Sadly I think it’s more that there isn’t really a standard way to buy books and other media in bulk at the scale of which AI training usually requires. So the companies realise they can save both time and money in just pirating after calculating the fine risk. Its just a bonus that they usually get away with it and that the fines would likely be cheaper than a legit transaction. But i do think it’s the bulk data packaging that makes piracy look more attractive to them at the get-go.
Heck, even video game publishers often source their roms for their official re-releases from pirated copies because pirates are better at preserving data and keeping it in a nice friendly format. Easier to search for it on the web and download it then it is too goo into their own archives and rip it themselves, if they even still have original copies, cause they sure as hell didn’t keep their source code.
Yeah, no, this genuinely doesn’t make sense as there are legitimate repositories for these books and can do business-to-business negotiations for access to them. Even libraries have access to ebooks at bulk scale.
Those kinds of negotiations if they haven’t been done by other companies before, they won’t have a process for it already in place. There’d be lots of friction for the first of such deal. Both in lots of legal work and software development to make sure they only get access relevant to the deal made.
It’s not something they can just be like “hey, here’s the FTP URI”. Because these legitimate repositories you speak of, like Amazon I guess, will already have existing deals with publishers. Currently as they stand, these deals may not be compatible with Amazon sharing their IP with other companies. So they will either have to redo those deals or restrict access of specific titles to the likes of Nvidia.
Sadly I think it’s more that there isn’t really a standard way to buy books and other media in bulk at the scale of which AI training usually requires. So the companies realise they can save both time and money in just pirating after calculating the fine risk. Its just a bonus that they usually get away with it and that the fines would likely be cheaper than a legit transaction. But i do think it’s the bulk data packaging that makes piracy look more attractive to them at the get-go.
Heck, even video game publishers often source their roms for their official re-releases from pirated copies because pirates are better at preserving data and keeping it in a nice friendly format. Easier to search for it on the web and download it then it is too goo into their own archives and rip it themselves, if they even still have original copies, cause they sure as hell didn’t keep their source code.
There is also no standard way of buying a DRM free epub for personal use so I’m fine downloading them from Anna too :)
Yeah, no, this genuinely doesn’t make sense as there are legitimate repositories for these books and can do business-to-business negotiations for access to them. Even libraries have access to ebooks at bulk scale.
Those kinds of negotiations if they haven’t been done by other companies before, they won’t have a process for it already in place. There’d be lots of friction for the first of such deal. Both in lots of legal work and software development to make sure they only get access relevant to the deal made.
It’s not something they can just be like “hey, here’s the FTP URI”. Because these legitimate repositories you speak of, like Amazon I guess, will already have existing deals with publishers. Currently as they stand, these deals may not be compatible with Amazon sharing their IP with other companies. So they will either have to redo those deals or restrict access of specific titles to the likes of Nvidia.
Well, I suppose they could buy access to Amazon’s kindle servers