On the internet, nobody knows you are Australian.

also https://lemm.ee/u/MargotRobbie

To tell you the truth, I don’t know who I am either. Somebody sincere, perhaps.

But if you ever read this one day, I hope that you are as proud of me, as I am of the person I imagined you to be.

  • 0 Posts
  • 19 Comments
Joined 1 year ago
cake
Cake day: June 17th, 2023

help-circle





  • There is an interesting, and almost universal phenomenon on reddit that every time a subreddit gets past about 40,000 subscribers, the discussion quality immediately drops off a cliff, unless extremely harsh moderation policies are implemented to explicitly weed out low effort content which brings its own set of problems.

    My theory on why this occurs is the scaling power of moderation. I think you computer people are probably very familiar with the concept of scalability, and that size is its own challenge at the hyperscale. So for a centralized system like Twitter or Instagram or Facebook, moderation can only scale vertically, so a huge moderation team is needed to contend with the scale of these platforms alone, which also forces the need of personalized recommendation algorithms to promote this that are actually interesting to individual users.

    Reddit was able to partially avoid this phenomenon with the subreddit system, which means everyone was able to effectively manage their own, smaller subgroups who shares common interest without intervention from the site admin/mods to achieve a form of pseudo-horizontal scaling. You can also see the success of that with Facebook Groups, which are one of the few reasons why people still use Facebook for social media even though they do not want to interact with the current Facebook audience.

    Lemmy, and the rest of the fediverse platforms would suffer the problems even less, as now every group admin can now be completely independent from one another, which means that real horizontal scaling can be achieved and hopefully preserving the discussion quality to a degree as it grows.







  • My advice is to restart with Arch (I use Arch btw). Not Manjaro, I’m talking Arch.

    I think using/installing Arch as well as its barebones nature FORCES you to understand how Linux works differently than Windows with concepts like root, bootloader, terminal emulation, and disk partitioning, just to give you some examples. At the same time, Arch has excellent documentation, a great package manager in pacman, and rolling release model that greatly simplifies maintainance during daily use so you can tune it to exactly how you want it.

    I believe doing it the hard way at first will make it easier for you in the long run if you really want to understand Linux, and Arch is just the right amount of difficult to make you learn Linux, whereas Gentoo would be too hard and you don’t learn enough from using Ubuntu/Debian/Mint.

    But yeah, if you just want to use something that works well out of the box, then Ubuntu is great, there’s nothing wrong with using the more user friendly distros.



  • But what an LLM does meets your listed definition of transformative as well, it indeed provides additional value that can’t be derive from the original, because everything it outputs is completely original but similar in style to the original that you can’t use to reconstitute the original work, in other words, similar to fan work, which is also why the current ML models, text2text or text2image, are called “transformers”. Again, works similar in style to the original cannot and should not be considered copyright infringement, because that’s a can of worm nobody actually wants to open, and the courts has been very consistent on that.

    So, I would find it hard to believe that if there is a Supreme Court ruling which finds digitalizing copyrighted material in a database is fair use and not derivative work, that they wouldn’t consider digitalizing copyrighted material in a database with very lossy compression (that’s a more accurate description of what LLMs are, please give this a read if you have time) fair use as well. Of course, with the current Roberts court, there is always the chance that weird things can happen, but I would be VERY surprised.

    There is also the previous ruling that raw transformer output cannot be copyrighted, but that’s beyond the scope of this post for now.

    My problem with LLM outputs is mostly that they are just bad writing, and I’ve been pretty critical against “”“Open”""AI elsewhere on Lemmy, but I don’t see Siverman’s case going anywhere.


  • She’s going to lose the lawsuit. It’s an open and shut case.

    “Authors Guild, Inc. v. Google, Inc.” is the precedent case, in which the US Supreme Court established that transformative digitalization of copyrighted material inside a search engine constitutes as fair use, and text used for training LLMs are even more transformative than book digitalization since it is near impossible to reconstitute the original work barring extreme overtraining.

    You will have to understand why styles can’t and should not be able to be copyrighted, because that would honestly be a horrifying prospect for art.