Guess we can always rely on the good old fashioned ways to make money…

Honestly, I think its pretty awful but im not surprised.

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    As for the ‘corporate control’ aspect, all this stuff is racing towards locally run anyway (since it’s free).

    I am not at all sure about that. I use an XT 7900 XTX and a Framework Desktop with an AI Max 395+, both of which I got to run LLMs and diffusion models locally, so I’ve no certainly no personal aversion to local compute.

    But there are a number of factors pulling in different directions. I am very far from certain that the end game here is local compute.

    In favor of local

    • Privacy.

    • Information security. It’s not that there aren’t attacks that can be performed using just distribution of static models (If Anyone Builds It, We All Die has some interesting theoretical attacks along those lines), but if you’re running important things at an institution that depend on some big, outside service, you’re creating creating attack vectors into your company’s systems. Not to mention that even if you trust the AI provider and whatever government has access to their servers, you may not trust them to be able to keep attackers out of their infrastructure. True, this also applies to many other cloud-based services, but there are a number of places that run services internally for exactly this reason.

    • No network dependency for operation, in terms of uptime. Especially for things like, say, voice recognition for places with intermittent connection, this is important.

    • Good latency. And no bandwidth restrictions. Though a lot of uses today really are not very sensitive to either.

    • For some locales, regulatory restrictions. Let’s say that one is generating erotica with generative AI stuff, which is a popular application. The Brits just made portraying strangulation in pornography illegal. I suspect that if random cloud service is permitting for generation of erotic material involving strangulation, they’re probably open to trouble. Random Brit person who is running a model locally may well not be in compliance with the law (I don’t recall if it’s just commercial provision or not) but in practical terms, it’s probably not particularly enforceable. That may be a very substantial factor based on where someone lives. And the Brits are far from the most-severe. Iranian law, for example, permits execution for producing pornography involving homosexuality.

    In favor of cloud

    • Power usage. This is, in 2025, very substantial. A lot of people have phones or laptops that run off batteries of limited size. Current parallel compute hardware to run powerful models at a useful rate can be pretty power hungry. My XT 7900 XTX can pull 355 watts. That’s wildly outside the power budget of portable devices. An Nvidia H100 is 700W, and there are systems that use a bunch of those. Even if you need to spend some power to transfer data, it’s massively outweighed by getting the parallel compute off the battery. My guess is that even if people shift some compute to be local (e.g. offline speech recognition) it may be very common for people with smartphones to use a lot of software that talks to remote servers for a lot of heavy-duty parallel compute.

    • Cooling. Even if you have a laptop plugged into wall power, you need to dissipate the heat. You can maybe use eGPU accelerators for laptops — I kind of suspect that eGPUs might see some degree of resurgence for this specific market, if they haven’t already — but even then, it’s noisy.

    • Proprietary models. If proprietary models wind up dominating, which I think is a very real possibility, AI service providers have a very strong incentive to keep their models private, and one way to do that is to not distribute the model.

    • Expensive hardware. Right now, a lot of the hardware is really expensive. It looks like an H100 runs maybe $30k at the moment, maybe $45k. A lot of the applications are “bursty” — you need to have access to an H100, but you don’t need sustained access that will keep that expensive hardware active. As long as the costs and applications look like that, there’s a very strong incentive to time-share hardware, to buy a pool of them and share them among users. If I’m using my hardware 1% of the time, I only need to pay something like 1% as much if I’m willing to use shared hardware. We used to do this back when all computers were expensive, had dumb terminal and teletypes that connected to “real” computers that ran with multiple users sharing access to hardware. That could very much again become the norm. It’s true that I expect that hardware capable of a given level of parallel compute will probably tend to come down (though there’s a lot of unfilled demand to meet). And it’s true that the software can probably be made more hardware-efficient than it is today. Those argue for costs coming down. But it’s also true that the software guys probably can produce better output and more-interesting applications if they get more-powerful hardware to play with, and that argues for upwards pressure.

    • National security restrictions. One possible world we wind up in is where large parallel compute systems are restricted, because it’s too dangerous to permit people to be running around with artificial superintelligences. In the Yudkowsky book I link to above, for example, the authors want international law to entirely prohibit beefy parallel-compute capability to be available to pretty much anyone, due to the risks of artificial superintelligence, and I’m pretty sure that there are also people who just want physical access to parallel compute restricted, which would be a lot easier if the only people who could get the hardware were regulated datacenters. I am not at all sure that this will actually happen, but there are people who have real security concerns here, and it might be that that position will become a consensus one in the future. Note that I think that we may already be “across the line” here with existing hardware if parallel compute can be sharded to a sufficient degree, across many smaller systems — your Bitcoin mining datacenter running racks of Nvidia 3090s might already be enough, if you can design a superintelligence that can run on it.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      edit-2
      1 day ago

      I disagree with many points, positive and negative.

      • …Honestly, the average person does not care about privacy, security, nor being offline. They have shown they will gladly trade all that away for cheap convenience, repeatedly.

      • Nor do they care about power usage or cooling. They generally do not understand thermodynamics, and your 395 (much less an iPhone GPU) would be a rounding error in their bill.

      I’m not trying to disparage folks here, but that’s how they are. We’re talking ‘average people.’


      • As for proprietary models, even if we don’t get a single new open source release, not one, the models we have right now (with a little finetuning/continue training) are good enough for tons of porn.

      • On hardware, I’m talking smartphones. And only smartphones. They’re powerful enough already, they just need a lot of software work and a bit more RAM, but everyone already has one.

      • Regulatory restrictions at either end are quite interesting, and I’m honestly not sure how it will pan out. Though I’m skeptical of any ‘superintelligence’ danger from commodity hardware, as the current software architectures are just not leading to that.

      What I am getting at is the ‘race to the bottom.’

      Fact is, running SDXL or Qwen class models on your iPhone is reasonably fast (results in a few seconds on my base iPhone 16, again highly unoptimized). And it’s free.

      Not free with a catch, free.

      That kind of trumps all other concerns, once the barriers come down. If it’s free, people (and cheap porn middlemen) will find a way. Hence the popularity of free porn now, no matter how crap it is or how much its restricted.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 day ago

        So, I’m just talking about whether-or-not the end game is going to be local or remote compute. I’m not saying that one can’t generate pornography locally, but asking whether people will do that, whether the norm will be to run generative AI software locally (the “personal computer” model that came to the fore in the mid-late 1970s and on or so) or remotely (the “mainframe” model, which mostly preceded it).

        Yes, one can generate pornography locally…but what if the choice is between a low-resolution, static SDXL (well, or derived model) image or a service that leverages compute to get better images or something like real-time voice synth, recognition, dialogue, and video? I mean, people can get static pornography now in essentially unbounded quantities on the Internet; It is in immense quantity; if someone spent their entire lives going through it, they’d never, ever see even a tiny fraction of it. Much of it is of considerably greater fidelity than any material that would have been available in, say, the 1980s; certainly true for video. Yet…even in this environment of great abundance, there are people subscribing to commercial (traditional) pornography services, and getting hardware and services to leverage generative AI, even though there are barriers in time, money, and technical expertise to do so.

        And I’d go even further, outside of erotica, and say that people do this for all manner of things. I was really impressed with Wolfenstein 3D when it came out. Yet…people today purchase far more powerful hardware to run 3D video games. You can go and get a computer that’s being thrown out that can probably run dozens of simultaneous instances of Wolfenstein 3D concurrently…but virtually nobody does so, because there’s demand for the new entertainment material that the new software and hardware permits for.

        • brucethemoose@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 day ago

          Fair.

          I guess it depends what becomes the ‘norm.’ Thing about any GenAI service is it’s not free. It’s not even ‘cheap’ like streaming video or images en masse is; every generation is tailored and costs cloud money.

          And Apple seems rather hell bent on pushing beefy hardware to basically half of all people in the western world. A few generations of iPhone and software can absolutely get that confluence of ‘real-time voice synth, recognition, dialogue, and video’ like you describe, and there are fundamental cloud issues going beyond that: game-like, local pornagraphy ‘virtual reality’ is much better done (at least partially) on device, because of how hard game streaming is.

          …But you still make good points. It could go either way.


          I mean, people can get static pornography now in essentially unbounded quantities on the Internet; It is in immense quantity; if someone spent their entire lives going through it, they’d never, ever see even a tiny fraction of it.

          On this specifically, TONS of people have no idea how to use a browser. Dare I say most, these days? Their whole internet is what’s availible through algorithmic recommendations on app stores, hence this ‘sea’ of static pornography might be more limited than you’d think.

          There’s also, apparently, a huge demand for basic interactivity, hence the unreasonable popularity of OnlyFans. And OF-type interactivity is quite crap if you ask me.