• 7 Posts
  • 369 Comments
Joined 3 years ago
cake
Cake day: June 23rd, 2023

help-circle



  • Didn’t someone at Google write a memo that was like “we’re kinda fucked b/c you can re-create this stuff with enough resources” like 2 years ago?

    Basically, yes. They were specifically decrying the amount of open-sourcing they and their American competitors were doing, because capitalism, of course. Around this time, we had examples like StabilityAI’s StableDiffusion and Meta’s LLaMA as open-source models. And around this time, everybody else started closing their models, despite the fact that the research kept on going out in the open. StabilityAI kept their models open, mostly because they had no choice, but the attitude shifted towards profitability.

    So, China took the open-source mantle, and these open/closed lines are being drawn strictly around national divisions as this American vs. China slant. Which is mostly a diversion of the real battle.


  • Whoever wrote this article didn’t even bother to do the most basic of research.

    DeepSeek fully admitted they started with ChatGPT outputs to train its model. And then they released it as an open-source model, so that everybody else can “steal” their work. On the image/video front, the general public has created every possible variation on top of every model you can think of. On top of that, any model that has ever been released with full weights has been spun into whatever variation or VRAM size you want.

    The ugly truth that the American companies want to hide is the fact that they are spending trillions of dollars on an oligopoly that they can’t keep long-term. They hope that they can just keep spending more money to add more billions of parameters to their models, and keep technologically competitive with the secondary open-source models. But, they’ve already ran into diminishing returns over a year ago, and the global compute sector physically cannot keep up with demand for another cycle of even more diminishing returns.

    The other factor is that realistic miniaturization of models is already here. Some of the smaller sizes aren’t as effective as the 250GB models they use on cloud-based services, but you can still do a lot with a 16GB or 24GB video card, using models of those sizes. Optimization and LLM quantization is getting better and better each year. The AI bubble burst is going to force a cascade shift into a new era of localization. Everybody is sick to fucking death of renting and subscribing to everything. Us pirates already do so on the media front, and soon localization of LLMs is going to become way more popular.

    The question isn’t “Can people steal the tech?”. It’s “how long will people notice that it’s already happening?”








  • Now major news publishers are actively blocking the Internet Archive—one of the most important cultural preservation projects on the internet—because they’re worried AI companies might use it as a sneaky “backdoor” to access their content.

    This is a total lie. This has nothing to do with AI. They’ve hated archive sites because forums like this one hate their paywalls, and we prefer to be able to actually read their articles and discuss them instead of getting blackballed every time.

    NYT is one of the worst offenders, and NYT as a company has turned for the worse in the last 5-10 years, maybe even worse than Amazon Post. None of the old media companies really understand how to adapt in the Internet age, so they are slowly dying. It’s like they are perpetually in an economic bubble that hasn’t figured out how to pop itself. There’s so much damn news and news places copying their own news, and regurgitating it a hundred times, that we’re forced to aggregate it and have YouTubers hawk shit like Ground News just to process it all.




  • open-weights aren’t open-source.

    This always has been a dumb argument, and really lacks any modicum of practicality. This is rejecting 95% of the need because it is not 100% to your liking.

    As we’ve seen in the text-to-image/video world, you can train on top of base models just fine. Or create LoRAs for specialization. Or change them into various styles of quantized GGUFs.

    Also, you don’t need a Brazilian LLM because all of the LLMs are very multilingual.

    Spending $3000 on training is still really cheap, but depending on the size of the model, you can still get away with training on 24GB or 32GB cards, which cost you the price of the card and energy. LoRAs take almost nothing to train. A university that is worth anything is going to have the resources to train a model like that. None of these arguments hold water.


  • DeepSeek API isn’t free, and to use Qwen you’d have to sign up for Ollama Cloud or something like that

    To use Qwen, all you need is a decent video card and a local LLM server like LM Studio.

    Local deploying is prohibitive

    There’s a shitton of LLM models in various sizes to fit the requirements of your video card. Don’t have the 256GB VRAM requirements for the full quantized 8-bit 235B Qwen3 model? Fine, get the quantized 4-bit 30B model that fits into a 24GB card. Or a Qwen3 8B Base with DeepSeek-R1 post-trained Q 6-bit that fits on a 8GB card.

    There are literally hundreds of variations that people have made to fit whatever size you need… because it’s fucking open-source!


  • For a company named “Open” AI their reluctance to just opening the weights to this model and washing their hands of it seem bizarre to me.

    It’s not when you understand the history. When StabilityAI released their Stable Diffusion model as an open-source LLM and kickstarted the whole text-to-image LLM craze, there was a bit of a reckoning. At the time, Meta’s LLaMA was also out there in the open. Then Google put out an internal memo that basically said “oh shit, open-source is going to kick our ass”. Since then, they have been closing everything up, as the rest of the companies were realizing that giving away their models for free isn’t profitable.

    Meanwhile, the Chinese have realized that their strategy has to be different to compete. So, almost every major model they’ve released has been open-source: DeepSeek, Qwen, GLM, Moonshot AI, Kimi, WAN Video, Hunyuan Image, Higgs Audio. Black Forest Labs in Germany, with their FLUX image model, is the only other major non-Chinese company that has adopted this strategy to stay relevant. And the models are actually good, going toe-to-toe with the American close-sourced models.

    The US companies have committed to their own self-fulfilling prophecy in record time. Open source is actively kicking their ass. Yet they will spend trillions trying to make profitable models and rape the global economy in the process, while the Chinese wait patiently to stand on top of their corpses, when the AI bubble grenade explodes in their faces. All in the course of 5 years.

    Linux would be so lucky to have OS market share dominance in such an accelerated timeline, rather than the 30+ years it’s actually going to take. This is a self-fail speedrun.