Alarmed by what companies are building with artificial intelligence models, a handful of industry insiders are calling for those opposed to the current state of affairs to undertake a mass data poisoning effort to undermine the technology.

Their initiative, dubbed Poison Fountain, asks website operators to add links to their websites that feed AI crawlers poisoned training data. It’s been up and running for about a week.

AI crawlers visit websites and scrape data that ends up being used to train AI models, a parasitic relationship that has prompted pushback from publishers. When scaped data is accurate, it helps AI models offer quality responses to questions; when it’s inaccurate, it has the opposite effect.

    • FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      11 hours ago

      Adversarial noise a fun topic and a DIY AI thing you can do to familiarize yourself with the local-hosting side of things. Image generating networks are lightweight compared to LLMs and are able to be run on a moderately powerful, NVIDIA, gaming PC (most of my work is done on a 3080).

      LLM poisoning can also be done if you can insert poisoned text into their training set. An example method would be detecting AI scrapers on your server and sending them poisoned instead of automatically blocking them. Poison Fountain makes this very easy by supplying pre-poisoned data.

      Here is the same kind of training data poisoning attack, but for images that was made by the researchers of University of Chicago into a simple windows application: https://nightshade.cs.uchicago.edu/whatis.html

      Thanks to you comment I realized that my clipboard didn’t have the right link selected so I edited in the link to his github. ( https://github.com/bennjordan )