• 0 Posts
  • 352 Comments
Joined 3 years ago
cake
Cake day: July 5th, 2023

help-circle

  • AI avatar man wants you to be afraid: “sleeper agents”! “backdoors”! “poisoned documents”! Terrifying!

    It is terrifying. People in positions of power have placed entirely too much trust in these machines that are this easily fooled. I’d argue that we shouldn’t trust these machines as much as they are, but I don’t think the rest of the world is listening enough to these warnings.

    I also worry about how broken search result rankings have gotten. For someone like me who doesn’t use these AI products, it concerns me that actual search engines (which I do use) continue to get worse.

    Sure, there are lessons here for those who build and maintain LLMs, but everyone else should still be terrified at how the world is moving towards, rather than away, this nonsense.



  • Here’s the original reporting, instead of another website’s summary of Bloomberg’s actual report:

    https://www.bloomberg.com/news/articles/2026-04-28/us-ends-investigation-into-claims-whatsapp-chats-aren-t-private

    https://archive.is/sGE3e

    So it sounds like the agent was investigating allegations, from content moderation contractors, that Meta could access the contents of WhatsApp messages, and came to the conclusion that yes, Meta could.

    There are a few possibilities here.

    1. Meta does have full plain text access to all Whatsapp messages, but guards that access very closely. Although the clients seem to generate E2EE keys for each session, somehow they’re leaking those keys to Meta’s servers somewhere, and the closed source code sufficiently hides that so that there’s no whistleblower or security researcher able to detect this definitively.
    2. Meta has a secret wiretap functionality where they can compromise the E2EE keys somehow, but uses it only for narrow cases. This helps keep the functionality secret, because security researchers and other reviewers may never see the functionality in action.
    3. Meta allows users to report objectionable content in the threads they’re already part of. The reporting function either forwards the E2EE key itself, or all the plaintext data, that gives content moderators access to the underlying message contents. The contractor whistleblowers and the federal agent investigating these allegations simply got it wrong, and misunderstood the technical process of how the plaintext messages end up in the content moderator’s possession.

    Meta claims that it’s #3. They acknowledge they have plaintext access to messages when a party to the thread presses the report button.

    This unnamed federal agent believes it’s #1, after 10 months of investigation, and sent out an email to other investigators that they should look into that possibility.

    I’m skeptical of #1, simply because I don’t believe that conspiracies to keep that kind of stuff secret can be maintained. It’s not just that there would be technically skilled whistleblowers who have actual access to the code (not the non-technical content moderator contractors who review the content), but a weakness in such an important and widely used protocol would attract all sorts of hackers, state sponsored or otherwise.

    But option #2 might explain everything we’ve seen so far. Full wiretap capability that is rarely used and very tightly controlled.


  • Anybody who believed that quantum computing posed a risk to symmetric encryption was fundamentally misunderstanding how encryption works and what quantum computing might be good at one day.

    Asymmetric cryptography is primarily used for the secure exchanging of symmetric keys: use a public/private key pair to exchange secure messages of what symmetric key to use for their session, and then both sides switch to the symmetric key for actual communication of a real payload.

    A public/private key pair is two keys that have some interesting mathematical relationship, such that it is easy to confirm that someone possesses the right private key using the public key or to encrypt something that only the correct private key can decrypt. And that mathematical relationship, relating to the product of two very large prime numbers, is at the core of modern asymmetric cryptography.

    Quantum computing may make number factorization much, much easier. So once a product of two large primes becomes possible to factor, the public/private key pairs might not be as secure anymore.

    But none of this has anything to do with symmetric encryption, or hash functions. Quantum doesn’t move the needle on that particular math.

    The real risk, though, is for an adversary to eavesdrop on an encrypted key exchange (which uses asymmetric cryptography) and then the message itself (which uses symmetric cryptography) and then be able to take the two steps of getting the secret symmetric key from the intercepted key exchange over a compromised asymmetric protocol, and being able to decrypt the symmetric portion of the communication too.






  • I think it’s worth being clear about the scope of the rating. iFixit has always been about repairability defined by parts availability, and its ratings consider software restrictions only to the point where it interferes with the user experience when replacing parts to restore things to the original performance.

    Customizability (in software or otherwise) isn’t part of the score. Durability/longevity isn’t part of the score, either. Those are things that I want, too, but I can recognize those are outside the scope of what iFixit advocates for.

    I do have some concerns about the partnerships creating a conflict of interest, but sometimes that feedback loop is helpful for improving the product, where the maintainer of a standard also has a consulting business in helping others meet that standard. Ideally there’s a wall between the two sides (advisors versus raters), but the mere fact that one company might do both things isn’t that big of a deal in itself.



  • You can reason from a few principles:

    • At its core, the math functions being optimized by these AI tools and their specialized hardware is that they can perform inference and pattern recognition at huge scales across enormous data sets.
    • Inferring a rule set for pattern also allows generation of new data that fits that pattern.
    • Some portion of human cognitive work falls within the general framework of finding patterns or finding new data that fits an old pattern.

    So when people start making claims about things with clear, objective definitions (a win condition in chess, the fastest route to take through a maze, a highest lossless compression algorithm for real world text), it’s reasonable to believe that the current AI infrastructure can lead to breakthroughs on that front. So image recognition, voice recognition, and things like that were largely solved a decade ago. Text generation with clear and simple definitions of good or bad (simple summaries, basic code that accomplishes a clearly defined goal) is what LLMs have been doing well.

    On things that have much more fuzzy or even internally inconsistent definitions, the AI world gets much more controversial.

    But I happen to believe that finding and exploiting bugs or security vulnerabilities falls more into the well defined problem with well defined successes and failures. So I take it seriously when people claim that AI tools are helpful for developing certain exploits.


  • but isn’t the memory on the Neo on the same die as the processor?

    Not actually on the same die, but in the same package, stacked on top using TSMC’s Integrated Fan-Out Package on Package (InFO-PoP).

    So the memory still needs to be sourced from memory manufacturers, sent to TSMC, and then have TSMC package it all together in a single package. It’s unclear whether they had locked up this supply at pre-AI prices, though. The underlying A18 Pro chip/package was annoinced and launched about 18 months ago, so if they had the manufacturing pipeline set up for that they might have kept the contractual rights to continue buying memory at the old prices.



  • No, it’s not volunteering, at least not anymore.

    Subpoena is legal Latin for “under penalty,” because noncompliance with a subpoena carries a penalty.

    Originally, it was an information request from the feds, and Reddit refused. Then they escalated to getting a grand jury subpoena (which means they got a bunch of normal citizens to agree that the information was relevant to a criminal investigation), so now noncompliance carries a penalty.

    Reddit notified the users, who hired their own lawyers, who are resisting the subpoena and will litigate it to where they need a judge to decide whether Reddit will have to turn the information over.

    That’s the process for these things, and we’re a couple steps in already.





  • What if license and copyright was washed by using an LLM to translate Claude into another language?

    The law doesn’t allow you to launder copyright like that. That’s just a derivative work, which can be restricted by the copyright holder in the original. As an example, in fictional writing, distinct characters are copyrighted, and using an LLM to generate new works using those copyrighted characters would still be a derivative work that the original copyright owner would have the right to deny distribution.

    So if you have a copyrighted codebase and you try to implement that codebase using some kind of transformation of that code, that’d still be a derivative work and infringe the original copyright.

    Now if you have some kind of clean room implementation where you can show that it was written without copying the original code itself, only working to implement its functionality through documentation/reverse engineering how the code worked, you’d be able to escape out of calling it a derivative work and could distribute it without the original copyright holder’s permission (Compaq did this with the IBM BIOS to make unauthorized/unlicensed PC clones, and Google did this with the Java API to make Android without a license from Sun/Oracle and won at the Supreme Court).

    Claude can’t be copyrighted because it’s a product of an LLM.

    No, because Claude’s code is still created by humans with the assistance of non-human tools. There’s a spectrum from spelling correction and tab completion in IDEs all the way to full vibe coding with a prompt describing the raw functionality (where the prompt is so uncreative that it isn’t itself copyrightable). Anthropic has never claimed that there was no human in the loop, or that the prompts it uses are so uncreative and purely functional so that the outputs aren’t copyrightable.


  • Unless it can be paper thin this does not look better than magnetic tape.

    As the article explains, the whole purpose here is to be able to store data on a medium that can endure harsh conditions, including heat, moisture, radiation, and physical abrasion. The company’s website claims the medium can retain data for 5000 years without power, and is water and fire resistant.

    I reckon you could scratch it pretty easily.

    The underlying ceramic film is already used for protecting tools like drill bits and saw blades from physical damage, which is why it was chosen for this project. They already found one of the most durable materials in the world, and asked whether they could store data using that already-durable material.