

It’s trivial to get LLMs to act against the instructions
I have weird thoughts that people find peculiar, so I write them down and people seem to enjoy reading them.
It’s trivial to get LLMs to act against the instructions
There is a lot more that goes into it than just being correct. 18000 waters may have been the actual order, because somebody decided to screw with the machine. A human who isn’t terminally autistic would reliably interpret that as a joke and would simply refuse to punch that in. The LLM will likely do what a human tells it to do, since it has no contextual awareness, it only has the system prompt and whatever interaction with the user it had so far.
It’s all a theater play with them, they’re dumb but only in a very myopic understanding of intelligence. They’d managed not just to outwit, but to run circles around half the population of the US.
Fun speculation: we are CPUs for the information systems we inhabit, like scientific method, political ideologies, etc.