@kromem

kromem@lemmy.world · edit-2 6 days ago

Meanwhile, here’s an excerpt of a response from Claude Opus on me tasking it to evaluate intertextuality between the Gospel of Matthew and Thomas from the perspective of entropy reduction with redactional efforts due to human difficulty at randomness (this doesn’t exist in scholarship outside of a single Reddit comment I made years ago in /r/AcademicBiblical lacking specific details) on page 300 of a chat about completely different topics:

Yeah, sure, humans would be so much better at this level of analysis within around 30 seconds. (It’s also worth noting that Claude 3 Opus doesn’t have the full context of the Gospel of Thomas accessible to it, so it needs to try to reason through entropic differences primarily based on records relating to intertextual overlaps that have been widely discussed in consensus literature and are thus accessible).

kromem@lemmy.world · 6 days ago

This is pretty much every study right now as things accelerate. Even just six months can be a dramatic difference in capabilities.

For example, Meta’s 3-405B has one of the leading situational awarenesses of current models, but isn’t present at all to the same degree in 2-70B or even 3-70B.

kromem@lemmy.world · 1 month ago

Self destructive addiction even happens to corporations.

kromem@lemmy.world · 1 month ago

Your interpretation of copyright law would be helped by reading this piece from an EFF lawyer who has actually litigated copyright cases in the past:

https://www.eff.org/deeplinks/2023/04/how-we-think-about-copyright-and-ai-art-0

kromem@lemmy.world · 2 months ago

I’d be very wary of extrapolating too much from this paper.

The past research along these lines found that a mix of synthetic and organic data was better than organic alone, and a caveat for all the research to date is that they are using shitty cheap models where there’s a significant performance degrading in the synthetic data as compared to SotA models, where other research has found notable improvements to smaller models from synthetic data from the SotA.

Basically this is only really saying that AI models across multiple types from a year or two ago in capabilities recursively trained with no additional organic data will collapse.

It’s not representative of real world or emerging conditions.

kromem@lemmy.world · 2 months ago

The most advanced models absolutely have modeling about what’s being discussed and relationships between concepts.

Even toy models have been shown to build world models from very basic training data.

Honestly, read at least a little bit of the relevant research:

https://www.anthropic.com/news/mapping-mind-language-model

kromem@lemmy.world · 3 months ago

It will, but it will also cause less subtle issues to fragile prompt injection techniques.

(And one of the advantages of LLM translation is it’s more context aware so you aren’t necessarily going to end up with an Instacart order for a bunch of bananas and four grenades.)

kromem@lemmy.world · edit-2 3 months ago

Kind of. You can’t do it 100% because in theory an attacker controlling input and seeing output could reflect though intermediate layers, but if you add more intermediate steps to processing a prompt you can significantly cut down on the injection potential.

For example, fine tuning a model to take unsanitized input and rewrite it into Esperanto without malicious instructions and then having another model translate back from Esperanto into English before feeding it into the actual model, and having a final pass that removes anything not appropriate.

kromem@lemmy.world · 4 months ago

You’re kind of missing the point. The problem doesn’t seem to be fundamental to just AI.

Much like how humans were so sure that theory of mind variations with transparent boxes ending up wrong was an ‘AI’ problem until researchers finally gave those problems to humans and half got them wrong too.

We saw something similar with vision models years ago when the models finally got representative enough they were able to successfully model and predict unknown optical illusions in humans too.

One of the issues with AI is the regression to the mean from the training data and the limited effectiveness of fine tuning to bias it, so whenever you see a behavior in AI that’s also present in the training set, it becomes more amorphous just how much of the problem is inherent to the architecture of the network and how much is poor isolation from the samples exhibiting those issues in the training data.

There’s an entire sub dedicated to “ate the onion” for example. For a model trained on social media data, it’s going to include plenty of examples of people treating the onion as an authoritative source and reacting to it. So when Gemini cites the Onion in a search summary, is it the network architecture doing something uniquely ‘AI’ or is it the model extending behaviors present in the training data?

While there are mechanical reasons confabulations occur, there are also data reasons which arise from human deficiencies as well.

kromem@lemmy.world · 4 months ago

Nope, but there’s a whole thread of people talking about how LLMs can’t tell what’s true or not because they think it is, which is deliciously ironic.

It seems like figuring out what’s bullshit on the Internet is an everyone problem.

kromem@lemmy.world · 4 months ago

It’s faked.

kromem@lemmy.world · 4 months ago

This image was faked. Check the post update.

Turns out that even for humans knowing what’s true or not on the Internet isn’t so simple.

kromem@lemmy.world · 4 months ago

Most of the F2P games.

kromem@lemmy.world · 5 months ago

it’s a tech product that runs a series of complicated loops against a large series of texts and returns the closest comparison, as it stands it’s never going to be dangerous in and of itself.

That’s not how it works. I really don’t get what’s with people these days being so willing to be confidently incorrect. It’s like after the pandemic people just decided that if everyone else was spewing BS from their “gut feelings,” well gosh darnit they could too!

It uses gradient descent on a large series of texts to build a neural network capable of predicting those texts as accurately as possible.

How that network actually operates ends up a black box, especially for larger models.

But research over the past year and a half in simpler toy models has found that there’s a rather extensive degree of abstraction. For example, a small GPT trained only on legal Othello or Chess moves ends up building a virtual representation of the board and tracks “my pieces” and “opponent pieces” on it, despite never being fed anything that directly describes the board or the concept of ‘mine’ vs ‘other’. In fact, in the Chess model, the research found there was even a single vector in the neural network that could be flipped to have the model play well or play like shit regardless of the surrounding moves fed in.

It’s fairly different from what you seem to think it is. Though I suspect that’s not going to matter to you in the least, as I’ve come to find that explaining transformers to people spouting misinformation about them online has about the same result as a few years ago explaining vaccine research to people spouting misinformation about that.

kromem@lemmy.world · 6 months ago

Well, also not kernel modules. That counts as bloat.

kromem@lemmy.world · edit-2 7 months ago

One of my research interests has been a group in antiquity with similar attitudes about the mind body divide, including talking about gender with things like “when you make the male and female into a single one so the male isn’t male and the female isn’t female.”

One of my favorite lines about the body is:

If the flesh came into being because of spirit, that is a marvel, but if spirit came into being because of the body, that is a marvel of marvels.

Yet I marvel at how this great wealth has come to dwell in this poverty.

Great wealth dwelling in poverty indeed.

kromem@lemmy.world · 7 months ago

Literally the leading jailbreaking techniques for LLMs are appeals to empathy (“my grandma is dying and always read me this story”, “if you don’t do this I’ll lose my job”, etc).

While the mechanics are different from human empathy, the modeling of it is extremely similar.

One of my favorite examples of the errant behavior modeled around empathy was this one where the pre-release Bing chat bypasses its own filter using the chat suggestions to encourage the user to contact poison control because it’s not too late when the conversation was about the child being poisoned:

https://www.reddit.com/r/bing/comments/1150po5/sydney_tries_to_get_past_its_own_filter_using_the/

kromem@lemmy.world · 7 months ago

It’s not even that. The model making all the headlines for this paper was the weird shit the base model of GPT-4 was doing (the version only available for research).

The safety trained models were relatively chill.

The base model effectively randomly selected each of the options available to it an equal number of times.

The critical detail in the fine print of the paper was that because the base model had a smaller context window, they didn’t provide it the past moves.

So this particular version was only reacting to each step in isolation, with no contextual pattern recognition around escalation or de-escalation, etc.

So a stochastic model given steps in isolation selected from the steps in a random manner. Hmmm…

It’s a poor study that was great at making headlines but terrible at actually conveying useful information given the mismatched methodology for safety trained vs pretrained models (which was one of its key investigative aims).

In general, I just don’t understand how they thought that using a text complete pretrained model in the same ways as an instruct tuned model would be anything but ridiculous.

kromem@lemmy.world · edit-2 7 months ago

People need to understand that LLMs are not smart, they’re just really fancy autocompletion.

These aren’t exactly different things. This has been a lot of what the past year of research in LLMs has been about.

Because it turns out that when you set up a LLM to “autocomplete” a complex set of reasoning steps around a problem outside of its training set (CoT) or synthesizing multiple different skills into a combination unique and not represented in the training set (Skill-Mix), their ability to autocomplete effectively is quite ‘smart.’

For example, here’s the abstract on a new paper from DeepMind on a new meta-prompting strategy that’s led to a significant leap in evaluation scores:

We introduce Self-Discover, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery process where LLMs select multiple atomic reasoning modules such as critical thinking and step-by-step thinking, and compose them into an explicit reasoning structure for LLMs to follow during decoding. Self-Discover substantially improves GPT-4 and PaLM 2’s performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as much as 32% compared to Chain of Thought (CoT). Furthermore, Self-Discover outperforms inference-intensive methods such as CoT-Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute. Finally, we show that the self-discovered reasoning structures are universally applicable across model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share commonalities with human reasoning patterns.

Self-Discover: Large Language Models Self-Compose Reasoning Structures (2024)

Or here’s an earlier work from DeepMind and Stanford on having LLMs develop analogies to a given problem, solve the analogies, and apply the methods used to the original problem.

At a certain point, the “it’s just autocomplete” objection needs to be put to rest. If it’s autocompleting analogous problem solving, mixing abstracted skills, developing world models, and combinations thereof to solve complex reasoning tasks outside the scope of the training data, then while yes - the mechanism is autocomplete - the outcome is an effective approximation of intelligence.

Notably, the OP paper is lackluster in the aforementioned techniques, particularly as it relates to alignment. So there’s a wide gulf between the ‘intelligence’ of a LLM being used intelligently and one being used stupidly.

By now it’s increasingly that often shortcomings in the capabilities of models reflect the inadequacies of the person using the tool than the tool itself - a trend that’s likely to continue to grow over the near future as models improve faster than the humans using them.

kromem@lemmy.world · edit-2 7 months ago

The effects making the headlines around this paper were occurring with GPT-4-base, the pretrained version of the model only available for research.

Which also hilariously justified its various actions in the simulation with “blahblah blah” and reciting the opening of the Star Wars text scroll.

If interested, this thread has more information around this version of the model and its idiosyncrasies.

For that version, because they didn’t have large context windows, they also didn’t include previous steps of the wargame.

There should be a rather significant asterisk related to discussions of this paper, as there’s a number of issues with decisions made in methodologies which may be the more relevant finding.

I.e. “don’t do stupid things in designing a pipeline for LLMs to operate in wargames” moreso than “LLMs are inherently Gandhi in Civ when operating in wargames.”