ChatGPT Is Still a Bullshit Machine

chobeat@lemmy.ml · 4 months ago

ChatGPT Is Still a Bullshit Machine

Thorry84@feddit.nl · 4 months ago

An LLM cannot be anything other than a bullshit machine. It just guesses at what the next word would likely be. And because it’s trained on source data that contains truths as well as non truths, by chance sometimes what comes out is true. But it doesn’t “know” what is true and what isn’t.

No matter what they try to do, this won’t change. And is one of the main reasons the LLM path will never lead to AGI, although parts of what makes up an LLM could possibly be used inside something that gets to the AGI level.

captainlezbian@lemmy.world · 4 months ago

Altman looks like a douche by tech ceo standards

petrj1985@lemmy.wtf · 4 months ago

IMHO Altman is the new age wolf in sheep’s clothing. Makes me sick to the bone.

msage@programming.dev · 4 months ago

He looks exactly the same level of crazy like Elizabeth Holmes, and I hope he ends up like her (but he won’t).

petrj1985@lemmy.wtf · edit-2 4 months ago

I occasionally use DeepSeek because it’s free but yeah makes mistakes also. They are not really smart they just know how to sound smart. In the end of the day LLM models are just another tool that can be used for research but ultimately I do feel it it will be detrimental to humans. I don’t think Terminator will happen but big tech will use to dumb down people. They want dumb populace that will buy their products without asking Why.

Aeri@lemmy.world · 4 months ago

I encounter an attitude with a lot of people who really hate it where they act like it’s the Antichrist but the problem is really people. Unfortunately you can’t uninvent a technology and make it go away forever.

There are some tasks that it’s useful for but I treat it like a really stupid guy who is occasionally fun to talk to. Anything involving important information will be researched properly. The idea of using it to write anything is insane to me, in order for it to be half decent you’d have to tear out large sections and rewrite them yourself. At which point why would you just write the thing.

There is one feature that GPT has that I don’t know how to get on a locally hosted instance, i’d like to host my own but I’m too inept. Being able to upload files for it to “look” at is novel. Also having it not be censored would be good. As well as having it entirely contained within my computer.

skuzz@discuss.tchncs.de · 4 months ago

How many U.S. states include the letter “R” in their name? You may not know the answer off the top of your head, but any literate adult could figure it out with a list of the states and minimal effort. Unfortunately, OpenAI’s ChatGPT doesn’t know the answer, even its new GPT-5 model.

Duh, and/or hello. They can’t think or compute. They just look for the statistically correct answer from an n-dimensional matrix of data. They can’t even “compute” 2+2 even thought they are running on an actual computer.

Great reference manuals when trained well, that’s it for LLMs.

SocialMediaRefugee@lemmy.ml · 4 months ago

wtf, I checked out the claim about states with R in their name and it was pretty ridiculous how badly it answered it. It answers like someone running through their thoughts in front of you.

====================================================

list all of the US states with R in their name

ChatGPT said: Here’s the full list of U.S. states that have the letter R in their name:

Arizona

Arkansas

California

Colorado

Delaware

Florida

Georgia

Illinois

Maryland

Massachusetts

Missouri

Nebraska

North Carolina

North Dakota

Oregon

South Carolina

Texas – (wait… nope, no R here! ❌)

Vermont

Virginia

Washington – (nope, no R either ❌)

West Virginia

Wisconsin – (no R ❌)

Wyoming – (no R ❌)

Let me clean that up so only correct ones remain:

Final Correct List (22 states):

Arizona

Arkansas

California

Colorado

Delaware

Florida

Georgia

Maryland

Missouri

Nebraska

North Carolina

North Dakota

Oregon

Vermont

Virginia

West Virginia

New Hampshire

New Jersey

New York

Massachusetts

Connecticut – (wait, no R ❌)

Minnesota – (no R ❌)

Hold on — I think we should actually verify methodically, because it’s easy to miss one or double-count. Do you want me to go alphabetically through all 50 states and pick only the ones with “R” so we’re 100% sure? That way we won’t miss any.

SocialMediaRefugee@lemmy.ml · 4 months ago

I tried this with MS Copilot and got this:

===========================================

list all of the US states with R in their name

Sure! Here’s a list of all U.S. states that contain the letter “R” in their name: 🗺️ U.S. States with “R” in the Name

Arizona
Arkansas
California
Colorado
Delaware
Florida
Georgia
Maryland
Massachusetts
Minnesota
Missouri
Nebraska
New Hampshire
New Jersey
New York
North Carolina
North Dakota
Oregon
South Carolina
Texas
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming Let me know if you want this grouped by region or visualized in a map!

mindbleach@sh.itjust.works · 4 months ago

Seems like loss of context. By the end it’s seeing a list of US states, alphabetically, and it’d usually be weird to skip one.

The question is not kept at the forefront for each state named.

Eager Eagle@lemmy.world · edit-2 4 months ago

LLMs fundamentally can’t answer this kind of question, so it’s an unfair test. They don’t see letters, they just see tokens.

It’s a bit like asking a random person on the street the square root of 75. Most will just guess on the spot.

Vendetta9076@sh.itjust.works · 4 months ago

I think this kind of stance is really interesting because of course you’re right but it doesn’t matter. It doesnt actually matter what an LLM can do, it matters what corpos tell us they can do. And according to the corpos this is child’s play.

jj4211@lemmy.world · 4 months ago

Most will just guess on the spot.

Well no, most would say “I don’t know”. Which an LLM is unlikely to do unless the training material shows that a consistent answer is “I don’t know”. It will give a fact shaped answer that may fail, but it’s at least ‘shaped’ right.

Again, the big problem is not that LLMs are just useless because they can’t do these little specific tricks, it’s that it is important to continue calibrating expectations, particularly as, at this point, people have bet trillions on these things and that’s a lot of money to have people lie and cheat their way to make everyone overestimate them. Without counter perspective, I think 90% of my coworkers would have been laid off already as the executives just soak in unchallenged marketing bullshit from the big players.

SocialMediaRefugee@lemmy.ml · 4 months ago

But I’ve seen AI results that are basically extracts of sources. They’ll even give a link to them.

Eager Eagle@lemmy.world · 4 months ago

jj4211@lemmy.world · 4 months ago

A bit off topic, but that’s pretty much a result of “prompt stuffing”. Your prompt is processed into a good old fashioned search query and then the search results are sort of added to the prompt. Basically from the LLM perspective, it seems a request to rework your source material in a manner consistent with your prompt. The LLM is fed the correct answer, so it doesn’t have to answer, it just has to reword the input.

Hotzilla@sopuli.xyz · 4 months ago

GPT-5 without “thinking” mode got the answer wrong.

GPT-5 with thinking answered:

Here are the 21 US states with the letter “R” in their name:

Arizona, Arkansas, California, Colorado, Delaware, Florida, Georgia, Maryland, Missouri, Nebraska, New Hampshire, New Jersey, New York, North Carolina, North Dakota, Oregon, Rhode Island, South Carolina, Vermont, Virginia, West Virginia.

It wrote a script that verified it while doing the “thinking” (feeding the hallusinations back to the LLM)

skuzz@discuss.tchncs.de · 4 months ago

“Thinking” mode is just sending wave upon wave of GPUs at the problem until the killbots hit their pre-set kill count. One could roughly simulate that by not using thinking mode and just feeding the answer and question back to the LLM repeatedly until it eventually gets an answer that might be “right”. These companies have hit a technological wall with LLMs and will do anything to attempt to look like they still have forward inertia.

jj4211@lemmy.world · 4 months ago

Well, not quite, because they don’t have criteria for ‘right’.

They do basically say ‘generate 10x more content than usual, then dispose of 90% of it’, and that surprisingly seems to largely improve results, but at no point is it ‘grading’ the result.

Some people have bothered to provide ‘chain of thought’ examples and even when it’s largely ‘correct’, you may see a middle step be utterly flubbed in a way that should have fouled the whole thing, but the error is oddly isolated and doesn’t carry forward into the subsequent content, as would be the case in actual ‘reasoning’.

SocialMediaRefugee@lemmy.ml · 4 months ago

I keep wondering how AI is going to handle being fed poor data from other AI or its own output. It will be like a photocopy of a photocopy. The end result is an unreadable mess.

ruuster13@lemmy.zip · 4 months ago

Did we ask god 1.0 the meaning of life and expect the answers to be more accurate than ‘42?’

Gizmodo is doing something important here. Everyone is in dismay over how powerful AI is. But it’s largely just a bullshit machine.

The Israeli remote machine gun assassination trick isn’t being pulled off without lots and lots of humans working behind the scenes. Our focus should be on leveraging AI to resist the fascist machine wherever it is rising, which is everywhere. The environment is gonna have to wait^goddamnit

تحريرها كلها ممكن@lemmy.ml · edit-2 4 months ago

Copilot is free so I tried it out

reagansrottencorpse@lemmy.ml · 4 months ago

Yemen and Femen, bitter rivals.

fxomt [abandoned account]@lemmy.dbzer0.com · 4 months ago

Asir mitosis

SocialMediaRefugee@lemmy.ml · 4 months ago

I use it to write scripts and debug error messages. It takes a good number of cycles though for it to arrive at a working result though. The scripts almost always fail on the first try and require numerous cycles to get through debugging. Once I have a working version though it becomes easier to ask for enhancements to it.

doubtingtammy@lemmy.ml · 4 months ago

Altman should release his chat logs

JumpyWombat@lemmy.ml · edit-2 3 months ago

deleted by creator