150
LLMs Will Always Hallucinate, and We Need to Live With This
arxiv.orgAs Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and Godel's First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process-from training data compilation to fact retrieval, intent classification, and text generation-will have a non-zero probability of producing hallucinations. This work introduces the concept of Structural Hallucination as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated.



Generally hallucinations are frequent in pure chatbots, ChatGPT and similar, because they are based on an own knowledge base and LLM, so, if they don’t know an answer, they invent it, based on their data set. Different are AI with web access, they don’t have an own knowledge base, retrieving their answers in realtime from webcontents, because of this with a similar reliability as traditional search engines, with the advantage that they find relevant sites which are related with the context of the question, listing sources and summarizing the contents in a direct answer, instead of 390.000 pages of sites, which have nothing to do with the question in the traditional keyword search. IMHO for me, the only AI apps which result usefull for normal users, as search assistant, not an chatbot which tell me BS.
This is not correct. The current chatbots don’t „know“ anything. And even the ones with web access hallucinate
Well, “know” is because an existing knowledge base used for ChatBots, result of scrapping webcontent, but this rarely is updated (the one of ChatGPT can have years). So in a chat, the LLM can retry the concept of a question, but because of the limited “knowledge” data, converge to inventions, because the lack of reasoning, this is what an AI don’t have. This is minor in searchbots, because they don’t have the capability to chat with you, imitating human, they are limited to process the concept of your question, searching this concept in the web, comparing several pages with it to create an summary. It is a very different approach for an AI as an chat, because of this, yes, they also can give BS as answer, depending of the pages they consult for it, same as when you search something in the web in terraplanist pages, but this is with search AIs less an problem as with ChatBots.
AI is an tool and we have to use it as such, to help us in researches and tasks, not to substitute our own intelligence and creativity, which is the real problem nowadays. For Example, I have in Lemmy several posts in World News, Science and Tecnology from articles and science papers I found in the web, mostly with long texts. Because of this I post it also with an summary made by Andisearch, which is always pretty correct with added several different sources of the issue, so you can check the content. The other why I like Andisearch is, when it don’t find an answer, it don’t invent one, it simply offers an normal websearch by yourself, using an search API from DDG and other privacy search engines.
Anyway, the use of AI for researches always need an fact check, before we use the content, the only error is to use the answers as is or use biased AI from big (US) corporations. In almost 8.000 different AI apps and services which currently exist, special for very different tasks, we can’t globalise these because of the BS by ChatBots from Google, M$, METH, Amazon & cia, only blame the lack of the own common sense like a kid with a new toy, the differences are too big.
Again. LLMs don’t know anything. They don’t have a „knowledge base“ like you claim. As in a database where they look up facts. That is not how they work.
They give you the answer that sounds most likely like a response to whatever prompt you give it. Nothing more. It is surprising how good it works, but it will never be 100% fact based.
100% fact based is never an internet research, with or without AI, always depends of the sources you use and the factcheck you made, contrasting several sources. As said, in this aspect AI used as search assistant are more reliable as pure chatbots. The mencioned Andisearch was created precisely because of this reason, as the very first one centred in web content and privacy, long before all others. The statement of their devs are clear about it.
Some time ago appears this from ChatGPT
I made the same question in Andisearch and it’s answer was this
Differences in reasoning and ethics, this is why I use Andi since more than 3 Years now, no halucinations, nor BS since than.