This person is right. But I think the methods we use to train them are what’s fundamentally wrong. Brute-force learning? Randomised datasets past the coherence/comprehension threshold? And the rationale is that this is done for the sake of optimisation and the name of efficiency? I can see that overfitting is a problem, but did anyone look hard enough at this problem? Or did someone just jump a fence at the time and then everyone decided to follow along and roll with it because it “worked” and it somehow became the golden standard that nobody can question at this point?
The generalized learning is usually just the first step. Coding LLMs typically go through more rounds of specialized learning afterwards in order to tune and focus it towards solving those types of problems. Then there’s RAG, MCP, and simulated reasoning which are technically not training methods but do further improve the relevance of the outputs. There’s a lot of ongoing work in this space still. We haven’t seen the standard even settle yet.
The researchers in the academic field of machine learning who came up with LLMs are certainly aware of their limitations and are exploring other possibilities, but unfortunately what happened in industry is that people noticed that one particular approach was good enough to look impressive and then everyone jumped on that bandwagon.
This person is right. But I think the methods we use to train them are what’s fundamentally wrong. Brute-force learning? Randomised datasets past the coherence/comprehension threshold? And the rationale is that this is done for the sake of optimisation and the name of efficiency? I can see that overfitting is a problem, but did anyone look hard enough at this problem? Or did someone just jump a fence at the time and then everyone decided to follow along and roll with it because it “worked” and it somehow became the golden standard that nobody can question at this point?
The generalized learning is usually just the first step. Coding LLMs typically go through more rounds of specialized learning afterwards in order to tune and focus it towards solving those types of problems. Then there’s RAG, MCP, and simulated reasoning which are technically not training methods but do further improve the relevance of the outputs. There’s a lot of ongoing work in this space still. We haven’t seen the standard even settle yet.
The researchers in the academic field of machine learning who came up with LLMs are certainly aware of their limitations and are exploring other possibilities, but unfortunately what happened in industry is that people noticed that one particular approach was good enough to look impressive and then everyone jumped on that bandwagon.