Do Large Reasoning Models Think? An Argument for Yes
The debate around whether large reasoning models (LRMs) can truly think has gained traction recently. Much of the skepticism stems from research suggesting that LRMs merely perform pattern-matching rather than genuine thought. However, a closer look at the mechanisms behind LRM behavior, particularly chain-of-thought (CoT) reasoning, reveals a compelling case for their capacity to think.
Defining Thinking: A Human Perspective
To understand if LRMs can think, we must first define thinking itself, specifically in the context of problem-solving. Human thinking involves several key processes:
- Problem Representation: The prefrontal cortex and parietal cortex work together to hold problems in mind, break them down, and set goals.
- Mental Simulation: This involves an auditory loop (similar to CoT generation) and visual imagery.
- Pattern Matching and Retrieval: The hippocampus and temporal lobes draw on past experiences and stored knowledge.
- Monitoring and Evaluation: The anterior cingulate cortex (ACC) detects errors and impasses.
- Insight or Reframing: The brain may shift to a more relaxed state, allowing for new perspectives.
CoT Reasoning and Biological Parallels
LRMs, while not replicating all human cognitive faculties, exhibit striking similarities to human thought processes. CoT reasoning, for instance, mirrors the verbalization of thoughts that humans often engage in. The ability of LRMs to backtrack when a line of reasoning proves fruitless further supports this parallel. Furthermore, pattern-matching is used for recalling learned experience, problem representation and monitoring and evaluating chains of thought. Working memory is to store all the intermediate steps. Backtracking search concludes that the CoT is not going anywhere and backtracks to some reasonable point.
Consider the case of DeepSeek-R1, which was trained for CoT reasoning without explicit CoT examples. This mirrors how the human brain continuously learns while solving problems. In contrast, even though LRMs aren’t allowed to change based on real-world feedback during prediction or generation, the learning did happen as it attempted to solve the problems — essentially updating while reasoning.
The Next-Token Prediction Argument
Critics argue that because LRMs are essentially next-token predictors, they cannot think. However, this view overlooks the complexity of knowledge representation. Natural language, with its complete expressive power, is a strong candidate for knowledge representation. A next-token prediction machine must, in some form, represent world knowledge to accurately compute the probability distribution over the next token. This is similar to how humans predict the next token during speech or internal thought. Debasish Ray Chawdhuri, a senior principal engineer at Talentica Software and a Ph.D. candidate in Cryptography at IIT Bombay, notes the significance of this.
Evidence from Benchmark Results
LRMs perform well on reasoning benchmarks. The benchmark results, the striking similarity between CoT reasoning and biological reasoning, and the theoretical understanding that any system with sufficient representational capacity, enough training data, and adequate computational power can perform any computable task — LRMs meet those criteria to a considerable extent.
Conclusion
Considering the evidence, it’s reasonable to conclude that LRMs almost certainly possess the ability to think. While further research may offer new insights, the current understanding of LRM behavior and its parallels with human cognition supports this conclusion. As the technology continues to evolve, our understanding of what constitutes thought may also need to evolve.