AI Scaling Laws Face Limits: Future of Intelligence Growth

The technical foundation of large AI models, which converts electrical energy into reusable intelligence through computational processes, is approaching its limits, according to a recent analysis by Yang You, a Presidential Young Professor at the National University of Singapore and founder of Morningstar AI. The analysis, reviewed by toolmesh.ai, suggests that while compute continues to increase, the proportional leap in intelligence appears to be slowing down.

A detailed view of a server rack in a data center, illustrating raw computational power.

This assessment aligns with growing concerns within the AI community regarding the sustainability of the "Scaling Law" paradigm. Ilya Sutskever, co-founder of OpenAI, has publicly stated that the era of simply increasing pre-training compute is plateauing, necessitating a shift toward new research avenues for intelligence growth. Yann LeCun, Meta's Chief AI Scientist, maintains that current large language models, regardless of scale, cannot achieve true Artificial General Intelligence (AGI). Even OpenAI CEO Sam Altman has indicated that merely adding more GPUs no longer yields a proportional increase in intelligence.

The Core of Intelligence and Its Bottlenecks

You Yang defines intelligence as a model's prediction and creation capabilities. He posits that over the past decade, the essence of large AI models has been the conversion of electrical energy into intelligence via computation. This process has relied on three key factors:

Pre-training as the primary intelligence source: While fine-tuning and reinforcement learning contribute, their energy (compute) investment is not on the same scale as pre-training.
Next-Token Prediction as a successful Loss design: This method minimizes human intervention, providing vast training data for large AI models.
Transformer architecture as a parallel computer: Its highly parallel, compute-intensive nature, with controllable communication, aligns well with GPU capabilities.

These elements collectively enabled continuous intelligence improvements as compute investment scaled from early models like GPT-1 and BERT to ChatGPT and Gemini. However, You Yang argues that the current paradigm struggles to fully utilize continuously growing compute. The issue is not a slowdown in GPU growth, but a decreasing "digestive capacity" of models, loss functions, and optimization algorithms for available compute.

Two diverging paths on a road, labeled 'Efficiency Improvement' and 'Intelligence Upper Limit'.

You Yang distinguishes between two types of progress:

Efficiency improvement: Achieving similar results with less compute or fewer parameters (e.g., pruning, distillation). This is vital for engineering but does not define the upper limit of intelligence.
Intelligence upper limit improvement: Training more capable and generalizable models under the same total computational constraint. This is the critical metric for continued intelligence leaps.

Future Directions: Consuming More Compute

Instead of focusing on "saving compute," You Yang's analysis suggests that future progress lies in "consuming more compute" more effectively, assuming cost is not a primary constraint. He outlines several potential directions:

Higher numerical precision: While moving from FP16 to FP32 or FP64 has not yet yielded significant intelligence leaps, it remains an underexplored area.
Higher-order optimizers: Shifting from first-order gradient methods could offer "smarter" parameter update paths, though widespread adoption may take time.
More scalable model architectures or Loss functions: The goal here is to train stronger models under extreme compute limits, not merely to improve throughput or efficiency.
More thorough training and search: This includes optimizing epochs, hyperparameters, and the relationship between data and parameters beyond simply running more iterations.

Intricate, glowing circuit board traces, symbolizing advanced computational methods.

You Yang categorizes techniques like inference optimization, low-precision training, and distillation as "implementation level," distinct from those that advance the "intelligence upper limit."

The analysis concludes that if the past decade focused on "how to get more compute," the next phase will address "how to truly turn this compute into intelligence." It suggests that when compute grows but intelligence no longer "automatically upgrades," a re-evaluation of limiting factors is necessary.