Flash News

Musk Admits Serious Deficiencies in Grok 4.2, Internal Version 9 1.5T Model Completed

Musk publicly acknowledged on X that Grok 4.2 has serious deficiencies and disclosed that it is based on the internal 8th version foundational model with only 0.5T parameters, trained on Hopper architecture GPUs, which has significant flaws in data quality, comprehensiveness, and proportion.

The internal 9th version foundational model has completed training, expanding parameters to 1.5T, specifically optimized for Blackwell architecture GPUs, with comprehensive upgrades in data processing, training recipes, and model scale. Musk stated that the 8th and 9th versions are not on the same level, and future training will incorporate Cursor data, marking the first mention of data integration following SpaceX's acquisition.

The SpaceX AI pre-training team has only a few members left after the departure of its leader, with over 50 core R&D personnel having left. Previously, there was a commitment to release a new foundational model every two weeks, but the 1T parameter Grok 4.4, originally scheduled for early May, has yet to be released. This 1.5T version aligns with the specifications of Grok 4.5 in the roadmap.

Source: Public Information

ABAB AI Insight

Musk has previously pushed for rapid iterations of Grok through xAI's roadmap, achieving multiple parameter and architecture leaps from Grok 1 to Grok 3 between 2024-2025. His rare public acknowledgment of the deficiencies in 4.2 continues his consistent transparency style, while also exposing historical issues of disconnect between internal and external versions.

On the capital path, xAI directly injected training data after acquiring Cursor through the SpaceX ecosystem, rapidly shifting resources from the Hopper cluster to large-scale training on Blackwell, motivated by the need to catch up with OpenAI and Anthropic in model scale, while reducing external dependencies through a data closed loop, accelerating the generational upgrade from 0.5T to 1.5T.

Similar to how Meta's Llama series publicly adjusted its roadmap multiple times due to data and scale iteration pressures in 2023, and how early versions of Anthropic's Claude frequently missed deadlines, xAI is currently at a bottleneck stage transitioning from "rapid release" to "high-quality scaling," with significant core talent loss further amplifying execution risks.

Essentially, this is a technological replacement: AI foundational model training heavily relies on computing architecture, data quality, and talent density. The leap from the 8th to the 9th version achieves exponential capability enhancement through Blackwell optimization and Cursor data injection, making hardware generational replacement and data asset acquisition decisive variables in surpassing competitors, forcing the industry to shift from "parameter competition" to "full-stack closed-loop control."

ABAB News · Cognitive Law

Acknowledging deficiencies accelerates the next leap more than covering them up.
The cost of talent loss will ultimately be paid for with computing power and data.
Public versions are marketing; internal versions are the real battlefield.

Source

·ABAB News
·
3 min read
·1d ago
分享: