Anthropic Negotiates to Acquire Fractile's Inference Chips, Continuing Decentralized Supply Chain Strategy
Anthropic is negotiating with London-based startup Fractile to acquire its inference chips, which are expected to begin mass production in 2027. The chips use SRAM instead of HBM, reducing data movement to lower power consumption and costs, similar to the approaches of Cerebras and Groq.
The agreement is still in the early stages, and the scale is unknown, but it has become a key selling point in Fractile's new funding round of over $100 million, as the latter seeks a valuation of over $1 billion.
Anthropic continues to diversify its chip supply, having rented Google and Amazon cloud servers, purchased Google's self-developed chips, and is also considering developing its own inference chips.
Source: Public Information
ABAB AI Insight
Anthropic has previously made large-scale purchases of NVIDIA servers through Microsoft Azure, and its negotiations with Fractile continue its strategy of decentralizing the supply chain and reducing reliance on a single supplier, similar to the paths taken by OpenAI and Meta in developing their own chips to address high inference costs and a shortage of computing power.
In terms of capital strategy, Anthropic is directing funds towards a diverse range of chip suppliers and potential self-developed projects, while Fractile is leveraging the potential large order to drive high valuation financing. The funding is shifting from traditional GPU reliance to memory-optimized architectures, motivated by the need to improve gross margins by reducing inference power consumption and alleviating the computing bottleneck caused by the surge in demand for Claude.
Similar to Groq and Cerebras, which are entering the inference market with innovative architectures, major AI companies are currently in a transformation phase of diversifying their supply chains and optimizing costs. Anthropic is shifting from heavy asset procurement to a mixed model of self-development and procurement.
This essentially represents a restructuring of the industry chain: the traditional reliance on HBM is being replaced by memory-optimized solutions like SRAM, driven by the bottlenecks in inference costs and energy consumption, which are causing funds to flow from the NVIDIA-dominated GPU ecosystem to emerging architecture startups. Pricing power is shifting to companies that possess low-power, high-efficiency inference technologies, as the explosive growth in demand for large model inference forces a reconstruction of backend infrastructure.
ABAB News · Cognitive Law
The deeper the single reliance, the more urgent the need to diversify risks.
Where the cost pain points are, is where new architectures rise.
When the supply chain is decentralized, innovative small firms gain leverage for large orders.