Coinbase Founder Brian Armstrong Shares AI Spending Control Practices, Company Experiments with Default Model Switch to Open Source Weight Models
Coinbase founder Brian Armstrong stated that better default settings, routing, and caching can maintain stable AI spending while token usage grows exponentially, rather than relying on friction and spending alerts.
The company is experimenting with switching the default model to open source weight models like GLM 5.2 and Kimi 2.7, implemented through an LLM gateway, while retaining engineers' autonomy in selection. Measures such as improved cache hit rates and context simplification have nearly halved AI spending.
In terms of market mechanisms, corporate AI cost optimization practices accelerate investments in internal infrastructure, shifting funds from pure model calls to routing, caching, and optimization tools. Event-driven AI can scale sustainably, benefiting companies that provide efficient inference infrastructure, while those merely selling computing power face pressure.
Source: Public Information
ABAB AI Insight
Brian Armstrong leads Coinbase in deeply integrating AI into its crypto business and practices cost control through customized LLM gateways and internal toolchains, combining engineers' freedom of use with company spending control to achieve sustainable growth under exponential adoption.
The capital path indicates that Coinbase is investing engineering resources into optimizing AI infrastructure, motivated by the need to support exponential business growth while controlling marginal costs, strategically building an efficient internal AI platform to enhance overall competitiveness.
Similar to how large tech companies reduced costs through caching and routing optimizations in the cloud era, the current AI era is at a critical stage of transitioning from extensive token consumption to intelligent resource allocation.
Essentially, this involves technological substitution and capital concentration, with mechanisms that significantly reduce ineffective computation through better defaults, routing, and caching, replacing inefficient manual monitoring and limits. Capital is concentrating on platforms and enterprises that master AI inference optimization technology, shifting pricing power from original computing power providers to those enhancing efficiency across the entire chain.
ABAB News · Cognitive Law
Limit friction is temporary, intelligent optimization is everlasting; exponential growth relies on infrastructure rather than austerity.
Default routing and caching control costs without reducing engineers' freedom; sustainability is the core of scaling.
Engineers' extensive calls are temporary, while enterprises build intelligent gateways for the long term, top capital sells AI efficiency structural tools.