Cursor Reveals Composer Series 'autoinstall' Training Technique: Using AI to Automatically Set Up RL Environments for AI
Cursor has announced the core training method for its Composer series models: using the previous generation model to automatically set up a runnable environment for the next generation reinforcement learning (RL), called autoinstall.
Specific process:
Step 1: An Agent reads the documentation and configuration of the entire codebase and proposes 10 validation commands with expected outputs.
Step 2: Another Agent selects 3 of those commands and configures the environment from scratch until the commands run successfully, retrying up to 5 times; if all attempts fail, the environment is discarded.
The Agent actively fills in missing dependencies, including faking database tables, creating MinIO as a substitute for S3, launching Docker containers as sidecars, and even generating placeholder images. For example, in the blockchain project celo-org/celo-monorepo, after failing in the first round, the Agent created mock users in the second round to bypass authentication, ultimately succeeding in running the tests.
After completing training with Composer 2, it scored 61.7% on Terminal-Bench (which tests the model's ability to set up development environments), compared to 47.9% for Composer 1.5. Cursor stated that the autoinstall effect will be even better with Composer 2 in the future, forming a positive feedback loop.
Source: Public Information
ABAB AI Insight
This represents a significant breakthrough for Cursor in AI Agent self-iteration. One of the most challenging aspects of RL training is building a stable, repeatable environment—if the environment is not set up correctly, the model wastes a lot of tokens debugging bugs, potentially rendering an entire training cycle useless. The autoinstall method, using a dual-Agent model of "AI planning + AI execution," greatly enhances the success rate of environment setup, allowing the model to focus on learning complex tasks.
From a capital perspective, Cursor has significantly reduced manual intervention and trial-and-error costs by enabling the model to self-serve its training environment, accelerating the iteration efficiency from Composer 1.5 to Composer 2. Once this self-reinforcing loop is operational, it will greatly widen the gap with competitors that rely solely on human-built environments.
Essentially, this is a technological substitution: Cursor replaces manual environment configuration and debugging with the previous generation AI, transforming capital from extensive manual debugging work into an automated, self-evolving training pipeline. Mechanically, this is achieved through a closed loop of planning-execution-validation and limited retries, allowing AI to truly "train itself," pushing AI coding Agents from being "human assistants" to "autonomous evolving systems."
ABAB News · Cognitive Law
The smartest AI is not the one fed the best by humans, but the one that learns to set up its own training ground. When AI can automatically configure environments for AI, the training loop truly begins to self-accelerate. The leading advantage of the next generation model often lies in the engineering of "having the previous generation AI run the next generation."