Scale AI
FlagshipInfrastructureUSA·HQ San Francisco·Est. 2016
Data labelling + RLHF infrastructure for frontier labs.
our score
Our take
The picks-and-shovels backbone for frontier AI labs and U.S. defense, commanding the critical human-data layer.
At a glance
- Best known for
- Data labeling and RLHF for OpenAI, Meta, and US defense
- Biggest strength
- Locked-in relationships with every major frontier lab
- Biggest risk
- Customer concentration among a handful of frontier labs
- Stage
- Series F
- Primary revenue
- Enterprise contracts for data annotation, RLHF, model evals, and defense AI software
What they do
Scale AI operates the data infrastructure layer that sits between raw information and production-grade AI systems. Its core offering, the Scale Data Engine, is a software platform combined with a global workforce of human annotators that produces the labeled datasets, reinforcement-learning-from-human-feedback (RLHF) rankings, and fine-tuning examples required to train large language models, computer-vision systems, and autonomous agents. Frontier labs—including OpenAI, Meta, and Microsoft—rely on Scale to handle the high-complexity, high-volume data tasks that cannot yet be fully automated, effectively outsourcing the human-in-the-loop portion of their training pipelines.
Beyond commercial AI, Scale has aggressively expanded into national security. Its Donovan product applies large language models to military decision-making, ingesting intelligence feeds and generating recommendations for commanders. This positions Scale as a dual-use infrastructure provider, selling both the training data for autonomy and the operational software that uses the resulting models. The company also runs SEAL (Safety, Evaluations, and Alignment Lab), a third-party evaluation and red-teaming practice that benchmarks frontier models for safety and capability, serving customers who need external validation before deployment.
Scale generates revenue primarily through multi-year enterprise contracts and government task orders. In the commercial segment, it charges for annotation volume, RLHF cycles, and evaluation services; in defense, it pursues software licensing and systems-integration contracts tied to Army modernization, robotic combat vehicles, and command-and-control AI programs. The company sits in the AI infrastructure category—specifically the human-data and evaluation sub-layer—making it a picks-and-shovels beneficiary of the broader generative-AI capex boom while also attempting to build durable software moats beyond manual annotation.
Origin story
Alexandr Wang founded Scale in 2016 while studying at MIT, enrolling the company in Y Combinator's Summer 2016 batch alongside co-founder Lucy Guo. The initial vision was to provide higher-quality data annotation than crowdsourced marketplaces like Amazon Mechanical Turk, specifically targeting autonomous-vehicle developers who needed precisely labeled camera and LiDAR feeds. By blending machine-learning pre-labeling with a managed human workforce, Scale promised both speed and accuracy. Guo departed in 2018, leaving Wang as the driving force behind the company's rapid expansion from self-driving cars into general AI data services.
Scale's defining inflection point arrived with the large language-model wave. As GPT-class models required RLHF and complex safety evaluations, Scale repositioned its workforce and tooling to serve frontier labs as a trusted training partner rather than a commodity outsourcer. This pivot coincided with a series of mega-rounds that valued the company as strategic AI infrastructure rather than a mere services vendor. In parallel, Wang cultivated deep relationships with the U.S. Department of Defense, culminating in the launch of Scale Donovan and major Army autonomy contracts. By the time Scale closed its $1 billion Series F in 2024 at a $13.8 billion valuation, it had transformed from a small YC startup into the default human-data layer for both Silicon Valley's most valuable AI labs and America's national-security establishment.
Key products
Scale Data Engine
An integrated platform and workforce for data annotation, RLHF, and fine-tuning that supplies labeled datasets to train frontier AI models.
Scale Donovan
An LLM-powered decision-support system built for defense and intelligence customers, designed to accelerate military planning and command operations.
SEAL
A safety-evaluation and red-teaming lab that benchmarks frontier models and provides third-party assessments of capability and alignment risks.
Leadership
- AW
Alexandr Wang
Founder & CEO
MIT dropout who founded Scale in 2016 and has led its expansion into frontier AI and defense.
Funding history
- 2016SeedUndisclosedY Combinator
- 2018Series A$18MAccel, Y Combinator
- 2024Series F$1BAccel
Strengths & risks
Strengths
- +Trusted supplier to every major frontier lab, including OpenAI, Meta, and Microsoft
- +Massive managed workforce plus automated quality controls at enterprise grade
- +Dual-use revenue streams spanning commercial AI training and US defense contracts
- +High switching costs once integrated into frontier model training pipelines
- +Category-defining brand that attracts top AI safety and engineering talent
Risks
- ⚠Revenue heavily concentrated among a handful of frontier lab customers
- ⚠Frontier labs are investing heavily in synthetic data and in-house annotation teams
- ⚠Defense contracts expose the company to shifting political priorities and export controls
- ⚠Margin compression risk as RLHF and data services commoditize over time
- ⚠Execution challenge in evolving from a services-heavy model to sticky software
Recent moves
Closed $1B Series F at $13.8B valuation
Late 2024Accel led the round, funding Scale's push into defense software and next-generation data infrastructure for frontier AI labs.
Expanded Scale Donovan across US defense agencies
2024The company broadened deployment of its military decision-support platform within Department of Defense command-and-control programs.
Advanced SEAL safety evaluations and public red-teaming
2023–2024Scale's SEAL initiative released public benchmarks and conducted third-party model assessments to establish industry safety standards.
Competitive position
Scale AI owns the high end of the data-labeling market for frontier AI, leaving legacy competitors such as Appen, Telus International, and Labelbox to compete for lower-margin computer-vision and mid-market work. Where Scale wins is on trust and complexity: frontier labs treat it as an extension of their own training organizations, granting access to sensitive pre-release models and paying premium rates for difficult RLHF and evaluation tasks that commodity vendors cannot handle. This relationship depth creates a formidable moat, but it is not unbreachable—labs are actively building internal data teams and synthetic-data pipelines to reduce dependency.
In defense, Scale competes less with traditional data vendors and more with Palantir, Anduril, and legacy primes on decision-AI and autonomy software. Its AI-native pedigree and existing security clearances give it credibility with the Pentagon, yet long-cycle government sales and integration complexity remain a steep learning curve for a company built on fast-moving commercial tech cycles. Scale's strategic challenge is to prove it is a software platform rather than a scalable services business before its core annotation market commoditizes.
What to watch
- 01Revenue concentration from the top three frontier-lab customers
- 02Gross-margin trends as synthetic data and automation replace human annotation
- 03Donovan contract wins versus Palantir and established defense contractors
- 04SEAL adoption as a recognized third-party safety standard by regulators or labs
- 05Utilization and headcount trends in Scale's global annotator workforce
Frequently asked questions
Does Scale AI build its own foundation models?
No. Scale provides the data infrastructure, annotation, and evaluation services that other companies use to train and align their own models.
Who are Scale's biggest customers?
Scale serves every major frontier AI lab, along with the U.S. Department of Defense and other government agencies through classified and unclassified contracts.
What is Scale Donovan?
Donovan is Scale's LLM-powered decision-support platform built for defense customers, helping military operators analyze intelligence and accelerate command decisions.
How does Scale differ from cheaper data-labeling providers?
Scale focuses on high-complexity, high-security work such as RLHF and red-teaming for frontier models, rather than commodity image or text tagging.
What is SEAL?
SEAL (Safety, Evaluations, and Alignment Lab) is Scale's initiative to benchmark, evaluate, and red-team frontier AI models for safety and capability risks.
Is Scale profitable?
Scale does not disclose financials, but its $1 billion late-stage raise suggests it is prioritizing platform expansion and defense growth over near-term profit.
Why did Scale raise $1B in a Series F round?
The capital supports a dual strategy: deepening commercial data-engine capabilities and scaling Donovan and other defense products for national-security customers.
The bottom line
Scale AI sits at a critical chokepoint in the AI stack: without high-quality labeled data and RLHF, frontier models cannot improve reliably. As long as labs continue racing toward AGI, Scale is positioned to capture a growing share of training budgets, and its push into defense software via Donovan opens a second, highly defensible revenue stream.
However, the company faces a classic picks-and-shovels dilemma—its customers are aggressively investing in automation and synthetic data to reduce reliance on human annotators, while well-funded competitors and in-house teams threaten margin and lock-in. The next 12–18 months will test whether Scale can evolve from a high-margin services layer into a sticky software platform before commoditization accelerates. If it succeeds, it could become the indispensable infrastructure provider for both commercial and national-security AI; if not, it risks being disintermediated by the very labs it serves.
Key products
- Scale Data Engine
- Scale Donovan
- SEAL evals