Scale AI
Scale AI is a data infrastructure company that provides high-quality training data, evaluation, and AI testing services for companies building AI models and applications. Founded in 2016 by Alexandr Wang, Scale AI specializes in data labeling, annotation, and reinforcement learning from human feedback (RLHF) at the quality levels required for training frontier AI models.
Scale AI has been central to how frontier research labs train their flagship models. The company became OpenAI's preferred partner for fine-tuning GPT-3.5 — work that contributed directly to ChatGPT — and has counted OpenAI, Google, Microsoft, Meta, xAI, and Anthropic among its customers. Leading labs each reportedly spend on the order of a billion dollars per year on human-provided training data, with Scale supplying the expert annotators, RLHF preference data, and evaluation pipelines used to make models more helpful, more accurate, and better aligned.
Scale has evolved beyond data labeling into a broader AI platform. Its Safety, Evaluations, and Alignment Lab (SEAL), led by Summer Yue (formerly Google DeepMind's RLHF research lead), publishes benchmarks that assess frontier model capabilities and safety. Its Donovan platform serves government and defense AI applications, and its enterprise platform helps organizations prepare proprietary data for AI use.
In June 2025, Meta took a 49% non-voting stake in Scale AI in a $14.3 billion deal that brought Alexandr Wang to Meta to co-lead its new Superintelligence Labs. Jason Droege, previously chief strategy officer, became Scale's CEO. The Meta tie-up reshaped Scale's frontier-lab relationships: OpenAI publicly wound down its partnership, and Google, Microsoft, and xAI began diversifying to other data suppliers — a shift that has accelerated the rise of competitors such as Surge AI and Turing.
In the agentic economy, Scale AI provides the data quality infrastructure that determines how capable AI agents can become — because the quality of training data and the rigor of evaluation directly bound the quality of AI model outputs and agent behavior.