OpenAI's o3 Is Here: A Reasoning Model That Thinks Before It Speaks

OpenAI has released o3, the latest in its reasoning-first model family, and it represents a fundamentally different approach to AI problem-solving. Unlike traditional LLMs that generate tokens left-to-right in a single pass, o3 can allocate additional compute at inference time to work through problems step-by-step before producing a final answer.

The Core Idea: Compute-Scaled Reasoning

OpenAI describes o3 as a model that can "think at variable depth." For simple questions, it responds instantly. For problems requiring multi-step deduction — a math proof, a debugging session, a legal analysis — it can spend seconds or minutes reasoning internally before responding.

This is achieved through what the company calls Extended Chain-of-Thought training, where the model was trained on verified reasoning traces rather than just final answers.

What Makes o3 Different From o1

Longer internal reasoning chains — o3 can sustain reasoning across 10,000+ tokens internally
Self-correction — the model backtracks when it detects contradictions in its own chain
Tool-augmented reasoning — o3 can call code execution, search, and calculators mid-thought
Configurable compute budget — the API exposes a reasoning_effort parameter: low, medium, high

Key Benchmark Scores

| Task | o3 | o1 | GPT-4o | |---|---|---|---| | AIME 2025 | 96.7% | 83.3% | 9.3% | | SWE-bench Verified | 71.7% | 48.9% | 38.8% | | GPQA Diamond | 87.7% | 78.3% | 53.6% | | ARC-AGI | 87.5% | 32.0% | 5.0% |

Pricing and Access

o3 is available via the OpenAI API today. Pricing is $15 per million input tokens and $60 per million output tokens at high reasoning effort — reflecting the additional compute. A lighter o3-mini variant targeting coding and math is expected later this quarter at significantly lower cost.

The ARC-AGI result — a benchmark specifically designed to resist memorization — is the number that has the research community talking most. Whether o3 represents a genuine step toward general reasoning or an impressive but narrow capability remains the central debate.

OpenAI's o3 Is Here: A Reasoning Model That Thinks Before It Speaks

The Core Idea: Compute-Scaled Reasoning

What Makes o3 Different From o1

Key Benchmark Scores

Pricing and Access

Stay up to date with AI news

Discussion

Related Articles

OpenAI Crosses $25 Billion in Annualized Revenue, Signals IPO Could Come by Late 2026

The QuitGPT Reckoning: How OpenAI's Pentagon Deal Reshaped the AI Industry

OpenAI Closes Record $122 Billion Funding Round at $852 Billion Valuation as IPO Looms