Claude 4 Arrives: Anthropic Bets on Safety-First Intelligence
Anthropic's Claude 4 pushes the frontier on helpfulness while doubling down on its safety-first philosophy. We break down what's new and what it means for enterprise AI.
Anthropic has officially released Claude 4, and the company isn't shy about its positioning: this is a model built first and foremost to be safe, then helpful. After months of rumor and a small private beta, Claude 4 is now available via the Anthropic API and on Claude.ai.
What's New in Claude 4
Improved Constitutional AI
Claude 4 uses a revised version of Anthropic's Constitutional AI framework. The training process now incorporates a larger set of constitutional principles, including a new emphasis on epistemic humility — the model is more likely to express genuine uncertainty rather than confidently stating incorrect information.
Extended Context Window
Claude 4 ships with a 500,000-token context window, enabling it to reason over entire codebases, lengthy legal documents, or multi-year research datasets in a single conversation.
Better Agentic Behavior
Anthropic has significantly improved Claude 4's ability to operate as an autonomous agent. In internal evals, Claude 4 successfully completed complex multi-step tasks in controlled environments at a 73% success rate — up from 41% in Claude 3 Opus.
The Safety Angle
Anthropic released a detailed model card alongside Claude 4, including red-team results and a breakdown of refusal behaviors. Notably, the company claims a 60% reduction in "over-refusals" — cases where the model declines requests it shouldn't — while simultaneously improving resistance to jailbreaks.
"We believe these aren't in tension. A model that refuses reasonable requests isn't safe — it's just useless." — Dario Amodei, Anthropic CEO
Pricing
| Model | Input (per 1M tokens) | Output (per 1M tokens) | |---|---|---| | Claude 4 Opus | $18 | $90 | | Claude 4 Sonnet | $4 | $20 | | Claude 4 Haiku | $0.30 | $1.50 |
Early enterprise adopters report that Claude 4 Sonnet hits a compelling price-performance sweet spot for customer-facing applications, while Opus remains the choice for complex reasoning and autonomous agent workloads.