Microsoft Ships Copilot Critique — GPT Drafts, Claude Reviews
Microsoft announced a new “Critique” feature for Copilot Researcher that pairs OpenAI’s GPT and Anthropic’s Claude in a single workflow. GPT generates a research response, then Claude reviews it for accuracy, completeness, and citation quality before the user sees it. The multi-model approach scores 13.8% higher on the DRACO benchmark than standalone tools from OpenAI, Google, Perplexity, or Anthropic individually.
This is a big deal for enterprise AI. Microsoft is essentially admitting that no single model is reliable enough for business-critical research, so it’s using a second model as a safety net. The move also signals that Anthropic’s API is now deeply embedded in Microsoft’s stack — a remarkable shift from the days when Microsoft was exclusively an OpenAI shop. Microsoft says the workflow will eventually run bidirectionally, with Claude drafting and GPT critiquing as well.
Copilot Cowork Launches in Frontier Program — Powered by Claude
Alongside Critique, Microsoft also shipped Copilot Cowork to its Frontier early-access program. Cowork is a desktop agent that can plan tasks, delegate work across apps, and automate multi-step workflows — built on Anthropic’s Claude models. It’s Microsoft’s answer to Anthropic’s own Cowork product, but integrated directly into the Microsoft 365 ecosystem.
The Frontier rollout means a limited group of enterprise customers get access first, with broader availability expected later this year. For Anthropic, this is a major distribution win — Claude is now powering core productivity features inside the world’s most widely used office suite.
Web Search and Code Execution Now Generally Available
Anthropic quietly moved web search and programmatic tool calling to general availability — no beta header required. API code execution is now free when used with web search or web fetch, a notable pricing change that makes Claude more competitive for agentic workflows. Fine-grained tool streaming is also GA across all models and platforms.
The Model API also got a useful addition: capability fields. GET /v1/models now returns max_input_tokens, max_tokens, and a capabilities object so developers can programmatically discover what each model supports. Small change, big quality-of-life improvement for multi-model orchestration.
Tool Helpers Hit Beta in Python and TypeScript SDKs
New tool helper utilities are now in beta for both the Python and TypeScript SDKs. They simplify tool creation with type-safe input validation and include a tool runner for automated tool handling in conversations. If you’ve been building agentic workflows with manual tool dispatch, this should cut a lot of boilerplate.
Also worth noting: the 1M token context window is now in beta for Claude Opus 4.6, joining Sonnet 4.5 and Sonnet 4. Data residency controls are live too — you can now specify US-only inference at 1.1x pricing for models released after February 1.
Claude Paid Subscriptions More Than Double in 2026
TechCrunch reports that Claude paid subscriptions have more than doubled this year, based on analysis of billions of anonymized credit card transactions from roughly 28 million U.S. consumers. Record numbers of new paid subscribers were added between January and February, with growth continuing into March. Most new subscribers are opting for the $20/month Pro tier.
The growth drivers are a mix of marketing and product. The Super Bowl campaign that mocked a competitor’s decision to show ads boosted awareness. Claude Code and Cowork pulled in developer and productivity users. And the Pentagon dispute generated weeks of free media coverage that turned Anthropic into a household name. Despite the surge, Claude still trails ChatGPT in overall market share — but it’s closing ground fast in the paid segment.
The Final Day of Anthropic’s Most Consequential Month
March 2026 ends today, and it’s been a month for the history books. The final tally: 14+ product launches (Claude Code auto mode, computer use, Excel/PowerPoint add-ins, voice mode, web search GA, structured outputs GA, Claude Marketplace), five service outages, a landmark court victory against the Pentagon, Bloomberg-reported IPO plans at $60B+, a data leak revealing Claude Mythos, the Claude for Open Source program, and Copilot Critique. All in 31 days.
The Pentagon feud was the accelerant that turned every product announcement into mainstream news. Anthropic went from a company known mainly to developers to one that made the evening news. MCP hit 97 million monthly SDK downloads. Daily signups exceeded 1 million. Whether April can sustain this pace — or whether the company even wants it to — is the open question heading into Q2.
IPO Countdown: October Window, $60B+ Target
As March closes, Anthropic’s IPO preparations remain front and center. Bloomberg reports the company is targeting a raise of over $60 billion with an October listing window. Goldman Sachs, JPMorgan, and Morgan Stanley are in early talks for leading roles. REX Shares and Tuttle Capital have already filed for leveraged products tied to future Anthropic shares.
The IPO narrative has only gotten stronger this month. Revenue approaching $20B ARR, paid subs doubling, Claude Code on a $2.5B run rate, and a court ruling that essentially validated the company’s safety-first stance. The public benefit corporation structure will be the wildcard — investors will need to accept that Anthropic will sometimes choose safety over speed.
Microsoft’s Multi-Model Bet Is a Validation of Anthropic’s Approach
The Copilot Critique announcement is more significant than it looks. Microsoft spent years as OpenAI’s exclusive distribution partner. Now it’s publicly shipping a feature where Claude checks GPT’s work — and touting the accuracy improvement. That’s not just a technical decision; it’s a strategic admission that the single-model era is over.
For Anthropic, the timing is perfect. Heading into an IPO, having Microsoft as a paying API customer who publicly credits Claude for improving Copilot accuracy is about as good as it gets. It positions Anthropic not just as a competitor to OpenAI, but as a complement — the model you bring in when accuracy matters most. And in enterprise, accuracy always matters most. The 13.8% DRACO improvement is the kind of number that shows up in sales decks for years.