AI & Tech Review ⚡
Q2 2025 was the quarter the frontier-model race became a three-body problem. Anthropic shipped the Claude 4 family (Opus 4, Sonnet 4) on May 22, delivering a step-function improvement in sustained agentic work and extended thinking. Google used I/O to unveil Gemini 2.5 Pro alongside Project Mariner and Jules. OpenAI pre-announced GPT-5 and continued pushing o-series reasoning models. The infrastructure layer matured just as fast: MCP crossed 8 million server downloads, AI coding tools reached mainstream developer adoption, and computer use moved from demo to product. Anthropic's valuation trajectory ($61.5B in March, toward $183B by September) and Cursor's approach to $2B ARR validated the enterprise AI market at unprecedented scale.
📌 Navigate
📋 Exec Summary
Q2 2025 was the quarter the frontier-model race became a three-body problem. Anthropic shipped the Claude 4 family (Opus 4, Sonnet 4) on May 22, delivering a step-function improvement in sustained agentic work and extended thinking. Google used I/O to unveil Gemini 2.5 Pro alongside Project Mariner and Jules. OpenAI pre-announced GPT-5 and continued pushing o-series reasoning models. The infrastructure layer matured just as fast: MCP crossed 8 million server downloads, AI coding tools reached mainstream developer adoption, and computer use moved from demo to product. Anthropic's valuation trajectory ($61.5B in March, toward $183B by September) and Cursor's approach to $2B ARR validated the enterprise AI market at unprecedented scale.
📊 What Moved
Frontier coding benchmark (SWE-bench)
~55-60% (Sonnet 3.7) to 72.5% (Claude Opus 4). Step-function improvement.
Best reasoning model
Opus 4 extended thinking and Gemini 2.5 Pro displace o1-pro and Sonnet 3.7 extended.
Agentic workflow ceiling
Single-session, minutes to multi-hour sustained agentic tasks.
MCP ecosystem
~2M server downloads to 8M+ downloads, 5,800+ servers, 300+ clients.
AI coding tool penetration
Early adopter to ~18% dev adoption (Cursor); Claude Code GA.
Computer use
Research preview (Anthropic) to product feature; Google Mariner in public beta.
Enterprise AI revenue (Anthropic)
~$875M annualized and accelerating; trajectory toward $183B valuation by Sep.
📈 Trend Arcs
1. The Agentic Inflection -- Velocity: Accelerating
Models shifted from "answer questions" to "do work." Claude 4 Opus demonstrated multi-hour autonomous coding sessions with measurably lower failure rates than any prior system. Google shipped Project Mariner (browser agent handling up to ten parallel tasks) and Jules (autonomous coding agent in public beta). Tool use and computer use became standard capabilities rather than research previews. Anthropic's computer use moved from research preview to product feature. The gap between "can demo" and "runs in production" narrowed materially -- the remaining blockers are organizational (trust, workflow integration, scope management), not technical.
Key evidence: Google's Agent Development Kit (ADK) launched for Python and Java. The Agent2Agent (A2A) protocol was introduced for cross-agent communication. Enterprise agent infrastructure (orchestration, monitoring, guardrails) became a funded startup category.
2. AI Coding Becomes Table Stakes -- Velocity: Accelerating
Cursor hit 18% developer adoption in a January 2026 survey, reflecting Q2-Q3 2025 usage. Claude Code launched as a terminal-native coding agent tied directly to Opus 4. Windsurf (Codeium) raised and was later acquired by Cognition AI; terms were not disclosed. GitHub Copilot held 29% overall but faced real competition for the first time from tools offering full agentic workflows rather than line-level autocomplete.
The market bifurcated: autocomplete (Copilot, Supermaven) vs. agentic coding (Claude Code, Cursor Composer, Devin). Supermaven's 72% acceptance rate showed autocomplete is mature; the growth vector shifted to multi-file, multi-step agent workflows. Average deal sizes in the AI developer tools category increased significantly as enterprise procurement formalized.
3. Protocol Standardization (MCP) -- Velocity: Accelerating
OpenAI adopted MCP across ChatGPT desktop and Agents SDK in March. Google DeepMind confirmed Gemini MCP support in April. Downloads grew from ~100K (Nov 2024) to 8M+ (Apr 2025) with 5,800+ servers and 300+ clients. The MCP Registry launched and onboarded close to 2,000 entries by September.
By quarter-end, MCP was the de facto standard for connecting AI models to external tools and data sources, eliminating the fragmentation risk that plagued earlier tool-use approaches. The remaining question: governance. MCP was still Anthropic-stewarded, but multi-company foundation governance was under discussion (formalized in December 2025 under the Linux Foundation). Builders who adopted MCP in Q2 avoided rework; those who built proprietary connectors face migration costs.
🗺️ Landscape Shift
| Player | Q1 Position | Q2 Move | Net Effect |
|---|---|---|---|
| Anthropic | Sonnet 3.7 leading coding | Claude 4 family (Opus, Sonnet); Claude Code GA; MCP ecosystem dominance | Cemented frontier + developer mindshare |
| OpenAI | GPT-4o + o1 line | GPT-5 pre-announced; continued o-series; adopted MCP | Playing catch-up on agents; GPT-5 hype cycle |
| Gemini 1.5 Pro | I/O: Gemini 2.5 Pro, Mariner, Jules, Agent2Agent protocol, ADK | Aggressive multi-product agent push | |
| Cursor | Leading AI IDE | ~18% dev adoption in a later survey; approaching $2B ARR trajectory | Market leader in AI-native IDE |
| Windsurf | Codeium standalone editor | Later acquired by Cognition AI; terms undisclosed | Consolidated under Devin parent |
| Anthropic (valuation) | $61.5B (Mar) | On trajectory toward $183B (Sep) | Revenue growth outpacing all AI peers |
💰 Funding
Anthropic Series (Mar 2025)
$3.5B at $61.5B. Enterprise AI revenue validated at scale.
Cursor / Anysphere
Trajectory to $2B ARR. AI coding market is real, not hype.
Windsurf / Cognition AI acquisition
Later deal in 2025; terms undisclosed. Consolidation in AI coding layer.
MCP ecosystem startups
Multiple seed/A rounds. Infrastructure-layer investment accelerating.
Google agent infrastructure
ADK, A2A protocol, Jules. Platform-level bet on agent ecosystem.
AI developer tools category (broad)
$1B+ total VC in H1. Fastest-growing enterprise software segment.
🔍 Counter-Narrative
- The consensus: Agents are overhyped -- reliability is still too low for production. The reality: Fair in Q1, less so by end of Q2. Claude 4 Opus demonstrated sustained multi-hour workflows with materially lower failure rates. Google shipped browser and coding agents to public beta. Multiple enterprises reported internal deployments of coding agents for routine PR generation and test writing.
- The counter-counter: Reliability improvements came from better models, not better tooling -- meaning gains are concentrated at the frontier and don't trickle down to weaker or cheaper models. Production deployments still require guardrails, human-in-the-loop checkpoints, and careful scope bounding. Cost remains a real constraint for high-volume agentic use cases (Opus 4 is expensive at scale). The hype-to-reality gap is closing, but it hasn't closed -- and the "last mile" of agent reliability (error recovery, ambiguity handling, graceful degradation) remains unsolved at the framework level.
📐 Builder's Benchmark
Autonomous multi-file code changes
Fragile/babysitting to reliable for scoped tasks (Opus 4). Adopt Claude Code / Cursor agents for routine PRs.
Tool-use via MCP
Early custom integrations to standardized (5,800+ servers). Build on MCP; stop rolling custom tool connectors.
Browser automation agents
Research demo to public beta (Mariner, computer use). Prototype internal workflow automation.
Extended reasoning for complex analysis
Available but slow/expensive to faster, cheaper, more reliable. Use extended thinking for regulatory analysis, code review.
Multi-agent orchestration
Framework-level only to ADK + A2A from Google; MCP composability. Evaluate multi-agent patterns for complex workflows.
Voice/multimodal agent interfaces
Limited to transcription to Gemini multimodal, GPT-4o voice. Test multimodal inputs for customer-facing agents.
👀 What to Watch
- GPT-5 launch -- benchmark comparisons vs. Opus 4 will reset the frontier narrative
- Anthropic next funding round -- revenue trajectory signals enterprise AI market size
- MCP governance -- whether OpenAI, Google, Anthropic agree on neutral stewardship or fragment
- Cursor vs. Claude Code adoption numbers -- terminal-native vs. IDE-native may segment the market
- Google Agent2Agent (A2A) protocol -- competing with or complementing MCP; watch for overlap
- EU AI Act compliance tooling -- high-risk classification creating a new enterprise software category
- Computer use in production -- first scaled deployments beyond demos will define the reliability bar
- Agent cost economics -- whether Opus 4-tier pricing sustains at enterprise scale or forces tier-switching
- Open-source model agents -- Llama 3, Mistral, and others approaching agent-capable thresholds