2024 Q1Quarterly Review8 min read

AI & Tech Review ⚡

Q1 2024 shattered the GPT-4 monopoly as Anthropic, Google, and Mistral each shipped frontier-competitive models within weeks. The competition axis shifted from parameter counts to context windows (Gemini 1.5 Pro at 1M tokens) and inference economics (NVIDIA Blackwell, Groq LPU). The EU AI Act passed Parliament, autonomous agent demos (Devin) entered the discourse, and the industry began a structural transition from chat-based copilots toward multi-step tool-using agents.

📌 Navigate

01📋 Exec Summary 02📊 What Moved 03📈 Trend Arcs 04🗺️ Landscape Shift 05💰 Funding & Deal Pattern 06🔍 Counter-Narrative 07📐 Builder's Benchmark 08👀 What to Watch 09📎 Sources

📋 Exec Summary

📊 What Moved

The GPT-4 monopoly broke
Three organizations shipped frontier-competitive models in a single quarter: Anthropic's Claude 3 Opus (March 4), Google's Gemini 1.5 Pro (February 15), and Mistral Large (February 26). Benchmark leaderboards became a rotating fixture, not a coronation.

Context windows became the new parameter count
Gemini 1.5 Pro debuted with a 1-million-token context window (700K words, 11 hours of audio, or an entire codebase in one prompt). Every lab began racing to match it; Magic.dev raised $117M in February specifically for long-context code models.

The inference layer became a product category
Groq's LPU demos in February made inference hardware a visible differentiator. NVIDIA reinforced this at GTC (March 18) with the Blackwell B200 architecture: 20 petaflops FP4, 25x better energy efficiency than Hopper, inference cost reduction as a first-class design goal.

Autonomous agents entered the discourse
Cognition's Devin (March 12) claimed 13.86% SWE-bench resolution end-to-end, up from 1.96% prior SOTA. The signal mattered more than the claims: the industry shifted from chat-based copilots toward autonomous, multi-step tool-using agents.

Regulation arrived as a concrete constraint
The EU AI Act passed Parliament on March 13 (523 votes in favor), the world's first comprehensive AI law. Musk's February 29 lawsuit against OpenAI forced a public reckoning about AI governance structures.

📈 Trend Arcs

Arc 1: The Multi-Frontier Era

Velocity: Accelerating

Through 2023, "frontier model" was synonymous with GPT-4. Q1 2024 broke that equation. Anthropic shipped Claude 3 Opus on March 4 with competitive or superior benchmark performance across reasoning, coding, and multilingual tasks. Google shipped Gemini 1.5 Pro with the context-window breakthrough. Mistral Large arrived February 26, competitive with GPT-4 on benchmark positioning and pricing. Inflection shipped Pi 2.5 on March 7 at 94% of GPT-4 performance. The frontier stopped being a point and became a region.

This matters for builders because vendor lock-in weakened overnight. If three models can do the job, pricing power shifts to customers, and switching costs drop. Abstraction layers (LangChain, LiteLLM) became strategic rather than nice-to-have.

Where it stands at quarter close: Four credible frontier-class models available via API. GPT-4 retains mindshare but no longer commands a performance moat. Price erosion underway.

Arc 2: Infrastructure Arms Race — From Training to Inference

Velocity: Accelerating

NVIDIA's GTC keynote on March 18 was the inflection point. Jensen Huang called AI the "next industrial revolution" and unveiled Blackwell B200 — 208 billion transistors, 4x training improvement, 30x inference improvement over Hopper. But the real signal was the design emphasis: Blackwell optimized for inference economics, not just training scale. Groq's LPU demos in February had already shown that inference speed was a product differentiator. Combined, the message was clear: the bottleneck was shifting from "can we train it?" to "can we serve it cheaply enough to build products?"

Cloud providers (AWS, Azure, GCP, OCI) all committed to Blackwell instances. The capex cycle in AI infrastructure entered a new phase.

Where it stands at quarter close: Blackwell announced but not shipping until late 2024. Groq generating developer interest but limited scale. H100 remains the workhorse. Inference cost reduction is now a stated priority for every major player.

Arc 3: The Agent Bet

Velocity: Accelerating

Devin's March 12 announcement was the spark, but the agent trend was broader. The quarter saw a proliferation of agent frameworks, tool-use protocols, and multi-step reasoning benchmarks. Magic.dev's $117M raise was explicitly for autonomous code agents. OpenAI, Anthropic, and Google all signaled agent capabilities in their model updates. The shift from "chat completions" to "autonomous task execution" became the dominant product narrative.

The controversy around Devin's benchmarks (subsequently questioned by independent reviewers) highlighted a core tension: agent capabilities are hard to evaluate, easy to overhype, and genuinely useful when they work.

Where it stands at quarter close: Agent demos impressive but brittle. No production-grade autonomous coding agent at scale. The bet is placed; the returns are TBD.

🗺️ Landscape Shift

Player	Quarter open	Quarter close	What changed
Anthropic	Claude 2.1, strong but second-tier	Claude 3 family (Opus/Sonnet/Haiku), first credible GPT-4 competitor	Shipped three-tier model lineup March 4; established multi-model strategy
Google DeepMind	Gemini 1.0 Ultra just launched	Gemini 1.5 Pro with 1M context window	Shifted competition axis to context length; reclaimed technical leadership narrative
OpenAI	Undisputed frontier leader	First-among-equals, facing lawsuits	Lost monopoly on frontier performance; Musk lawsuit (Feb 29) forced governance debate
NVIDIA	H100 supply-constrained, printing money	Blackwell B200 announced, inference-first design	Signaled next-gen architecture; cemented AI infrastructure dominance
Mistral AI	European upstart, open-weight models	Mistral Large launched, Microsoft partnership	Became credible commercial competitor; $2B+ valuation, Azure distribution deal
Stability AI	Troubled but operational	CEO Emad Mostaque resigned March 23	Leadership crisis; interim co-CEOs appointed; open-source image generation future uncertain
Cognition (Devin)	Unknown/stealth	Viral launch, $21M Series A, massive hype	Defined "AI software engineer" category; also became poster child for agent overhype
Groq	Niche inference startup	LPU demos go viral, GroqCloud dev platform	Proved inference speed is a marketable differentiator; drew developer attention
EU regulators	AI Act in trilogue	AI Act passed Parliament (523-46), March 13	First comprehensive AI law becomes real; binding rollout remains future-dated

💰 Funding & Deal Pattern

Infrastructure mega-rounds
NVIDIA's Blackwell announcement catalyzed forward commitments from hyperscalers. The capex cycle intensified. Sector on pace for $100B+ (up 80% from $55.6B in 2023).

Model companies
Mistral AI secured its Microsoft partnership and Azure distribution. Inflection shipped Pi 2.5 but was already in acqui-hire talks with Microsoft.

Agent/code generation
Magic.dev raised $117M (February) for long-context code AI. Cognition raised $21M Series A from Founders Fund ahead of Devin launch.

Open-source ecosystem
Stability AI's funding struggles (Coatue pushing for CEO resignation, Lightspeed publicly critical) contrasted with well-funded proprietary labs.

Pattern: capital flowing toward inference cost reduction, agent capabilities, and multi-modal applications. Pure "bigger model" plays without distribution or product moats attracted skepticism. European AI (Mistral, Aleph Alpha) received strategic investment tied to sovereignty narratives.

🔍 Counter-Narrative

The consensus: More models at the frontier means healthy competition. The reality: Near-parity on benchmarks means the moat disappears, margins compress, and value shifts to distribution and infrastructure. Mistral's Microsoft deal and Inflection's eventual acqui-hire both confirm that even frontier-capable labs need distribution partners to survive.
The consensus: Devin proves autonomous agents are here. The reality: 13.86% SWE-bench resolution means 86% of tasks still fail — a disqualifying error rate for production use. Builders risk over-investing in agent architectures before the reliability problem is solved.

📐 Builder's Benchmark

Frontier API cost (GPT-4 class)
$30/$60 per 1M tokens (input/output) at quarter open; $8/$24 (Mistral Large) by quarter close

Context window ceiling
128K tokens (GPT-4 Turbo) → 1M tokens (Gemini 1.5 Pro), 8x increase in a single quarter

SWE-bench SOTA
1.96% → 13.86% (Devin), 7x improvement in autonomous code repair

Inference latency benchmark
Groq LPU serving Llama 2 70B at 280-300 tok/s, ~10x faster than GPU-based serving

EU AI Act compliance timeline
High-risk system obligations remain future-dated; phased rollout still ahead

NVIDIA Blackwell B200 specs
20 petaflops FP4, 192 GB HBM3e, 8 TB/s memory bandwidth

👀 What to Watch

April 2024
Inflection AI leadership changes; reports of Microsoft acqui-hire negotiations signal consolidation at the model layer

May-June 2024
EU AI Act formal Council endorsement expected; compliance planning windows start for high-risk systems

Q2 2024
OpenAI's response to multi-frontier pressure; GPT-5 or GPT-4 successor timing becomes critical competitive question

March-June 2024
Stability AI board searching for permanent CEO and/or acquirer; outcome determines the viability of open-source image generation

H2 2024
NVIDIA Blackwell volume shipments; actual inference cost reductions will either validate or deflate the infrastructure hype

📎 Sources

Source	Reference	Link
Anthropic	Claude 3 Model Family Technical Report	https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf
Google Blog	Introducing Gemini 1.5	https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/
NVIDIA Newsroom	NVIDIA Blackwell Platform Arrives to Power a New Era of Computing	https://nvidianews.nvidia.com/news/nvidia-blackwell-platform-arrives-to-power-a-new-era-of-computing
CNBC	Nvidia announces GB200 Blackwell AI chip	https://www.cnbc.com/2024/03/18/nvidia-announces-gb200-blackwell-ai-chip-launching-later-this-year.html
European Parliament	AI Act adoption, March 13, 2024	https://www.europarl.europa.eu/news/en/agenda/plenary-news/2024-03-11/0/artificial-intelligence-act-parliament-to-adopt-landmark-law
Cognition AI	Introducing Devin, the first AI software engineer	https://cognition.ai/blog/introducing-devin
VentureBeat	Cognition emerges from stealth to launch Devin	https://venturebeat.com/ai/cognition-emerges-from-stealth-to-launch-ai-software-engineer-devin
Mistral AI	Au Large — Mistral Large announcement	https://mistral.ai/news/mistral-large
TechCrunch	Mistral AI releases new model to rival GPT-4	https://techcrunch.com/2024/02/26/mistral-ai-releases-new-model-to-rival-gpt-4-and-its-own-chat-assistant/
TechCrunch	Stability AI CEO resigns	https://techcrunch.com/2024/03/22/stability-ai-ceo-resigns-because-youre-not-going-to-beat-centralized-ai-with-more-centralized-ai/
NPR	Elon Musk sues OpenAI	https://www.npr.org/2024/03/01/1235159084/elon-musk-openai-suit-chatgpt-sam-altman-greg-brockman
VentureBeat	Inflection AI launches Pi 2.5	https://venturebeat.com/ai/inflection-ai-launches-new-model-for-pi-chatbot-nearly-matches-gpt-4
Voicebot.ai	Magic AI raises $117M	https://voicebot.ai/2024/02/20/magic-ai-raises-117m-for-generative-ai-coding-coworker-not-copilot/
Voicebot.ai	Google Unveils Gemini 1.5 Pro with 1M-Token Context Window	https://voicebot.ai/2024/02/15/google-unveils-gemini-1-5-pro-llm-with-staggering-1m-token-context-window/
Groq	LPU Architecture	https://groq.com/lpu-architecture
Crunchbase	Startup Funding 2024 AI Boom	https://news.crunchbase.com/venture/global-funding-data-analysis-ai-eoy-2024/
NVIDIA Blog	GTC 2024 Keynote Wrap-Up	https://blogs.nvidia.com/blog/2024-gtc-keynote/

2024 Q1Quarterly Review8 min read

AI & Tech Review ⚡

📌 Navigate

01📋 Exec Summary 02📊 What Moved 03📈 Trend Arcs 04🗺️ Landscape Shift 05💰 Funding & Deal Pattern 06🔍 Counter-Narrative 07📐 Builder's Benchmark 08👀 What to Watch 09📎 Sources

📋 Exec Summary

📊 What Moved

📈 Trend Arcs

Arc 1: The Multi-Frontier Era

Velocity: Accelerating

Where it stands at quarter close: Four credible frontier-class models available via API. GPT-4 retains mindshare but no longer commands a performance moat. Price erosion underway.

Arc 2: Infrastructure Arms Race — From Training to Inference

Velocity: Accelerating

Cloud providers (AWS, Azure, GCP, OCI) all committed to Blackwell instances. The capex cycle in AI infrastructure entered a new phase.

Arc 3: The Agent Bet

Velocity: Accelerating

Where it stands at quarter close: Agent demos impressive but brittle. No production-grade autonomous coding agent at scale. The bet is placed; the returns are TBD.

🗺️ Landscape Shift

Player	Quarter open	Quarter close	What changed
Anthropic	Claude 2.1, strong but second-tier	Claude 3 family (Opus/Sonnet/Haiku), first credible GPT-4 competitor	Shipped three-tier model lineup March 4; established multi-model strategy
Google DeepMind	Gemini 1.0 Ultra just launched	Gemini 1.5 Pro with 1M context window	Shifted competition axis to context length; reclaimed technical leadership narrative
OpenAI	Undisputed frontier leader	First-among-equals, facing lawsuits	Lost monopoly on frontier performance; Musk lawsuit (Feb 29) forced governance debate
NVIDIA	H100 supply-constrained, printing money	Blackwell B200 announced, inference-first design	Signaled next-gen architecture; cemented AI infrastructure dominance
Mistral AI	European upstart, open-weight models	Mistral Large launched, Microsoft partnership	Became credible commercial competitor; $2B+ valuation, Azure distribution deal
Stability AI	Troubled but operational	CEO Emad Mostaque resigned March 23	Leadership crisis; interim co-CEOs appointed; open-source image generation future uncertain
Cognition (Devin)	Unknown/stealth	Viral launch, $21M Series A, massive hype	Defined "AI software engineer" category; also became poster child for agent overhype
Groq	Niche inference startup	LPU demos go viral, GroqCloud dev platform	Proved inference speed is a marketable differentiator; drew developer attention
EU regulators	AI Act in trilogue	AI Act passed Parliament (523-46), March 13	First comprehensive AI law becomes real; binding rollout remains future-dated

💰 Funding & Deal Pattern

Infrastructure mega-rounds
NVIDIA's Blackwell announcement catalyzed forward commitments from hyperscalers. The capex cycle intensified. Sector on pace for $100B+ (up 80% from $55.6B in 2023).

Model companies
Mistral AI secured its Microsoft partnership and Azure distribution. Inflection shipped Pi 2.5 but was already in acqui-hire talks with Microsoft.

Agent/code generation
Magic.dev raised $117M (February) for long-context code AI. Cognition raised $21M Series A from Founders Fund ahead of Devin launch.

Open-source ecosystem
Stability AI's funding struggles (Coatue pushing for CEO resignation, Lightspeed publicly critical) contrasted with well-funded proprietary labs.

🔍 Counter-Narrative

The consensus: More models at the frontier means healthy competition. The reality: Near-parity on benchmarks means the moat disappears, margins compress, and value shifts to distribution and infrastructure. Mistral's Microsoft deal and Inflection's eventual acqui-hire both confirm that even frontier-capable labs need distribution partners to survive.
The consensus: Devin proves autonomous agents are here. The reality: 13.86% SWE-bench resolution means 86% of tasks still fail — a disqualifying error rate for production use. Builders risk over-investing in agent architectures before the reliability problem is solved.

📐 Builder's Benchmark

Frontier API cost (GPT-4 class)
$30/$60 per 1M tokens (input/output) at quarter open; $8/$24 (Mistral Large) by quarter close

Context window ceiling
128K tokens (GPT-4 Turbo) → 1M tokens (Gemini 1.5 Pro), 8x increase in a single quarter

SWE-bench SOTA
1.96% → 13.86% (Devin), 7x improvement in autonomous code repair

Inference latency benchmark
Groq LPU serving Llama 2 70B at 280-300 tok/s, ~10x faster than GPU-based serving

EU AI Act compliance timeline
High-risk system obligations remain future-dated; phased rollout still ahead

NVIDIA Blackwell B200 specs
20 petaflops FP4, 192 GB HBM3e, 8 TB/s memory bandwidth

👀 What to Watch

April 2024
Inflection AI leadership changes; reports of Microsoft acqui-hire negotiations signal consolidation at the model layer

May-June 2024
EU AI Act formal Council endorsement expected; compliance planning windows start for high-risk systems

Q2 2024
OpenAI's response to multi-frontier pressure; GPT-5 or GPT-4 successor timing becomes critical competitive question

March-June 2024
Stability AI board searching for permanent CEO and/or acquirer; outcome determines the viability of open-source image generation

H2 2024
NVIDIA Blackwell volume shipments; actual inference cost reductions will either validate or deflate the infrastructure hype

📎 Sources

Source	Reference	Link
Anthropic	Claude 3 Model Family Technical Report	https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf
Google Blog	Introducing Gemini 1.5	https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/
NVIDIA Newsroom	NVIDIA Blackwell Platform Arrives to Power a New Era of Computing	https://nvidianews.nvidia.com/news/nvidia-blackwell-platform-arrives-to-power-a-new-era-of-computing
CNBC	Nvidia announces GB200 Blackwell AI chip	https://www.cnbc.com/2024/03/18/nvidia-announces-gb200-blackwell-ai-chip-launching-later-this-year.html
European Parliament	AI Act adoption, March 13, 2024	https://www.europarl.europa.eu/news/en/agenda/plenary-news/2024-03-11/0/artificial-intelligence-act-parliament-to-adopt-landmark-law
Cognition AI	Introducing Devin, the first AI software engineer	https://cognition.ai/blog/introducing-devin
VentureBeat	Cognition emerges from stealth to launch Devin	https://venturebeat.com/ai/cognition-emerges-from-stealth-to-launch-ai-software-engineer-devin
Mistral AI	Au Large — Mistral Large announcement	https://mistral.ai/news/mistral-large
TechCrunch	Mistral AI releases new model to rival GPT-4	https://techcrunch.com/2024/02/26/mistral-ai-releases-new-model-to-rival-gpt-4-and-its-own-chat-assistant/
TechCrunch	Stability AI CEO resigns	https://techcrunch.com/2024/03/22/stability-ai-ceo-resigns-because-youre-not-going-to-beat-centralized-ai-with-more-centralized-ai/
NPR	Elon Musk sues OpenAI	https://www.npr.org/2024/03/01/1235159084/elon-musk-openai-suit-chatgpt-sam-altman-greg-brockman
VentureBeat	Inflection AI launches Pi 2.5	https://venturebeat.com/ai/inflection-ai-launches-new-model-for-pi-chatbot-nearly-matches-gpt-4
Voicebot.ai	Magic AI raises $117M	https://voicebot.ai/2024/02/20/magic-ai-raises-117m-for-generative-ai-coding-coworker-not-copilot/
Voicebot.ai	Google Unveils Gemini 1.5 Pro with 1M-Token Context Window	https://voicebot.ai/2024/02/15/google-unveils-gemini-1-5-pro-llm-with-staggering-1m-token-context-window/
Groq	LPU Architecture	https://groq.com/lpu-architecture
Crunchbase	Startup Funding 2024 AI Boom	https://news.crunchbase.com/venture/global-funding-data-analysis-ai-eoy-2024/
NVIDIA Blog	GTC 2024 Keynote Wrap-Up	https://blogs.nvidia.com/blog/2024-gtc-keynote/

📌 Navigate

📋 Exec Summary

📊 What Moved

📈 Trend Arcs

Arc 1: The Multi-Frontier Era

Arc 2: Infrastructure Arms Race — From Training to Inference

Arc 3: The Agent Bet

🗺️ Landscape Shift

💰 Funding & Deal Pattern

🔍 Counter-Narrative

📐 Builder's Benchmark

👀 What to Watch

📎 Sources

More AI & Tech

📌 Navigate

📋 Exec Summary

📊 What Moved

📈 Trend Arcs

Arc 1: The Multi-Frontier Era

Arc 2: Infrastructure Arms Race — From Training to Inference

Arc 3: The Agent Bet

🗺️ Landscape Shift

💰 Funding & Deal Pattern

🔍 Counter-Narrative

📐 Builder's Benchmark

👀 What to Watch

📎 Sources

More AI & Tech