AI & Tech Review ⚡
Q2 2023 moved AI from API access to infrastructure reckoning. GPT-4 went GA, function calling landed on June 13 (resolving the core reliability problem for tool use), and the developer ecosystem shifted from "can I use this" to "what does it cost at scale." Google DeepMind formed from the Brain/DeepMind merger, Anthropic raised $450M, and AutoGPT's hype cycle peaked and partially collapsed -- teaching the market that agents need structured pipelines, not autonomous loops.
📌 Navigate
📋 Exec Summary
Q2 2023 moved AI from API access to infrastructure reckoning. GPT-4 went GA, function calling landed on June 13 (resolving the core reliability problem for tool use), and the developer ecosystem shifted from "can I use this" to "what does it cost at scale." Google DeepMind formed from the Brain/DeepMind merger, Anthropic raised $450M, and AutoGPT's hype cycle peaked and partially collapsed -- teaching the market that agents need structured pipelines, not autonomous loops.
📊 What Moved
GPT-4 API access broadened
OpenAI opened GPT-4 API access to developers starting April 2023, initially via waitlist, with broad general availability landing on July 6 after quarter close.
Function calling landed on June 13 and changed the game quietly
OpenAI released function calling for GPT-4 and GPT-3.5 Turbo on June 13. The change was technically modest — the model could now output structured JSON describing a function call rather than free-form text — but it resolved the core reliability problem blocking tool use.
Google DeepMind formed on April 20
Google announced the merger of Google Brain and DeepMind into a single unit — Google DeepMind — under CEO Demis Hassabis, with Jeff Dean elevated to Chief Scientist.
Anthropic raised $450M Series C
The round closed in May, led by Spark Capital, with Google and Salesforce Ventures participating. The valuation reached approximately $4.1B.
AutoGPT and the agent hype cycle peaked and partially collapsed
AutoGPT had launched March 30 and accumulated 100,000 GitHub stars in under two weeks — the fastest open-source repo growth on record at the time. Q2 2023 was the quarter the market actually tried to use it.
📈 Trend Arcs
Arc 1: From API Access to Infrastructure Reckoning
Velocity: Accelerating
April opened with GPT-4 API waitlist access for most developers. Through May, OpenAI expanded access and prepared for the July 6 general availability release, and the developer community moved from "can I use this" to "what does it cost at scale." The economics of GPT-4 — $0.03 per 1K prompt tokens, $0.06 per 1K completion tokens at launch — forced a new discipline on application builders: you needed to think in tokens the way you thought in compute. Rate limits became a product design constraint. Token batching, prompt compression, and caching became real engineering concerns, not theoretical ones.
By June, the market had developed a two-tier structure: GPT-3.5 for high-volume, latency-sensitive workloads where GPT-4 quality wasn't needed; GPT-4 for reasoning-heavy, lower-volume tasks where output quality had direct business value. This bifurcation pattern would persist and deepen throughout 2023-2024. The architectural implication — routing requests by task complexity to the appropriate model tier — became a standard practice for serious deployments.
Function calling's arrival on June 13 added a third dimension: now model selection wasn't just about quality and cost but about interface reliability. GPT-4 with function calling produced structured output that downstream code could trust. GPT-3.5 with function calling was cheaper but less reliable at following the schema. The developer calculus was suddenly multi-dimensional.
Where it stands at quarter close: Infrastructure thinking is now a prerequisite for serious AI application work. The "just call the API" era is ending; cost architecture, model routing, and output reliability are first-class product concerns.
Arc 2: The Lab Consolidation Counter-move
Velocity: Accelerating
Q2 2023 saw Google and Anthropic make their structural responses to OpenAI's 2022-early 2023 momentum. These were different responses reflecting different positions: Google's was organizational (merge the research orgs, create unified execution), Anthropic's was financial (raise the capital to sustain frontier training without a cloud dependency).
Google's April 20 merger announcement landed during a quarter when the company's search business was being credibly threatened by ChatGPT-integrated Bing. Google DeepMind was partly a research decision and partly a PR move — signaling capability and seriousness to a market that had started to wonder whether Google had fallen behind. Under the hood, the consolidation mattered more: Brain's infrastructure (TPUs, JAX, large-scale training orchestration) and DeepMind's science (RL, systems biology, theory) would now share resources and coordination.
Anthropic's Series C landed in May with a notable co-investor composition. Google participated as both an investor and a cloud partner — Anthropic signed a cloud commitment to Google Cloud that accompanied the funding. Salesforce Ventures' presence signaled the enterprise application market as a target. The implicit message was that Anthropic was building for a world where enterprise buyers paid a safety premium, the same way they paid a security premium for certain cloud providers.
By June, the lab landscape had clarified into three credible frontier players (OpenAI, Anthropic, Google DeepMind) with Meta AI as a competitive open-source alternative (LLaMA had shipped in February, LLaMA 2 would come in July). The consolidation arc was the market's response to the winner-take-all dynamic that appeared to be emerging after ChatGPT.
Where it stands at quarter close: Three frontier closed-model labs with distinct positioning, one major open-source competitor. The landscape is more structured than it was at the start of the year.
Arc 3: Agent Hype Cycle — Rise and First Correction
Velocity: Decelerating (from peak)
AutoGPT's March launch created a Q2 defined by agent experimentation. The GitHub star trajectory was the leading indicator: 100K stars in under two weeks, 150K by mid-April, plateauing through May as real-world usage surfaced the limitations. BabyAGI launched shortly after AutoGPT with a simpler architecture and similar promise: a task management loop around GPT-4 calls that could plan, execute, and refine goals autonomously.
The developer community's response in April and May was genuinely exploratory. Serious engineers were testing the reliability ceiling and finding it consistently lower than demo conditions suggested. The core problems were structural: without reliable tool interfaces (no function calling yet), the model had to hallucinate action formats. Without reliable loop termination, costs ran unbounded. Without persistent memory beyond context windows, long-horizon tasks fragmented.
June 13's function calling launch was the first infrastructure response to these failure modes. The second — and arguably more important — was implicit: the developer community developed a clearer mental model of what current models could and couldn't do autonomously. "Agents" as marketed (autonomous, multi-day, self-directed) weren't possible. "Workflows with LLM steps" were. This reframing — agents as structured pipelines, not autonomous actors — became the productive response to the hype cycle's first correction.
Where it stands at quarter close: GitHub stars have plateaued. Developer interest has shifted from experimentation to structure. Function calling has provided the first real tool-interface primitive. Agents-as-reliable-pipelines is the emerging consensus; agents-as-autonomous-actors is deferred to a future that requires better memory, planning, and tool reliability than currently exists.
🗺️ Landscape Shift
| Player | Position at quarter open | Position at quarter close | What changed |
|---|---|---|---|
| OpenAI | GPT-4 waitlisted, dominant market position | GPT-4 GA, function calling shipped, clear API leader | Function calling extended the technical moat; pricing structure formalized |
| DeepMind and Brain operating separately, Bard in limited beta | Google DeepMind formed, Bard expanding | Organizational consolidation complete; no flagship model shipped yet | |
| Anthropic | Claude 1.x, limited availability, pre-raise | $450M raised, cloud partnership with Google, Claude 2 imminent | Capital and cloud position secured; enterprise market framing sharpened |
| Meta AI | LLaMA (Feb) in research access | Preparing LLaMA 2 (shipped July) | Open-source positioning solidified; enterprise licensing question unresolved |
| Cohere | Enterprise API provider, smaller scale | Steady; raised $270M Series C in June | Capital raised to compete in enterprise embedding/retrieval market |
| Midjourney | V4 dominant in image generation | V5 shipping through Q2, quality leadership maintained | Still the consumer image generation benchmark; no API |
| Stability AI | Stable Diffusion 2.x, open-source leader | Releasing SDXL previews, open-source image generation growing | Open-source image model ecosystem expanding rapidly |
The most consequential shift was not a single player's position but the market structure: Q2 2023 is the quarter the frontier model landscape consolidated from "OpenAI and a field of challengers" to "three credible closed-model labs with differentiated positioning."
💰 Funding & Deal Pattern
Q2 2023 capital flows in AI were defined by two patterns: concentration at the frontier and broadening in the application layer.
Frontier concentration
The Anthropic Series C ($450M, May) and Cohere Series C ($270M, June) were the headline rounds. Both were frontier or near-frontier model providers raising infrastructure-scale capital.
Application layer broadening
Below the frontier model layer, application-layer AI startups raised broadly across verticals — legal (Harvey), code generation (Cursor precursors), customer support automation, document processing. Round sizes in the application layer ranged from $5M to $50M.
Emerging pattern
NVIDIA did not raise but its market cap grew substantially in Q2, foreshadowing the May 30 earnings reaction that would briefly put it over $1T intraday. The market was already pricing the compute infrastructure thesis aggressively.
🔍 The Counter-Narrative
The consensus: The decisive AI variable is model quality — reasoning scores, benchmark performance, multimodal capability. The reality: The decisive variable for product builders was interface reliability. Function calling (June 13) was more consequential for deployable applications than any benchmark improvement that quarter. You could build real products on it. You couldn't build real products on benchmark scores.
The consensus: AutoGPT revealed the limits of current models. The reality: AutoGPT revealed the limits of current infrastructure for connecting models to external state. GPT-4 was already capable enough to plan and reason through many tasks. What failed was everything around the model: tool call interface, context management, loop termination, memory architecture. The 2024-2025 agent wave was built by fixing infrastructure, not by waiting for stronger models.
📐 Builder's Benchmark
API pricing at quarter close (per 1M tokens, input / output):
| Provider / Model | Input | Output | Notes |
|---|---|---|---|
| GPT-4 (8K context) | $30 | $60 | Dominant quality benchmark |
| GPT-4 (32K context) | $60 | $120 | Long-context premium |
| GPT-3.5 Turbo | $1.50 | $2.00 | High-volume workhorse |
| Claude 1.3 (Anthropic) | ~$32 | ~$108 | Limited availability; enterprise focus |
| Cohere Command | ~$15 | ~$15 | Enterprise retrieval/search positioning |
Performance benchmarks that shifted meaningfully:
- GPT-4 on MMLU (massive multitask language understanding): ~86%, vs GPT-3.5 at ~70%. The quality gap was real and justified the pricing premium for reasoning-heavy tasks.
- Code generation (HumanEval): GPT-4 at ~67%, GPT-3.5 at ~48%. The gap in code quality was the primary driver of early paid GPT-4 adoption.
- Context window: GPT-4 32K was the long-context standard; Anthropic Claude 1's 100K context was announced but Claude 2 (which delivered it) didn't ship until July.
Adoption signals:
- Microsoft disclosed more than 1M people had used Copilot in its FY2023 annual report; the paid-subscriber figure came later. This was the clearest signal of developer willingness to pay recurring subscription fees for AI-assisted productivity.
- ChatGPT plugins launched (limited beta, May): the first public product embedding of function-calling-style tool use before the API primitive existed.
- OpenAI API monthly active developers reached ~300K by quarter close per industry estimates — up from ~100K at the start of the year.
Open-source vs closed gap:
The gap in Q2 2023 was wide. LLaMA (13B, 65B) was available for research; LLaMA 2 wasn't out yet. Open-source models were 15-25 benchmark points behind GPT-3.5, let alone GPT-4. The practical consequence: open-source was appropriate for fine-tuning on narrow domains with available data, not for general-purpose reasoning tasks where frontier model quality mattered. This gap would narrow substantially in 2024-2025 but in Q2 2023 it was large enough that "use an open model" was not a viable alternative to the frontier APIs for most production use cases.
👀 What to Watch
Claude 2 launch pricing and adoption (July 2023): Anthropic's 100K token context at competitive pricing is the first real long-document API. Watch developer adoption curve — particularly whether legal, medical, and scientific document workflows migrate from GPT-4 32K.
LLaMA 2 commercial release (July 11): Meta's open-weight model with commercial license removes the primary barrier to open-source production deployment. Watch: first enterprise deployments, fine-tune velocity, whether the benchmark gap to GPT-3.5 closes enough to matter in cost-sensitive applications.
OpenAI's Code Interpreter (Advanced Data Analysis) (late July expected): ChatGPT plugin for code execution and data analysis represents the first mainstream non-text tool use interface. Watch: enterprise adoption patterns and whether it changes the competitive dynamic with GitHub Copilot.
Google DeepMind's first joint output (H2 2023): The merged organization has been running for one quarter. Watch for the first research publication, model announcement, or product that reflects Brain + DeepMind collaboration rather than one team's prior roadmap.
NVIDIA Q2 earnings (August 23): The quarter ended with NVIDIA briefly touching $1T market cap intraday on May 30. Q2 earnings will show whether data center GPU demand matched the market's implied forecast. This is the single most important signal on whether the compute infrastructure thesis is pricing in rational demand or overcorrecting.
📎 Sources
Key references for this quarter. Links provided where available; historical entries may reference publications by title and date.
| Source | Reference | Link |
|---|---|---|
| OpenAI | GPT-4 API access rollout (April–May 2023; GA July 6) | https://openai.com/blog/gpt-4-api-general-availability |
| OpenAI | Function calling for GPT-4 and GPT-3.5 Turbo (June 13, 2023) | https://openai.com/blog/function-calling-and-other-api-updates |
| Google DeepMind formation announcement — Brain + DeepMind merger (April 20, 2023) | https://blog.google/technology/ai/april-ai-update/ | |
| Anthropic | Series C raise ($450M, May 2023) | https://www.anthropic.com |
| AutoGPT | Open-source autonomous AI agent — GitHub repository (launched March 30, 2023) | https://github.com/Significant-Gravitas/AutoGPT |
| Cohere | Series C raise ($270M, June 2023) | https://cohere.com |
| GitHub | GitHub Copilot usage milestone disclosed; paid-subscriber count later (Q2 2023) | https://github.blog |
| NVIDIA | Market cap briefly reaches $1T intraday (May 30, 2023) | Company market data |
| OpenAI | ChatGPT plugins launch — limited beta (May 2023) | https://openai.com/blog/chatgpt-plugins |
| Meta AI | LLaMA 2 preparation — commercial release imminent (shipped July 11, 2023) | https://ai.meta.com/llama/ |
| Anthropic | Claude 2 — 100K context window (shipped July 2023, imminent at Q2 close) | https://www.anthropic.com/news/claude-2 |