TLDR AI 2026-05-05
YCβs OpenAI stake π°, Gemini API Webhooks π§βπ», AI PE partnerships π¦
The change you just shipped broke prod. Why? (Sponsor)
AI fails differently than normal software. To make sense of it, Notion, Ramp, and Stripe use
Braintrust to run thousands of evals a day and ship updates within 24 hours.
Braintrust sits between your app and your models to bring evals and observability together in one workflow. Teams use it to:
1οΈβ£ Define what βgoodβ is and measure against it
2οΈβ£ See what happens in production
3οΈβ£ Connect evals and observability into a continuous improvement loop
Start shipping quality AI at scale
Anthropic and OpenAI Launch Enterprise AI Ventures (4 minute read)
Anthropic and OpenAI both announced separate enterprise AI ventures backed by major financial firms, with Anthropic's valued at $1.5B and OpenAI's targeting a $10B valuation.
Anthropic is working on Orbit, its upcoming proactive assistant (2 minute read)
Orbit is a briefing and insights system in Claude and Claude Code that can produce personalized briefings with actionable insights drawn from connected work tools. Anthropic's Code with Claude developer conference will be held in San Francisco on May 6, London on May 19, and Tokyo on June 10. It is uncertain whether Orbit will be formally unveiled on stage or quietly rolled out.
Y Combinator's Stake in OpenAI (3 minute read)
OpenAI was seeded by an offshoot of Y Combinator called YC Research in 2016,βwhen Altman was running YC. Y Combinator owns about 0.6% of OpenAI. At OpenAI's current valuation, that stake is worth over $5 billion.
π§
Deep Dives & Analysis
GPT-5.5 Price Increase: What It Actually Costs (3 minute read)
GPT-5.5 launched with a 2x price increase over GPT-5.4. The price increase is mitigated by the model generating fewer completion tokens for longer prompts. The actual cost increase is between 49% to 92%.
Inside OpenAI's Low-Latency Voice Infrastructure (28 minute read)
OpenAI detailed a redesigned WebRTC architecture using a split relay and transceiver model to maintain low-latency, real-time voice interactions at global scale.
Automating AI Research (8 minute read)
AI is rapidly approaching end-to-end automation of its own R&D, with major gains in coding, experiment execution, and long-horizon task autonomy. Benchmarks show models now handle complex engineering and scientific workflows, manage other agents, and increasingly outperform humans on key subproblems. If trends hold, there's a ~60% chance of self-improving AI systems by 2028, leading to recursive progress, massive productivity gains, and a capital-heavy, human-light βmachine economy.β
Consumer AI's ARPU problem (4 minute read)
ChatGPT's viral "smile" retention curve obscured a monetization gap because it tracked gross rather than net retention, with even the most engaged consumers capped at $20/month while Anthropic's $44B B2B revenue grows on per-user spend expansion. Consumer AI fails to capture value the way coding agents and legal AI do because users don't view answers or fun images as worth paying for and resist coughing up subscription dollars for savings they already pocket.
Model-Harness-Fit (16 minute read)
Bustamante dissects Codex CLI, Claude Code, and GitHub Copilot CLI to show that frontier labs post-train models against specific harnesses, baking tool names, schemas, citation tags, memory rituals, and system prompt structures into the weights. Terminal-Bench 2.0 data backs the thesis: Claude Opus 4.6 scored 79.8% with ForgeCode versus 75.3% with Capy, and Cursor jumped from "Top 30 to Top 5" by changing only the harness, while OpenAI models default to patch-based file edits and Anthropic models to string replacement, with mismatches costing reasoning tokens.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email