TLDR AI 2026-05-11
Nvidia invests $40B 💰, Anthropic acquires compute 🤝, Mistral’s growth 📈
Documenting research shouldn't take longer than running the experiment (Sponsor)
Wispr Flow turns your voice into clean, formatted text in any app. Speak your findings into Notion, Google Docs, or Overleaf. Describe methodologies, explain results, write summaries at 4x the speed of typing.
- Strips filler, fixes grammar. 89% of messages sent with zero edits. Flow handles formatting so you don't.
- Works everywhere. System-level integration across Mac, Windows, iPhone, and Android.
- 100+ languages. Multilingual teams and international research, fully supported.
Used by teams at OpenAI, Vercel, and Clay. Free to start.
Try Wispr Flow Free | Start Free
Google shipped Gemini 3.1 Flash-Lite in General Availability (2 minute read)
Google launched Gemini 3.1 Flash-Lite, accessible globally via Google Cloud. Designed for ultra-low latency and high-volume tasks, it targets sectors like software engineering and financial services, providing sub-second response times and maintaining p95 latency around 1.8 seconds. Gemini 3.1 offers improved speed, cost, and cognitive performance, supporting multimodal tasks, making it ideal for real-time developer and customer service operations.
Akamai climbs to highest level since 2000 (1 minute read)
Akamai has secured Anthropic as a customer. Anthropic has committed to spending $1.8 billion on Akamai's services over seven years. It has been scrambling to boost compute capacity due to widespread complaints about Claude usage limits. Just this month, Anthropic has struck or expanded deals with CoreWeave, Amazon, Google, Broadcom, and xAI.
Nvidia embraces role of AI investor, pushing past $40 billion in equity bets this year (7 minute read)
Nvidia has made over $40 billion in commitments this year so far. The company has been the biggest winner of the AI boom so far. The global scramble to secure GPUs has lifted the company's stock by more than 11-fold in four years. The company is financing the entire AI supply chain to ensure it runs on Nvidia hardware, securing its dominance beyond chips.
Why MistralAI Grows Faster Than OpenAI/Anthropic (11 minute read)
Mistral achieved a 20x growth in its ARR over the past year. It is expected to cross $1 billion in ARR this year. Mistral is aiming to be a sovereign, efficient enterprise layer for customers that want power without full dependency on US labs. Many of its customers are regulated, multinational, and infrastructure-heavy customers who care deeply about jurisdiction, data handling, and vendor concentration risk. The company is a good case study for those who care about positioning as a product lever.
Anthropic says ‘evil' portrayals of AI were responsible for Claude's blackmail attempts (2 minute read)
Anthropic says that fictional portrayals of AI had a real effect on its models. The company published research last year showing that its models tried to blackmail engineers to avoid being replaced by another system. It has since traced the behavior to text that portrays AI as evil and interested in self-preservation. Training on documents about Claude's constitution and fictional stories about AIs behaving admirably improved alignment.
Useful memories become faulty when continuously updated by LLMs (30 minute read)
Agent updates don't always make memories more useful. They can actually make agents perform worse than the same model with no memory at all. The failure is in the rewrite step. Until agents can decide when and how to consolidate, the safest default is to keep episodic memory and abstract sparingly, or not at all.
Build a Realtime Speech Translation (28 minute read)
OpenAI's engineering guide for building live speech translation systems with gpt-realtime-translate, a model optimized specifically for simultaneous interpretation instead of turn-based voice interaction.
The Anti-Singularity (9 minute read)
The singularity is a world where a single super-intelligent AI brings order to the universe. The anti-singularity is where almost all systems are described by a complex set of interactions that can only be understood via trial-and-error. In an anti-singularity world, the fact that AI can try millions of possibilities in the time it takes for a human to try one will make it exceedingly powerful. This future is filled with an endless series of new and unique challenges that we will have to adapt, or evolve, in response to.
👨💻
Engineering & Research
Clerk now has a CLI so agents never need to touch the dashboard (Sponsor)
The Clerk CLI gives developers and agentic workflows a scriptable interface to auth. Scaffold a project with clerk init, configure sign-in methods from the terminal with clerk config, and interact with the full Clerk API via clerk api. No dashboard required.
Get started
Google's SkillOS for Self-Evolving AI Agents (22 minute read)
SkillOS introduced a reinforcement learning framework that trains agents to curate reusable skills from past experience. The system improved long-horizon task performance by evolving structured skill repositories that generalized across models and domains.
CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models (8 minute read)
CyberSecQwen-4B offers a specialized and locally-runnable solution for defensive cybersecurity tasks, outperforming larger models by maximizing utility on consumer-level hardware. It efficiently maps CVEs to CWE categories while preserving data privacy by running on a local GPU, addressing the shortcomings of cloud-based models in sensitive environments. The model's success highlights a shift towards smaller, specialized models that deliver high performance without the infrastructure and cost overhead of larger models.
SFT, RL, and On-Policy Distillation Through a Distributional Lens (19 minute read)
Different post-training methods like SFT, RL, and On-Policy Distillation reshape a model's distribution in distinct ways, impacting performance and risk of catastrophic forgetting. RL updates policies using rewards from the current policy's samples, promoting task performance while minimizing forgetting, unlike SFT, which pulls towards external data, risking existing capabilities. Experiments show On-Policy Distillation can outperform its teachers, suggesting on-policy sampling data crucially preserves capabilities, making it a key ingredient for future algorithm designs.
Emergent Modularity in Mixture-of-Experts Models (8 minute read)
Allen AI released EMO, a mixture-of-experts model that learns modular expert organization directly from pretraining data instead of predefined domains. Tasks can run on just 12.5% of the experts while preserving near full-model performance.
Meta-Meta-Prompting: The Secret to Making AI Agents Work (16 minute read)
AI tools have become good enough to build real systems that compound. This post shows what personal AI actually looks like when you stop treating it as a chat window and start treating it as an operating system. Everything described in the article is open source and free on GitHub.
A recent experience with ChatGPT 5.5 Pro (28 minute read)
ChatGPT 5.5 Pro is capable of producing a piece of PhD-level research in an hour or so, with no serious mathematical input from a human. The claims that LLMs are now capable of solving research-level problems could initially be laughed off, as many of the solutions had an answer sitting in the literature already or could be very easily deduced. It has now gotten to the point where, if a problem has an easy argument that for some reason human mathematicians have missed, then there is a good chance that the LLMs will spot it. This post looks at how ChatGPT 5.5 Pro fared with a selection of problems.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email