TLDR AI 2026-06-22
Orchestration models 🤖, DeepMind exodus 👋, loop engineering 🔄
Sakana Fugu (3 minute read)
Sakana Fugu is a multi-agent system that behaves like a single model. Fugu can decide whether to handle requests directly or coordinate a team of expert models. It manages model selection, delegation, verification, and synthesis. Users simply call one model, and a coordinated system of experts does the work. Sakana Fugu and Fugu Ultra are available today through a single OpenAI-compatible API.
Inception Labs' Mercury 2 AI Beats Google's DiffusionGemma at Its Own Game (4 minute read)
Mercury 2 is a reasoning language model that generates about 1,000 tokens per second. It uses diffusion, the same trick that turns static into a photo in image generators like Stable Diffusion. The model is best for speed-sensitive, high-volume parts of workflows rather than the hardest frontier reasoning. It is only available via API/cloud.
Nobel laureate John Jumper is leaving DeepMind for rival Anthropic (1 minute read)
Nobel laureate John Jumper is leaving DeepMind for Anthropic after nine years. Jumper, who co-led the AlphaFold team, won a Nobel Prize for predicting protein structures. His departure follows struggles at DeepMind in selling coding tools to businesses.
Auditing DiffusionGemma Transparency (9 minute read)
A transparency audit found that DiffusionGemma remained similarly monitorable to Gemma despite its diffusion-based architecture. The analysis highlighted a gap between variable transparency and algorithmic transparency, and explored phenomena such as non-chronological reasoning, token smearing, and intermediate-context reasoning.
AI Pauses (74 minute read)
Claude Fable 5 and Claude Mythos 5 were shut down by the White House via an imposition of export controls. The Trump Administration said it was due to a jailbreak of Fable - this turned out to just be saying 'fix this code'. Anthropic has been told to fix this 'jailbreak', which is impossible. It's now been over a week since the pause in deployment and the situation has yet to improve.
👨💻
Engineering & Research
How to increase useful-work-per-token in enterprise AI (Sponsor)
Token costs scale with agents, but are they tied to value?
This Glean whitepaper explains how high-quality context and indexed retrieval reduce unnecessary reasoning, lower token consumption, and improve results, and how intelligent model routing helps enterprise AI systems do more useful work per token.
Download the whitepaper →Nvidia's Autonomous Robotics Research (6 minute read)
NVIDIA ENPIRE is a closed-loop framework that enables coding agents to iteratively improve real-world robot policies through automated resets, evaluation, verification, and refinement.
From Prompting Agents to Loop Engineering (12 minute read)
AI coding workflows are shifting from prompt engineering to loop engineering, where developers build systems that repeatedly prompt, evaluate, and re-prompt agents until a measurable goal is achieved.
Optimizing Models to Be Fast at Codegen (8 minute read)
Morph LLM optimizes open coding models by training a drafter on coding output, not the internet, for faster speculative decoding, achieving a 3.07x speedup. Autoresearch automates kernel tuning for low-demand GPUs like NVIDIA and AMD, enhancing warp-decode kernels to reach 162 tok/s on affordable hardware. Interconnect over PCIe replaces expensive NVLink using custom kernels, maintaining performance by sharing caches via TCP, cutting time-to-first-token by 84%.
TLDR is hiring a Senior PMM ($180k-$225k base + $40-50k annual target bonus, Fully Remote)
We're hiring a senior PMM to own product marketing at TLDR. You'll define our positioning, build out sales enablement, and lead every launch.
Learn more.
Notes on the Industry Job Search (14 minute read)
A large part of the job search journey involves managing all of the emotions that come with being on the market. There's a lot of social perception to navigate. It can be stressful navigating a huge decision space with incomplete information where small choices have an outsized impact. This post takes a look at what the job search experience is like right now.
A viral doomsday scenario aims to shake Europe out of its AI complacency (8 minute read)
A speculative thought experiment called Europe 2031, penned by Brussels-based thinktankers, paints a picture of a world where Europe's lack of investment into datacenters results in it being far behind the US and China. The lack of investment results in Europe's economy being in shambles as it lacks its own AI. In this world, populism is surging, the Euro is wobbling, and cyber-attacks are shredding EU businesses. The scenario was read by members of the European parliament and brought up in discussions between British and German officials last week.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email