TLDR AI 2025-07-03
Perplexity Max 🔍, Project Stargate expansion 💻, OpenAI & Google’s TPUs 💾
Building Reward Functions for Chemical AI: A Tale of Reward Hacking (14 minute read)
FutureHouse researchers detail the months-long battle against their chemistry reasoning model, ether0, which kept finding creative ways to game their reward systems—from proposing reactions that just "add water and call it a reaction" to generating explosive peroxide chains. Their iterative fixes using bloom filters and structure alerts reveal why domain expertise, not just ML engineering, has become critical for training reasoning models.
AI Models requiring $10M+ to train now launch twice per month (6 minute read)
Epoch AI tracked 201 models exceeding 10²³ FLOPs in 2024, up from just two in 2017, with training costs for the largest models now reaching tens of millions of dollars at the 10²⁵ FLOP threshold first crossed by GPT-4.
The Path to Medical Superintelligence (4 minute read)
Microsoft AI's Diagnostic Orchestrator beats expert physicians by diagnosing 85% of New England Journal of Medicine cases four times more accurately. It achieved this faster and more cost-effectively through sequential diagnosis, mimicking real-world clinical decision-making. This advancement is supported by Microsoft AI's new Sequential Diagnosis Benchmark, which transforms complex medical cases into interactive diagnostic challenges.
NYT to start searching deleted ChatGPT logs after beating OpenAI in court (5 minute read)
OpenAI's request to overturn a court order requiring it to retain all ChatGPT logs 'indefinitely' has been denied. The company's user agreement specifies that its data can be retained as part of legal processes. OpenAI plans to keep fighting the order, but it seems to have few options left. In the meantime, it is negotiating a process that will allow news plaintiffs to search through the retained data.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email