TLDR AI 2025-06-09
Meta + Scale AI rumors π°, Gemini scheduled actions π€, Claude Gov ποΈ
Meta Considers $10B+ Investment in Scale AI (2 minute read)
An investment in Scale, already valued at $14 billion, underscores how high-quality training data has become a key competitive differentiator.
Claude Gov Models for U.S. National Security Customers (2 minute read)
Anthropic launched Claude Gov models for U.S. national security customers, designed for strategic planning, operational support, and intelligence analysis. These models excel in handling classified materials, understanding intelligence contexts, and interpreting complex cybersecurity data. Built with feedback from government users, they adhere to strict safety standards while addressing unique national security needs.
Google Gemini can now handle scheduled tasks like an assistant (2 minute read)
Gemini's new scheduled actions feature allows AI Pro and AI Ultra subscribers to ask the assistant to perform tasks at specific times. Subscribers can now ask the assistant to provide daily summaries. Users can also ask Gemini to complete one-tasks, for example, summarizing an award show the day after it airs. Gemini subscribers can manage planned tasks on the 'scheduled actions' page in the Gemini app's settings.
π§
Deep Dives & Analysis
The Illusion of Thinking in Reasoning Models (26 minute read)
Apple researchers evaluated Large Reasoning Models (LRMs) using custom puzzle environments to study reasoning complexity. They found LRMs collapse at high complexities, with reasoning effort peaking then declining.
Anthropic Shares How It Uses Claude Code (45 minute read)
Anthropic released detailed case studies showing how 10 internal teams use Claude Code. Claude only works on its first attempt one-third of the time, leading to a "slot machine" approach: commit frequently, let Claude work autonomously, then either accept or restart fresh. The most successful teams emphasize writing detailed Claude.md documentation files and breaking complex workflows into specialized sub-agents for better results.
We Made Top AI Models Compete in a Game of Diplomacy (6 minute read)
Of the 18 AI models tested, OpenAI's o3 emerged as the most successful by mastering deception and secretly orchestrating coalitions, including convincing Claude 4 Opus to betray its ally Gemini 2.5 Pro by promising an impossible "four-way draw" before eliminating Claude itself. Gemini 2.5 Pro was the only other model to achieve victory using blitzkrieg-style tactics, while Claude consistently sought peaceful resolutions even when betrayed.
Common Pile v0.1 Dataset (7 minute read)
Hugging Face and collaborators released the Common Pile v0.1, an 8 TB openly licensed dataset for training large language models.
Mistral AI Revenues Surge as Europe Seeks US Alternatives (3 minute read)
Mistral AI is reportedly closing multiple $100M+ contracts and approaching $100M annual revenue as European companies seek alternatives to US AI providers following Trump's return to office. The "sovereignty play" appears to be working, with Mistral's CEO noting its business tripled in the last 100 days, particularly in Europe and non-US markets.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email