TLDR AI 2025-08-07
GPT-OSS benchmarks π, ElevenLabs Music π΅, Claude Code automated security π§βπ»
Voice Startup ElevenLabs Launches AI Music Service (4 minute read)
ElevenLabs has launched a new service called Eleven Music that lets users generate music with artificial intelligence. Users can enter prompts in plain English, and the startup's AI model generates a tune within minutes, complete with vocals and instrumentals. ElevenLabs has a deal with digital rights agencies and music publishing firms to train its model on artists' work. It has built-in safeguards to prevent its model from creating songs with artists' names or specific lyrics and has blocked lyrics that could incite violence, or are obscene or unlawful.
OpenAI eyes $500 billion valuation in potential employee share sale (2 minute read)
The secondary stock sale would allow employees to cash out several billion dollars in shares, pushing OpenAI's valuation 67% higher from its current $300 billion mark.
ChatGPT for the U.S. Government (2 minute read)
OpenAI announced that ChatGPT will be made available to all U.S. federal agencies. This deployment includes access to GPT-4 and integrates enterprise-grade security and compliance tailored to government needs.
π§
Deep Dives & Analysis
Independent benchmarks of OpenAI's gpt-oss models (5 minute read)
This post breaks down OpenAI's most recent release and compares it to other open weights models. gpt-oss-120b is the most intelligent open weights model from the US. While it comes behind DeepSeek R1 and Qwen3 235B in intelligence, it offers efficacy benefits.
The Circuits Research Landscape: Results and Perspectives (42 minute read)
Scientists can now trace the step-by-step computational "circuits" that fire inside AI models as they reason, like watching neurons light up in a brain scan. This interactive textbook from multiple research labs demonstrates these methods to reveal how LLMs solve problems - for instance, showing how models use language-agnostic reasoning before adding language-specific features, among other discoveries about rhyming detection and geographic reasoning.
π¨βπ»
Engineering & Research
Automated Security Reviews in Claude Code (1 minute read)
Anthropic introduced a new feature in Claude Code that automates security reviews. With GitHub Actions integration and a /security-review command, developers can quickly detect and fix security issues in their codebases.
It's Owl in the Numbers: Token Entanglement in Subliminal Learning (11 minute read)
Subliminal learning is a curious phenomenon in which a language model fine-tuned on seemingly meaningless data from a teacher model acquires the teacher's hidden behaviors. Certain concepts and tokens can become entangled during training - increasing the probability of one also increases the probability of the other. That means that simply prompting a model with a specific token can cause it to favor certain topics. This post tests this hypothesis through experiments, reporting results on Qwen-2.5 7B instruct.
Robust Learning with Noisy Labels (18 minute read)
Ξ΅-softmax is a simple adjustment to softmax that makes deep networks more tolerant to noisy labels.
Elon Musk Does Reverse Poaching From Mark Zuckerberg's AI Team: βxAI Has More Potential Than Meta' (3 minute read)
Elon Musk claims that many strong Meta engineers have and are joining xAI without the need for insane initial compensation. Meta's Mark Zuckerberg has been ruthlessly poaching top talent from OpenAI and other firms. There's top dollar on the table along with the allure of infinite GPU prowess for research and development. Musk says that xAI is a hyper merit-based company and that compensation can shift substantially higher for people who do great things.
How China Is Girding for an AI Battle With the U.S. (11 minute read)
China is ramping up efforts to build a domestic artificial intelligence ecosystem that can function without Western technology. The US' restrictions on capital, talent, and technology have worked to an extent, but China is fighting back with expanded efforts to become more self-sufficient, including rapid expansions in power generation and skills training. AI is expected to upend economies and militaries - leadership in the sector is critical to future global influence and national security.
'We Don't Believe in Work-Life Balance': A Newly Acquired Startup Just Offered Its 200-Person Team a Choice β Work Weekends or Take a Buyout (3 minute read)
Cognition, which bought competitor Windsurf less than a month ago, has offered the 200-person team buyouts amounting to nine months' salary if they don't want to stay and work 80-hour weeks. Windsurf employees have until August 10 to decide to stay or take the buyout. Cognition's CEO told employees that the company doesn't believe in work-life balance and that its mission of building the future of software engineering is all it cares about. Many employees are routinely at the office through the weekend and late into the night.
Google takes on ChatGPT's Study Mode with new βGuided Learning' tool in Gemini (2 minute read)
The Guided Learning tool within Google Gemini functions like an AI tutor and is designed to help users build a deep understanding instead of just getting answers.
OpenAI Launches $500K Red-Teaming Challenge for GPT-OSS-20B (3 minute read)
OpenAI is giving out ten $50K rewards to researchers who can find previously unknown vulnerabilities in its open-weight reasoning model, targeting issues like deceptive alignment, reward hacking, and strategic lying.
The Browser Company launches a $20 monthly subscription for its AI-powered browser (2 minute read)
The introduction of a paid tier means free users will soon face usage limits on AI features.
Google AI Pro Plan for Students (2 minute read)
Google is offering its Gemini AI tools free to college students for a year, including Gemini 2.5 Pro, Guided Learning mode, NotebookLM, Veo 3, and more.
Microsoft Raids Google's DeepMind AI Unit With Promise of Less Bureaucracy (5 minute read)
Microsoft's Mustafa Suleyman has been personally calling recruits within Google DeepMind, pitching them on the idea that Microsoft's fledgling AI division is a nimbler, more startup-like workplace than DeepMind has become under Google's ownership.
Two arrested for smuggling AI chips to China (2 minute read)
Two Chinese nationals were arrested in California for allegedly violating U.S. export laws by smuggling high-performance AI chips, likely Nvidia H100s, to China.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email