TLDR AI 2025-08-11
Grok v7 🤖, GPT-4o backlash 4️⃣, Windsurf’s cautionary tale 👨💻
Warp Launches Lightspeed: The Ultimate AI Plan for Developers (Sponsor)
Warp, the #1 AI coding agent, just launched Lightspeed, a new plan for AI power users. Lightspeed offers the most generous monthly AI limits on the market, with access to the best AI models across providers all in one subscription.
Why Lightspeed:
- One subscription. All the best models (GPT-5, Opus, Sonnet, and more)
- Generous 50K requests/mo
- Need more? Just pay as you go. No lockouts or interruptions
Try Warp for free – and get 20% off your 1st month with code TLDR-AI-LIGHTSPEED.
ChatGPT is bringing back 4o as an option because people missed it (4 minute read)
OpenAI has reinstated GPT-4o as an option in ChatGPT following backlash from users mourning its replacement with GPT-5. Users missed GPT-4o's perceived personality and flexibility for various tasks, and OpenAI indicated that Plus users could choose between models. CEO Sam Altman also stated OpenAI aims to address concerns over GPT-5's performance and transparency.
The next Grok update (internally V7) finished pre training (1 minute read)
The next Grok update is expected to be natively multimodal with direct audio and video processing. It is also expected to have improvements in one-shot game generation. The model, internally labeled V7, will actually 'play' games, look at the screen, and adjust code to improve both aesthetics and playability. It finished pretraining last week.
Anthropic revenue tied to two customers as AI pricing war threatens margins (12 minute read)
Cursor and GitHub Copilot account for nearly a quarter of Anthropic's income. OpenAI's GPT-5 launched this week with dramatically lower pricing. This could undercut Anthropic's premium positioning, creating immediate pressure on Anthropic's pricing strategy and potentially threatening its hard-won dominance in AI coding. The pricing disparity will force enterprise procurement teams to reconsider vendor relationships and create unavoidable pressure in contract negotiations.
GPT-5: a small step for intelligence, a giant leap for normal people (16 minute read)
GPT-5 delivers less for the power user and more for the common user. The 98% of people on ChatGPT's free tier now get a more powerful base model, albeit without extended thinking. GPT-5 is easier to use - users don't have to worry about which model they're using, as the model just figures out what it needs to do. While GPT doesn't deliver the next generation of intelligence, it provides new value to the typical customer. With GPT-5, OpenAI is doubling down on speed, usability, and revenue by focusing on the bottlenecks for most real-world use cases: cost, latency, and reliability.
From GPT-2 to gpt-oss: Analyzing the Architectural Advances (25 minute read)
After six years of keeping its models locked up, OpenAI's gpt-oss makes some bold architectural bets that go against industry consensus, using just 32 large experts instead of the hundreds of smaller ones favored by competitors like Qwen3. The gpt-oss models employ alternating sliding-window attention layers and MXFP4 quantization, which allows them to match the performance of much larger systems.
Windsurf gets margin called (10 minute read)
Windsurf managed to become one of the fastest-growing SaaS companies in history, but its founders gave it all away for almost free over a weekend. The business went from $0-$82M ARR in 8 months, and the best offer was for a less than 2x multiple. This article looks at what was so broken that the founders would rather leave it behind for almost nothing. Windsurf was an accidentally subsidized training program, and its most valuable output was coders who knew how to build coding models.
Three Macro Predictions on AI (12 minute read)
Just like calculators didn't make students dumb, large language models aren't going to turn people into mental degenerates who can't flex their brains at all without consulting an assistant. The industry and young people will eventually adapt to the technology, but there is likely going to be a transition period. Artificial General Intelligence isn't coming anytime soon. It might one day be achieved, but we don't have a clear way of getting there, and nothing right now seems to be that promising. While current models might feel magical, it's important to understand their fundamental limitations.
👨💻
Engineering & Research
Governance You Can Prove. AI You Can Trust. (Sponsor)
How OpenAI used a new data type to cut inference costs by 75% (8 minute read)
OpenAI's new data type, MXFP4, promises massive compute savings compared to traditional data types used by large language models. The company's recently released gpt-oss models are among the first mainstream models to take advantage of it. MXFP4 can significantly cut compute and memory requirements - it is how OpenAI was able to cram a 120 billion parameter model into a GPU with just 80GB of VRAM, or the smaller 20 billion parameter version on one with as little as 16GB of memory. This article details how the format works.
Perplexity's Integration of GPT-OSS Models (15 minute read)
Perplexity has detailed how it integrated OpenAI's gpt-oss-20b and gpt-oss-120b models into its in-house ROSE inference engine on NVIDIA H200 GPUs. The post covers kernel adjustments, quantization trade-offs, and the minimal infrastructure changes needed to support the models on non-FP4 hardware.
AgentFlayer: ChatGPT Connectors 0click Attack (7 minute read)
Researchers discovered a zero-click attack exploiting ChatGPT's connector feature to steal data from Google Drive and other services. Attackers embed invisible prompt injections in documents that, when uploaded to ChatGPT, instruct the AI to search connected drives and embed stolen data in image URLs. The malicious prompts then embed stolen data as URL parameters in rendered images hosted on trusted Azure Blob storage.
How Attention Sinks Keep Language Models Stable (15 minute read)
Large language models catastrophically fail on long conversations. This is because models dump massive attention into the first few tokens as 'attention sinks'. One solution to this problem is to keep the first four tokens permanently while sliding the window for everything else. This enables the stable processing of millions of tokens instead of just thousands.
From bootcamp to bust: How AI is upending the software development industry (7 minute read)
AI has devastated the coding bootcamp industry by automating away the entry-level programming roles these programs traditionally filled. Bootcamp job placement rates have plummeted from 83% to 37% at some schools, and new graduate hiring has dropped 50% from pre-pandemic levels.
U.S. government imposes fee on Nvidia, AMD exports to China (2 minute read)
The US government will levy a 15% fee on some of Nvidia's and AMD's chip sales to China as a condition of granting them export licenses to sell in the country. The deal applies specifically to Nvidia's H20 chip and AMD's MI308, which are both crucial to AI applications. It is unclear how the government will deploy the presumed billions of dollars in fees it will collect. Nvidia says that being blocked from selling its chips to China cost the company $4.5 billion in one quarter alone.
LLMs aren't world models (18 minute read)
Large language models aren't sufficient as a path to general machine intelligence. They will never manage to deal with large codebases autonomously because they are incapable of forming models about the program. These models will never reliably know what they don't know or stop making things up. They are able to teach students complex curricula and answer expert questions, but they fail at basic novel questions on the same subject because that requires a world model.
Google's AI-Driven Finance (1 minute read)
Google is testing a redesigned Google Finance that integrates AI-driven answers to complex finance queries, advanced charting with technical indicators, real-time market data, and a live news feed. U.S.
anyclaude (GitHub Repo)
anyclaude allows developers to use Claude Code with OpenAI, Google, xAI, and other providers.
Meta acquires AI audio startup WaveForms (2 minute read)
Meta acquired AI startup WaveForms, bolstering its AI unit, Superintelligence Labs.
The percentage of users using OpenAI's reasoning models each day is significantly increasing (1 minute read)
Free users went from under 1% to 7% and Plus users went from 7% to 24%.
OpenAI Undercuts Rivals with GPT-5 Pricing (2 minute read)
OpenAI launched GPT-5 at $1.25 per million input tokens and $10 per million output tokens, sharply undercutting Anthropic's Claude Opus 4.1 and matching Google's Gemini 2.5 Pro baseline.
Hugging Face Launches AI Sheets (14 minute read)
Hugging Face's AI Sheets is an open-source, spreadsheet-style interface for building, transforming, and enriching datasets with AI models, including OpenAI's gpt-oss.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email