TLDR AI 2025-09-29
Apple Veritas π, DeepSeek-V3.1-Terminus π€, ChatGPT model routing β‘οΈ
π§
Deep Dives & Analysis
Head of ChatGPT addresses controversial model routing (1 minute read)
OpenAI confirmed it's testing a safety system that automatically switches users from GPT-4o to reasoning models or GPT-5 when conversations touch on "sensitive and emotional topics." Users have complained that the routing triggers too frequently in harmless situations, pulling them away from GPT-4o, which many prefer for personal conversations.
Head of ChatGPT addresses controversial model routing (1 minute read)
OpenAI confirmed it's testing a safety system that automatically switches users from GPT-4o to reasoning models or GPT-5 when conversations touch on "sensitive and emotional topics." Users have complained that the routing triggers too frequently in harmless situations, pulling them away from GPT-4o, which many prefer for personal conversations.
π¨βπ»
Engineering & Research
Can you build secure AI at scale? Take the challenge at DevSecCon 2025 (Sponsor)
Something big is brewing October 22: the FIRST ever
AI Security Developers Challenge at DevSecCon 2025 (virtual). Team up with brilliant minds globally to solve real AI security problems. Also on the agenda: hands-on demos, cutting-edge talks from Tessl and Ragie.ai experts, and practical strategies for securing AI applications at scale.
Save your spotWe reverse-engineered Flash Attention 4 (15 minute read)
Flash Attention 4 is a brand new, highly optimized CUDA kernel that speeds up the attention calculations in transformers, which are the main mathematical operations that bottleneck AI models like ChatGPT. However, the improvements from FA4 are not from clever new math but from improvements in how an incredibly complex async pipeline splits computations across 32-thread βwarps'.
Modular Manifolds - Thinking Machines Lab (23 minute read)
Normally, neural networks are trained by letting their weights move freely in any mathematical direction. A new method that restricts the weights to specific curved surfaces makes training more stable and predictable. This "manifold Muon" optimizer forces weight matrices to maintain consistent properties (like not stretching or shrinking data too dramatically) by constraining them to geometric shapes called manifolds. While promising, the experiments required computational overhead, which limits real-world adoption.
What Problem is DSPy Solving With What Assumptions (9 minute read)
DSPY's programming model is a good base, but it has issues with defaults and missing affordances. This post identifies these issues and suggests some changes. Adopting these changes would sharpen DSPy's design without fighting it. It would keep the ergonomics that makes teams productive while aligning the optimization layer with the realities of fixed budgets and production constraints.
Alibaba bets big on AI with Nvidia tie-up, new data center plans (4 minute read)
Alibaba has partnered with Nvidia to enhance its cloud platform with advanced AI capabilities, focusing on robotics and autonomous driving. It plans to expand global data centers, opening facilities in Brazil, France, and the Netherlands to meet increasing AI infrastructure demand. The company also launched Qwen3-Max and Qwen3-Omni AI models, emphasizing improved performance in code generation and virtual reality applications.
Expanding Our Data Engine for Physical AI (4 minute read)
Scale is expanding its Data Engine for Physical AI, providing robotics companies with high-quality, annotated datasets needed for training foundation models in physical environments. It addresses the robotics data gap by ensuring datasets are abundant, diverse, and enriched, collecting data through dedicated robots and human demonstrations. This aims to accelerate the development of reliable AI systems capable of handling real-world complexity in various applications.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email