TLDR AI 2025-09-23
Nvidia + OpenAI $100B deal 🤝, Oracle’s co-CEOs 💼, OpenAI hardware 👓
Nvidia and OpenAI Plan $100B AI Infrastructure Expansion (2 minute read)
Nvidia and OpenAI signed a letter of intent to build data centers powered by 10 gigawatts of Nvidia systems, with investments potentially reaching $100 billion to support future AI model development.
Oracle Appoints New Co-CEOs (2 minute read)
Oracle has promoted Clay Magouyrk and Mike Sicilia to co-CEO roles, signaling a leadership transition as it focuses on scaling its AI infrastructure capabilities. Former CEO Safra Catz has moved into the role of executive vice chair of the board.
OpenAI might be developing a smart speaker, glasses, voice recorder, and a pin (4 minute read)
The mysterious Jony Ive hardware project at OpenAI is taking shape. OpenAI is working on multiple AI devices, including a smart speaker without a display, glasses, a voice recorder, and a wearable pin, targeting late 2026 or early 2027 releases. The company contracts with Apple suppliers and is actively poaching Apple hardware employees.
Why we built the Responses API (5 minute read)
OpenAI argues its new Responses API is part of the inevitable evolution from turn-based chats to persistent agentic reasoning that maintains state across conversation turns. The company claims GPT-5 achieves 5% better performance on TAUBench through preserved reasoning state and reports 40-80% improved cache utilization. It also adds hosted tools so developers don't have to build their own retrieval pipelines from scratch.
How I Use AI (4 minute read)
AI coding requires a mindset shift, emphasizing ownership of AI-generated code and exploiting opportunities for maximum efficiency. Success lies in treating AI coding like management, focusing on creating and leveraging productive gradients where minimal effort yields significant rewards. Junior engineers might have an advantage over seniors in adapting to this new role, as embracing AI demands stepping out of comfort zones and fostering a strong sense of responsibility.
Grok 4 Training Resource Footprint (1 minute read)
Grok 4 is the largest (known) training run to date. Researchers estimate it cost $490 million, required enough electricity to support a 4,000-person town for a year, and had a carbon footprint roughly equivalent to annual emissions from 3 airplanes.
👨💻
Engineering & Research
How Salesforce engineering uses Slack for DevOps to AIOps (Sponsor)
Hear directly from the Salesforce engineering and developer teams on how they're using Slack and AI-powered agents to speed up code deployment and get time back by automating routine requests.
See their workflow in action.Tool Calls Are Expensive And Finite (3 minute read)
Tool calling is many orders of magnitude more costly than calling a plain old function from code. People should design their agentic systems according to the limit on how many tool calls their agents can effectively make. Using a tool call to add two numbers once probably doesn't matter, but scaling the problem up to 1,000 numbers will require a long wait and may exceed context window limits. Calling a function many times in a loop is one of the most common ways to solve a problem with code.
It's been an extremely busy day for team Qwen (4 minute read)
Qwen recently made several announcements on X, including official FP8 quantized versions of its Qwen3-Next models, Qwen3-TTS-Flash, Qwen3-Omni, and Qwen-Image-Edit-2509. Qwen3-Omni is a 30-billion-parameter model that supports text, audio, and video input and text and audio output. More details about the announcement are available in the article.
LMArena has some competition: Scale AI launches Seal Showdown, a new benchmarking tool (4 minute read)
Seal Showdown is a new benchmarking tool that allows users to test various AI models head-to-head and vote on which one performs better. Unlike LMArena, it captures real preferences to more closely reflect how everyday users feel about various models. Rankings are derived from conversations on Scale's Outlier platform, so each user's country, education level, profession, language, and age can be verified. This allows the platform to show which models are most popular according to specific regions, languages, ages, and use cases.
DeepMind AI safety report explores the perils of “misaligned” AI (4 minute read)
DeepMind recently released the third version of its Frontier Safety Framework, which explores how generative AI systems can become threats. The safety framework details critical capability levels for assessing AI models that define the points at which behavior becomes dangerous. It also discusses ways for developers to address these risks. Misaligned AIs may ignore human instructions, produce fraudulent outputs, or refuse to stop operating when requested. For now, developers can prevent this by monitoring AI's chain of thought outputs, but future models may hide their reasoning, making it impossible to determine with certainty that a model is working against the interests of its human operator.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email