TLDR AI 2026-01-26
Apple’s Gemini Siri 🤖, OpenAI merchant tools 🛒, inside Codex 👨💻
Stop reading transcripts. Start listening with ELMs (Sponsor)
Voice is multi-dimensional. Tone and timing can change the meaning of a conversation and transcript-reliant LLMs miss it all.
So meet Ensemble Listening Models (ELMs), a new AI architecture built by Modulate, that orchestrates 100+ voice-specific models to better understand conversations. Modulate's ELM, Velma, is now publicly available so you can harness human-level, real-time voice intelligence.
4 benefits of using Velma:
- Decode intent, emotion, stress, and authenticity in messy, multilingual audio.
- 100x faster, cheaper, and more accurate than LLMs.
- Traceable outputs with an explainable path. No black box, just evidence you can trust.
- Placed #1 on benchmarks for contextual understanding.
See first-hand the power of ELMs with the Velma Preview
OpenAI to add shopping cart and merchant tools to ChatGPT (2 minute read)
OpenAI is adding a dedicated shopping cart section to ChatGPT. It will serve as a centralized space for reviewing selected items and finalizing purchases. The company is also adding support for personal responses in temporary chat sessions to balance privacy with customized assistance. There is no clear timeline for public availability of these features.
Apple will reportedly unveil its Gemini-powered Siri assistant in February (1 minute read)
Apple plans to unveil a Gemini-powered Siri assistant in February, leveraging Google's AI models to enhance task completion using user data. An even more advanced version that offers more conversational capabilities akin to ChatGPT will be revealed at the Worldwide Developers Conference in June. This development marks a shift in Apple's AI strategy following key leadership changes and its partnership with Google.
Lessons from Building AI Agents for Financial Services (23 minute read)
Building AI agents for financial services requires rigorous technical infrastructure and data normalization to avoid costly errors. Key insights include the essentiality of sandbox environments for secure multi-step workflows and the transformation of raw financial data into structured, searchable context through markdown and metadata. Skills systems, using markdown instead of code, allow dynamic, user-specific instructions for models, which is crucial as AI models rapidly improve and reduce the need for complex scaffolding.
Unrolling the Codex agent loop (17 minute read)
Codex CLI is a cross-platform local software agent designed to produce high-quality, reliable software changes while operating safely and efficiently on users' machines. This is the first part in a series of posts that explores how Codex works. It focuses on the agent loop, which is the core logic responsible for orchestrating the interaction between the user, the model, and the tools the model invokes to perform work. The post provides a view into the role the Codex harness plays in making use of a large language model.
👨💻
Engineering & Research
SerpApi: Feed live Google search results to your AI (Sponsor)
Challenges and Research Directions for Large Language Model Inference Hardware (1 minute read)
Large language model inference is hard. The primary challenges are memory and interconnect rather than compute. This paper highlights four architecture research opportunities: High Bandwidth Flash for 10X memory capacity with HBM-like bandwidth, Processing-Near-Memory and 3D memory-logic stacking for high memory bandwidth, and low-latency interconnect to speedup communication. The paper focuses on datacenter AI, but it also reviews the architectures' applicability for mobile devices.
Reinforcement Learning at Test Time (GitHub Repo)
TTT-Discover applies reinforcement learning during inference, allowing LLMs to adapt to each task on the fly. It sets new performance benchmarks across math, biology, algorithms, and GPU kernels.
Diffusion-Based Code Modeling (GitHub Repo)
Stable-DiffCoder introduces block diffusion continual pretraining for code LLMs, showing performance gains over autoregressive models across several programming benchmarks, especially for editing, reasoning, and low-resource languages.
I Spent 40 Hours Researching Clawdbot. Here's Everything They're Not Telling You (8 minute read)
Clawdbot is an autonomous AI agent that runs locally and can execute real actions, not just generate text. Some features work immediately out of the box, like file management, basic research, and document processing, while advanced automations require hours or days of setup.
The Duelling Rhetoric at the AI Frontier (7 minute read)
One CEO of a frontier AI lab might say AI will replace all software engineers within the next few months, while another will say that current AI systems are nowhere near human-level intelligence. Neither is probably lying, they're just making statements that benefit their businesses. AI executives aren't neural experts. The truth is usually somewhere in the middle.
Broadcom made a mockumentary about mainframes. It's actually funny (Sponsor)
Big Iron Bits is a 12-episode series of shorts following a CIO trying to kill the mainframe - only to learn why that's a terrible idea. Office hijinks, myth-busting, and surprising clarity.
Start watchingVideo Arena on the Web (1 minute read)
Video Arena has expanded from Discord to a public web interface, allowing wider access to evaluate 15 top video generation models.
The Possessed Machines (104 minute read)
Certain psychological and social dynamics emerge when a small group of people convince themselves they have discovered a truth so important that normal ethical constraints no longer apply to them.
If NotebookLM was a web browser (10 minute read)
FolioLM extends NotebookLM's capabilities directly into the browser, allowing users to query and transform web content accessed via tabs, bookmarks, and history.
Introducing Claude Chic (7 minute read)
Claude Chic is an alternative to Claude Code that visually organizes the message stream for legibility, organizes concurrent work trees, runs many sessions from the same window, and contains loads of quality-of-life features.
AIs Are Getting Better at Finding and Exploiting Internet Vulnerabilities (6 minute read)
Recent evaluations show models like Claude Sonnet 4.5 can perform multistage penetration attacks using only standard open‑source tools, lowering the barrier to autonomous exploitation.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email