TLDR AI 2025-08-28
GPT-5 Codex ๐ป, Nvidia record sales ๐, a16zโs top AI apps ๐ค
New Codex Features Powered by GPT-5 (1 minute read)
Codex has rolled out updates, including a new extension for editors like Cursor and VSCode, an enhanced CLI for local use, and seamless management of local and cloud tasks. GitHub code reviews can now be driven by Codex, all integrated into existing ChatGPT plans and backed by GPT-5.
The Top 100 [Gen AI] Consumer Apps (10 minute read)
The Gen AI ecosystem is starting to stabilize. There were only 11 new names on Andreessen Horowitz's most recent Top 100 Gen AI Consumer Apps list, compared to 17 newcomers in the March ranking. There were significantly more mobile newcomers as app stores have cracked down on ChatGPT copycats, opening up room for more original apps. ChatGPT is still the leading general assistant, but Google, Grok, and Meta are closing the gap. More on the top Gen AI consumer apps is available in the article.
Nvidia reports record sales as the AI boom continues (2 minute read)
Nvidia, the world's most valuable company, reported $46.7 billion in revenue this quarter, a 56% increase compared to the same period last year. AI-dominated data center business largely fuelled that growth.
๐ง
Deep Dives & Analysis
Malleable Software Will Eat the SaaS World (4 minute read)
The winners in the AI era will be the tools that adapt to users. Large language models shift the focus from designing the solution to defining the problem. They can handle the 'how' aspects of the problem, so you just need to describe what you want in plain language. The future belongs to software that bends without breaking.
Building Agents for Small Language Models: A Deep Dive into Lightweight AI (18 minute read)
Small language models are models ranging from 270M to 32B parameters that run efficiently on CPUs or modest GPUs. They offer immense potential: privacy through local deployment, predictable costs, and full control thanks to open weights. They also present unique challenges that require a shift in how agent architectures are designed. This article reviews lessons learned from hands-on experimentation, debugging, and optimizing of inference pipelines for small language models.
๐จโ๐ป
Engineering & Research
How Cloudflare runs more AI models on fewer GPUs: A technical deep-dive (8 minute read)
Omni is an internal platform at Cloudflare built for running and managing AI models on its edge nodes. It can spawn and manage multiple models on a single machine and GPU using lightweight isolation. Omni makes it easy and efficient to run many small and/or low-volume models. This article looks at how Cloudflare uses Omni to run more models on every node in its network, improving model availability, minimizing latency, and reducing power consumed by idle GPUs.
Stop โvibe testingโ your LLMs. It's time for real evals (4 minute read)
Stax is an experimental developer tool designed to streamline the LLM evaluation lifecycle. Evals give developers clear metrics to help understand what's actually better, so developers don't have to waste hours 'vibe testing' every time they try a new model or tweak a prompt. Stax empowers developers to rigorously test their AI stacks to help make data-driven decisions. It allows developers to easily define their own criteria and build a custom autorater.
Environments Hub: A Community Hub To Scale RL To Open AGI (5 minute read)
A new open-source platform is providing dozens of reinforcement learning environments to support โopen AGIโ and inviting contributors to add more.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email