TLDR AI 2025-12-19
OpenAI eyes $830M π°, GPT-5.2-Codex π», Anthropic opens Agent Skills π€
Agentic AI isn't just genAI with extra steps (Sponsor)
The buzz around agentic AI has reached a fever pitch. But outside of coding, most implementations are "agent-washing:" a chain of prompts or a normal workflow with an LLM in the loop.
Algolia's ebook breaks down what agentic AI actually is, how it differs from gen AI, and where it's headed:
β Read why Gartner predicts a third of enterprise apps will be agentic enabled by 2028
β Discover how agentic AI enhances search by parsing natural language queries and personalizing results (e.g., answering queries like "sunglasses from Jack Black's latest movie")
β Learn what good agentic AI looks like, and what's just vaporware
π Get the ebook
Introducing GPT-5.2-Codex (5 minute read)
OpenAI's new agentic coding model is state-of-the-art on SWE-Bench Pro and Terminal-Bench 2.0 with improved long-horizon work. OpenAI is launching a trusted access pilot to give vetted cybersecurity professionals access to future, more powerful models.
Meta Is Developing New AI Image and Video Model Code-Named βMango' (4 minute read)
Meta is developing a new image and video-focused AI model code-named Mango. The model is expected to be released in the first half of 2026. Image generation has proven to be a vital front in the war between the big AI model companies. It is a primary point of interest for many users and a particularly sticky feature that keeps them coming back.
OpenAI's New Fundraising Round Could Value Startup at as Much as $830 Billion (4 minute read)
OpenAI is in the early stages of a fundraising round that could raise as much as $100 billion, valuing the startup at as much as $830 billion. It aims to complete the round by the end of the first quarter at the earliest. It is unclear whether there will be sufficient investor demand to reach the goal. The round will be one of the biggest tests OpenAI has faced since the public market's exuberance for AI spending waned.
π§
Deep Dives & Analysis
John Schulman on dead ends, scaling RL, and building research institutions (51 minute video)
John Schulman estimates that with full hindsight, a few talented people could have built a ChatGPT-3.5-level model in 2018-2019 with a couple of GPU boxes. He described early OpenAI as a "rag tag" blend of small exploratory research projects and bigger engineering efforts inspired by DeepMind's AlphaGo. He expects value functions and offline RL to make a comeback, and warns that catch-up mode makes it harder to build exploratory research culture later.
AI #147: Flash Forward (93 minute read)
GPT-5.2 is a frontier model only for the frontier. OpenAI's image 1.5 looks comparable to Nano Banana Pro, and the startup has a deal for Disney's characters. Gemini 3 Flash is a good model for its speed and price. It captures the bulk of Gemini 3 Pro's intelligence quickly at a low price.
Evaluating Chain-of-Thought Monitorability (15 minute read)
OpenAI has proposed a new evaluation suite to measure how reliably model reasoning can be monitored via chains-of-thought. The study assessed monitorability across 24 environments and found that reasoning transparency can vary significantly with scale, reinforcement learning, and inference-time compute.
π¨βπ»
Engineering & Research
Introducing Mistral OCR 3 (8 minute read)
Mistral OCR 3 is designed to extract text and embedded images from a wide range of documents with exceptional fidelity. It is a major upgrade over the previous version in forms, handwritten content, low-quality scans, and tables. Mistral OCR 3 enables downstream systems to understand structure as well as document content. The model can be integrated via API and used with Document AI, a UI that parses documents into text or structured JSON.
Inside Replit's Snapshot Engine: The Tech Making AI Agents Safe (9 minute read)
Replit built a compute and storage fabric that allows it to make changes in an isolated, reversible way. These primitives enable developers to experiment more frequently and faster. The company realized the same primitives could be used to superpower coding agents when it built Replit Agent in 2024. The system helps the human driving the agent, and the agent itself greatly benefits from the tools. This post explores the underlying systems that make the Replit Agent safe and how Replit uses them. It also takes a peek at Replit's near-term roadmap.
Agent Skills Becomes an Open Standard (2 minute read)
Agent Skills, folders of instructions, scripts, and resources that give AI agents new capabilities on demand, originated at Anthropic (which also created MCP) and is now an open format with adoption from Cursor, GitHub, VS Code, Claude Code, and OpenAI's Codex CLI. Skills let teams package domain expertise and workflows into portable, version-controlled packages that work across different agent products.
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains (15 minute read)
Scale AI researchers developed a structured approach to reinforcement learning that uses checklist-style rubrics instead of traditional preference rankings to train language models on subjective tasks. The framework achieved up to 28% improvement on medical reasoning benchmarks by decomposing response quality into interpretable criteria like factual accuracy and completeness.
The Signature Flicker (4 minute read)
Anthropic has fixed Claude Code's signature flicker. Terminals weren't really designed for interactivity. Repositioning the cursor and writing over existing text easily leads to flickering if not done well. Anthropic chose to re-render only the changed parts. It rewrote the renderer from scratch while still keeping React as the component model.
Nvidia and Alphabet VC arms back vibe coding startup Lovable at $6.6 billion valuation (3 minute read)
Alphabet and Nvidia have invested in Swedish vibe coding startup Lovable in a $330 million Series B round that places it at a $6.6 billion valuation. Lovable has raised over $500 million this year. The company has built a product popular with both enterprises and founders. Its platform uses AI models to help users build apps and websites using text prompts.
2026 vibe coding tool comparison (20 minute read)
Replit is the most feature-rich, well-thought-out, and powerful vibe coding tool, but v0 is the best if you're already a developer and want a technical interface.
Project Vend: Phase two (8 minute read)
Anthropic's AI shopkeeper experiment finally became profitable after upgrading to Sonnet 4.5, adding a CRM, and hiring an AI CEO.
Meta's Yann LeCun targets $3.5 billion valuation for new AI startup, FT reports (1 minute read)
Alexandre LeBrun, the founder of French health tech startup Nabla, will be the chief executive of the new LeCun's startup.
Contra DSPy and GEPA (15 minute read)
Trying to treat LLM workflows as modular programs is a backwards, rigid, and the wrong fit for the most interesting tasks.
Superpowers 4 (4 minute read)
Superpowers 4.5 has better subagent-driven development.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email