TLDR AI 2026-03-18
GPT-5.4 mini & nano ⚡, Cursor self-summarization 📝, Harmonic’s AI mathematician ♾️
Mistral Unveiled Forge (6 minute read)
Mistral Forge is a platform for enterprises and governments to build custom AI models trained from scratch on their own data. The company positions it as a more controlled alternative to fine-tuning and RAG, with support for domain-specific training, reinforcement learning, and reduced dependence on third-party model providers.
Aristotle Agent (1 minute read)
Aristotle Agent is an autonomous mathematician that can solve and formalize the world's most challenging mathematical research problems. It is fully agentic and can produce repo-quality code. Aristotle Agent can autonomously prove/formalize for up to 24 hrs without human intervention. It is now live on web, CLI, and API, currently free of charge.
GPT‑5.4 Mini and Nano (4 minute read)
OpenAI released GPT‑5.4 mini and nano, smaller models designed for high‑volume workloads with faster speeds and lower cost. GPT‑5.4 mini improves substantially over GPT‑5 mini and approaches the larger GPT‑5.4 model on some benchmarks, while GPT‑5.4 nano targets lightweight tasks like classification, extraction, and ranking.
How to Stop Your Autoresearch Loop from Cheating (4 minute read)
Experiments with the autoresearch framework show that environment design and strict validation gates are more critical than model choice for preventing agent drift. While independent models discovered identical optimizations in structured landscapes, the primary bottlenecks remain infrastructure failures and GPU costs from rejected proposals.
Building Claude Code: How We Use Skills (4 minute read)
Anthropic's internal framework treats AI "skills" as functional folders containing scripts and assets rather than static text, using the file system for context engineering. Nine core categories emerged, with product verification and "Gotchas" sections identified as the highest-leverage components for improving output reliability. This shift toward progressive disclosure allows agents to fetch specific data and runbooks only when needed, reducing context noise and error rates.
👨💻
Engineering & Research
AI Security Best Practices by Datadog (Sponsor)
Learn
effective ways to secure:
- Components that host and run AI applications
- Software and data used by AI applications
- Entry points and business logic that enable a user to interact with AI
Download the guide ↗️
Introducing Unsloth Studio (7 minute read)
Unsloth Studio is a no-code web UI for training, running, and exporting open models. It allows users to run GGUF and safe tensor models locally on Mac, Windows, and Linux, and run and train text, vision, TTS audio, and embedding models. The studio can auto-create data sets from PDF, CSV, JSON, DOCX, and TXT files. A video tutorial on how to get started with Unsloth Studio is available.
Mixture‑of‑Depths Attention (GitHub Repo)
MoDA introduces a new attention mechanism that lets each head access both current‑layer and earlier‑layer key‑value pairs, helping preserve useful signals as models scale deeper.
Cursor Trains Models to Self‑Summarize Context (9 minute read)
Cursor described how its Composer model learns to summarize its own context during long coding sessions, compressing earlier steps into shorter representations to extend effective working memory. The trained behavior improves performance on multi‑step programming tasks while keeping token usage manageable.
Measuring progress toward AGI: A cognitive framework (3 minute read)
Google DeepMind released a paper outlining a cognitive taxonomy to measure AI progress toward AGI, identifying 10 key cognitive abilities like perception, learning, and reasoning. It proposes a three-stage evaluation protocol comparing AI performance to human benchmarks. A Kaggle hackathon, with a $200,000 prize pool, invites researchers to develop evaluations for five under-assessed abilities, using a new Community Benchmarks platform.
Microsoft Seeks More Coherence in AI Efforts With Copilot Reorganization (4 minute read)
Microsoft is reorganizing the teams that work on its flagship Copilot AI product. It is unifying the teams that work on its Microsoft 365 Copilot productivity offerings and the consumer version of Copilot. Jacob Andreou, who leads product and growth for Microsoft AI, will become the executive vice-president of Copilot and will be in charge of its design, product, growth, and engineering. Mustafa Suleman, Microsoft AI's chief executive, will focus primarily on the company's proprietary models and on achieving superintelligence. The new setup will enable the company to deliver a more coherent and competitive experience.
Nvidia Says It Is Restarting Production of AI Chips for Sale in China (3 minute read)
Nvidia has restarted the manufacture of H200 processors for sale in China. The US announced that it would allow Nvidia to sell its H200 processor in China in December, as long as 25% of sales were shared with the US government. Nvidia CEO Jensen Huang announced at the company's GTC event on Tuesday that demand signals out of China have strengthened in recent weeks and that the company's supply chain is getting fired up. The company has not commented on how much it expects to earn from H200 sales in China, but the Chinese market is estimated to be worth tens of billions of dollars a year.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email