TLDR AI 2025-04-10
Google Ironwood TPU π», Claude Max π€, Grok API π
Grok API (4 minute read)
Grok 3 has an API for its models, including multiple reasoning modes. This link includes pricing, which is reasonable given the assumed size of the model itself.
Google Ironwood TPU (7 minute read)
With improvements on podsize, bandwidth, memory, and overall flops, the 7th generation of Google's TPUs is a step change improvement over any chip on the market.
Claude's Max Plan (2 minute read)
The new Max plan for Claude offers expanded usage with priority access to features and models, supporting users with demanding projects. It provides two levels: Expanded Usage for $100/month with 5x usage and Maximum Flexibility for $200/month with 20x usage. The plan is ideal for those who frequently work with substantial documents and complex data.
π§
Research & Innovation
Omni SVG (9 minute read)
By treating SVGs as a foreign language, a pretrained Qwen model can generate novel SVGs based on text and images. It is easily the state-of-the-art model. An open release is coming soon.
AI Scientist v2 (PDF)
Sakana AI had a research paper accepted to an ICLR workshop that was fully generated, executed, and written by a language model system. They improved their system by using VLMs, more general purpose search, and more.
OLMoTrace (17 minute read)
There is a debate in the language modeling world about how much a model actually learns and how much it just memorizes. This feature, in the AI2 Playground, searches through billions of input documents, in real time, to figure out if the output of a model is just regurgitated or if it is novel. It includes the source of the utterance from multiple documents.
π¨βπ»
Engineering & Research
Efficient MoE Inference (GitHub Repo)
HybriMoE is a new framework for hybrid CPU-GPU inference on Mixture of Experts models that tackles instability and overhead with smarter scheduling and caching strategies.
Dynamic Knowledge Circuits (GitHub Repo)
This research explores how LLMs structurally internalize new knowledge by analyzing computational subgraphs. It reveals patterns in knowledge acquisition, optimization phases, and potential strategies for improving continual pre-training.
Protein Backbone Generation (GitHub Repo)
ReQFlow sets a new benchmark in protein backbone generation, delivering state-of-the-art results while being significantly faster than existing models. It is 37x faster than RFDiffusion and 62x faster than Genie2 for sequences of length 300.
Generative Media Upgrades in Vertex AI (1 minute read)
Google Cloud has added enhanced generative tools to Vertex AI, including new capabilities in text-to-music (Lyria), video editing (Veo 2), voice customization (Chirp 3), and image inpainting (Imagen 3).
Cogito v1 Preview: Introducing IDA as a path to general superintelligence (22 minute read)
The Cognito team has released open-source LLMs ranging from 3B to 70B parameters, all of which outperform top open models of similar sizes, with the 70B model surpassing Llama 4 109B MoE. These models are trained using Iterated Distillation and Amplification (IDA), support both direct and reflective answering, and are available on Hugging Face, Ollama, Fireworks AI, and Together AI.
Introducing Firebase Studio (5 minute read)
Firebase Studio is launching as a cloud-based development environment for building, testing, and deploying AI applications. It combines tools like Project IDX, Genkit, and Gemini within a unified platform, offering rapid AI prototyping and a customizable coding workspace. Developers can leverage features such as natural language prototyping, instant device previews, and easy collaboration to accelerate app development.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email