TLDR AI 2026-01-20
Claude knowledge bases 📝, Google Stitch upgrades 🧵, Anthropic’s character research 🎭
Stop reading transcripts. Start listening with ELMs (Sponsor)
Voice is multi-dimensional. Tone and timing can change the meaning of a conversation and transcript-reliant LLMs miss it all.
So meet Ensemble Listening Models (ELMs), a new AI architecture built by Modulate, that orchestrates 100+ voice-specific models to better understand conversations. Modulate's ELM, Velma, is now publicly available so you can harness human-level, real-time voice intelligence.
4 benefits of using Velma:
- Decode intent, emotion, stress, and authenticity in messy, multilingual audio.
- 100x faster, cheaper, and more accurate than LLMs.
- Traceable outputs with an explainable path. No black box, just evidence you can trust.
- Placed #1 on benchmarks for contextual understanding.
See first-hand the power of ELMs with the Velma Preview
Anthropic works on Knowledge Bases for Claude Cowork (3 minute read)
Knowledge bases are persistent repositories that Claude can reference for relevant context and incrementally update with new information. They allow Claude to segment information into multiple, user-managed knowledge bases. Users will likely be able to select specific knowledge bases as context attachments when working on tasks in Cowork. The feature could be particularly relevant for workflows that involve automation and file management.
Google Stitch to add API keys and PRD generation for PMs (2 minute read)
Google Stitch will soon allow API key management and automatic PRD generation. The update aim to enhance design workflows by integrating Gemini models for high-resolution outputs and simplifying project documentation. This positions Stitch as a vital tool in Google's AI-driven development strategy, enhancing collaboration within product teams.
Slonk: Character.ai internal system (4 minute read)
Character.ai uses an internal system called "Slonk" to manage GPU research clusters. Slonk merges the flexibility of Kubernetes with the productivity of HPC environments to optimize ML research workflows.
Training CLI Agents with Synthetic Data and RL (12 minute read)
Nvidia showcases how a reasoning model with no prior knowledge can be taught to safely use the LangGraph CLI for complex tasks like spinning up servers and building Docker containers, using synthetic data, and Reinforcement Learning with Verifiable Rewards.
The assistant axis: situating and stabilizing the character of large language models (20 minute read)
Two components are important to shaping model character: persona construction and persona stabilization. Without the proper construction, assistants can easily inherit counterproductive associations from the wrong sources. Even when assistants are well-constructed, they easily drift away from their roles, which makes stabilizing and preserving models' personas particularly important. Ensuring models are aligned will become increasingly important as they become more capable and are deployed in increasingly sensitive environments.
👨💻
Engineering & Research
Got questions about the OWASP Top 10 for Agentic Applications? (Sponsor)
Here's where you get answers:
Zenity is hosting a live AMA on Wed. 1/28 with the security researchers (Chris Hughes, Steve Wilson, Michael Bargury, and Kayla Underkoffler)who helped write the industry's first peer-reviewed risk framework for autonomous AI agents -- including contributors who led entries on goal hijacking, tool misuse, and memory poisoning. Bring your hardest questions about securing agents in production.
Register now →Natural Language to Solver-Ready Code (4 minute read)
Microsoft's OptiMind translates natural language optimization problems into mathematical formulations. It was designed to streamline the traditionally slow and expert-heavy modeling step in optimization workflows.
skypilot (GitHub Repo)
skypilot is a unified open-source framework for running, managing, and scaling AI/ML workloads across clouds, Kubernetes, and on-prem infrastructure with a simple interface and intelligent scheduling.
Meta's 3D Shape Generation (GitHub Repo)
ShapeR reconstructs full-scene 3D meshes from image sequences by extracting multimodal features and feeding them into a flow transformer that generates object-specific shape codes.
Kaggle Community Benchmarks (4 minute read)
Kaggle has launched Community Benchmarks, a feature that allows users to collaboratively build and share custom model evaluations that go beyond static accuracy metrics to better reflect real-world AI performance.
AI Engineering Has a Runtime Problem (6 minute read)
AI teams struggle to deploy agents because no standardized runtime exists to handle state, streaming, isolation, recovery, and scaling across sessions and users. Existing frameworks, observability, and eval tools only cover building and testing agents, leaving each team to reinvent complex infrastructure from scratch. A proper runtime would serve agents as APIs, persist state, manage streams, recover from failures, and scale safely, turning notebooks into production-ready AI applications.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email