TLDR AI 2025-12-10
OpenAI Image-2 🖼️, Devstral2 💻, no data generating distribution 📊
Observability for agentic AI and LLMs: 6 recommendations (Sponsor)
Agentic AI and GenAI are powerful but unpredictable. It's not just hallucination - they regularly take entirely new paths through established workflows.
This Dynatrace report lays out six pragmatic observability recommendations for practitioners managing agentic AI and GenAI workloads. Learn to look beyond monitoring, spot escalating costs, and catch critical issues early. Read the report
Want to check it out firsthand? Experiment with AI observability tools in the Dynatrace Playground - where you can explore sample data without installing any software.
GLM-4.6V: Open Source Multimodal Models with Native Tool Use (7 minute read)
GLM-4.6V (106B), a foundation model designed for cloud and high-performance cluster scenarios, and GLM-4.6V-Flash (9B), a lightweight model optimized for local deployment and low-latency applications, have been open-sourced. GLM-4.6V scales its context window to 128k tokens in training. It achieves SoTA performance in visual understanding and reasoning among models of similar parameter scales. The model has native Function Calling capabilities.
Introducing Devstral2 and Mistral Vibe CLI (5 minute read)
Devstral 2 hits 72.2% on SWE-bench Verified with 123B parameters, making it one of the best open-weight coding models despite being a fraction of the size of its peers. The 24B Devstral Small 2 scores 68.0% and runs on consumer hardware. Mistral also launched Vibe CLI, an open-source terminal agent that orchestrates multi-file changes across codebases.
OpenAI testing new Image-2 models on LM Arena (2 minute read)
OpenAI is preparing to roll out its next generation of image generation models along with GPT-5.2. Two new models have appeared on online evaluation platforms. Early testers note a substantial increase in image detail and fidelity. The upgrade brings OpenAI's image generation closer to the standard set by Google Nano Banana 2. Examples of images generated with the models are available in the article.
2025: The State of Generative AI in the Enterprise (32 minute read)
The shift to enterprise AI is no longer speculative. Enterprise AI is now a $37 billion market. It is the fastest-scaling category in software history. AI is becoming the core of how work gets done across industries, and enterprises, seeing real returns, are doubling down.
How People Use AI at Work (12 minute read)
Professionals use AI for grunt work and first drafts. It is never trusted without supervision. Artists use AI to handle admin work so they have more time for creative work. Researchers find that the time required to fact-check AI output often negates the efficiency gains. The primary barrier to AI adoption is hallucination.
How People Use AI Agents (7 minute read)
A large-scale usage study from Perplexity and Harvard reveals that over half of AI agent tasks involve deep cognitive work like research and productivity, challenging the stereotype of agents as simple task-doers.
There is no data-generating distribution (8 minute read)
The data-generating distribution is a convenient mnemonic crutch for machine learning engineers. However, it isn't necessary for understanding machine learning. The same methods are used regardless of whether the world is producing randomness. We need to figure out when we actually need statistical models of data.
MCP Donated to Agentic AI Foundation (4 minute read)
Anthropic has donated the Model Context Protocol (MCP) to the new Agentic AI Foundation under the Linux Foundation. MCP has become a widely adopted interoperability standard across AI platforms and infrastructure. It is supported by major industry players like AWS, Google, and Microsoft.
Anthropic and Accenture Expand Enterprise AI Partnership (6 minute read)
Anthropic and Accenture launched a dedicated business group to accelerate enterprise AI deployment using Claude. With 30,000 professionals trained on Claude and solutions targeting regulated industries, the partnership aims to move companies from pilot projects to full-scale production.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email