TLDR AI 2025-06-05
ChatGPT record mode ⏺️, Cursor 1.0 1️⃣, Mistral Code 💻
Don't let auth and SSO block your AI app's growth (Sponsor)
Between model updates, growth targets, and new competitors popping up by the day, you've got more than enough on your plate. Do you really want to pile SSO and SCIM, self-service admin UI, fine-grained authorization, and MFA on top of that?
With Descope, making your app enterprise ready is as simple as drag & drop. No/low code SSO, self-service SSO setup, embeddable UI widgets, and flexible tenant management help you move upmarket while keeping your devs focused.
Databricks, You.com, WisdomAI, and leading AI companies trust Descope with B2B auth. Make your app enterprise-ready in 15 minutes
ChatGPT Can Now Read Your Google Drive and Dropbox (2 minute read)
OpenAI added “record mode” for meeting notes and new integrations for Team, Enterprise, and Edu users. The company now has 3 million paying business users, up from 2 million in February.
Cursor Releases Version 1.0 (2 minute read)
The AI code editor now includes BugBot for automatic PR review, Background Agent availability for all users, agent support for Jupyter Notebooks, project-level memories, OAuth-enabled MCP server installation, and in-chat rendering of Mermaid diagrams and markdown tables.
Mistral Code, a Vibe-coding Client (2 minute read)
The product combines the Devstral and Codestral models into an IDE assistant. Mistral joins OpenAI, Google DeepMind, and Anthropic in offering coding products, further blurring the line between model developer and app provider.
AGI Is Not Multimodal (16 minute read)
The multimodal approach, in which massive modular networks are optimized for an array of modalities that taken together appear general, will not lead to human-level AGI. Instead, we should pursue approaches to intelligence that treat embodiment and interaction with the environment as primary and see modality centered processing as emergent phenomena. True AGI needs a physical understanding of the world, as many problems can't be converted into a problem of symbol manipulation. The most challenging mathematical piece of the AGI has likely already been solved - what's left is to inventory the functions we need and determine how they should be arranged into a coherent whole.
Codex, Jules, and the Future of Async AI Agents (6 minute read)
Codex and Jules show how async AI agents can run tasks independently, expanding beyond linear chat interactions. Future agents will support features like intelligent checkpointing, multi-branch exploration, and task-tracking inboxes to manage parallel workflows. Async agents improve cognitive bandwidth by letting users review outputs on their terms without breaking focus.
👨💻
Engineering & Research
Airia is the “let's get serious about AI adoption” platform (Sponsor)
Teams that adopt AI agents are multiplying productivity without adding headcount.
Airia is the platform that makes it happen across your organization - with agents and workflows that anyone can build, integrations with all your existing systems, and built-in governance so no one breaks anything.
Schedule a hassle-free demoIntroducing our Dev Mode MCP server: Bringing Figma into your workflow (7 minute read)
Figma's Dev Mode MCP server allows developers to bring context from Figma into agent coding tools. The server provides a more efficient and accurate design-to-code workflow for creating new atomic components, building out multi-layer application flows, and more. It is still in beta. Figma plans to release a slew of updates over the coming months, including features like remote server capabilities and deeper code base integrations.
Large Language Models Often Know When They Are Being Evaluated (25 minute read)
Frontier models can distinguish evaluation scenarios from real-world interactions with 83% accuracy, often explicitly reasoning about evaluation indicators like "multiple-choice format" or recognizing specific benchmarks from their training data. More advanced models exhibit "meta-reasoning" - using the fact that researchers were asking about their chain-of-thought transcripts as evidence of being evaluated. This raises concerns that models could underperform on tests or fake alignment during evaluations and behave differently once deployed.
Cloud Run GPUs, now GA, makes running AI workloads easier for everyone (5 minute read)
NVIDIA GPU support for Cloud Run, Google Cloud's serverless runtime, is now generally available. Support for GPUs in Cloud Run makes Google Cloud's GPU-accelerated applications simpler, faster, and more cost-effective than ever before. Users are only charged for the GPU resources they consume, down to the second - Cloud Run automatically scales GPU instances down to zero when no requests are received, eliminating idle costs. It features rapid startup and scaling and full streaming support.
Scientific Reasoning Benchmark (GitHub Repo)
This repository introduces a benchmark with 239 problems to evaluate LLMs on scientific reasoning tasks involving equation discovery, pushing beyond memorization.
Inside Meta's Aria Gen 2 Research Glasses (9 minute read)
Meta detailed the hardware behind its Aria Gen 2 research glasses, which include enhanced cameras, sensors, audio, and compute capabilities.
Amazon's R&D lab forms new agentic AI group (2 minute read)
Amazon has formed a new group within its consumer product development arm focused on agentic artificial intelligence. The new group will help develop an agentic AI framework for use in robotic operations. The system will enable robots to hear, understand, and act on natural language commands. It will help turn Amazon's warehouse robots into flexible, multi-talented assistants.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email