TLDR AI 2025-10-30
Cursor 2.0 2️⃣, Cognition’s SWE-1.5 👨💻, agent labs 🤖
Auth0 for AI Agents: the secure way to authenticate and rightsize agent permissions (Sponsor)
Your agent needs to call APIs, read documents, and make decisions. But without proper controls, it can access data it shouldn't, call services without approval, or leak sensitive information.
Auth0 for AI Agents adds a control layer to your AI deployment. Authenticate users and agents, let agents call APIs securely through a Token Vault, add human approval at critical moments, and control document access through FGA for RAG.
✅ Setup takes 5 minutes
✅ Supports any language or framework.
Auth0 for AI Agents is currently in Developer Preview (GA this October). Get early access ↗️
Cursor debuts Composer and multi-agent suite in Cursor 2.0 (1 minute read)
Cursor 2.0 features Cursor's first coding model, Composer, and a multi-agent interface. Composer can complete most tasks in under 30 seconds. Its training allows it to accurately navigate and understand large and complex codebases. Cursor 2.0's new interface allows multiple agents to operate in parallel, managed via git worktrees or remote machines. The new release is aimed at addressing workflow challenges seen in modern development teams.
Nvidia Becomes First Company to Hit $5 Trillion Valuation (2 minute read)
Nvidia's 2025 revenue more than doubled to $130.5 billion, up 114% year-over-year. Jensen Huang announced $500 billion in AI chip orders and plans to build seven government supercomputers. President Trump is expected to discuss Nvidia's chips with Xi Jinping as the high-end processors remain a key sticking point in US-China trade tensions due to Washington's export controls.
Why Foundation Models in Pathology Are Failing (and What Comes Next) (12 minute read)
Optimization for the wrong target produces excellent results at that target while failing at the actual problem. Pathology needs optimization for clinical utility, generalization across institutions, and robustness to real-world conditions. The next generation of pathology AI will be smarter about what size and architectural approach actually fit the problem. It will be smaller, more interpretable, more rigorously validated, and more aware of its own limitations.
Signs of introspection in large language models (15 minute read)
Anthropic researchers used concept injection, artificially inserting neural activity patterns into Claude's processing, to test whether models could identify the change to their own thinking. Claude Opus correctly recognized injected concepts about 20% of the time, sometimes noticing that something was “off” before mentioning the inserted concept.
👨💻
Engineering & Research
MiniMax M2 — Open-Sourced & Free (Sponsor)
Built for Agents & Code, 2× faster at 8% of Claude Sonnet's price. Create, code, and deploy smarter with selective parameter activation.
Try it now: MiniMax Agent | MiniMax-M2 API | Open-source for local use.
Introducing gpt-oss-safeguard (8 minute read)
The gpt-oss-safeguard models (120b and 20b parameters) let developers apply custom safety policies at inference time rather than training their own classifiers on thousands of labeled examples, using chain-of-thought reasoning to explain decisions.
Claude Skills, anywhere: making them first-class in Codex CLI (4 minute read)
This post discusses how to make Claude Skills available globally. The trick is to treat the skills and an enumerator as global assets. Put the skills in Codex's installation folder and drop a script that enumerates skills into the PATH. This allows Codex to discover skills in any repository by running that script as instructed in AGENTS.md.
Introducing SWE-1.5: Our Fast Agent Model (8 minute read)
SWE-1.5 is a frontier-sized model with hundreds of billions of parameters optimized for software engineering. It achieves near state-of-the-art coding performance and sets a new standard for speed. The model is now available in Windsurf, serving at up to 950 tokens per second. It can be used to deeply explore and understand large codebases, build end-to-end full-stack apps, and easily edit configurations without needing to memorize field names.
How AI labs use Mercor to get the data companies won't share (5 minute read)
Mercor's marketplace connects former employees of investment banks, consulting houses, and law firms with the AI labs looking to automate those industries. Its customers include OpenAI, Anthropic, and Meta. The startup pays industry experts up to $200 an hour to fill out forms and write reports for AI training. It has tens of thousands of contractors that it pays more than $1.5 million to daily. Mercor's annual recurring revenue has grown to roughly $500 million in just under three years.
Agent Labs Are Eating the Software World (8 minute read)
Agent labs ship product first and build infrastructure later. They turn AI models into goal-directed systems that deliver outcomes, capturing the real value in the AI stack. Founders only need a deep understanding of a domain and the ability to build reliable workflows on top of existing models. The most valuable developer skills are shifting from model architecture to system design, evaluation engineering, and domain-specific workflow optimization. Investors should look for companies that capture workflow data and have clear evaluation metrics. The moat is in the data and feedback loops, not the models.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email