TLDR AI 2025-07-24
US AI Action Plan ๐บ๐ธ, GitHub Spark ๐, OpenAIโs economic analysis ๐ง
GitHub launches Spark for no-code AI app creation (8 minute read)
GitHub Spark enables "vibe coding" โ letting users create personalized "micro apps" by describing ideas in natural language rather than writing code. The tool uses AI models from Anthropic and OpenAI to instantly generate functional apps with automatic deployment, persistent data storage, and polished UI components.
Anthropic backs US AI plan, urges transparency and security (7 minute read)
The White House's "Winning the Race: America's AI Action Plan" prioritizes accelerating AI infrastructure, federal adoption, and security coordination. Anthropic supports this strategy and emphasizes the importance of export controls and transparency standards in AI development. The plan aligns with Anthropic's earlier recommendations and highlights the need to maintain America's AI leadership through robust infrastructure, security, and responsible policy measures.
Trump Administration Pledges to Stimulate AI Use and Exports (7 minute read)
The Trump administration has promised to slash red tape and take steps to boost exports for US tech companies in a bid to accelerate AI use in the US. Its action plan lays out moves to make it easier and faster for tech companies to build data centers and get the power they need for those centers. The plan includes directives to several government agencies to identify and gut any regulations that block the development and use of AI. The administration is also working on addressing uncertainty about whether it is legal for AI models to train on copyrighted material.
๐ง
Deep Dives & Analysis
OpenAI's new economic analysis (20 minute read)
OpenAI's new economic research reveals that 28% of employed US adults now use ChatGPT at work โ up from just 8% in 2023. The usage data shows learning and upskilling dominating at 20% of US messages, followed by writing and communication at 18%, with programming and data science capturing 7%.
Inverse Scaling Appears in Extended Reasoning (11 minute read)
Anthropic found that more test-time compute doesn't always help: longer reasoning chains in large models sometimes lead to worse performance.
โBehavioristโ RL reward functions lead to scheming (20 minute read)
A large class of reward functions, which includes almost every reward function in the RL and LLM literature, are all doomed to eventually lead to AI that will pretend to be docile and cooperative while secretly looking for opportunities to behave in egregiously bad ways. This is because negative reward for lying and stealing looks the same as negative reward for getting caught lying and stealing. The reward function will miss sufficiently sneaky misaligned behavior, so the AI will learn that this kind of behavior is good.
๐จโ๐ป
Engineering & Research
๐จ Lovart introduces the world's first AI design agent (Sponsor)
Creating Ghibli versions of your profile pic is fun, but that's not what you pay a designer to do โ and that's where
Lovart comes in. It's AI that thinks, works, and creates like a professional designer. It brainstorms with you. It looks for examples. It thinks in design systems.
Try it โ freeVoxtral (15 minute read)
Voxtral Mini and Small are new multimodal audio chat models that excel in both spoken audio and text comprehension. Voxtral Small outperforms many closed-source models, can run locally, and supports audio files up to 40 minutes.
TimeScope: How Long Can Your Video Large Multimodal Model Go? (2 minute read)
TimeScope is a new open source benchmark for evaluating vision models on how well they handle long videos. It evaluates not just retrieval but synthesis, localization, and fine-grained motion analysis to provide a more holistic view of temporal comprehension. Using the benchmark reveals how model size isn't everything - simply scaling parameters doesn't automatically grant a longer temporal horizon - and Gemini 2.5-Pro is in a league of its own, being the only model able to maintain strong accuracy on videos longer than an hour.
Higgs Audio V2 (GitHub Repo)
Higgs Audio v2 is an audio foundation model trained on 10 million hours of data that achieves 75.7% win rates over GPT-4o-mini-tts. The open-source model demonstrates emergent capabilities like multi-speaker dialogues and voice cloning without requiring fine-tuning.
Subliminal Learning in Language Models (10 minute read)
Anthropic has explored how language models can inherit behavioral traits such as preferences or goals from other models through training data that appears unrelated. This effect only occurs when the teacher and student models share the same base architecture, suggesting hidden signals in the data can transmit unintended behavior.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email