TLDR AI 2025-10-06
OpenAI Agent Builder π€, Gemini revamps app π±, Jules API π¨βπ»Β
OpenAI prepares to release Agent Builder during DevDay on October 6 (2 minute read)
OpenAI's Agent Builder will help users build agentic workflows and connect MCPs, ChatKit widgets, and other tools. It is a direct competitor to established workflow automation tools like n8n and Zapier. The Agent Builder features a drag-and-drop canvas that allows users to create agent flows from predefined templates. The canvas supports a range of modular building blocks, nodes for logic, connectors, and more, mirroring the flexibility seen in other agentic workflow platforms.
Gemini's home screen could soon get a Discovery-style redesign (2 minute read)
Google appears to be testing a new home screen for Gemini. The redesign adds a scrollable feed of one-tap prompt suggestions, with prompts ranging from image edits to requests regarding news, quizzes, and coding. Video and screenshots of the new home screen are available in the article. There is no guarantee that the redesign will roll out broadly.
With its latest acqui-hire, OpenAI is doubling down on personalized consumer AI (5 minute read)
OpenAI has acquired Roi, an AI-powered personal finance app. Only the company's CEO will be joining OpenAI. Roi will wind down operations and end its service to customers on October 15. Terms of the deal have not been disclosed. The acquisition clearly aligns with OpenAI's bet on personalization and life management as the next layer of AI products.
OpenAI's first device with Jony Ive could be delayed due to 'technical issues' (1 minute read)
OpenAI and Jony Ive's partnership is struggling with some technical issues that could end up pushing back their highly anticipated AI device's release date. OpenAI and Ive are still working on the assistant's personality and potential privacy concerns. The project's budget could reportedly be challenged due to the increased computing power necessary to run mass-produced AI devices. There is still little information about the upcoming product publicly available.
π§
Deep Dives & Analysis
Which Table Format Do LLMs Understand Best? (Results for 11 Formats) (12 minute read)
Understanding format sensitivity in large language models is crucial for data pipeline architecture, performance optimization, and cost management. Many RAG pipelines involve ingesting documents that contain tables of data. These tables need to be formatted in a way that is easy for models to consume otherwise they may be needlessly hurting the accuracy of the overall system. Markdown-KV appears to be a good default to use in situations where accuracy is paramount. CSV and JSONL can hurt system accuracy.
Circular Financing: Does Nvidia's $110B Bet Echo the Telecom Bubble? (18 minute read)
Analysts immediately drew comparisons to the telecom bubble on the news that Nvidia had invested $100 billion in OpenAI in September. During the telecom bubble, demand was speculative, and customers burned cash. Many of Nvidia's customers are profitable and sophisticated hyperscalers. This suggests that the current situation is different from the telecom bubble.
Economics and AI (28 minute read)
Most economists refuse to discuss the possibility of transformative AI, even at workshops designed for it, insisting that comparative advantage keeps human wages high, as past tech progress has shown. AI's impact depends on domain structure, like protein folding has predictable patterns, but mapping genomes requires individual observations.
AI Cyber Defenders in Practice (11 minute read)
Claude Sonnet 4.5 has been enhanced with strong cybersecurity capabilities, matching or surpassing previous frontier models in tasks like vulnerability detection. This shift reflects a growing focus on using AI not only for cyber offense but also to assist defenders in securing systems.
π¨βπ»
Engineering & Research
IBM TechXchange 2025 is here! Don't miss major announcements (Sponsor)
Starting tomorrow, stream the Opening General Sessions to see how generative AI and agents are transforming enterprise tech.π€ Featuring Dinesh Nirmal, SVP of IBM Software, and other tech leaders, fellows, and distinguished engineersπ Celebrate innovation at the Excellence Awards on Thursday.π
October 7β9
πJoin the livestream
Anatomy of a Modern Finetuning API (9 minute read)
Mira Murati's AI lab, Thinking Machines, has released a language model fine-tuning API called Tinker. Tinker exposes a handle of very low-level functions to users to let them perform supervised fine-tuning and online reinforcement learning. Each training step requires sending a batch of data over the network. It offers a stable, mature infrastructure for RL and fine-tuning experiments, making it possible for anyone to be an AI researcher.
SINQ (GitHub Repo)
SINQ (Sinkhorn-Normalized Quantization) is a novel, fast, and high-quality quantization method designed to make any large language models smaller while keeping their accuracy almost intact. It uses dual scaling to make models less vulnerable to outliers. SINQ's Sinkhorn-normalized optimization, which iteratively rescales rows and columns to balance their variance, makes errors more spread out and less severe, preserving model accuracy even at 3-bit precision. Weights become inherently easier to quantize when the overall matrix imbalance is reduced, leading to more stable behavior across layers and consistently higher accuracy even at very low bit-widths.
On-Device OCR with Core ML and MLX (14 minute read)
dots.ocr was adapted to run entirely on-device using Apple's Core ML and MLX frameworks. In this post, Hugging Face outlines the conversion process, offering a practical guide for developers looking to deploy similar models locally.
Jules API (5 minute read)
The Jules API gives developers programmatic access to Jules' capabilities to automate and enhance software development cycles. It can be used to create custom workflows, automate tasks like bug fixing and code reviews, and embed Jules' intelligence directly into everyday tools. This page provides a quick overview of the API and walks through how to make an API call.
It's a JAX, JAX, JAX, JAX World (8 minute read)
Stan's days are likely numbered. High-end applications are replacing Stan with JAX. The biggest obstacle for the shift is finding hardware on which to run JAX most effectively, but this won't be a problem in ten years. The Stan project will continue development, as it will likely still be used for a long time. The team has plenty of strategies for making it faster, like adding samplers that will work well on CPU but not on GPU.
This Meta alum has spent 10 months leading OpenAI's nationwide hunt for its Stargate data centers (11 minute read)
Keith Heyde, OpenAI's Head of Infrastructure, leads site development within OpenAI's industrial compute team, a division that's quickly becoming one of the most important groups in the company. Infrastructure is now a strategic pillar on par with product and model development. OpenAI is betting that owning the next generation of physical infrastructure is central to controlling the future of AI. This article looks at how Heyde has spent the past several months searching for Stargate data center sites and the challenges that face the project.
2 Hours in Line for a Free Hat (5 minute read)
The Claude team recently gave out hats and coffee at an event where people had to wait two hours in line. This post details how the event went down. Participants at the event waited, took pictures, posted on social media, and became one of the cool kids. The Claude team created an experience people wanted to be part of, which made it worth waiting for hours just for a hat. Pictures of the event and the merch the Claude team handed out are available in the article.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email