TLDR

TLDR Data 2026-07-02

Google’s Tabular Foundation Model 🧾, Meta’s Data Eng Agent 🛠️, LLM Spark Debugger 🚦

📱

Deep Dives

Ontology Everywhere! (8 minute read)

How We Built DEmate: Taming LLMs for Data Engineering at Meta (7 minute read)

Building Indexes on a Moving Target (20 minute read)

🚀

Opinions & Advice

Never seen a data quality issue that wasn't actually an ownership problem (4 minute read)

Query Faster, Query Smarter: Our Move to DuckDB and What We Learned (4 minute read)

Too many tables are bad for you (6 minute read)

💻

Launches & Tools

Introducing TabFM: A zero-shot foundation model for tabular data (4 minute read)

SedonaDB 0.4: GPU-Accelerated Spatial Joins (3 minute read)

TiDB (GitHub Repo)

🎁

Miscellaneous

Data Residency Is Not a Legal Problem. It Is An Infrastructure Design Problem (5 minute read)

⚡️

Quick Links

Curated deep dives, tools and trends in big data, data science and data engineering 📊

Join 570,000 readers for one daily email