TLDR

TLDR Data 2026-05-28

Slashing Snowflake Costs ❄️, Open-Source Agent Tradeoffs 🤖, Kafka’s New Bottleneck ⚙️

📱

Deep Dives

Kafka Share Groups and Parallelizing Consumption — Tuning max.poll.records (14 minute read)

How CockroachDB Built Vector Indexing at Scale (8 minute read)

Design S3 Object Storage Like a Senior Engineer (31 minute read)

🚀

Opinions & Advice

I battletested 5 open source analytics agents (14 minute read)

I Inherited a $140K Snowflake Bill — Three Months Later It Was $38K. Here's Everything I Learned (23 minute read)

AI Risk Is an Architecture Problem (20 minute read)

💻

Launches & Tools

Scaling AI-Driven Marketing Processes with PostgreSQL (6 minute read)

RushDB 2.0: Memory Infrastructure for the Agentic Era (11 minute read)

Auditing Model Bias with Balanced Datasets with Mimesis (7 minute read)

MurrDB (GitHub Repo)

🎁

Miscellaneous

Deconstructing Data Sketches (8 minute read)

Open Data Product SDK: Turning Data Product Ideas Into Standard YAML With AI Models (5 minute read)

⚡️

Quick Links

Announcing Polars 1.41 (2 minute read)

Visualize the Brrr (Website)

Curated deep dives, tools and trends in big data, data science and data engineering 📊

Join 400,000 readers for one daily email