Slashing Snowflake Costs ❄️, Open-Source Agent Tradeoffs 🤖, Kafka’s New Bottleneck ⚙️

I battletested 5 open source analytics agents (14 minute read)

Open-source “analytics agents” are often grouped together, but LangChain, Wren AI, nao, LibreChat, and Vercel's template solve very different problems, and only some are actually built for analytics. Reliable answers depend less on the agent interface and more on where business context lives, whether that's prompts, semantic models, markdown files, or the underlying MCP/tooling layer.

TLDR Data 2026-05-28

Slashing Snowflake Costs ❄️, Open-Source Agent Tradeoffs 🤖, Kafka’s New Bottleneck ⚙️

Deep Dives

Kafka Share Groups and Parallelizing Consumption — Tuning max.poll.records (14 minute read)

How CockroachDB Built Vector Indexing at Scale (8 minute read)

Design S3 Object Storage Like a Senior Engineer (31 minute read)

Opinions & Advice

I battletested 5 open source analytics agents (14 minute read)

I Inherited a $140K Snowflake Bill — Three Months Later It Was $38K. Here's Everything I Learned (23 minute read)

AI Risk Is an Architecture Problem (20 minute read)

Launches & Tools

2026 State of Analytics Engineering Report by dbt Labs (Sponsor)

Scaling AI-Driven Marketing Processes with PostgreSQL (6 minute read)

RushDB 2.0: Memory Infrastructure for the Agentic Era (11 minute read)

Auditing Model Bias with Balanced Datasets with Mimesis (7 minute read)

MurrDB (GitHub Repo)

Miscellaneous

Deconstructing Data Sketches (8 minute read)

Open Data Product SDK: Turning Data Product Ideas Into Standard YAML With AI Models (5 minute read)

Quick Links

Announcing Polars 1.41 (2 minute read)

Visualize the Brrr (Website)

Curated deep dives, tools and trends in big data, data science and data engineering 📊