TLDR

TLDR Data 2025-05-01

Agent messages via Kafka 🤖, Medallion Iceberg lake 🧊, New Llama API 🦙

📱

Deep Dives

Why Google's Agent2Agent Protocol Needs Apache Kafka (5 minute read)

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg (10 minute read)

It's Time We Talked About Time: Exploring Watermarks (And More) In Flink SQL (20 minute read)

🚀

Opinions & Advice

Context Serialization (5 minute read)

Upstream Observability to Mitigate Data Issues (13 minute read)

10 Challenges Internal Data Teams May Face Building Their First Revenue-Generating Data Product (40 minute podcast)

💻

Launches & Tools

Everything we announced at our first-ever LlamaCon (7 minute read)

How to build and deliver an MCP server for production (4 minute read)

Logchef (GitHub Repo)

Tired of Slow Python ML Pipelines? Try Purem (4 minute read)

🎁

Miscellaneous

Collective Wisdom of Models: Advanced Feature Importance Techniques at Meta (7 minute read)

When OpenAI Isn't Always the Answer: Enterprise Risks Behind Wrapper-Based AI Agents (5 minute read)

⚡️

Quick Links

Kafka Visualization (Website)

Curated deep dives, tools and trends in big data, data science and data engineering 📊

Join 400,000 readers for one daily email