Data + AI 2026 Review 🧱, Lyft’s Golden Metrics 🏅, DuckDB 1.5.4 🦆

Write-Ahead Intent Log: A Foundation for Efficient CDC at Scale (51 minute video)

DoorDash replaced fragile CDC pipelines with WAIL after Debezium hit Cassandra scale limits. WAIL logs mutation intent to Kafka and the database, then a smart consumer verifies state, applies schema rules, and publishes events, improving recovery and scaling.

TLDR Data 2026-06-22

Data + AI 2026 Review 🧱, Lyft’s Golden Metrics 🏅, DuckDB 1.5.4 🦆

Deep Dives

Write-Ahead Intent Log: A Foundation for Efficient CDC at Scale (51 minute video)

AI Agents to Make Sense of Data at OpenAI (45 minute video)

Metric Semantic Layer: How Lyft Governs and Scales Key Data Definitions (7 minute read)

ClickHouse Ingestion at Scale: An Open-Source Zepto Engineering Story (8 minute read)

Opinions & Advice

DuckDB's agent moment (55 minute podcast)

7 Crucial Barriers between Data Teams and Self-Healing Data Architecture (9 minute read)

Review of Databricks Data + AI Summit 2026 (14 minute read)

Launches & Tools

AWS enters the context layer race with a graph that learns from agents, not manual curation (3 minute read)

Announcing DuckDB 1.5.4 (3 minute read)

Data-Juicer: The Data Operating System for the Foundation Model Era (Tool)

Miscellaneous

Ten years of ClickHouse in open source (17 minute read)

Data quality traffic lights (13 minute read)

Quick Links

Here's my AI-enabled dbt project structure (2 minute read)

The analytics engineer in 2026: system designer, governance owner, AI context provider (5 minute read)

Curated deep dives, tools and trends in big data, data science and data engineering 📊