Query Planning Slowdown 🐢, Airbnb’s Data Mesh 🧩, Ontology-Driven Policies 🧬

Context pruning: cut LLM tokens without losing quality (9 minute read)

Context Pruning is the practice of selectively removing low-value tokens, sentences, or passages from an LLM's input to reduce cost, latency, and often improve output quality. It includes techniques such as token-level, sentence/chunk-level, attention-based, and dynamic layer-progressive pruning, and works best when paired with semantic caching.

TLDR Data 2026-05-18

Query Planning Slowdown 🐢, Airbnb’s Data Mesh 🧩, Ontology-Driven Policies 🧬

Deep Dives

Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse (9 minute read)

AWS Outage May 2026: Lessons for Database Disaster Recovery (10 minute read)

Viaduct 1.0 and the Future of Airbnb's Data Mesh (5 minute read)

Opinions & Advice

The Modern Data Stack is Overcomplicated: Data Ingestion (17 minute read)

Welcome to ORDER BY Jungle (11 minute read)

Exploring schema evolution with ontology-driven propagation (4 minute read)

Launches & Tools

A Data Layer That Won't Make You Wait (Sponsor)

ducklake-sdk (GitHub Repo)

Apache Arrow as Data Interchange (5 minute read)

What Matters in Production RAG (8 minute read)

MinIO's MemKV promises 95% better GPU utilization by ending AI recompute tax (5 minute read)

Miscellaneous

Context pruning: cut LLM tokens without losing quality (9 minute read)

Your AI agent deletes critical data: Who is responsible? (5 minute read)

Quick Links

What Leading a Data Team Actually Looks Like Right Now (7 minute read)

How Agents Use Systems Differently (15 minute read)

Curated deep dives, tools and trends in big data, data science and data engineering 📊