TLDR

TLDR Data 2026-05-04

Zero-Downtime at Stripe πŸ’³, Trimming ML Feature Bloat βœ‚οΈ, Less Repetitive Data QA βœ…

πŸ“±

Deep Dives

Data Mesh at Grab Part II: The Foundational Tools behind Certification (10 minute read)

Optimizing ML Workload Network Efficiency (Part I): Feature Trimmer (14 minute read)

How we rebuilt search ranking at Faire with deep learning (11 minute read)

πŸš€

Opinions & Advice

How We Built an AI Second Brain for 60K Knowledge Workers (8 minute read)

We automated data validation β€” Here's how we did it (12 minute read)

Five Worlds of Data Engineering (10 minute read)

πŸ’»

Launches & Tools

Datanomy (GitHub Repo)

What Held Up at 3 AM: One Engineer's RAG Case Study (17 minute read)

Handling Schema Issues in Polars (6 minute read)

🎁

Miscellaneous

Bottling the River: Apache Fluss on EKS (6 minute read)

Effective KV Compression with TurboQuant (4 minute read)

⚑️

Quick Links

Introducing Neo4j Agent Skills (3 minute read)

Does ELT vs. ETL Even Still Matter? (6 minute read)

Curated deep dives, tools and trends in big data, data science and data engineering πŸ“Š

Join 400,000 readers for one daily email