TLDR DevOps 2026-05-08
Faster Postgres Writes 🐎, Datadog Code Security 🥷, AWS MCP Server 🔮
Introducing HCP Terraform powered by Infragraph - now in public preview (5 minute read)
HCP Terraform, powered by Infragraph, introduces a centralized, event-driven knowledge graph that unifies infrastructure data across hybrid and multi-cloud environments, enabling real-time visibility, improved security, cost control, and a foundation for AI-driven automation, now available in public preview for qualified US customers.
Introducing the Datadog Code Security MCP (4 minute read)
Datadog Code Security MCP scans AI-generated code in real time to detect vulnerabilities, secrets, and risky dependencies while consolidating multiple security checks into a single local workflow, enabling early issue detection and consistent security across development.
The AWS MCP Server is now generally available (5 minute read)
AWS MCP Server is now in general availability, a managed tool that gives AI coding agents secure, authenticated access to all 15,000+ AWS API operations while using existing IAM credentials and pulling current documentation at query time. The server addresses a key problem where AI agents rely on outdated training data and produce non-production-ready infrastructure, now offering features like sandboxed Python script execution, IAM context key support, and curated "Skills" that guide agents through common tasks with AWS best practices.
How we built a real-world evaluation platform for autonomous SRE agents at scale (9 minute read)
Datadog built a replayable evaluation platform for its Bits AI SRE agent using diverse, production-derived labels and simulated noisy environments to measure performance, detect regressions, and continuously improve agent reliability across complex incident investigations.
How to build CI/CD observability at scale (8 minute read)
CI/CD optimization for GitLab relies on observability using Prometheus, Grafana, and pipeline exporters to measure pipeline performance, job efficiency, and infrastructure bottlenecks, enabling scalable visibility, deployment optimization, and capacity planning for enterprise self-managed environments.
How Cloudflare responded to the “Copy Fail” Linux vulnerability (9 minute read)
Cloudflare successfully defended against the "Copy Fail" Linux kernel vulnerability (CVE-2026-31431) disclosed on April 29, deploying a custom eBPF-based mitigation across its 330-city infrastructure within hours while confirming zero customer impact through fleet-wide behavioral detection and forensic analysis. The company's existing security monitoring flagged internal exploit validation attempts within minutes without signature updates, and engineers used BPF Linux Security Module programs to surgically block the vulnerable code path while awaiting patched kernel deployment across hundreds of thousands of servers.
How lakebase architecture delivers 5x faster Postgres writes (5 minute read)
Neon eliminated a decade-old Postgres performance bottleneck by pushing full-page write operations from compute to its distributed storage layer, achieving up to 5x throughput improvements and reducing WAL generation by 94% in some cases. The "image generation pushdown" technique, now rolled out across Neon's entire fleet, leverages the company's separated compute-storage architecture to solve a durability problem that's structurally impossible to fix in traditional monolithic Postgres deployments.
Get our free daily newsletter with curated tools 💻, trends 📈, and insights 💡, for DevOps Engineers 👨💻
Join 340,000 readers for
one daily email