I Tried to Build a Context Layer for My Agent in a Weekend. Reader, I Did Not Build a Context Layer for My Agent in a Weekend.

· Cassidy Williams, DevRel at GitHub

Once upon a time, I had an agent that almost worked.

It could answer questions, it could call a couple tools, and it could mostly follow the happy path. It was a runtime agent, and needed to hold some user preferences across sessions and recall accurate data. It was not production-ready, but it was close enough to make me optimistic! It's doing something!

It only needed a couple small things, nothing major:

  • A little memory
  • Access to actual data

"Easy peasy! No problem!" - Me, a fool

A "little memory" is... more than a little memory. There's short-term state and long-term state, user preferences, task history, deletion, corrections, permissions... and don't get me started on the "actual data" part.

...okay, you got me started: that turns into retrieval, sync pipelines, search, caching, schema mapping, freshness guarantees (yikes), and a natural fear of letting an LLM directly query production data.

There came a point where I realized that I was building more than a couple small things, and I learned that the name for this behemoth I was attempting is called a context engine.

Memory is not one thing

I think the main thing that I assumed incorrectly was that "memory" is a checkbox I can check. Like, surely I could just save a conversation somewhere, and boom, I have memory.

But... once you say an agent should "remember things", you have to then answer "what things?"

Should it remember the current conversation? Yes! Probably.

Should it remember that a user prefers TypeScript examples over Python? Maybe.

Should it remember the half-finished task from yesterday? That sounds useful, sure?

Should it remember every support ticket the customer has opened? Okay, that sounds like business data.

Should it remember something the user corrected? Definitely, oh yes.

Should it forget something the user deleted? Totally, yeah.

Because of all of these questions and answers, now "memory" has categories, and each of those categories have different rules. Some should expire, some should be updated, some should be searchable, some should be private, some should be shared, some should be "deleted" (and some should have different definitions of what "deleted" means).

Deciding which context mattered and to whom and for how long it matters turned into a whole system of choices, and a larger design problem than I wanted it to be.

And then, a whole new series of questions arrived, like should I build this or buy this? Should I store memory myself? Do I use Postgres? Do I use a vector database? Do I grab a memory framework or a model provider's built-in memory? Do I use a dedicated service? There were even more questions than this, and everything had its own pros and cons. Everything was a bit too reasonable, and scope creep loomed heavily with every option.

I wept (fully justified, not dramatic) because I truly just wanted to give my agent "a little memory."

So then I continued being mistaken by thinking, "okay, fine, memory is complex, but at least passing in actual data should be straightforward, it's just one thing."

Data retrieval is a whole other infrastructure project

Data already exists! We have databases! We have APIs! We have documents and dashboards and logs and everything everywhere all at once!

I just needed to... hook it up.

You may start playing some ominous orchestral music.

So, turns out, there's different kinds of data. Who knew?

  • There is structured data, like users, orders, invoices, tickets, events, accounts, and product records.
  • There is unstructured data, like docs, PDFs, emails, support conversations, meeting notes, and that one very important document that somehow became load-bearing infrastructure.
  • There is semi-structured data, which is the developer equivalent of "surprise, good luck, ha ha!"

And there's also just data that exists in theory but not in a format that any reasonable tool (agent, human, anyone, really) should be allowed to touch directly.

I thought perhaps, "okay, maybe the agent can just query Postgres?" (Editor's note: I would like to throw the word "just" into the sea)

Now, I had to think about a whole other fun set of questions, like:

  • Which tables should the agent be able to access?
  • Which rows should this user be allowed to see?
  • What happens if the schema changes?
  • What happens if the agent writes a bad query?
  • What happens if the query is technically correct but wildly expensive?
  • What happens if the data is stale?
  • What happens if the agent finds the right record but misses the related records?
  • What happens if the answer requires data from Postgres, Salesforce, Zendesk, and a PDF from 2014 named final_FINAL_v3.pdf?

Raw data is not always usable. Agents need the right data (shaped the right way, at the right time, and so on) and the retrieval of that data isn't just... throwing search on it.

My exhausted realization is probably exhausting you too, at this point. The trade-offs are too much.

Because if the agent has stale data, it gives bad answers.

If it has too much data, it gets confused or expensive.

If it has too little data, it hallucinates politely.

If it has data it should not have, congratulations, now you have a security incident with vibes.

And even more, if every useful piece of context lives in a different system, I'm building sync pipelines, indexes, permissions layers, fallback logic, search systems, and you know there's just duct tape everywhere as I try to piece it all together.

I wept again. Still fully justified.

Caching

You know, I don't feel like weeping more. You get it.

There are only two hard things in Computer Science: caching and naming things. - Phil Karlton

Repeated questions. Repeated retrieval. Repeated LLM calls. It's too much. It's too much!

All powers combined: this is a context engine

A context engine is the layer that gives an agent the right context, in the right shape, at the right time.

It sounds simple until you realize all that it includes, which you've just read about. Context is most useful at runtime, when the agent is doing the work. It needs the user state while answering, the latest data when deciding which tool to call, the relevant docs for planning, the cached results to reduce waiting, and memory so it doesn't make mistakes.

I kept coming back to this point where I needed that layer between things... but I didn't want to build it myself anymore. I was done weeping.

The "boring" infrastructure answer

This is where Redis started to make a lot of sense to me, because it's boring in the best possible way.

Developers already use it for fast, reliable application infrastructure. It sits close to the runtime. It handles hot data, caching, search, session state, queues, real-time workloads, and the kinds of operational paths where latency actually matters. Agent context is very similar, and it needs to sit close to where the user is.

Redis Iris is Redis's answer to this whole mess: an integrated context engine for agents. And I like that framing, because it doesn't pretend that the problem is smaller than it is!

It brings together:

  • Redis Data Integration, for keeping operational data synced and fresh
  • Context Retriever, for helping agents access the right business context
  • Agent Memory, for short-term and long-term agent context
  • Redis Search, for fast search across the data your agent needs
  • LangCache, for semantic caching so your agent does not keep paying for the same work over and over

I found too many individual tools to do all of these things, and there were too many lifecycle rules, and too many places where mysterious behavior squished together could ruin my afternoon. Redis Iris is appealing to me because it gives developers a practical, boring default. With it, context is all I need.

And I say boring in a positive way! Boring means fewer surprises, fewer custom glue layers, and that the interesting work can happen higher up. I love that it treats context like real infrastructure. It brings together all of those pieces of memory and retrieval and freshness and search and caching, all close to where the agent is actually running, and I don't have to think anymore about it.

Now, I can build the agent. Redis can handle the context. No more drama.