Back to resources

What Is Context Engineering And Why Should You Care? | In The Loop Episode 23

What Is Context Engineering And Why Should You Care? | In The Loop Episode 23

Published by

Jack Houghton
Anna Kocsis

Published on

July 10, 2025
July 10, 2025

Read time

9
min read

Category

Podcast
Table of contents

The term context engineering has become one of the biggest buzzwords in AI over the past 10 days. If industry leaders are to be believed, it’s something you’ll need to start learning—many are calling it the next big evolution after prompt engineering.

Put simply, prompt engineering is about what you ask an AI model to do. Context engineering, on the other hand, is about what the model already knows when you ask it.

This shift reflects hard-won lessons in building AI agent systems. You can create something with all the bells and whistles, but if it lacks the right context, it’s never going to deliver good outcomes for people.

In today’s episode, we’re going to explore the evolution from prompt engineering to context engineering. I’ll explain what context really means in the world of AI, and I’ll break down some of the real, tangible things you’re going to start noticing as a result of better context engineering.

This is In The Loop with Jack Houghton. I hope you enjoy the show.

Context Engineering vs. Prompt Engineering

Most of you probably already know what prompt engineering is, but just for the sake of the episode, I’ll spend 20 seconds on it.

Prompt engineering is a term that gained mainstream traction around late 2022, after the launch of ChatGPT. Millions of people were suddenly trying to figure out how to communicate with AI systems. It even became one of the first job titles advertised by big companies. Prompt engineering turned into a legitimate skillset—people were doing courses on it. At its core, prompt engineering is about using trial and error, plus intuition, to craft instructions—both in content and in wording—that help the model give you the result you want.

One famous example is the prompt: "Let's think step by step." This led to what’s now known as chain-of-thought, a reasoning technique that improves AI performance by encouraging it to create a plan and follow it step by step. It’s been shown to significantly enhance outcomes compared to asking a simple question outright.

But as AI use cases have grown more complex, so have the tasks that agents are expected to perform. This has exposed some clear limitations in prompt engineering. Early prompt engineering is essentially a one-shot approach: you write a single prompt—maybe with examples or instructions—and the model responds. But many real-world tasks require more than that. They involve dialogue with users, memory, and the ability to use tools—like integrating with external systems to pull in data.

For instance, asking a model to search the web and then write a summary can’t be done in a single prompt. The model needs to perform multiple actions: searching, remembering the results, and synthesizing them into a response.

To solve this, many builders—including us—use prompt chaining. That means linking prompts together so the model builds context step by step. It completes one step, uses the output as context for the next, and so on. This has led to innovations like RAG (Retrieval-Augmented Generation) and ReAct (Reasoning + Action), which are techniques for getting models to retrieve and use new information to enhance their responses.

The point is this: it’s no longer just about the instruction you send to a model—it’s about the context the model has, and how it carries that context through multiple steps.

This is where context engineering really starts to take center stage. It’s all about deciding what a language model should know before it responds—not just what you ask, but what data, tools, memories, instructions, and history are fed into the system to guide the answer.

The term itself was recently popularized on Twitter by Shopify’s CEO, Tobias Lutke:

Others have echoed this. For example, Andrej Karpathy, one of OpenAI’s early team members and the person who coined the term vibe coding, described it as the art and science of filling the model’s context window just enough to make the next step work—not too much, not too little.

So, you can think of context engineering like orchestration. You’re orchestrating an agent across multiple steps, deciding what information it should have after each step, and how it should use that context moving forward.

Let’s go one level deeper and unpack what actually counts as context for a language model.

Every piece of software built on top of a language model—including ours at Mindset AI—is essentially defining and delivering this context and instruction pipeline. And this is where the real complexity lies—because making it all work smoothly is tough. You'll see why.

There are several layers of context:

  • The immediate prompt: This is the direct instruction from the user, like: “Create me a marketing strategy.” This is where prompt engineering has traditionally been focused.
  • System instructions: This is the system message sent to the model to define behavior. For example: “You are an expert travel planner.” These instructions set tone, scope, guardrails, personality, and response structure. Our own agent builder has a whole framework around this to ensure the right behavior at the right time.
  • Conversation history: Also known as short-term memory. This includes the messages within the current thread. For example, when you say, “As I mentioned earlier,” the model needs to know what earlier means. Software like ChatGPT captures and condenses message history and sends that along with the user message to give the model proper context.
  • Long-term memory or persistent memory: This includes summaries of past interactions, user preferences, profile info, or even chats from different threads. It allows the model to "remember" that you work in a certain field or dislike certain types of responses. At Mindset AI, we're launching a long-term memory layer next month to let customers define what their agents should remember.
  • External information retrieval: This includes both RAG-style document lookup and integrations with third-party systems. For example, a user might ask a question that requires pulling data from PDFs, code snippets, or videos. The system fetches that information and includes it in the prompt to the model. Another example is a support bot that pulls a user’s billing data in real time.
  • Tool definitions: If the model can use tools—like APIs or calculators—it needs to understand how to call those tools and process the results. For example, if it’s summarizing your WhatsApp messages, it needs to retain and use that output in the next step of the process.

So that’s what we mean by context in an AI system. It’s just like a human needing the right background to do a good job—but in GenAI systems, this has huge technical and design implications.

And that’s one of the main reasons we’re building a new orchestration system at Mindset AI—to help customers deliver the right context to the model, at the right time.

Stay In The Loop

To wrap this section up: simpler, sequential processes—step-by-step orchestration—tend to outperform multi-agent systems right now. I’ll explain why in a moment.

Context in multi-agent systems

We’ve talked about multi-agent systems in past episodes, where you have teams of agents working on different parts of a problem. It’s a popular concept right now, especially in the context of context engineering, and it highlights why managing context is so critical.

Let me break this down.

The appeal of multi-agent systems is clear: you can delegate different tasks to different agents, have them work in parallel, and then combine their outputs to get a faster result. For example, Agent A could break a task into smaller parts, assign them to Agents B and C, and then combine their work into one output. It sounds efficient. You could even have agents specialize—one focused on breaking down tasks, another on summarizing information.

But in practice, it often fails. And the failure point is almost always coordination—which really means context. These agents aren’t sharing the same context.

A great example of this comes from a recent paper by Cognition Labs—the team behind the Devon coding agent. The paper is provocatively titled "Don't Build Multi-Agent Systems." In it, they describe an experiment where a team of agents was asked to build a Flappy Bird clone. One sub-agent was tasked with creating the background graphics, and another with designing the bird character.

Here’s what happened: the background agent produced Mario-style scenery instead of Flappy Bird pipes. The other agent, unaware of that choice, created a bird that didn’t match the style. When they tried to merge the two pieces, the result felt completely disjointed. The agents simply didn’t have access to the same full context or to each other's decisions.

So, you might think—just give all the agents the full context. Pass along the task description, conversation history, decisions made—everything.

But even that isn’t enough.

Even with the full context, agents can still make conflicting assumptions. One might assume the game should be cartoonish, the other realistic. It’s not just about information sharing—it’s about how agents interpret that information. Every decision has knock-on effects.

That’s why Cognition Labs argues that single-agent architectures—where one agent handles tasks sequentially—are more reliable. And we’ve seen the same thing ourselves.

At Mindset AI, we’re enabling exactly this—giving our customers the ability to define a sequence of steps, link to separate tools or integrations, and ensure their agents carry long-term memory through the conversation. It’s all about controlling what the agent does with its data and how it uses context at every step.

Stay In The Loop

Anthropic reported the same challenges in their research assistants. And we’ve run into these problems firsthand when experimenting with multi-agent systems in our own work.

So, while multi-agent systems are exciting—and even a single agent can be incredibly powerful—success ultimately depends on how well your software manages context. How you prompt the model, what history you give it, and how well that context is preserved and interpreted each time.

This is going to be a major area of innovation over the next year. Personally, I think these problems will be solved—and probably faster than expected. If I say 12 months, it’ll probably be six.

Why you should care about context engineering

So, how does this all affect you?

Why should you care about context engineering?

The short answer: because when it's done well—whether you’re using something simple like ChatGPT or a more complex AI product—you feel it. You walk away thinking, that was actually helpful. And that’s a direct result of good context engineering.

It also impacts the kinds of experiences, features, and interactions you’ll see in AI products going forward.

Personalization

Take personalization, for example. Systems are starting to remember your preferences, your past conversations—everything about how you like to work. That’s only going to become more common.

UI and UX patterns

We're also seeing new UI and UX patterns emerge. As AI gets embedded into apps, designers are figuring out clever ways to feed context to agents—often without the user realizing it. A great example is the “upload file” button in many chat interfaces. On the surface, it’s just: I want to talk to this document. But what you’re really doing is giving the AI more context to reason with.

Modes of interaction

You’ll also see more modes of interaction. If you’ve used the ChatGPT app, you’ve probably noticed options like “deep research,” “Google search,” or “study with me.” Each of these is just giving the model context—telling it how to behave in that scenario, even if it’s still a single-agent architecture under the hood.

Multi-tenancy

Another huge challenge is context in multi-tenant environments. This is especially relevant to our customers who are SaaS platforms or enterprise software providers. They often serve thousands of different organizations, each with its own data, users, and rules. So how do you manage context across all of that—securely, privately, and effectively?

We’ve been investing a lot of time into solving exactly that. Imagine a sales platform that connects to a rep’s contacts, CRM, and inbox. When that rep asks the AI to draft an email, it pulls everything it needs from across those systems—like magic. But that “magic” only works because there’s a secure, well-designed context layer under the hood.

Privacy and trust

And finally, privacy and trust. This is going to matter more and more. These systems are capturing huge amounts of data—and we should all be asking: where is it going? Who sees it? How is it used?

I’ll be honest—I use AI like a life coach. For almost any question I have, I go to ChatGPT or another system first. But that also means I care deeply about how my data is handled.

Closing thoughts

So to wrap things up: early on, we were all obsessed with the model itself—and how to talk to it using prompt engineering. But now, we’re realizing that an AI agent isn’t just the model. It’s the entire system around it—the scaffolding that feeds it context, interprets its outputs, and uses those outputs to decide what to do next.

We’re moving up a layer of abstraction.

It’s no longer just: how do I prompt the model to give me X?

It’s now: what information and guidance does the model need to generate something great?

Anyway, I hope you enjoyed this episode. I certainly did. And I’ll see you next week.

Become an AI expert

Subscribe to our newsletter for the latest AI news.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Table of contents
Share this:
Related

Articles

Stay tuned for the latest AI thought leadership.

View all

Book a demo today.

Book a demo