What Is Context Engineering And Why Should You Care? | In The Loop Episode 23

Published by

Jack Houghton

Anna Kocsis

Published on

July 10, 2025

Read time

min read

Context Engineering vs. Prompt Engineering

Most of you probably already know what prompt engineering is, but just for the sake of the episode, I’ll spend 20 seconds on it.

Prompt engineering is a term that gained mainstream traction around late 2022, after the launch of ChatGPT. Millions of people were suddenly trying to figure out how to communicate with AI systems. It even became one of the first job titles advertised by big companies. Prompt engineering turned into a legitimate skillset—people were doing courses on it. At its core, prompt engineering is about using trial and error, plus intuition, to craft instructions—both in content and in wording—that help the model give you the result you want.

One famous example is the prompt: "Let's think step by step." This led to what’s now known as chain-of-thought, a reasoning technique that improves AI performance by encouraging it to create a plan and follow it step by step. It’s been shown to significantly enhance outcomes compared to asking a simple question outright.

But as AI use cases have grown more complex, so have the tasks that agents are expected to perform. This has exposed some clear limitations in prompt engineering. Early prompt engineering is essentially a one-shot approach: you write a single prompt—maybe with examples or instructions—and the model responds. But many real-world tasks require more than that. They involve dialogue with users, memory, and the ability to use tools—like integrating with external systems to pull in data.

For instance, asking a model to search the web and then write a summary can’t be done in a single prompt. The model needs to perform multiple actions: searching, remembering the results, and synthesizing them into a response.

To solve this, many builders—including us—use prompt chaining. That means linking prompts together so the model builds context step by step. It completes one step, uses the output as context for the next, and so on. This has led to innovations like RAG (Retrieval-Augmented Generation) and ReAct (Reasoning + Action), which are techniques for getting models to retrieve and use new information to enhance their responses.

The point is this: it’s no longer just about the instruction you send to a model—it’s about the context the model has, and how it carries that context through multiple steps.

This is where context engineering really starts to take center stage. It’s all about deciding what a language model should know before it responds—not just what you ask, but what data, tools, memories, instructions, and history are fed into the system to guide the answer.

The term itself was recently popularized on Twitter by Shopify’s CEO, Tobias Lutke:

I really like the term “context engineering” over prompt engineering.

It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.
— tobi lutke (@tobi) June 19, 2025

Others have echoed this. For example, Andrej Karpathy, one of OpenAI’s early team members and the person who coined the term vibe coding, described it as the art and science of filling the model’s context window just enough to make the next step work—not too much, not too little.

So, you can think of context engineering like orchestration. You’re orchestrating an agent across multiple steps, deciding what information it should have after each step, and how it should use that context moving forward.

Let’s go one level deeper and unpack what actually counts as context for a language model.

Every piece of software built on top of a language model—including ours at Mindset AI—is essentially defining and delivering this context and instruction pipeline. And this is where the real complexity lies—because making it all work smoothly is tough. You'll see why.

There are several layers of context:

The immediate prompt: This is the direct instruction from the user, like: “Create me a marketing strategy.” This is where prompt engineering has traditionally been focused.
System instructions: This is the system message sent to the model to define behavior. For example: “You are an expert travel planner.” These instructions set tone, scope, guardrails, personality, and response structure. Our own agent builder has a whole framework around this to ensure the right behavior at the right time.
Conversation history: Also known as short-term memory. This includes the messages within the current thread. For example, when you say, “As I mentioned earlier,” the model needs to know what earlier means. Software like ChatGPT captures and condenses message history and sends that along with the user message to give the model proper context.
Long-term memory or persistent memory: This includes summaries of past interactions, user preferences, profile info, or even chats from different threads. It allows the model to "remember" that you work in a certain field or dislike certain types of responses. At Mindset AI, we're launching a long-term memory layer next month to let customers define what their agents should remember.
External information retrieval: This includes both RAG-style document lookup and integrations with third-party systems. For example, a user might ask a question that requires pulling data from PDFs, code snippets, or videos. The system fetches that information and includes it in the prompt to the model. Another example is a support bot that pulls a user’s billing data in real time.
Tool definitions: If the model can use tools—like APIs or calculators—it needs to understand how to call those tools and process the results. For example, if it’s summarizing your WhatsApp messages, it needs to retain and use that output in the next step of the process.

So that’s what we mean by context in an AI system. It’s just like a human needing the right background to do a good job—but in GenAI systems, this has huge technical and design implications.

And that’s one of the main reasons we’re building a new orchestration system at Mindset AI—to help customers deliver the right context to the model, at the right time.

‍

‍

To wrap this section up: simpler, sequential processes—step-by-step orchestration—tend to outperform multi-agent systems right now. I’ll explain why in a moment.

‍

Context in multi-agent systems

We’ve talked about multi-agent systems in past episodes, where you have teams of agents working on different parts of a problem. It’s a popular concept right now, especially in the context of context engineering, and it highlights why managing context is so critical.

Let me break this down.

The appeal of multi-agent systems is clear: you can delegate different tasks to different agents, have them work in parallel, and then combine their outputs to get a faster result. For example, Agent A could break a task into smaller parts, assign them to Agents B and C, and then combine their work into one output. It sounds efficient. You could even have agents specialize—one focused on breaking down tasks, another on summarizing information.

But in practice, it often fails. And the failure point is almost always coordination—which really means context. These agents aren’t sharing the same context.

A great example of this comes from a recent paper by Cognition Labs—the team behind the Devon coding agent. The paper is provocatively titled "Don't Build Multi-Agent Systems." In it, they describe an experiment where a team of agents was asked to build a Flappy Bird clone. One sub-agent was tasked with creating the background graphics, and another with designing the bird character.

Here’s what happened: the background agent produced Mario-style scenery instead of Flappy Bird pipes. The other agent, unaware of that choice, created a bird that didn’t match the style. When they tried to merge the two pieces, the result felt completely disjointed. The agents simply didn’t have access to the same full context or to each other's decisions.

So, you might think—just give all the agents the full context. Pass along the task description, conversation history, decisions made—everything.

But even that isn’t enough.

Even with the full context, agents can still make conflicting assumptions. One might assume the game should be cartoonish, the other realistic. It’s not just about information sharing—it’s about how agents interpret that information. Every decision has knock-on effects.

That’s why Cognition Labs argues that single-agent architectures—where one agent handles tasks sequentially—are more reliable. And we’ve seen the same thing ourselves.

At Mindset AI, we’re enabling exactly this—giving our customers the ability to define a sequence of steps, link to separate tools or integrations, and ensure their agents carry long-term memory through the conversation. It’s all about controlling what the agent does with its data and how it uses context at every step.

‍

Anthropic reported the same challenges in their research assistants. And we’ve run into these problems firsthand when experimenting with multi-agent systems in our own work.

So, while multi-agent systems are exciting—and even a single agent can be incredibly powerful—success ultimately depends on how well your software manages context. How you prompt the model, what history you give it, and how well that context is preserved and interpreted each time.

This is going to be a major area of innovation over the next year. Personally, I think these problems will be solved—and probably faster than expected. If I say 12 months, it’ll probably be six.

‍

Why you should care about context engineering

So, how does this all affect you?

Why should you care about context engineering?

The short answer: because when it's done well—whether you’re using something simple like ChatGPT or a more complex AI product—you feel it. You walk away thinking, that was actually helpful. And that’s a direct result of good context engineering.

It also impacts the kinds of experiences, features, and interactions you’ll see in AI products going forward.

‍

Personalization

Take personalization, for example. Systems are starting to remember your preferences, your past conversations—everything about how you like to work. That’s only going to become more common.

‍

UI and UX patterns

We're also seeing new UI and UX patterns emerge. As AI gets embedded into apps, designers are figuring out clever ways to feed context to agents—often without the user realizing it. A great example is the “upload file” button in many chat interfaces. On the surface, it’s just: I want to talk to this document. But what you’re really doing is giving the AI more context to reason with.

‍

Modes of interaction

You’ll also see more modes of interaction. If you’ve used the ChatGPT app, you’ve probably noticed options like “deep research,” “Google search,” or “study with me.” Each of these is just giving the model context—telling it how to behave in that scenario, even if it’s still a single-agent architecture under the hood.

‍

Multi-tenancy

Another huge challenge is context in multi-tenant environments. This is especially relevant to our customers who are SaaS platforms or enterprise software providers. They often serve thousands of different organizations, each with its own data, users, and rules. So how do you manage context across all of that—securely, privately, and effectively?

We’ve been investing a lot of time into solving exactly that. Imagine a sales platform that connects to a rep’s contacts, CRM, and inbox. When that rep asks the AI to draft an email, it pulls everything it needs from across those systems—like magic. But that “magic” only works because there’s a secure, well-designed context layer under the hood.

‍

Privacy and trust

And finally, privacy and trust. This is going to matter more and more. These systems are capturing huge amounts of data—and we should all be asking: where is it going? Who sees it? How is it used?

I’ll be honest—I use AI like a life coach. For almost any question I have, I go to ChatGPT or another system first. But that also means I care deeply about how my data is handled.

‍

Closing thoughts

So to wrap things up: early on, we were all obsessed with the model itself—and how to talk to it using prompt engineering. But now, we’re realizing that an AI agent isn’t just the model. It’s the entire system around it—the scaffolding that feeds it context, interprets its outputs, and uses those outputs to decide what to do next.

We’re moving up a layer of abstraction.

It’s no longer just: how do I prompt the model to give me X?

It’s now: what information and guidance does the model need to generate something great?

Anyway, I hope you enjoyed this episode. I certainly did. And I’ll see you next week.

‍

Table of contents

Articles

Stay tuned for the latest AI thought leadership.

What Is Context Engineering And Why Should You Care? | In The Loop Episode 23

Published by

Published on

Read time

Category

Context Engineering vs. Prompt Engineering

Context in multi-agent systems

Why you should care about context engineering

Closing thoughts

Become an AI expert

Articles

What Is Context Engineering And Why Should You Care? | In The Loop Episode 23

How Do I Integrate AI Into My Product—Ideally By Yesterday

What Jobs Will AI Create—And Do The Luddites Have A Point? | In The Loop Episode 22

New Release: Mindset AI SDK 2.4 - Fonts Customization

How Enterprise CIOs Build & Buy Gen AI In 2025 | In The Loop Episode 21

Three Reasons Why Apple Is Cooked | In The Loop Episode 20

New Release: Mindset AI SDK 2.2 Multi-Tenancy Agents & Session Control

New Release: Mindset AI SDK 2.1 Theme Customization

In The Loop Episode 19 | Mary Meeker AI Trends 2025: Three Reasons Why AI Is Different From Any Other Tech In History

In The Loop Episode 18 | What Happens To Entry-Level Jobs In The AI Era?

What Is The Difference Between A2A And MCP? [With Videos]

Mindset AI Appoints Pip White as Non-Executive Director

In The Loop Episode 17 | Google I/O & Microsoft Build In 10 Minutes: What We Learned From The Two Biggest AI Conferences

In The Loop Episode 16 | The Top Five AI Features SaaS Companies Are Shipping In 2025 (And Why They Work)

New Release: Mindset AI SDK 2.0

In The Loop Episode 15 | Google, OpenAI, Meta, Anthropic & The Three Battles To Own All AI

Should Conversational AI Agents Get Priority On Your E-Learning Platform’s Roadmap?

In The Loop Episode 14 | The Real State Of AI Adoption In 2025: What's AI Actually Used For?

In The Loop Episode 13 | Cluely: The AI App That Made Cheating Viral—And Maybe Acceptable?

The Million-Dollar Question: Build Or Buy A Conversational AI Agent?

In The Loop Episode 12 | Google Agent2Agent (A2A): The Future Of AI Agent Protocols Or A Flop?

How To Turn Your E-Learning Business Into An AI Coaching Solution

In The Loop Episode 11 | Shopify Memo: No Humans Hired Without AI Approval—Tobias Lütke's Vision

Mindset AI Raises £4.3 Million To Meet Growing Demand For Embedded AI Agents For SaaS Businesses

In The Loop Episode 10 | Does ChatGPT's Viral Image Generator & The Ghibli Craze Spell The End Of Art & Creativity?

How To Monetize Your AI Agents: A Product Leader's Guide To Revenue Generation In EdTech

In The Loop Episode 9 | Apple’s AI Crisis Exposed: Is It Having A Nokia Moment?

In The Loop Episode 8 | Model Context Protocol (MCP): The Newest AI Buzzword Explained

In The Loop Episode 7 | Vibe Coding: Will Developers Be Out Of A Job In Six Months? Dario Amodei’s Take

When To Use Agentic RAG—And What Is It Anyway?

In The Loop Episode 6 | Multi-Agent Systems: The Next Big Shift In AI—Yet People Have No Clue About Them

Agentic AI 101: Everything You Ever Wanted To Know About AI Agents But Never Dared Ask

In The Loop Episode 5 | The Rise Of Vertical AI Agents: Why SaaS Companies Should Be Worried

In The Loop Episode 4 | Why Microsoft's CEO Thinks Everyone's Wrong About AI Agents & AGI

AI Expert Interview: The Benefits And Drawbacks Of Agentic AI

In The Loop Episode 3 | The Real AI Challenge: Designing Human-Agent Interfaces That Work

AI Agents vs. Everything AI: All The Definitions You'll Ever Need

In The Loop Episode 2 | The Future of AI Agents: What’s Real, What’s Hype & What’s Next

When Did AI Agents Become A Thing? The History & Evolution Of Agentic AI

In The Loop Episode 1 | DeepSeek’s AI Breakthrough: Hype or Game-Changer? A No-Nonsense Breakdown

What Is The Future Of Agentic AI: Eight Predictions From A CPO

How to use AI agents to fix broken search in learning platforms

The OpenAI announcement will transform the way Mindset AI agents engage with users and knowledge

How AI can support self-guided employee onboarding and reduce ramp times

The future of knowledge management: How AI will change the way we manage knowledge

How AI can make your video content more interactive and engaging

The future of HR: AI-powered knowledge assistants

Why learners choose Google over your learning platform and how AI can change that

ChatGPT Intellectual Property Issues: How To Protect IP

Book a demo today.