Build a Star Wars Copilot in C# - Lesson 2: Chat History and System Prompts

In lesson 1, we got basic prompt/response working.

Now we make it feel like an actual copilot by adding:

chat history
message roles
a system prompt

This is where things get fun.

Lessons in this series

Lesson
Lesson 0: Self-Setup
Lesson 1: Chat with an LLM
Lesson 2: Chat History and System Prompts
Lesson 3: Model Choice and Local Models
Lesson 4: Tool Calling
Lesson 5: MCP (Model Context Protocol)
Lesson 6: RAG from a Database
Lesson 7: Multimodal Image Generation
Lesson 8: Agents and Orchestration

Before you start (self-setup)

If you’re following along on your own, complete lesson 0 and lesson 1 first.

Lesson 2 reuses the same Azure OpenAI endpoint, API key, and model deployment from lesson 1.

Why follow-up questions failed

LLMs don’t “remember” by default. Every call is stateless unless you pass prior messages.

So in lesson 2 we move from:

await chatClient.GetResponseAsync(userInput);

to:

await chatClient.GetResponseAsync(history);

where history includes both user and assistant messages.

Build chat memory

We start with:

var history = new List<ChatMessage>();

Then for each turn:

add user message to history
get model response using full history
add assistant response back to history

That loop gives continuity so “What is the worst?” can refer to the previous question.

Message roles matter

By this point we have two roles:

User
Assistant

Then we add a third:

System

System messages are the highest-priority behavior instructions. This is where tone, format rules, and constraints live.

System prompt: from generic bot to Star Wars copilot

The workshop evolves the prompt to:

keep responses concise (optional)
answer in Yoda style
warn about the dark side
respond to “hello there” with only “General Kenobi!”

It sounds playful (because it is), but this is a great pattern for real apps: keep core behavior in one explicit, testable prompt.

Tradeoff: memory vs tokens

Sending history improves quality, but increases token usage and cost.

That tradeoff is unavoidable in chat apps. The practical takeaway: keep enough context for quality, but not so much that costs or latency spike unnecessarily.

A stylized sci-fi scene showing layered chat bubbles orbiting around a glowing holographic AI mentor in a spaceship command room, one bubble labeled by icon only for system rules, one for user, one for assistant. Warm cinematic lighting, high detail, no text, no logos.

Follow along

Workshop source for this lesson: Lesson 2 README.

Next up: swapping model providers, including local models with Foundry Local.

Note: Original workshop repository: jimbobbennett/StarWarsCopilot.