Step 4: Multi-Turn Conversation

Maintain context across multiple exchanges

1 ExplorePlay below
2 ReadUnderstand
3 BuildHands-on lab
4 CompareSolution
💡 ReflectThink deeper

Stateless API, Stateful Conversation

The LLM API is stateless -- every call is independent. The model has zero memory of previous calls. To create a conversation, you must send the full history every time:

TurnMessages you sendWhat the model sees
Turn 1[user: "What is lateral movement?"]Just the question
Turn 2[user: "What is lateral movement?", assistant: "Lateral movement is...", user: "How do I detect it?"]The entire conversation so far
Turn 3[user, assistant, user, assistant, user: "What tools?"]All 5 messages -- growing every turn

The conversation list grows with every exchange. You are responsible for maintaining this state.

Building a Multi-Turn Conversation

system = "You are a threat intelligence analyst. Be concise and technical."
messages = []

def chat(user_input):
    """Send a message and get a response, maintaining history."""
    messages.append({"role": "user", "content": user_input})

    response = client.chat(
        system=system,
        messages=messages,
        max_tokens=300,
    )

    messages.append({"role": "assistant", "content": response})
    return response

# Turn 1
print(chat("What is lateral movement?"))

# Turn 2 -- model remembers the context from turn 1
print(chat("How do I detect it in a Windows environment?"))

# Turn 3 -- model has full conversation history
print(chat("What Sigma rules should I deploy?"))

Context Window and Token Limits

As conversations grow, so does the token count. Every model has a maximum context window:

ModelContext windowApproximate word limit
Claude Sonnet200K tokens~150,000 words
GPT-4o128K tokens~96,000 words
Gemini Flash1M tokens~750,000 words
Ollama (small models)4K-32K tokens~3,000-24,000 words

When the conversation exceeds the context window, you must truncate older messages. Common strategies: drop the oldest turns, summarise the conversation so far, or keep only the system prompt and last N turns.

Building an Interactive Security Assistant

system = """You are a security incident response assistant.
When given an incident description, help the analyst through triage:
1. Classify the incident type
2. Suggest containment actions
3. Identify evidence to preserve
4. Recommend escalation criteria
Maintain context across the conversation."""

messages = []

print("Security Assistant (type 'quit' to exit)")
print("-" * 45)

while True:
    user_input = input("\nYou: ").strip()
    if user_input.lower() == "quit":
        break

    messages.append({"role": "user", "content": user_input})
    response = client.chat(system=system, messages=messages, max_tokens=400)
    messages.append({"role": "assistant", "content": response})

    print(f"\nAssistant: {response}")

This is the skeleton of every AI-powered security tool: a system prompt that defines the role, a message loop that maintains context, and an LLM that generates responses.

Loading...
Loading...
Loading...

Think Deeper

Start a conversation about a suspicious IP. After 3 turns, ask 'what IP were we discussing?' Does the model remember? Now start a new conversation and ask the same question without history. What happens?

With the full conversation history, the model correctly recalls the IP because you are sending the entire conversation every time. Without history, the model has no idea -- it is stateless. This has security implications: if an attacker can manipulate the conversation history (prompt injection via earlier messages), they can change the model's understanding of the entire context. Conversation history is an attack surface.
Cybersecurity tie-in: Conversation history is an attack surface. In a multi-turn security assistant, an attacker who can inject content into earlier messages (e.g., via a malicious log entry that becomes part of the conversation) can influence all subsequent responses. This is indirect prompt injection. Defence: sanitise all external data before inserting it into the message history, and never let untrusted content appear in the assistant role.

Loading...