Quiz — Working with LLM APIs

1 of 5

What does the max_tokens parameter actually control on an LLM API call?

max_tokens caps the output length. Set it too low and answers get truncated mid-sentence; set it too high and you pay for unused capacity. In a security pipeline processing thousands of alerts/hour, this is a primary cost and latency control.

2 of 5

What is the purpose of the system prompt?

The system prompt tells the model who it is and how it should respond. The same log entry will get a step-by-step technical reply for a 'junior SOC analyst' persona and a high-level risk briefing for a 'CISO' persona — same model, same input, completely different output.

3 of 5

You ask the LLM for JSON output and feed it garbage input. What's the most dangerous thing the model is likely to do?

LLMs almost always return something, even for nonsensical input. The output may be valid JSON but contain fabricated data. Always validate field values, not just JSON structure, before passing model output to automation. Never trust raw LLM output blindly.

4 of 5

How does an LLM API maintain a 'multi-turn conversation' if every API call is stateless?

LLM APIs are stateless. To create the illusion of conversation, you append the user's new message to a list of all prior messages and send the whole list every turn. This is why long conversations get expensive — you're paying for the full history on every call.

5 of 5

Why is the conversation history itself an attack surface?

If the model sees the full conversation every turn and you only filter the latest user message, an attacker can inject malicious instructions into earlier turns (via tool outputs, RAG context, or compromised history) and bypass your filters. Treat the entire context window as untrusted input.

End-of-lesson Quiz

Quiz complete