Quiz — Retrieval-Augmented Generation

1 of 5

What is the fundamental advantage of a vector database over a SQL LIKE query?

A SQL LIKE '%ransomware%' only finds the literal word. A vector database encodes meaning as vectors and finds documents about 'encryption malware demanding bitcoin' even though the word 'ransomware' never appears. Semantic similarity is the entire point.

2 of 5

Why must documents be chunked before embedding?

Embedding models collapse an entire input into one vector. If you embed a 10-page document, the vector averages everything — too vague to match specific questions. Chunking produces focused vectors that match specific queries.

3 of 5

You chunk a document with overlap=0. A question about content right at a chunk boundary fails to retrieve a useful answer. What went wrong?

Without overlap, a sentence spanning a chunk boundary appears in neither chunk completely. Overlap duplicates boundary content so that both chunks contain the full sentence. For security documents where mitigation steps often follow vulnerability descriptions, this is critical.

4 of 5

What does the instruction 'answer based ONLY on the provided context' do in a RAG prompt?

Without this instruction, the model freely mixes its general knowledge with your retrieved chunks. The answer might sound correct but contain fabricated details. In a security context, hallucinated remediation advice during an active incident could cause real damage. Grounding is a safety control.

5 of 5

Compared to a pure LLM call, what is the key attribution advantage of RAG?

A pure LLM gives you an answer with no source trail. RAG returns the retrieved chunks alongside the answer, so you can verify every claim against the original document. This is essential when the answer is 'your incident response procedure says to do X' — you need to know it actually says that.

End-of-lesson Quiz

Quiz complete