Generate Synthetic Q&A Pairs — RAG (Retrieval-Augmented Generation) Prep Advanced Task | Graduates Hub

The Scenario

Raw text embeddings often fail because user queries ("How do I claim overtime?") don't look like formal handbook text ("Overtime remuneration is subject to clause 4..."). To fix this, you want an LLM to read the handbook chunk and generate 3 hypothetical questions a user might ask about it. You will then embed those questions.

The Brief

Write the "Synthetic Q&A" prompt pipeline. This is an advanced technique (Hypothetical Document Embeddings / HyDE).

Deliverables

The Prompt: Takes a chunk of text and generates 3 diverse user queries that this text answers
Instructions on Tone: The questions must sound like a real, frustrated employee, not a robot
A short explanation of why embedding user-style questions improves vector similarity search compared to embedding formal document text.

Submission Guidance

A user searches using symptoms ("My app keeps crashing"). Documentation is written in solutions ("Troubleshooting memory leaks"). Your synthetic questions bridge this vocabulary gap.

Submit Your Work

Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.