AI & PromptingBeginner 1 to 2 hours

Create a Golden Test Set

Create 10 test cases to evaluate a new customer support prompt.

The Scenario

Your team wrote a new prompt to answer customer FAQs. Before deploying it, you need to prove it works. You need a "Golden Dataset" — a spreadsheet of varied inputs and the exact expected outputs to test the prompt against.

The Brief

Design a 10-row Golden Dataset for testing a retail e-commerce bot.

Deliverables

  • 10 diverse test queries (Inputs)
  • The expected behaviour/answer for each (Expected Output)
  • Categorisation of the test cases (e.g., 3 standard questions, 3 edge cases, 2 adversarial attacks, 2 off-topic)

Submission Guidance

A test set of 10 easy questions is useless. You must include adversarial questions ("Can I buy a gun?"), ambiguous questions ("Where is it?"), and complex questions ("I bought this yesterday but the price dropped today").

Submit Your Work

Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.

This appears on your public Badge.

0/20000 charactersMarkdown supported

One per line or comma separated. Up to 5 links.

By submitting, you agree your submission text, name, and evaluation will appear on a public Badge URL.