The Scenario
You are trying to switch from an expensive model to a cheaper, faster model for summarising articles. You have 100 summaries from the cheap model. You don't have time to read them all. You want to write a prompt for the expensive model to act as an automated "Judge" and score the cheap model's work.
The Brief
Write the LLM-as-a-Judge prompt. It must take the Source Text and the Generated Summary, and output a score out of 5 based on strict criteria.
Deliverables
- The Judge Prompt
- The Grading Rubric embedded in the prompt (What defines a 1, 3, or 5?)
- Chain of Thought formatting (forcing the Judge to write its rationale *before* outputting the final score)
Submission Guidance
LLM judges suffer from biases (e.g., they prefer longer answers). Your prompt must explicitly tell the judge to ignore length and focus only on factual accuracy and conciseness. Forcing rationale before the score is non-negotiable for accuracy.
Submit Your Work
Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.