AI & PromptingIntermediate 2 to 3 hours

Write an LLM-as-a-Judge Prompt

Use GPT-4 to grade the outputs of a smaller, cheaper model.

The Scenario

You are trying to switch from an expensive model to a cheaper, faster model for summarising articles. You have 100 summaries from the cheap model. You don't have time to read them all. You want to write a prompt for the expensive model to act as an automated "Judge" and score the cheap model's work.

The Brief

Write the LLM-as-a-Judge prompt. It must take the Source Text and the Generated Summary, and output a score out of 5 based on strict criteria.

Deliverables

  • The Judge Prompt
  • The Grading Rubric embedded in the prompt (What defines a 1, 3, or 5?)
  • Chain of Thought formatting (forcing the Judge to write its rationale *before* outputting the final score)

Submission Guidance

LLM judges suffer from biases (e.g., they prefer longer answers). Your prompt must explicitly tell the judge to ignore length and focus only on factual accuracy and conciseness. Forcing rationale before the score is non-negotiable for accuracy.

Submit Your Work

Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.

This appears on your public Badge.

0/20000 charactersMarkdown supported

One per line or comma separated. Up to 5 links.

By submitting, you agree your submission text, name, and evaluation will appear on a public Badge URL.