The Scenario
A company discovered that 5% of their revenue reports were wrong last quarter because duplicates, nulls, and out-of-range values slipped through their data pipeline undetected. The CTO wants automated quality gates.
The Brief
Design a comprehensive set of data quality checks for a sales data pipeline. The checks should run automatically after each data load and block bad data from reaching the reporting layer.
Deliverables
- A framework of check categories: completeness, accuracy, consistency, timeliness, and uniqueness
- At least 10 specific check rules with SQL or pseudo-code (e.g., "SELECT COUNT(*) FROM orders WHERE total < 0")
- A severity classification: which failures block the pipeline (critical) vs which generate warnings (non-critical)
- A reporting mechanism: how the data team is notified and what the quality dashboard looks like
Submission Guidance
Quality checks that nobody looks at are worse than no checks (false sense of security). Design the alerting so it is impossible to ignore.
Submit Your Work
Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.