Design a Chunking Strategy — RAG (Retrieval-Augmented Generation) Prep Beginner Task | Graduates Hub

The Scenario

You are building an AI HR bot using RAG. You have a 200-page PDF of the company handbook. If you embed whole pages, the AI will get confused. If you embed single sentences, the AI loses context.

The Brief

Write a strategy document detailing exactly how you will chunk the handbook text before sending it to the embedding model.

Deliverables

The Chunk Size (e.g., 500 tokens) and Overlap size (e.g., 50 tokens)
The Chunking Method (e.g., Fixed-size, Sentence-aware, or Header-based) and why you chose it
One example of a "bad" chunk (where context is lost) and how your strategy prevents it

Submission Guidance

For structured documents like handbooks, semantic chunking (splitting by Markdown headers like `### Leave Policy`) is usually vastly superior to dumb character-count chunking.

Submit Your Work

Your submission is graded against the rubric on the right. If you pass, you get a public Badge URL you can share on LinkedIn. There is no draft save, so work offline first and paste your finished response here.