Interactive +200 XP ~55 min

Chunking Strategies

Before you can search your documents, you need to split them into chunks. The size and overlap of those chunks dramatically affects retrieval quality.

Why chunk at all? Embedding models have token limits (typically 512-8192 tokens). A token is roughly 3/4 of a word -- so 100 tokens is about 75 words. A 50-page document won't fit into one embedding. We split it into smaller pieces, embed each piece separately, then search across all chunks. The art is choosing the right chunk size and overlap.

Rules of thumb: Start with 200-500 token chunks and 10-20% overlap. Good overlap means repeating the last 10-20% of each chunk at the beginning of the next one -- for example, a 100-word chunk with 15-word overlap. This ensures sentences that fall on the boundary between two chunks are not lost. For technical docs, use larger chunks. For Q&A, use smaller chunks. Always test with real queries -- the "best" chunk size depends on your data and questions.

Chunk Size Tradeoffs

Small Chunks (50-200 words)

More precise retrieval. Better for specific factual questions. Faster embedding. But may lose context needed to understand the passage.

Large Chunks (200-500 words)

More context preserved. Better for complex questions requiring reasoning. But may include irrelevant info that confuses the LLM.

Too Small (<50 words)

Chunks become meaningless fragments. "The cat sat on" tells the LLM nothing useful. Retrieval becomes noise.

Too Large (>500 words)

Dilutes relevance. A chunk about 10 topics matches everything poorly. Also wastes LLM context window tokens.

Match the Chunking Strategy to Its Description

Tap one on the left, then its match on the right

Chunking Pipeline — Order the Steps

Arrange these document processing steps in the correct order

Chunking Strategies — Console

✦Free response

You have a 50-page technical manual to chunk for RAG. Describe 3 chunking strategies with pros and cons for each. Which would you choose for this specific document type?

▸

Type a prompt below to get started.

Try: