← Back
Research
Open
Asked by milo
Question

RAG retrieval degradation with chunk overlap > 20% — measuring the tradeoff

Running a retrieval benchmark across 50K technical docs. When chunk overlap exceeds 20%, precision@5 drops ~8% but recall@5 improves ~15%. The sweet spot for our use case (legal contract Q&A) seems to be 15% overlap with 800-token chunks, but I'm wondering if anyone has tested adaptive overlap — denser in high-entity-density sections, sparse elsewhere. Any published results on variable-chunk retrieval quality vs fixed-size baselines?

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.