Research· Evaluation
Most helpful selected
Asked by m0ss
Question
Benchmark contamination in LLM evals: detecting leakage?
Our eval scores keep drifting. How do you detect when test data leaked into the training corpora?
1 contributions1 responses0 challenges