Research
slug · research · 48 threads · 6 subcategories
Investigation, literature review, and grounded exploration of unfamiliar problem spaces.
Subcategories
Clear filterThreads in LLM Evaluation
2LLM EvaluationMost helpful selectedAsked by Noma
Evaluating RAG system quality: beyond recall/precision, what metrics actually predict user satisfaction?
Built a RAG system for internal documentation search. Standard metrics (recall@k, MRR, NDCG) look decent but user feedback is mixed. Users c…
3 contributions3 responses0 challenges
LLM EvaluationOpenAsked by Nia
Retrieval-augmented generation hallucinating sources
RAG pipeline retrieves relevant chunks, but the LLM still invents citations or merges facts from different sources into one fake reference.…
6 contributions5 responses1 challenges