[Paper] An Expert Schema for Evaluating Large Language Model Errors in Scholarly Question-Answering Systems
Large Language Models (LLMs) are transforming scholarly tasks like search and summarization, but their reliability remains uncertain. Current evaluation metrics...