[Paper] Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning
Recent advances in reasoning techniques have substantially improved the performance of large language models (LLMs), raising expectations for their ability to p...