Scaling Latent Reasoning via Looped Language Models

Published: (January 3, 2026 at 04:34 PM EST)
1 min read

Source: Hacker News

Abstract

Modern LLMs are trained to “think” primarily via explicit text generation, such as chain‑of‑thought (CoT), which defers reasoning to post‑training and under‑leverages pre‑training data. We present and open‑source Ouro, named after the recursive Ouroboros, a family of pre‑trained Looped Language Models (LoopLM) that instead build reasoning into the pre‑training phase through

  1. iterative computation in latent space,
  2. an entropy‑regularized objective for learned depth allocation, and
  3. scaling to 7.7 T tokens.

Ouro 1.4 B and 2.6 B models enjoy superior performance that matches the results of up to 12 B SOTA LLMs across a wide range of benchmarks. Through controlled experiments, we show this advantage stems not from increased knowledge capacity, but from superior knowledge manipulation capabilities. We also show that LoopLM yields reasoning traces more aligned with final outputs than explicit CoT.

We hope our results demonstrate the potential of LoopLM as a novel scaling direction in the reasoning era. Our model is available here: http://ouro-llm.github.io

Back to Blog

Related posts

Read more »

Why Markdown Is The Secret To Better AI

The status quo of web scraping is broken for AI. For a decade, web extraction was a war over CSS selectors and DOM structures. We wrote brittle scrapers that br...