[Paper] Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

Published: (December 5, 2025 at 01:59 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.05967v1

Overview

The paper proposes a smarter Retrieval‑Augmented Generation (RAG) pipeline for AI‑driven tutoring platforms, especially those delivering content in Italian. By weaving entity linking (tying mentions to Wikidata IDs) into the retrieval step, the authors show that factual accuracy improves when the system must answer domain‑specific educational questions.

Key Contributions

  • Entity‑aware retrieval: Introduces a Wikidata‑based entity linking module that supplies a factual signal alongside traditional semantic similarity.
  • Hybrid re‑ranking strategies: Implements three ways to merge the semantic and entity signals:
    1. Weighted hybrid score,
    2. Reciprocal Rank Fusion (RRF),
    3. Cross‑encoder re‑ranker.
  • Domain‑focused evaluation: Tests on a custom Italian academic QA set and the public SQuAD‑it benchmark, revealing how domain mismatch influences performance.
  • Empirical insight: Shows that RRF‑based hybrid ranking outperforms the baseline RAG in the specialized educational dataset, while the cross‑encoder shines on the general‑domain set.
  • Practical roadmap: Highlights the importance of domain adaptation and entity‑aware retrieval for building reliable AI tutors.

Methodology

  1. Baseline RAG: A standard pipeline that encodes the user query, retrieves top‑k passages from an indexed knowledge base using dense semantic similarity (e.g., DPR or SBERT), and feeds the retrieved text to a Large Language Model for answer generation.
  2. Entity Linking Layer:
    • The query is passed through an off‑the‑shelf Italian entity linker that maps surface forms to Wikidata Q‑IDs.
    • The same linking is performed on every candidate passage in the retrieval corpus, producing a set of entity IDs per passage.
  3. Hybrid Scoring & Re‑ranking:
    • Weighted hybrid: Combines the semantic similarity score and an entity overlap score via a linear weight (tuned on validation data).

    • Reciprocal Rank Fusion (RRF): Treats the semantic rank list and the entity‑overlap rank list as independent, then merges them using the RRF formula:

      [ \text{score} = \sum \frac{1}{k + \text{rank}} ]

    • Cross‑encoder re‑ranker: A BERT‑style model that jointly encodes query + passage + entity IDs and outputs a relevance score; fine‑tuned on the QA datasets.

  4. Answer Generation: The top‑N re‑ranked passages are concatenated and supplied to an LLM (e.g., GPT‑3.5‑turbo) that generates the final answer, optionally with a “grounding” prompt encouraging citation of retrieved facts.

Results & Findings

DatasetBaseline RAG (BLEU/F1)Hybrid‑WeightedRRF (Hybrid)Cross‑Encoder
Custom Italian Academic QA62.3 / 58.764.1 / 60.268.5 / 64.966.2 / 62.8
SQuAD‑it (general)71.4 / 68.972.0 / 69.571.8 / 69.174.3 / 71.6
  • Reciprocal Rank Fusion yields the biggest boost on the domain‑specific academic set, confirming that entity overlap compensates for semantic drift in specialized vocabularies.
  • Cross‑encoder excels on the broader SQuAD‑it benchmark, where richer contextual modeling outweighs the simpler entity signal.
  • The experiments expose a domain mismatch effect: a model tuned for generic text may underperform on niche educational material unless it receives entity‑level grounding.

Practical Implications

  • More reliable AI tutors: By ensuring that retrieved passages contain the exact entities referenced in a student’s question, the system reduces hallucinations and delivers fact‑checked explanations.
  • Plug‑and‑play component: The entity linking module can be swapped for any language‑specific knowledge graph (e.g., DBpedia, ConceptNet), making the approach adaptable to other curricula or languages.
  • Scalable hybrid ranking: RRF is computationally cheap (no extra neural inference) and can be added on top of existing vector‑search pipelines, offering an immediate accuracy bump for production systems.
  • Domain‑aware fine‑tuning: Developers building educational chatbots should consider a two‑stage retrieval—semantic first, entity‑aware second—to handle terminology‑heavy subjects like medicine, law, or engineering.
  • Auditability: Because entity IDs are explicit, developers can trace which knowledge‑graph entries contributed to an answer, simplifying compliance with educational standards and data‑privacy regulations.

Limitations & Future Work

  • Entity linker quality: The current Italian linker struggles with ambiguous or misspelled terms, which can propagate errors into the re‑ranking stage.
  • Knowledge‑graph coverage: Wikidata’s Italian coverage is uneven; niche academic concepts may lack entries, limiting the entity signal’s usefulness.
  • Scalability of cross‑encoder: While accurate, the cross‑encoder re‑ranker adds latency that may be prohibitive for real‑time tutoring scenarios.
  • Future directions: The authors suggest (1) training a domain‑specific entity linker, (2) enriching the knowledge graph with curriculum‑aligned entities, and (3) exploring lightweight neural re‑rankers that retain speed while leveraging entity information.

Authors

  • Francesco Granata
  • Francesco Poggi
  • Misael Mongiovì

Paper Information

  • arXiv ID: 2512.05967v1
  • Categories: cs.IR, cs.AI, cs.CL, cs.LG
  • Published: December 5, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »