Introduction to RAG

Published: (April 27, 2026 at 03:45 PM EDT)
2 min read
Source: Dev.to

Source: Dev.to

What is a Model?

A model is essentially an equation.

Example

y = mx + c

During training, values of x and y are provided. The model learns the appropriate values of m and c to fit the data. The values of m and c may vary depending on the use case.

What is a Parameter?

A parameter is a variable that is learned during training.

  • m is a parameter
  • c is a parameter

More parameters allow the model to learn more complex patterns.

What is Temperature?

Temperature controls the model’s creativity. It usually ranges from 0 to 1.

  • Lower temperature → more factual answers
  • Higher temperature → more imaginative answers

Temperature is passed along with the prompt input and is typically set around 0.5 for balanced output.

SLM

SLM stands for Small Language Model.

  • Typically has fewer billion parameters.
  • Trained for a particular domain or specific tasks.
  • Training cost can still be high, similar to LLMs, depending on the use case.

Example: smallest ai – provides voice‑based smaller AI models.

LLM

LLM stands for Large Language Model.

  • Usually has billions of parameters and contains knowledge from many domains.
  • Considered a generalized model.

Example: gpt‑oss‑120b.

How LLM Works

The primary functionality of an LLM is to predict the next word correctly. It generates text by predicting one word after another based on previous words.

Sometimes LLMs generate incorrect information confidently; this phenomenon is called hallucination.

Example: If the model knows about cats and dogs but has limited knowledge about lions, it may generate irrelevant or incorrect content.

Hallucination can be reduced by writing proper prompts and providing correct context.

What is RAG?

RAG stands for Retrieval‑Augmented Generation.
It is a method used to provide private or external knowledge such as:

  • Company policies
  • HR policy documents
  • Internal business documents

This information is supplied to the LLM so it can generate human‑readable answers based on that content.

Where is Private Data Stored?

Private data is usually stored in a vector database.

How Documents are Stored

Documents are split into smaller parts called chunks.
These chunks are converted into numerical vectors and stored in the vector database.

To search relevant chunks quickly, algorithms like:

  • ANN (Approximate Nearest Neighbors)
  • KNN (K‑Nearest Neighbors)

are commonly used.

0 views
Back to Blog

Related posts

Read more »

Don't forget to say 'please'.

I was reading an article recently Long‑running Claude for scientific computinghttps://www.anthropic.com/research/long-running-Claude, which described how to set...