Distilling Knowledge into Tiny LLMs

Published: 4 days ago (January 15, 2026 at 12:35 PM EST)

3 min read

Source: Dev.to

Large Language Models (LLMs) are the magic behind AI. These massive billion‑ and trillion‑parameter models generalize well when trained on enough data.

A big problem is that they are hard to run and expensive, so many developers call LLMs through APIs such as OpenAI or Claude. In practice, developers also spend a lot of time crafting complex prompt logic to cover edge cases, believing they need a huge model to handle all the rules.

If you truly want control over your business processes, running a local model is a better choice. The good news is that it doesn’t have to be a multi‑billion‑parameter beast. By fine‑tuning a smaller LLM you can handle specific business logic, reduce prompt complexity, and keep everything in‑house.

This article shows how to distill knowledge into tiny LLMs.

Install dependencies

Install txtai and the required libraries:

pip install txtai[pipeline-train] datasets

The LLM

We’ll use a 600 M parameter Qwen‑3 model for this example. The target task is translating user requests into Linux commands.

from txtai import LLM

llm = LLM("Qwen/Qwen3-0.6B")

Test the base model

llm("""
Translate the following request into a linux command. Only print the command.

Find number of logged in users
""", maxlength=1024)

Output

ps -e

The model understands the request but the command isn’t correct. Let’s improve it through fine‑tuning.

Finetuning the LLM with knowledge

Even a 600 M model can be enhanced by distilling domain‑specific knowledge. We’ll use the Linux commands dataset from Hugging Face and txtai’s training pipeline.

Create the training dataset

"""
Translate the following request into a linux command. Only print the command.

{user request}
"""

from datasets import load_dataset
from transformers import AutoTokenizer

# Model path
path = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(path)

# Load the training dataset
dataset = load_dataset("mecha-org/linux-command-dataset", split="train")

def prompt(row):
    text = tokenizer.apply_chat_template([
        {"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
        {"role": "user", "content": row["input"]},
        {"role": "assistant", "content": row["output"]}
    ], tokenize=False, enable_thinking=False)

    return {"text": text}

# Map to training prompts
train = dataset.map(prompt, remove_columns=["input", "output"])

Train the model

from txtai.pipeline import HFTrainer

trainer = HFTrainer()

model = trainer(
    "Qwen/Qwen3-0.6B",
    train,
    task="language-generation",
    maxlength=512,
    bf16=True,
    per_device_train_batch_size=4,
    num_train_epochs=1,
    logging_steps=50,
)

Evaluate the fine‑tuned model

from txtai import LLM

llm = LLM(model)

# Example 1
llm([
    {"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
    {"role": "user", "content": "Find number of logged in users"}
])

Output

who | wc -l

# Example 2
llm([
    {"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
    {"role": "user", "content": "List the files in my home directory"}
])

Output

ls ~/

# Example 3
llm([
    {"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
    {"role": "user", "content": "Zip the data directory with all its contents"}
])

Output

zip -r data.zip data

The model also works without the explicit system prompt:

llm("Calculate the total amount of disk space used for my home directory. Only print the total.")

Output

du -sh ~

Wrapping up

This article demonstrated how straightforward it is to distill knowledge into LLMs using txtai. You don’t always need a giant model—spending a little time fine‑tuning a tiny LLM can be well worth the effort.

Distilling Knowledge into Tiny LLMs

Install dependencies

The LLM

Test the base model

Finetuning the LLM with knowledge

Create the training dataset

Train the model

Evaluate the fine‑tuned model

Wrapping up

Related posts

Rapg: TUI-based Secret Manager

Quick Data Recovery using Snapshots - Amazon FSx for NetApp ONTAP

Technology is an Enabler, not a Saviour

Industry Survey: Faster Coding, Slower Debugging