Distilling Knowledge into Tiny LLMs
Source: Dev.to

Large Language Models (LLMs) are the magic behind AI. These massive billion‑ and trillion‑parameter models generalize well when trained on enough data.
A big problem is that they are hard to run and expensive, so many developers call LLMs through APIs such as OpenAI or Claude. In practice, developers also spend a lot of time crafting complex prompt logic to cover edge cases, believing they need a huge model to handle all the rules.
If you truly want control over your business processes, running a local model is a better choice. The good news is that it doesn’t have to be a multi‑billion‑parameter beast. By fine‑tuning a smaller LLM you can handle specific business logic, reduce prompt complexity, and keep everything in‑house.
This article shows how to distill knowledge into tiny LLMs.
Install dependencies
Install txtai and the required libraries:
pip install txtai[pipeline-train] datasets
The LLM
We’ll use a 600 M parameter Qwen‑3 model for this example. The target task is translating user requests into Linux commands.
from txtai import LLM
llm = LLM("Qwen/Qwen3-0.6B")
Test the base model
llm("""
Translate the following request into a linux command. Only print the command.
Find number of logged in users
""", maxlength=1024)
Output
ps -e
The model understands the request but the command isn’t correct. Let’s improve it through fine‑tuning.
Finetuning the LLM with knowledge
Even a 600 M model can be enhanced by distilling domain‑specific knowledge. We’ll use the Linux commands dataset from Hugging Face and txtai’s training pipeline.
Create the training dataset
"""
Translate the following request into a linux command. Only print the command.
{user request}
"""
from datasets import load_dataset
from transformers import AutoTokenizer
# Model path
path = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(path)
# Load the training dataset
dataset = load_dataset("mecha-org/linux-command-dataset", split="train")
def prompt(row):
text = tokenizer.apply_chat_template([
{"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
{"role": "user", "content": row["input"]},
{"role": "assistant", "content": row["output"]}
], tokenize=False, enable_thinking=False)
return {"text": text}
# Map to training prompts
train = dataset.map(prompt, remove_columns=["input", "output"])
Train the model
from txtai.pipeline import HFTrainer
trainer = HFTrainer()
model = trainer(
"Qwen/Qwen3-0.6B",
train,
task="language-generation",
maxlength=512,
bf16=True,
per_device_train_batch_size=4,
num_train_epochs=1,
logging_steps=50,
)
Evaluate the fine‑tuned model
from txtai import LLM
llm = LLM(model)
# Example 1
llm([
{"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
{"role": "user", "content": "Find number of logged in users"}
])
Output
who | wc -l
# Example 2
llm([
{"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
{"role": "user", "content": "List the files in my home directory"}
])
Output
ls ~/
# Example 3
llm([
{"role": "system", "content": "Translate the following request into a linux command. Only print the command."},
{"role": "user", "content": "Zip the data directory with all its contents"}
])
Output
zip -r data.zip data
The model also works without the explicit system prompt:
llm("Calculate the total amount of disk space used for my home directory. Only print the total.")
Output
du -sh ~
Wrapping up
This article demonstrated how straightforward it is to distill knowledge into LLMs using txtai. You don’t always need a giant model—spending a little time fine‑tuning a tiny LLM can be well worth the effort.