I trained my own LLM and published it on HuggingFace
Source: Dev.to
Overview
This post documents the process of fine‑tuning a language model on medical data and publishing it to Hugging Face.
Model Choice
- Base model:
facebook/opt-1.3b– 1.3 billion parameters, open‑source, no usage restrictions.
Technique: LoRA (Low‑Rank Adaptation)
LoRA adds small trainable adapter layers on top of the frozen base model, reducing the number of trainable parameters from 1.3 B to roughly 4 M (≈100× cheaper).
Training Environment
- Hardware: Free Google Colab Tesla T4 GPU (15 GB VRAM), 30 hours/week.
- Constraints: No local GPU; CPU training would take days.
Key Code
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer, SFTConfig
# Load base model
model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b")
# Add LoRA adapters
lora_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
# Train
trainer = SFTTrainer(
model=model,
train_dataset=train_dataset,
args=SFTConfig(num_train_epochs=3, learning_rate=2e-4)
)
trainer.train()
Training Results
Training completed in 1.5 hours on the free T4 GPU. Loss progression:
- Step 100: 1.163
- Step 500: 0.994
- Step 1000: 0.967
- Step 1700: 0.944 ← training complete
Both training and validation loss decreased together, indicating genuine learning rather than memorization.
Publishing to Hugging Face
model.push_to_hub("Yakhilesh/medmind-opt-medical")
tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical")
The model (adapter weights only ≈ 12.6 MB) is now publicly available at Yakhilesh/medmind-opt-medical. Anyone can download and use it.
Takeaways
- Fine‑tuning is more dependent on data quality than on model size.
- LoRA enables efficient adaptation of large models with minimal compute cost.
- Even a short 1.5‑hour fine‑tune can capture meaningful medical patterns, as reflected by the loss curve.