I trained my own LLM and published it on HuggingFace

Published: 1 hour ago (May 5, 2026 at 05:11 AM EDT)

2 min read

Source: Dev.to

Source: Dev.to

Overview

This post documents the process of fine‑tuning a language model on medical data and publishing it to Hugging Face.

Model Choice

Base model: facebook/opt-1.3b – 1.3 billion parameters, open‑source, no usage restrictions.

Technique: LoRA (Low‑Rank Adaptation)

LoRA adds small trainable adapter layers on top of the frozen base model, reducing the number of trainable parameters from 1.3 B to roughly 4 M (≈100× cheaper).

Training Environment

Hardware: Free Google Colab Tesla T4 GPU (15 GB VRAM), 30 hours/week.
Constraints: No local GPU; CPU training would take days.

Key Code

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer, SFTConfig

# Load base model
model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b")

# Add LoRA adapters
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)

# Train
trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    args=SFTConfig(num_train_epochs=3, learning_rate=2e-4)
)
trainer.train()

Training Results

Training completed in 1.5 hours on the free T4 GPU. Loss progression:

Step 100: 1.163
Step 500: 0.994
Step 1000: 0.967
Step 1700: 0.944 ← training complete

Both training and validation loss decreased together, indicating genuine learning rather than memorization.

Publishing to Hugging Face

model.push_to_hub("Yakhilesh/medmind-opt-medical")
tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical")

The model (adapter weights only ≈ 12.6 MB) is now publicly available at Yakhilesh/medmind-opt-medical. Anyone can download and use it.

Takeaways

Fine‑tuning is more dependent on data quality than on model size.
LoRA enables efficient adaptation of large models with minimal compute cost.
Even a short 1.5‑hour fine‑tune can capture meaningful medical patterns, as reflected by the loss curve.

I trained my own LLM and published it on HuggingFace

Overview

Model Choice

Technique: LoRA (Low‑Rank Adaptation)

Training Environment

Key Code

Training Results

Publishing to Hugging Face

Takeaways

Related posts

LLM-Powered OSINT 2026 — Using AI to Automate Open Source Intelligence Gathering

OpenAI and PwC collaborate to reimagine the office of the CFO

How to build an LLM wiki with How to build an LLM wiki with Claude and MCP

Transformers Are Inherently Succinct

Overview

Model Choice

Technique: LoRA (Low‑Rank Adaptation)

Training Environment

Key Code

Training Results

Publishing to Hugging Face

Takeaways

Related posts

LLM-Powered OSINT 2026 — Using AI to Automate Open Source Intelligence Gathering

OpenAI and PwC collaborate to reimagine the office of the CFO

How to build an LLM wiki with How to build an LLM wiki with Claude and MCP

Transformers Are Inherently Succinct

Publishing to Hugging Face