Serving a deep learning model with Django

Published: 1 month ago (December 28, 2025 at 02:42 AM EST)

5 min read

Source: Dev.to

Introduction

Deep learning did not suddenly appear with large language models. The field has been evolving for decades, starting with early neural network research in the late 20th century and gradually improving as compute, data, and algorithms advanced. Today, deep learning systems are used in image recognition, natural language processing, recommendation systems, and many other real‑world applications. Training a model, however, is only part of the workflow. To make a model useful, it must be served. It needs to accept inputs and return predictions in a reliable way. In this article, we will walk through a practical approach to serving a deep learning model using Django and PyTorch.

Note: The focus here is clarity and correctness rather than production‑scale optimization (we will see this in a future article).

Part 1: Set up a Django project

Create and activate a virtual environment, then install Django

python -m venv venv
source venv/bin/activate
pip install django

Create a new Django project

django-admin startproject sample_project
cd sample_project

Create an application that will contain the model‑serving logic
```
python manage.py startapp my_app
```

Project directory layout

sample_project/
├── sample_project/
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── my_app/
│   ├── views.py
│   ├── apps.py
│   └── templates/
└── manage.py

Add my_app to INSTALLED_APPS in sample_project/settings.py.
At this stage, no deep‑learning code is involved yet.

Part 2: Train a deep learning model

Model training is intentionally kept outside Django. Training is usually compute‑heavy and better handled in a separate script or notebook. Django’s responsibility will later be inference, not optimization.

Below is a training function that performs supervised learning with validation and tracks common metrics. For this example we use a PyTorch model.

def train_model(
    model: nn.Module,
    loss_fn,
    optim,
    lr: float,
    train_dl,
    val_dl,
    num_classes: int,
    epochs: int = 20,
    device: str = "cuda",
) -> dict[str, list[float]]:
    """
    Train a PyTorch model.

    - Moves data and the model to the chosen device (CPU or GPU)
    - Computes loss using cross‑entropy (appropriate for multi‑class classification)
    - Tracks accuracy and macro‑averaged F1 score using torchmetrics
    - Separates training and validation phases
    """
    # implementation goes here …

Key points

The function is completely independent of Django, ensuring a clean separation between training and serving.
You may train your own model or adapt a pre‑trained one depending on your task.

Part 3: Saving and loading model weights

After training, save the model’s learned parameters so Django can load them for inference.

torch.save(
    {"model_state_dict": model.state_dict()},
    "resnet18_model.pth"
)

When serving the model, reconstruct the architecture exactly as it was during training, then load the saved weights:

model = ResNet18_CustomHead(num_classes=5).to(device)
ckpt = torch.load("resnet18_model.pth", map_location=device)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()   # disable dropout, batch‑norm updates, etc.

This explicit approach is less error‑prone across environments.

Part 4: Creating a Django view for inference

Now we connect everything together. The view below accepts uploaded images, loads the trained model, runs inference, and returns predictions as JSON.

import os
import uuid
import shutil
from pathlib import Path

from django.http import JsonResponse
from django.shortcuts import render
from django.conf import settings

import torch
from torchvision import transforms

# Assume these utilities are defined elsewhere in the project
# from .utils import load_images_from_path, get_default_test_transforms, infer_on_unknown_data
# from .models import ResNet18_CustomHead

TEMP_DIR = Path(settings.BASE_DIR) / "temp"
MODEL_WEIGHTS_PATH = Path(settings.BASE_DIR) / "model_weights"


def analysisPageView(request):
    if request.method == "POST":
        # ---- Prepare a clean temporary directory ----
        os.makedirs(TEMP_DIR, exist_ok=True)

        # Remove any leftover sub‑directories
        for p in TEMP_DIR.iterdir():
            if p.is_dir():
                shutil.rmtree(p)

        req_dir = TEMP_DIR / uuid.uuid4().hex
        req_dir.mkdir(parents=True, exist_ok=True)

        # ---- Save uploaded images to the temp folder ----
        uploaded_files = request.FILES.getlist("images")
        for f in uploaded_files:
            file_name = Path(f.name).name
            ext = Path(file_name).suffix.lower()
            final_file_name = f"image-{uuid.uuid4().hex[:8]}{ext}"

            with open(req_dir / final_file_name, "wb") as dest:
                for chunk in f.chunks():
                    dest.write(chunk)

        # ---- Load images and preprocessing transforms ----
        images = load_images_from_path(req_dir)
        test_tfms = get_default_test_transforms()
        device = "cuda" if torch.cuda.is_available() else "cpu"

        # ---- Load the model (once per request – OK for demos) ----
        model = ResNet18_CustomHead(num_classes=5).to(device)
        ckpt = torch.load(
            MODEL_WEIGHTS_PATH / "resnet18_model.pth",
            map_location=device,
        )
        model.load_state_dict(ckpt["model_state_dict"])
        model.eval()

        # ---- Run inference ----
        pred_labels = infer_on_unknown_data(
            images, model, device, test_tfms
        )

        return JsonResponse({"labels": pred_labels})

    # GET request – render the upload page
    return render(request, "app/analysis.html")

Important notes

Loading the model inside the request handler is acceptable for demonstrations but inefficient for high‑traffic systems. In production you would load the model once (e.g., at server start) and reuse it across requests.
model.eval() disables training‑specific layers such as dropout and batch‑norm updates.
Pre‑processing must exactly match the transforms used during training; otherwise predictions will be unreliable.

With the steps above, you now have a minimal yet functional pipeline for training a PyTorch model, saving its weights, and serving it through a Django web application.

Part 5: Testing the API

Once the view is wired into urls.py, you can test it by:

Submitting images via an HTML form
Sending a POST request using Postman or curl
Calling the endpoint from a frontend application

If everything is set up correctly, Django will return predictions as structured JSON, making it easy to integrate with other services.

Example view for model inference

(The previous view already covers this; the snippet below is provided for reference.)

def ModelInferenceView():
    ...
    images = load_images_from_path(req_dir)

    # ⚠️ Warning: not recommended for production
    model = ResNet18_CustomHead(num_classes=5).to(device)

    ckpt = torch.load(MODEL_WEIGHTS_PATH / "resnet18_model.pth",
                     map_location=device)
    state_dict = ckpt["model_state_dict"]
    model.load_state_dict(state_dict)
    ...

    pred_labels = infer_on_unknown_data(images, model, device, test_tfms)

    return JsonResponse({'labels': pred_labels})

Recap

In this article we walked through the full lifecycle of serving a deep learning model using Django:

Created a Django project and application
Trained a PyTorch model outside the web layer
Saved and loaded model weights correctly
Exposed a Django view for inference
Tested the model via HTTP requests

Django is not designed for large‑scale model serving, but it is a practical choice for prototypes, internal tools, and workloads where simplicity and flexibility matter.

That’s it for today. The reference code (related, but for a different project) is available here. If you found this useful, consider following me on LinkedIn and starring the repository.

Serving a deep learning model with Django

Introduction

Part 1: Set up a Django project

Part 2: Train a deep learning model

Key points

Part 3: Saving and loading model weights

Part 4: Creating a Django view for inference

Important notes

Part 5: Testing the API

Example view for model inference

Recap

Related posts

The $0 Localization Stack for Solo .NET Developers

Building an AI-Powered Code Editor: (part 2) LLM like interpreter

Networking for DevOps (Senior-Level, Production-Focused)

# The Engineering Behind Zero-Buffer 4K Streaming: A Deep Dive into High-Performance Smart4k IPTV Architecture

Introduction

Part 1: Set up a Django project

Part 2: Train a deep learning model

Key points

Part 3: Saving and loading model weights

Part 4: Creating a Django view for inference

Important notes

Part 5: Testing the API

Example view for model inference

Recap

Related posts

The $0 Localization Stack for Solo .NET Developers

Building an AI-Powered Code Editor: (part 2) LLM like interpreter

Networking for DevOps (Senior-Level, Production-Focused)

# The Engineering Behind Zero-Buffer 4K Streaming: A Deep Dive into High-Performance Smart4k IPTV Architecture

Part 1: Set up a Django project

Part 2: Train a deep learning model

Part 3: Saving and loading model weights

Part 4: Creating a Django view for inference

Part 5: Testing the API