Mithridatium: An Open-Source Toolkit for Verifying the Integrity of Pretrained Machine Learning Models

Published: (December 2, 2025 at 09:53 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Why Mithridatium?

Today’s ML ecosystem assumes that pretrained models are safe. In reality, the model file itself can be a silent attack vector:

  • poisoned training data
  • hidden triggers that activate under specific inputs
  • manipulated weights
  • malformed checkpoints that cause unexpected runtime behavior

Mithridatium provides a command‑line workflow to evaluate these risks through model‑centric defenses, inspired by academic research but simplified for real‑world use.

Offline Usage

Once installed, Mithridatium can run entirely offline.

You only need:

  1. Your .pth model file
  2. A local dataset directory (optional for STRIP; required for MMBD depending on configuration)

This makes the tool suitable for restricted environments, air‑gapped machines, or secure internal ML pipelines.

Installation

pip install mithridatium

Upgrade to the latest release:

pip install --upgrade mithridatium

Implemented Defenses

MMBD (Maximum Mean Backdoor Detection)

MMBD evaluates synthetic class‑optimized images to detect anomalous activation patterns commonly associated with backdoored models.

Features

  • per‑class eigenvalue scores
  • normalized anomaly distributions
  • classical hypothesis testing (p‑value)
  • deterministic verdict

Example invocation

mithridatium detect --model model.pth --defense mmbd --arch resnet18 --data cifar10

STRIP (Strong Intentional Perturbation)

STRIP is a black‑box defense that does not rely on internal architectural details. It evaluates prediction entropy when the model is exposed to heavily perturbed variants of the same input. Backdoored models typically exhibit abnormally low entropy under perturbation.

Features

  • entropy computation on perturbed samples
  • sampling and perturbation utilities
  • summary metrics (mean, min, max entropy)
  • integration into a unified reporting schema

Example invocation

mithridatium detect --defense strip --model model.pth --data cifar10 --arch resnet18

Recent Advancements

  • STRIP Core Utility – modular implementation handling entropy scoring, perturbation generation, and device‑safe execution (CPU/MPS/CUDA).
  • CLI Integration – STRIP can now be invoked like MMBD, with unified reporting and JSON output.
  • Output Schema Normalization – standardizing all defenses toward a single report format for ecosystem integration.
  • End‑to‑End CLI Tests – full test coverage ensures STRIP runs cleanly through subprocess without crashes.

What’s Next

  • Improving documentation
  • Adding developer notes
  • Refining report summaries
  • Strengthening validation and error messaging

No new defenses are planned until next year; the focus is on polishing the tool for maintainability and accessibility.

Try it Yourself

The project is open‑source and available here: mithridatium

Contributions, issues, and feedback are welcome.

If you’re working with pretrained models—research, deployment, or security—you should not assume integrity. Mithridatium helps you verify it. Detailed explanations, defense theory, and usage examples are in the repository’s README.

Back to Blog

Related posts

Read more »