What is the DIETClassifier?

Published: 3 days ago (February 7, 2026 at 08:29 PM EST)

3 min read

Source: Dev.to

Source: Dev.to

What is the DIETClassifier?

DIET stands for Dual Intent and Entity Transformer.
It is a single neural network that performs:

Intent classification
Entity extraction

Unlike CRFEntityExtractor, which focuses only on entities, DIET jointly learns:

The meaning of the full sentence (intent)
The role of each token (entity labels)

This shared learning allows the model to use intent‑level context to improve entity prediction, and vice‑versa.

Why was DIET introduced?

Traditional pipelines looked like this:

Intent classifier → predicts intent
Entity extractor → predicts entities independently

Drawbacks of this separation:

Duplicate feature computation
No shared understanding between intent and entities
More models to train, tune, and maintain

DIET solves this by using one model to learn shared embeddings and optimise both tasks together, leading to better performance, especially when training data is limited.

How DIET works

DIET is based on a Transformer architecture. At a high level, it:

Tokenizes the input text
Converts tokens into embeddings
Applies transformer layers to model context

and predicts:

Sentence embedding → intent
Token‑level labels → entities

Instead of hand‑engineered features (as in CRF), DIET learns features automatically.

Intent classification with DIET

For intent classification, DIET:

Embeds the entire sentence
Compares it against learned intent embeddings
Uses similarity scoring to choose the best intent

Example

“Book a flight to Paris.”

The model learns that this sentence embedding is closest to the book_flight intent, allowing DIET to generalize well to paraphrases and unseen phrasing.

Entity extraction with DIET

DIET performs token‑level classification, similar to CRF. Each token receives labels like B-entity, I-entity, O, etc.

Book    O
a       O
flight  O
from    O
New     B-location
York    I-location
to      O
Paris   B-location

The difference is that DIET uses contextual embeddings produced by transformers instead of manually designed features.

Training data format

DIET uses the same annotated NLU data as CRF.

version: "3.1"

nlu:
  - intent: book_flight
    examples: |
      - Book a flight from [New York](location) to [Paris](location)
      - Fly from [Berlin](location) to [London](location)

There is no separate configuration for intent vs. entity training; DIET learns both from the same data.

Internal working (simplified)

At runtime, DIET:

Tokenizes the message
Generates embeddings
Applies transformer layers

Predicts:

Intent with confidence
Entity labels per token
Groups entity tokens

Example output

{
  "intent": {
    "name": "book_flight",
    "confidence": 0.92
  },
  "entities": [
    {
      "entity": "location",
      "value": "Paris",
      "start": 23,
      "end": 28
    }
  ]
}

When should you use DIETClassifier?

DIETClassifier is the default choice when you want a single model for intents and entities, especially when:

The language is flexible and conversational
You need long‑term scalability or are building production‑grade assistants

CRFEntityExtractor and RegexEntityExtractor still have value for highly structured or deterministic entities, but DIET is the backbone of modern Rasa NLU pipelines.

What is the DIETClassifier?

What is the DIETClassifier?

Why was DIET introduced?

How DIET works

Intent classification with DIET

Entity extraction with DIET

Training data format

Internal working (simplified)

When should you use DIETClassifier?

Related posts

Happy women in STEM day!! <3

Your Coding Agent Doesn't Need a Bigger Context Window. It Needs Coworkers.

Data-driven decision making using Power BI.

Introducing QwikChek: Security Scanning Built for Developers