What is the DIETClassifier?
Source: Dev.to
What is the DIETClassifier?
DIET stands for Dual Intent and Entity Transformer.
It is a single neural network that performs:
- Intent classification
- Entity extraction
Unlike CRFEntityExtractor, which focuses only on entities, DIET jointly learns:
- The meaning of the full sentence (intent)
- The role of each token (entity labels)
This shared learning allows the model to use intent‑level context to improve entity prediction, and vice‑versa.
Why was DIET introduced?
Traditional pipelines looked like this:
- Intent classifier → predicts intent
- Entity extractor → predicts entities independently
Drawbacks of this separation:
- Duplicate feature computation
- No shared understanding between intent and entities
- More models to train, tune, and maintain
DIET solves this by using one model to learn shared embeddings and optimise both tasks together, leading to better performance, especially when training data is limited.
How DIET works
DIET is based on a Transformer architecture. At a high level, it:
- Tokenizes the input text
- Converts tokens into embeddings
- Applies transformer layers to model context
and predicts:
- Sentence embedding → intent
- Token‑level labels → entities
Instead of hand‑engineered features (as in CRF), DIET learns features automatically.
Intent classification with DIET
For intent classification, DIET:
- Embeds the entire sentence
- Compares it against learned intent embeddings
- Uses similarity scoring to choose the best intent
Example
“Book a flight to Paris.”
The model learns that this sentence embedding is closest to the book_flight intent, allowing DIET to generalize well to paraphrases and unseen phrasing.
Entity extraction with DIET
DIET performs token‑level classification, similar to CRF. Each token receives labels like B-entity, I-entity, O, etc.
Book O
a O
flight O
from O
New B-location
York I-location
to O
Paris B-location
The difference is that DIET uses contextual embeddings produced by transformers instead of manually designed features.
Training data format
DIET uses the same annotated NLU data as CRF.
version: "3.1"
nlu:
- intent: book_flight
examples: |
- Book a flight from [New York](location) to [Paris](location)
- Fly from [Berlin](location) to [London](location)
There is no separate configuration for intent vs. entity training; DIET learns both from the same data.
Internal working (simplified)
At runtime, DIET:
- Tokenizes the message
- Generates embeddings
- Applies transformer layers
Predicts:
- Intent with confidence
- Entity labels per token
- Groups entity tokens
Example output
{
"intent": {
"name": "book_flight",
"confidence": 0.92
},
"entities": [
{
"entity": "location",
"value": "Paris",
"start": 23,
"end": 28
}
]
}
When should you use DIETClassifier?
DIETClassifier is the default choice when you want a single model for intents and entities, especially when:
- The language is flexible and conversational
- You need long‑term scalability or are building production‑grade assistants
CRFEntityExtractor and RegexEntityExtractor still have value for highly structured or deterministic entities, but DIET is the backbone of modern Rasa NLU pipelines.