Setting Reasoning Strength in OpenWebUI with `chat_template_kwargs`
Source: Dev.to
Controlling Reasoning Strength in OpenWebUI
When you run a model through llama.cpp and access it from OpenWebUI via an OpenAI‑compatible API, you can control how “strongly” the model reasons by sending a custom parameter called chat_template_kwargs. This parameter can include a reasoning_effort setting such as low, medium, or high.
In many llama.cpp‑based deployments, the model’s reasoning behavior is influenced by values passed into the chat template. Rather than trying to force reasoning strength through prompts, passing reasoning_effort via chat_template_kwargs provides a more direct and predictable control mechanism. OpenWebUI supports sending such custom parameters in its model configuration, and this approach is also demonstrated in official integration guidance (see the OpenVINO documentation).
Configuration Steps
-
Open the Admin Panel → Settings → Models.
-
Select the model you want to configure and open Advanced Params.
-
Click + Add Custom Parameter.
-
Set the fields as follows:
-
Parameter name:
chat_template_kwargs -
Value:
{"reasoning_effort": "high"}
(Replace
"high"with"medium"or"low"depending on your needs.) -
-
Save the changes.
Once saved, OpenWebUI will include this parameter in requests sent to your llama.cpp OpenAI‑compatible endpoint, applying the configuration consistently without requiring users to adjust prompts manually.
Reasoning Levels
| Level | Description |
|---|---|
| low | Faster responses and less deep multi‑step reasoning. |
| medium | Balanced performance and reasoning depth. |
| high | More thorough reasoning, often slower and more verbose internally. |
Practical tip: Start with medium and move to high only when you truly need deeper reasoning for complex tasks.