Verdict — When Policies Collide
Source: Dev.to
Algolia Agent Studio Challenge – Consumer‑Facing Non‑Conversational Experiences
Submission Overview
Most support tools simply fetch a policy and hand it to an agent. In reality, tickets often sit at the intersection of multiple, sometimes contradictory, policies.
Example: A treadmill motor dies after 6 weeks.
- 30‑day return window → No
- 2‑year motor warranty → Yes
Verdict is a decision engine that resolves these conflicts. An agent clicks a ticket, Verdict pulls the relevant policy clauses from Algolia, detects contradictions, and applies a resolution hierarchy:
- Product‑specific overrides general
- Situational overrides everything
The result is a structured verdict with full citations – no chatbot, no back‑and‑forth.
Demo Highlights
| Scenario | Outcome | Note |
|---|---|---|
| Treadmill X500 (Warranty vs. Return) | ✅ APPROVED (green) | Motor warranty overrides expired return window. |
| SoundPro Earbuds (Hygiene Override) | ❌ DENIED (red) | Hygiene exception blocks return despite being within the 30‑day window. |
| Alpine Hiking Boots (Damage Override) | ✅ APPROVED (green) | Shipping‑damage situational override wins over expired return window. |
| TrailBlazer Daypack (Standard Return) | ✅ APPROVED (green) | No conflict – simple case. |
| Custom ticket | Live analysis | Paste any text and watch Verdict reason in real time. |
| Policy Index | Browse 26 records | Includes 3 red‑herring decoy policies that are correctly ignored. |
- Live URL: (link omitted)
- Demo video: (link omitted)
Policy Index Details
- Index name:
apex_gear_policies - Records: 26 individual policy clauses (one clause per record) for the fictional retailer Apex Gear.
- Metadata per record:
{
"policy_layer": 1‑4,
"priority_score": number,
"policy_type": "string",
"product_tags": ["string"],
"conditions": "string",
"effect": "string"
}
-
Why clause‑level indexing?
Conflict resolution needs to compare individual clauses, not whole documents, and keeps each record well under Algolia’s 10 KB free‑tier limit. -
Red herrings (3 records):
- Expired holiday return extension
- Loyalty‑member perk
- Bulk‑discount policy
They share product categories with the demo scenarios but should never be cited. The agent consistently ignores them.
Algolia Configuration
| Feature | Description |
|---|---|
| Custom ranking | desc(policy_layer), desc(priority_score), desc(specificity_score) – ensures the most authoritative clause appears first. |
| Index‑level Rules (3) | Promote critical override policies when trigger words appear. Example: a query containing “hygiene” bumps HYG‑4.1 (in‑ear audio hygiene exception) to position 1. |
| Synonym groups (6) | Expand vocabulary: • “defective” ↔ “broken”, “malfunction”, “stopped working” • “earbuds” ↔ “in‑ear audio”, “personal audio” |
| Result flow | Retrieval‑time features (ranking, rules, synonyms) shape what the LLM sees; reasoning‑time features (condition matching, explanation) are handled by the LLM. |
These layers work together: Algolia delivers an ordered, vocabulary‑normalized set of clauses; the LLM then reasons over them to produce the final verdict.
Agent Studio Loop
-
System prompt defines a multi‑step protocol:
- Extract key information from the ticket.
- Perform three targeted Algolia searches (general return, product‑specific warranty, situational overrides).
- Analyze all retrieved policies, detect conflicts, resolve using the hierarchy, and output a structured XML verdict.
-
Dynamic search selection – the agent decides which queries to run based on ticket content.
- Earbuds ticket:
"hygiene earbuds in‑ear audio return" - Treadmill ticket:
"Pro‑Treadmill X500 warranty" - Hiking boots ticket:
"shipping damage carrier report override"
- Earbuds ticket:
-
Anti‑hallucination guard – every
clause_idin the verdict must be copied verbatim from the search results. The agent cannot invent policies or ask the customer for more information; it must either decide or escalate.
How to Explore
- Click a ticket in the demo UI → see the verdict card, policy comparison panel, and conflict trace.
- Hover over citations to view the exact clause text.
- Browse the Policy Index to see all 26 records, including the decoys.
Takeaway
Verdict demonstrates that non‑conversational, proactive decision making can be built on top of Algolia’s retrieval capabilities and LLM reasoning. The agent’s workflow is simple:
See ticket → Click → Read structured verdict → Act
No chat bubbles, no back‑and‑forth, just a reliable, explainable ruling powered by Algolia and LLMs.
Overview
Agent Studio’s /completions endpoint returns a Server‑Sent Events (SSE) stream. I built a custom SSE parser in the API route that captures every event in the agent’s reasoning chain – not just the final text output.
The parser:
- Correlates
tool-input-startevents (which carry the search query and atoolCallId) withtool-output-availableevents (which carry the actual Algolia hits for thattoolCallId). - Gives a full pipeline view:
- What the agent searched for
- What Algolia returned
- Which records the LLM ultimately cited
Front‑end Rendering
The UI renders this pipeline trace visibly:
- Each Algolia search step shows the query text, hit count, and the individual policy records returned.
- Records that end up cited in the final verdict receive a “Cited in verdict” badge, so you can see exactly which retrieved clauses influenced the decision.
If the LLM deviates from the expected format, the UI falls back to displaying the raw response with a warning banner – it never crashes.
System Prompt – XML‑Tagged Output
<verdict>
<status>APPROVED</status>
<type>warranty_claim</type>
<explanation>Motor warranty overrides expired return window...</explanation>
<clause>
<id>WAR-3.1</id>
<title>Pro-Treadmill Motor Warranty</title>
<valid>true</valid>
<outcome>warranty_approved</outcome>
<details>Motor failed within 2-year warranty period</details>
</clause>
<rule>
<id>WAR-3.1</id>
<description>Product-specific warranty overrides general return</description>
</rule>
</verdict>
- The front‑end parses this into typed components using a regex‑based tag extractor.
- Across 20+ test runs per scenario at
temperature=0, the XML has been well‑formed every time.
Algolia Features Shaping Retrieval
| Feature | What it does |
|---|---|
| Custom ranking | Every search result arrives pre‑sorted by policy authority. |
| Index‑level Rules | Promote critical override clauses to the top when trigger conditions are met. |
| Synonyms | Normalize vocabulary (e.g., “motor stopped working” → “mechanical defect”) so the LLM doesn’t need to guess. |
All of these are configured in the Algolia dashboard and work transparently through Agent Studio – no extra API code is required.
Why Not a Vector Database?
- A vector DB would retrieve “semantically similar” policies, which isn’t what we need for compliance data.
- When a customer reports a treadmill motor failure, we need the exact motor‑warranty clause for that product model (
policy_layer:3, applies_to:Pro‑Treadmill X500), not a handful of loosely related fitness‑equipment policies ranked by embedding distance. - Structured metadata with precise filtering is the correct retrieval model for this use‑case.
Performance
- Total analysis time (shown in the UI after each verdict): 5‑10 seconds – almost entirely LLM reasoning.
- Algolia retrieval: < 50 ms across all three searches.
In a support workflow where agents triage dozens of tickets, this retrieval speed keeps the bottleneck on reasoning, not on waiting for data.