Verdict — When Policies Collide

Published: (February 8, 2026 at 07:18 PM EST)
6 min read
Source: Dev.to

Source: Dev.to

Algolia Agent Studio Challenge – Consumer‑Facing Non‑Conversational Experiences

Submission Overview

Most support tools simply fetch a policy and hand it to an agent. In reality, tickets often sit at the intersection of multiple, sometimes contradictory, policies.

Example: A treadmill motor dies after 6 weeks.

  • 30‑day return window → No
  • 2‑year motor warranty → Yes

Verdict is a decision engine that resolves these conflicts. An agent clicks a ticket, Verdict pulls the relevant policy clauses from Algolia, detects contradictions, and applies a resolution hierarchy:

  • Product‑specific overrides general
  • Situational overrides everything

The result is a structured verdict with full citations – no chatbot, no back‑and‑forth.

Demo Highlights

ScenarioOutcomeNote
Treadmill X500 (Warranty vs. Return)APPROVED (green)Motor warranty overrides expired return window.
SoundPro Earbuds (Hygiene Override)DENIED (red)Hygiene exception blocks return despite being within the 30‑day window.
Alpine Hiking Boots (Damage Override)APPROVED (green)Shipping‑damage situational override wins over expired return window.
TrailBlazer Daypack (Standard Return)APPROVED (green)No conflict – simple case.
Custom ticketLive analysisPaste any text and watch Verdict reason in real time.
Policy IndexBrowse 26 recordsIncludes 3 red‑herring decoy policies that are correctly ignored.
  • Live URL: (link omitted)
  • Demo video: (link omitted)

Policy Index Details

  • Index name: apex_gear_policies
  • Records: 26 individual policy clauses (one clause per record) for the fictional retailer Apex Gear.
  • Metadata per record:
{
  "policy_layer": 14,
  "priority_score": number,
  "policy_type": "string",
  "product_tags": ["string"],
  "conditions": "string",
  "effect": "string"
}
  • Why clause‑level indexing?
    Conflict resolution needs to compare individual clauses, not whole documents, and keeps each record well under Algolia’s 10 KB free‑tier limit.

  • Red herrings (3 records):

    1. Expired holiday return extension
    2. Loyalty‑member perk
    3. Bulk‑discount policy

    They share product categories with the demo scenarios but should never be cited. The agent consistently ignores them.

Algolia Configuration

FeatureDescription
Custom rankingdesc(policy_layer), desc(priority_score), desc(specificity_score) – ensures the most authoritative clause appears first.
Index‑level Rules (3)Promote critical override policies when trigger words appear. Example: a query containing “hygiene” bumps HYG‑4.1 (in‑ear audio hygiene exception) to position 1.
Synonym groups (6)Expand vocabulary:
• “defective” ↔ “broken”, “malfunction”, “stopped working”
• “earbuds” ↔ “in‑ear audio”, “personal audio”
Result flowRetrieval‑time features (ranking, rules, synonyms) shape what the LLM sees; reasoning‑time features (condition matching, explanation) are handled by the LLM.

These layers work together: Algolia delivers an ordered, vocabulary‑normalized set of clauses; the LLM then reasons over them to produce the final verdict.

Agent Studio Loop

  1. System prompt defines a multi‑step protocol:

    • Extract key information from the ticket.
    • Perform three targeted Algolia searches (general return, product‑specific warranty, situational overrides).
    • Analyze all retrieved policies, detect conflicts, resolve using the hierarchy, and output a structured XML verdict.
  2. Dynamic search selection – the agent decides which queries to run based on ticket content.

    • Earbuds ticket: "hygiene earbuds in‑ear audio return"
    • Treadmill ticket: "Pro‑Treadmill X500 warranty"
    • Hiking boots ticket: "shipping damage carrier report override"
  3. Anti‑hallucination guard – every clause_id in the verdict must be copied verbatim from the search results. The agent cannot invent policies or ask the customer for more information; it must either decide or escalate.

How to Explore

  • Click a ticket in the demo UI → see the verdict card, policy comparison panel, and conflict trace.
  • Hover over citations to view the exact clause text.
  • Browse the Policy Index to see all 26 records, including the decoys.

Takeaway

Verdict demonstrates that non‑conversational, proactive decision making can be built on top of Algolia’s retrieval capabilities and LLM reasoning. The agent’s workflow is simple:

See ticket → Click → Read structured verdict → Act

No chat bubbles, no back‑and‑forth, just a reliable, explainable ruling powered by Algolia and LLMs.

Overview

Agent Studio’s /completions endpoint returns a Server‑Sent Events (SSE) stream. I built a custom SSE parser in the API route that captures every event in the agent’s reasoning chain – not just the final text output.

The parser:

  • Correlates tool-input-start events (which carry the search query and a toolCallId) with tool-output-available events (which carry the actual Algolia hits for that toolCallId).
  • Gives a full pipeline view:
    1. What the agent searched for
    2. What Algolia returned
    3. Which records the LLM ultimately cited

Front‑end Rendering

The UI renders this pipeline trace visibly:

  • Each Algolia search step shows the query text, hit count, and the individual policy records returned.
  • Records that end up cited in the final verdict receive a “Cited in verdict” badge, so you can see exactly which retrieved clauses influenced the decision.

If the LLM deviates from the expected format, the UI falls back to displaying the raw response with a warning banner – it never crashes.

System Prompt – XML‑Tagged Output

<verdict>
  <status>APPROVED</status>
  <type>warranty_claim</type>
  <explanation>Motor warranty overrides expired return window...</explanation>
  <clause>
    <id>WAR-3.1</id>
    <title>Pro-Treadmill Motor Warranty</title>
    <valid>true</valid>
    <outcome>warranty_approved</outcome>
    <details>Motor failed within 2-year warranty period</details>
  </clause>
  <rule>
    <id>WAR-3.1</id>
    <description>Product-specific warranty overrides general return</description>
  </rule>
</verdict>
  • The front‑end parses this into typed components using a regex‑based tag extractor.
  • Across 20+ test runs per scenario at temperature=0, the XML has been well‑formed every time.

Algolia Features Shaping Retrieval

FeatureWhat it does
Custom rankingEvery search result arrives pre‑sorted by policy authority.
Index‑level RulesPromote critical override clauses to the top when trigger conditions are met.
SynonymsNormalize vocabulary (e.g., “motor stopped working” → “mechanical defect”) so the LLM doesn’t need to guess.

All of these are configured in the Algolia dashboard and work transparently through Agent Studio – no extra API code is required.

Why Not a Vector Database?

  • A vector DB would retrieve “semantically similar” policies, which isn’t what we need for compliance data.
  • When a customer reports a treadmill motor failure, we need the exact motor‑warranty clause for that product model (policy_layer:3, applies_to:Pro‑Treadmill X500), not a handful of loosely related fitness‑equipment policies ranked by embedding distance.
  • Structured metadata with precise filtering is the correct retrieval model for this use‑case.

Performance

  • Total analysis time (shown in the UI after each verdict): 5‑10 seconds – almost entirely LLM reasoning.
  • Algolia retrieval: < 50 ms across all three searches.

In a support workflow where agents triage dozens of tickets, this retrieval speed keeps the bottleneck on reasoning, not on waiting for data.

0 views
Back to Blog

Related posts

Read more »