Reducing False Positives in WAF: Combining OWASP Rules with AI Context

Published: 1 week ago (December 31, 2025 at 02:54 AM EST)

4 min read

Source: Dev.to

Cover image for Reducing False Positives in WAF: Combining OWSQL Rules with AI Context

The Problem

Not every API request carries the same weight.

A product‑catalog service handles tens of thousands of requests per second and cannot tolerate 100 ms latency.
A highly‑sensitive admin panel processes ≈10 requests / s and can tolerate that latency.

Most Web Application Firewalls (WAFs) – ModSecurity, Cloudflare, Fastly – apply a single default rule set across the entire application. Although they support custom rules, they don’t provide per‑route security profiles without a lot of manual configuration.

Speed vs. Accuracy

Rule‑based WAFs detect SQL‑i/XSS attacks in microseconds but generate many false positives because they lack context.
LLM‑based security understands context but adds unacceptable latency to each request.

Solution: a hybrid approach that blends deterministic pattern matching with probabilistic AI.

Argus – A Hybrid Open‑Source WAF (Go)

Argus combines:

Coraza (OWASP‑compatible rule engine) for fast pattern matching.
Gemini LLM for contextual analysis.

It offers three risk profiles so developers can pick the right balance of latency and context per route.

Core Challenges

How do we merge speed and context awareness without forcing a binary choice?
How does the WAF degrade when external dependencies fail?
How can we make adoption painless in production?

Merging Speed and Context

Deterministic regex‑based rules catch the majority of attacks. Adding AI context eliminates false positives where legitimate requests look suspicious (e.g., a tutorial that mentions DELETE TABLE).

Because each endpoint has a different latency and security budget, Argus lets developers choose the mode per route instead of a global setting.

The Solution: Three Modes

Mode	Behaviour
Latency First	Coraza decides allow / block; no AI involvement.
Paranoid	Every request is validated by Gemini after Coraza’s check.
Smart Shield	Only when Coraza blocks a request does Gemini run to eliminate false positives.

Each request’s Gemini verdict is logged to a database for later admin analysis.

Request processing flowchart showing WAF Coraza filter followed by three modes: LatencyFirst (immediate block/allow), SmartShield (AI verification on WAF blocks), and Paranoid (AI checks all requests)

Resilience: What Happens When AI Fails?

Argus depends on Gemini’s API for Smart Shield and Paranoid modes. If Google’s service experiences an outage:

OWASP rules continue to block obvious threats (SQLi, XSS, etc.).
The AI layer simply disappears, but the core protection remains.

Circuit Breaker

A 3‑state circuit breaker ensures graceful degradation:

State	Condition	Action
Closed (Normal)	Gemini responding	Requests flow through selected mode.
Open	3 consecutive Gemini failures	Stay open for 30 s; all modes fall back to Coraza only.
Half‑Open	After 30 s timeout	Send a single test request to Gemini. • Success → Closed • Failure → Open for another 30 s

Circuit breaker state diagram showing three states: HalfOpen, Closed, and Open, with transitions based on test success/failure and timeout conditions

Drop‑in Adoption in Production

Argus hides its internal complexity behind two simple integration paths.

1. Go SDK

For native Go applications, Argus is a middleware that wraps your http.Handler.

package main

import (
    "net/http"
    "time"

    "github.com/priyansh-dimri/argus"
)

func main() {
    waf, _ := argus.NewWAF()

    client := argus.NewClient(
        "https://argus-5qai.onrender.com",
        "api-key",
        20*time.Second,
    )

    cfg := argus.Config{
        Mode: argus.SmartShield, // or argus.LatencyFirst / argus.Paranoid
    }

    shield := argus.NewMiddleware(client, waf, cfg)

    http.Handle("/api/", shield.Protect(yourHandler))
    http.ListenAndServe(":8080", nil)
}

2. Docker Sidecar

For services written in Node, Python, Ruby, PHP, etc., run Argus as a lightweight sidecar reverse proxy. No code changes are required.

docker run -d \
  --name argus-sidecar \
  -p 8000:8000 \
  -e TARGET_URL=http://host.docker.internal:3000 \
  -e ARGUS_API_KEY=api-key \
  -e ARGUS_API_URL=https://argus-5qai.onrender.com/ \
  ghcr.io/priyansh-dimri/argus-sidecar:latest

All traffic is routed through the sidecar before reaching your application.

Optimizing the Hot Path

Because the middleware sits on the critical path of every request, performance is paramount. Argus:

Caches recent Gemini responses where appropriate.
Minimizes allocations in the Coraza path.
Runs the circuit‑breaker logic with lock‑free atomic operations.

These optimizations keep the added latency well within the tolerances of latency‑sensitive services while still delivering the contextual safety net of an LLM.

Happy hacking!

During development we focused on optimizing the hot path, achieving:

- **262 µs** processing time for clean requests.  
- **151 ns** overhead for the circuit breaker (atomic state checks instead of mutexes).  
- **56 %** parallel efficiency when scaling up to 4 cores.

You can check out the source code and contribute at  
[github.com/priyansh-dimri/argus/](https://github.com/priyansh-dimri/argus/).