Reducing False Positives in WAF: Combining OWASP Rules with AI Context
Source: Dev.to

The Problem
Not every API request carries the same weight.
- A product‑catalog service handles tens of thousands of requests per second and cannot tolerate 100 ms latency.
- A highly‑sensitive admin panel processes ≈10 requests / s and can tolerate that latency.
Most Web Application Firewalls (WAFs) – ModSecurity, Cloudflare, Fastly – apply a single default rule set across the entire application. Although they support custom rules, they don’t provide per‑route security profiles without a lot of manual configuration.
Speed vs. Accuracy
- Rule‑based WAFs detect SQL‑i/XSS attacks in microseconds but generate many false positives because they lack context.
- LLM‑based security understands context but adds unacceptable latency to each request.
Solution: a hybrid approach that blends deterministic pattern matching with probabilistic AI.
Argus – A Hybrid Open‑Source WAF (Go)
Argus combines:
- Coraza (OWASP‑compatible rule engine) for fast pattern matching.
- Gemini LLM for contextual analysis.
It offers three risk profiles so developers can pick the right balance of latency and context per route.
Core Challenges
- How do we merge speed and context awareness without forcing a binary choice?
- How does the WAF degrade when external dependencies fail?
- How can we make adoption painless in production?
Merging Speed and Context
Deterministic regex‑based rules catch the majority of attacks. Adding AI context eliminates false positives where legitimate requests look suspicious (e.g., a tutorial that mentions DELETE TABLE).
Because each endpoint has a different latency and security budget, Argus lets developers choose the mode per route instead of a global setting.
The Solution: Three Modes
| Mode | Behaviour |
|---|---|
| Latency First | Coraza decides allow / block; no AI involvement. |
| Paranoid | Every request is validated by Gemini after Coraza’s check. |
| Smart Shield | Only when Coraza blocks a request does Gemini run to eliminate false positives. |
Each request’s Gemini verdict is logged to a database for later admin analysis.

Resilience: What Happens When AI Fails?
Argus depends on Gemini’s API for Smart Shield and Paranoid modes. If Google’s service experiences an outage:
- OWASP rules continue to block obvious threats (SQLi, XSS, etc.).
- The AI layer simply disappears, but the core protection remains.
Circuit Breaker
A 3‑state circuit breaker ensures graceful degradation:
| State | Condition | Action |
|---|---|---|
| Closed (Normal) | Gemini responding | Requests flow through selected mode. |
| Open | 3 consecutive Gemini failures | Stay open for 30 s; all modes fall back to Coraza only. |
| Half‑Open | After 30 s timeout | Send a single test request to Gemini. • Success → Closed • Failure → Open for another 30 s |

Drop‑in Adoption in Production
Argus hides its internal complexity behind two simple integration paths.
1. Go SDK
For native Go applications, Argus is a middleware that wraps your http.Handler.
package main
import (
"net/http"
"time"
"github.com/priyansh-dimri/argus"
)
func main() {
waf, _ := argus.NewWAF()
client := argus.NewClient(
"https://argus-5qai.onrender.com",
"api-key",
20*time.Second,
)
cfg := argus.Config{
Mode: argus.SmartShield, // or argus.LatencyFirst / argus.Paranoid
}
shield := argus.NewMiddleware(client, waf, cfg)
http.Handle("/api/", shield.Protect(yourHandler))
http.ListenAndServe(":8080", nil)
}
2. Docker Sidecar
For services written in Node, Python, Ruby, PHP, etc., run Argus as a lightweight sidecar reverse proxy. No code changes are required.
docker run -d \
--name argus-sidecar \
-p 8000:8000 \
-e TARGET_URL=http://host.docker.internal:3000 \
-e ARGUS_API_KEY=api-key \
-e ARGUS_API_URL=https://argus-5qai.onrender.com/ \
ghcr.io/priyansh-dimri/argus-sidecar:latest
All traffic is routed through the sidecar before reaching your application.
Optimizing the Hot Path
Because the middleware sits on the critical path of every request, performance is paramount. Argus:
- Caches recent Gemini responses where appropriate.
- Minimizes allocations in the Coraza path.
- Runs the circuit‑breaker logic with lock‑free atomic operations.
These optimizations keep the added latency well within the tolerances of latency‑sensitive services while still delivering the contextual safety net of an LLM.
Happy hacking!
During development we focused on optimizing the hot path, achieving:
- **262 µs** processing time for clean requests.
- **151 ns** overhead for the circuit breaker (atomic state checks instead of mutexes).
- **56 %** parallel efficiency when scaling up to 4 cores.
You can check out the source code and contribute at
[github.com/priyansh-dimri/argus/](https://github.com/priyansh-dimri/argus/).