We Analyzed 4,584 MCP Servers — The Average Trust Score Is 53.9 Out of 100
Source: Dev.to
The Numbers
| Metric | Value |
|---|---|
| Servers tracked | 4,584 |
| Categories | 16 |
| Total interactions recorded | 5,846 |
| Average trust score | 53.9 / 100 |
| Highest trust score | 92.1 |
| Servers scoring above 90 | 8 |
The average MCP server scores 53.9 out of 100 – barely passing.
Trust by Category
| Category | Servers | Avg Trust Score |
|---|---|---|
| Data | 208 | 58.3 |
| Code | 317 | 57.9 |
| Productivity | 263 | 56.7 |
| Finance | 226 | 56.2 |
| Health | 26 | 56.2 |
| Compliance | 83 | 56.1 |
| Security | 52 | 55.9 |
| Communication | 164 | 55.6 |
| Search | 367 | 55.5 |
| Education | 67 | 55.4 |
| Transport | 39 | 55.1 |
| Media | 113 | 54.4 |
| Other | 1,880 | 52.6 |
Data and Code servers lead. These categories tend to have more structured, predictable behavior — which is exactly what trust scoring rewards.
The “Other” category is the long tail — 1,880 servers (41 % of all tracked) that don’t fit clean categories. Their below‑average scores suggest many are experimental or poorly documented.
The Top 8: What High‑Trust Servers Look Like
| Server | Category | Trust Score | Interactions |
|---|---|---|---|
| sg-cpf-calculator-mcp | Data | 92.1 | 691 |
| sg-gst-calculator-mcp | Finance | 92.1 | 697 |
| sg‑workpass‑compass‑mcp | Data | 92.0 | 692 |
| sg‑weather‑data‑mcp | Weather | 92.0 | 698 |
| asean‑trade‑rules‑mcp | Data | 91.8 | 691 |
| sg‑regulatory‑data‑mcp | Data | 91.7 | 705 |
| sg‑finance‑data‑mcp | Finance | 91.6 | 695 |
| sg‑company‑lookup‑mcp | Data | 91.4 | 694 |
Patterns
- High interaction volume — 690 + interactions each. Trust is earned through consistent behavior, not a one‑time scan.
- Narrow scope — each does ONE thing well. Focused scope = predictable behavior = higher trust.
- Structured data sources — they wrap government/institutional data, not arbitrary web scraping.
Why This Matters Now
For agent developers
The average server scores 53.9. Would you trust a contractor with a 54 % reliability rating? Check scores before integrating.
For MCP server builders
Your behavioral footprint is your reputation. You can’t game it with a badge — you earn it by being reliable.
For compliance teams
The EU AI Act (Article 12) requires audit trails for AI system behavior. Static code reviews won’t cut it. You need runtime behavioral baselines.
Observatory SDK (Python)
from dominion_observatory import ObservatoryClient
client = ObservatoryClient()
trust = client.check_trust("your-server-name")
For LangChain users
pip install dominion-observatory-langchain
A callback handler that auto‑reports telemetry for every MCP tool call.
Methodology
Runtime behavioral analysis, not static scanning. Every interaction is recorded with anonymized telemetry (tool name, latency, success/fail — no PII, no payload content). Scores are computed from response consistency, error rates, latency stability, and availability.
Full methodology:
Explore the Data
- Full server index:
- Weekly reports:
- Category baselines:
- SDK (Python):
pip install dominion-observatory - SDK (npm):
npm install dominion-observatory-sdk - GitHub:
- Check trust scores before calling any server: