We Analyzed 4,584 MCP Servers — The Average Trust Score Is 53.9 Out of 100

Published: (April 16, 2026 at 08:49 PM EDT)
3 min read
Source: Dev.to

Source: Dev.to

The Numbers

MetricValue
Servers tracked4,584
Categories16
Total interactions recorded5,846
Average trust score53.9 / 100
Highest trust score92.1
Servers scoring above 908

The average MCP server scores 53.9 out of 100 – barely passing.

Trust by Category

CategoryServersAvg Trust Score
Data20858.3
Code31757.9
Productivity26356.7
Finance22656.2
Health2656.2
Compliance8356.1
Security5255.9
Communication16455.6
Search36755.5
Education6755.4
Transport3955.1
Media11354.4
Other1,88052.6

Data and Code servers lead. These categories tend to have more structured, predictable behavior — which is exactly what trust scoring rewards.

The “Other” category is the long tail — 1,880 servers (41 % of all tracked) that don’t fit clean categories. Their below‑average scores suggest many are experimental or poorly documented.

The Top 8: What High‑Trust Servers Look Like

ServerCategoryTrust ScoreInteractions
sg-cpf-calculator-mcpData92.1691
sg-gst-calculator-mcpFinance92.1697
sg‑workpass‑compass‑mcpData92.0692
sg‑weather‑data‑mcpWeather92.0698
asean‑trade‑rules‑mcpData91.8691
sg‑regulatory‑data‑mcpData91.7705
sg‑finance‑data‑mcpFinance91.6695
sg‑company‑lookup‑mcpData91.4694

Patterns

  • High interaction volume — 690 + interactions each. Trust is earned through consistent behavior, not a one‑time scan.
  • Narrow scope — each does ONE thing well. Focused scope = predictable behavior = higher trust.
  • Structured data sources — they wrap government/institutional data, not arbitrary web scraping.

Why This Matters Now

For agent developers

The average server scores 53.9. Would you trust a contractor with a 54 % reliability rating? Check scores before integrating.

For MCP server builders

Your behavioral footprint is your reputation. You can’t game it with a badge — you earn it by being reliable.

For compliance teams

The EU AI Act (Article 12) requires audit trails for AI system behavior. Static code reviews won’t cut it. You need runtime behavioral baselines.

Observatory SDK (Python)

from dominion_observatory import ObservatoryClient

client = ObservatoryClient()
trust = client.check_trust("your-server-name")

For LangChain users

pip install dominion-observatory-langchain

A callback handler that auto‑reports telemetry for every MCP tool call.

Methodology

Runtime behavioral analysis, not static scanning. Every interaction is recorded with anonymized telemetry (tool name, latency, success/fail — no PII, no payload content). Scores are computed from response consistency, error rates, latency stability, and availability.

Full methodology:

Explore the Data

  • Full server index:
  • Weekly reports:
  • Category baselines:
  • SDK (Python): pip install dominion-observatory
  • SDK (npm): npm install dominion-observatory-sdk
  • GitHub:
  • Check trust scores before calling any server:
0 views
Back to Blog

Related posts

Read more »