CI/CD Observability Powered by OpenTelemetry
Source: Dev.to
Modern Engineering Teams & CI/CD Observability
Modern engineering teams spend a lot of time and resources setting up monitoring of their production systems—tracking uptime, catching errors, and responding to incidents before customers ever notice.
But what about the journey before code reaches production? For most teams, observing the CI/CD pipeline is either an after‑thought or completely overlooked.
Key questions
- Do we truly understand how well our CI/CD process is functioning?
- Are we delivering value to users as quickly and reliably as possible?
What is CI/CD Observability?
CI/CD observability shines a light on the entire software development lifecycle, giving you the insights needed to optimize not just your code, but the whole development process. It moves teams beyond gut feelings and anecdotal evidence, enabling data‑driven decisions that accelerate delivery and improve development practices.
CI/CD Observability Powered by OpenTelemetry and SigNoz

Video: “CI/CD Observability Powered by OpenTelemetry and SigNoz”
SigNoz – Open Source Observability Platform
Why CI/CD Observability Matters
Your CI/CD pipeline is the heartbeat of your software organization. Every feature, bug‑fix, and improvement flows through it. Without observability you’re left guessing about critical questions such as:
- How often do our changes introduce production issues?
- How long does it take for a change to reach production?
- How quickly are pull requests reviewed and merged?
- Where are the slowdowns in our workflow?
- Which pipelines are reliable, and which are flaky?
- Who are our most active contributors, and how are they performing?
Without answers, bottlenecks go unnoticed, pipeline failures become recurring headaches, and engineering teams can’t measure the impact of their efforts. You can’t improve what you can’t see.
Standardized Telemetry with OpenTelemetry
- The CI/CD semantic conventions in OpenTelemetry (OTel) standardize telemetry collection from CI/CD systems.
- The
githubreceivercomponent in the OTel Contrib repository makes it straightforward to ingest GitHub Actions telemetry. - Combined with SigNoz, teams can visualize and analyze CI/CD telemetry within minutes.
Getting Started
1. Deploy SigNoz
Set up SigNoz by following the [installation guide] for your preferred environment (cloud or self‑hosted).
2. Configure the OpenTelemetry Collector
Add the githubreceiver component to your collector configuration.
receivers:
githubreceiver:
# Example configuration
endpoint: https://api.github.com
token: ${GITHUB_TOKEN}
repositories:
- owner: your-org
name: your-repo
Refer to the [CI/CD GitHub Metrics documentation] for detailed, environment‑specific setup instructions.
3. Explore Your CI/CD Metrics in SigNoz
Once configured, use SigNoz dashboards to monitor builds, deployments, and pipeline health in real time.
Sample Insights
Average PR Open, Merge, and Approval Times
- Average time a Pull Request remains open: 1.07 years – many PRs are being ignored.
- Average merge time (for reviewed PRs): 1.8 days – the team is efficient once they engage.
- Average approval time: 5.07 days (longer than merge time) – some approved PRs aren’t merged promptly.
Action items
- Create an automated system to flag approved PRs awaiting merge.
- Establish SLAs for PR review timeframes.
DORA Metrics
| Metric | Description | How SigNoz Helps |
|---|---|---|
| Deployment Frequency | How often you deploy to production. | Filter CI/CD data for production pipelines (e.g., “release charts”) and count deployments over a chosen period. |
| Lead Time for Changes | Time from code change (merge/PR close) to production deployment. | Tracks both waiting time after merge and pipeline execution time. Example: PR merged in 1.8 days + 13 min pipeline → ~2 days lead time. |
| Change Failure Rate | Percentage of deployments that require a hotfix or rollback. | Detects deployments quickly followed by corrective actions (hotfixes). If none → 0 % failure rate. |
| Mean Time to Recovery (MTTR) | Time from a failed deployment to the next successful one that resolves the issue. | Example: Deployment fails, hotfix deployed 13 min later → MTTR = 13 min. |
Alerts & Hotspots
- High Failure Rate Alert: 50 of 151 pipeline runs failed – indicates potential issues with test reliability or configuration.
- Performance Hotspots: The
build‑stagingpipeline consistently took around 8 minutes (longer than expected).
Use SigNoz visualizations to pinpoint failing stages, set alerts, and drive remediation.
References
- OpenTelemetry CI/CD Semantic Conventions – https://opentelemetry.io/docs/specs/semconv/ci-cd/
- GitHub Receiver (OTel Contrib) – https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/githubreceiver
- SigNoz Documentation – https://signoz.io/docs/
Keep observing, keep improving.
Flakiness Identification
We detected a pattern of failures immediately followed by successes in the integrationci pipeline—a classic indicator of test flakiness that needed addressing.
Example of Flakiness Identification
As software delivery continues to accelerate, teams can no longer afford to treat CI/CD pipelines as mysterious black boxes. The combination of OpenTelemetry’s standardized telemetry collection and SigNoz’s powerful visualization capabilities provides a comprehensive solution to this challenge.
Implementing this observability stack grants unprecedented visibility into delivery pipelines and lays the groundwork for continuous improvement.
- Join the SigNoz community on Slack or GitHub, share your experience, and help us shape the future of observability.
- Your feedback directly drives what we build next.