How We Built a GA4-Compatible Analytics Pipeline to Escape US Tech Lock-in
Source: Dev.to

Google Analytics is everywhere. It’s also a deal‑breaker for a growing number of teams.
Under GDPR and the post‑Schrems II landscape, sending EU visitor data to Google’s US infrastructure is legally murky at best. For healthcare organizations under HIPAA or government sites under FedRAMP, it’s a non‑starter.
The usual answer is to switch to a privacy‑friendly alternative. The problem: most of them require you to throw away your existing tracking plan and start over. If you’ve invested in a GA4 setup — event taxonomy, GTM configuration, custom dimensions — that’s a real switching cost.
One hard requirement
We built d8a around a single constraint: it had to speak GA4’s protocol natively. Same /g/collect endpoint, same parameters. If you’re already sending data to Google, you’re already sending it in the right format for d8a. No rewrites, no migration weekend.
How it’s put together
The pipeline has three moving parts:
- a tracking component that turns HTTP requests into hits,
- a queue that buffers them, and
- a processing component that closes sessions and writes to your warehouse.
Transport layer (pluggable)
- Default: filesystem‑based communication – cheap, simple, works on a single VPS.
- HA deployments: swap in object storage (e.g., S3/MinIO) to enable horizontal scaling.
- Low‑latency, high‑throughput: a RabbitMQ driver can be used, at the cost of additional maintenance or a larger cloud bill.
Session engine (pluggable KV interface)
- Session state (grouping hits by visitor, tracking inactivity windows) lives behind an interface with a full black‑box test suite.
- Default implementation: BoltDB – embedded, no external process, runs anywhere.
- Alternative backends: Redis, Cassandra, or any store that satisfies the same contract.
Tracking component (protocol‑agnostic)
- GA4 is the default, but the HTTP path‑to‑protocol mapping is an abstraction.
- Adding a new ingest protocol is just a matter of implementing an interface.
- Possible drop‑ins: Matomo, Amplitude, or any platform with a defined HTTP tracking format.
- Can act as a self‑hosted mirror for teams already on those platforms, intercepting existing tracking calls without touching the client.
Warehouse destinations
- ClickHouse (fully self‑hosted)
- BigQuery
- CSV files written to S3/MinIO, GCS, or local disk
The file path works with Snowflake Snowpipe, Redshift Spectrum, Databricks Auto Loader, DuckDB – if you already have a warehouse, you can pipe into it.
Deployment: a single VPS is enough to get started. Our own cloud runs it on Kubernetes with the object storage transport between the two components.
Get started
d8a is open source, MIT licensed.
- Getting started guide:
- GitHub repository:
- Free cloud instance (for now):