Building Reliable Product Analytics From Scratch
Most product analytics implementations follow the same path. Install a JavaScript SDK. Add tracking calls to the frontend. Build dashboards. Trust the numbers.
Then the numbers start disagreeing with the payment system. Then the marketing team cannot reconcile campaign spend. Then the ML team discovers that 30% of users are invisible to the analytics system because of ad blockers.
Building reliable product analytics requires starting from the server, not the client.
Server-Side Event Emission
The most reliable event is one emitted by the server that processed the business logic. When the order service creates an order, it emits an order.created event. This event cannot be blocked by ad blockers, lost to client network errors, or duplicated by UI retries.
class OrderService:
def create_order(self, user_id, items, attribution):
order = Order(
id=generate_id(),
user_id=user_id,
items=items,
total=self.calculate_total(items),
attribution=attribution,
created_at=datetime.utcnow(),
)
self.db.save(order)
# Server-side event: reliable, complete, auditable
self.events.emit("order.created", {
"order_id": order.id,
"user_id": user_id,
"total": order.total,
"item_count": len(items),
"attribution": attribution,
"timestamp": order.created_at.isoformat(),
})
return order
Client-side events still have value for behavioral signals - page views, scroll depth, click patterns, feature usage. But the business metrics that drive decisions (signups, revenue, conversions) should come from server-side events.
UTM Persistence
UTM parameters carry marketing attribution data. They arrive on the first page load and disappear on the next navigation. Most analytics implementations lose this data because they only capture UTMs at the moment of the page view event, not at the moment of conversion.
The fix is to persist UTM parameters server-side at the earliest opportunity:
// Middleware: capture and persist UTMs on first visit
app.use((req, res, next) => {
const utms = {
source: req.query.utm_source as string || null,
medium: req.query.utm_medium as string || null,
campaign: req.query.utm_campaign as string || null,
content: req.query.utm_content as string || null,
term: req.query.utm_term as string || null,
};
const hasUtms = Object.values(utms).some(Boolean);
if (hasUtms) {
// Store in session - survives page navigations
req.session.attribution = {
...utms,
referrer: req.headers.referer || null,
landing_page: req.path,
captured_at: new Date().toISOString(),
};
}
next();
});
// At signup: attach persisted attribution to user record
app.post("/api/signup", async (req, res) => {
const user = await createUser({
email: req.body.email,
attribution: req.session.attribution || {
source: "unknown",
},
});
events.emit("user.signup_completed", {
user_id: user.id,
attribution: user.attribution,
});
});
UTMs captured at signup are imperfect - they miss users who arrive without UTMs and return later, or users who arrive via one campaign and convert via another. But they are significantly better than relying on client-side analytics, which loses UTMs entirely for returning visitors.
The Event Pipeline
Events from both client and server sources flow through a shared pipeline: collection, validation, queuing, processing, and warehousing.
class EventPipeline:
def __init__(self, validator, queue, warehouse):
self.validator = validator
self.queue = queue
self.warehouse = warehouse
def ingest(self, event):
# 1. Validate schema
if not self.validator.validate(event):
self.metrics.increment("events.invalid",
tags={"type": event.get("type", "unknown")})
self.quarantine.store(event)
return
# 2. Enrich with server metadata
event["ingested_at"] = datetime.utcnow().isoformat()
event["pipeline_version"] = self.version
# 3. Queue for processing
self.queue.produce(
topic=event["type"],
key=event.get("user_id", event.get("session_id")),
value=event,
)
def process(self, event):
# 4. Deduplicate
if self.dedup_store.exists(event["event_id"]):
self.metrics.increment("events.duplicate")
return
# 5. Transform and load
transformed = self.transform(event)
self.warehouse.insert(transformed)
self.dedup_store.mark(event["event_id"])
Invalid events are quarantined, not dropped. Duplicates are counted and skipped. Every event is enriched with pipeline metadata for debugging.
Reconciliation Systems
The most important component of a reliable analytics system is the reconciliation job. It compares analytics output against the source of truth and publishes the gap.
-- Daily reconciliation: analytics vs reality
WITH analytics_orders AS (
SELECT DATE(timestamp) AS day,
COUNT(*) AS count,
SUM(amount) AS revenue
FROM analytics.events
WHERE type = 'order.created'
GROUP BY DATE(timestamp)
),
actual_orders AS (
SELECT DATE(created_at) AS day,
COUNT(*) AS count,
SUM(total) AS revenue
FROM orders
GROUP BY DATE(created_at)
)
SELECT
a.day,
a.count AS actual_orders,
COALESCE(e.count, 0) AS tracked_orders,
a.count - COALESCE(e.count, 0) AS order_gap,
a.revenue AS actual_revenue,
COALESCE(e.revenue, 0) AS tracked_revenue,
a.revenue - COALESCE(e.revenue, 0) AS revenue_gap,
ROUND(100.0 * (a.revenue - COALESCE(e.revenue, 0))
/ a.revenue, 1) AS revenue_gap_pct
FROM actual_orders a
LEFT JOIN analytics_orders e ON a.day = e.day
ORDER BY a.day DESC;
This query runs daily. The gap percentage is tracked as a metric and alerted on. When the gap grows, something changed in the pipeline. When the gap shrinks after a fix, the fix worked.
A healthy analytics system has a small, stable, explainable gap. A gap that fluctuates wildly indicates pipeline instability. A gap that grows steadily indicates data loss that nobody is catching.
Putting It Together
A reliable analytics system has these properties:
- Server-side events for business metrics. Revenue, signups, and conversions come from the server, not the browser.
- Client-side events for behavioral data. Page views and feature usage come from the client, acknowledged as lossy.
- UTM persistence at session level. Attribution data is captured on arrival and attached to the user record at conversion.
- Schema validation at ingestion. Malformed events are quarantined, not dropped or silently corrupted.
- Deduplication at processing. Duplicate events from retries are detected and counted.
- Daily reconciliation. The gap between analytics and the source of truth is measured, tracked, and alerted on.
This is more infrastructure than installing a JavaScript SDK. It is also the difference between analytics you can trust and analytics you learn to ignore.
References
- PostHog: Open-source product analytics- PostHog Documentation
- Apache Kafka Documentation- Apache Kafka
- The Log: What every software engineer should know- LinkedIn Engineering
- Keystone Real-time Stream Processing Platform- Netflix Technology Blog