Most Product Analytics Dashboards Are Fiction
Your product analytics dashboard says you had 12,847 signups last month. Your billing system says you processed 9,214 new subscriptions. The CFO asks which number is real. You already know the answer, but nobody wants to hear it.
The dashboard is wrong.
Not slightly wrong. Not rounding-error wrong. Wrong in ways that compound silently until every decision built on that data is suspect. I have spent years building checkout flows, event pipelines, and financial infrastructure, and the pattern is always the same: the analytics layer drifts from reality, and nobody notices until the numbers matter.
How Analytics Drift from Reality
The problem is not bad tooling. The problem is that client-side analytics is fundamentally unreliable, and most companies treat it as ground truth.
Here is what actually happens to your data:
UTM Parameters Disappear
A user clicks your Google ad. The URL carries utm_source=google&utm_medium=cpc&utm_campaign=q1_launch. They land on your site. Everything looks clean.
Then they bookmark the page. Come back two days later. The UTMs are gone. They sign up. Your analytics attributes this to "direct" traffic. Your paid marketing team just lost credit for a conversion they actually drove.
# Day 1: User clicks ad
https://yourapp.com/pricing?utm_source=google&utm_medium=cpc&utm_campaign=q1
# Day 3: User returns via bookmark
https://yourapp.com/pricing
# Analytics records: source = "direct"
# Reality: source = "google/cpc"
This is not an edge case. For B2B products with multi-day consideration cycles, this is the majority of conversions.
Cross-Device Journeys Break Attribution
A user researches your product on their phone during lunch. Clicks a LinkedIn post. Browses your docs. That evening, they sign up on their laptop. Your analytics sees two separate anonymous users. The mobile session gets attributed to LinkedIn. The desktop conversion gets attributed to direct.
No amount of client-side JavaScript fixes this. The two sessions exist on different devices, different browsers, different IP addresses. They are fundamentally unlinkable until the user authenticates.
Ad Blockers Silently Drop Events
Roughly 30 to 40 percent of technical audiences use ad blockers. Most ad blockers block analytics scripts. Your analytics.js never loads. Your tracking pixel never fires. The user signs up, pays, and becomes a customer. Your dashboard never knew they existed.
# What your analytics pipeline expects:
page_view → signup_started → signup_completed → first_purchase
# What actually arrives (user with ad blocker):
first_purchase
# (via server-side payment webhook)
# Events 1-3 never fired. This user does not exist
# in your product analytics. They exist in Stripe.
Your conversion funnel shows a 3.2% signup-to-purchase rate. But that denominator is missing 35% of your technical users. The real rate is different, and you cannot know by how much from the analytics layer alone.
Client-Side Events Fail Silently
Even without ad blockers, client-side events are unreliable. The user clicks "Complete Purchase." Your analytics fires a purchase_completed event. But the user's connection is flaky. The event POST request fails. No retry. No error handler. The event vanishes.
Meanwhile, the actual purchase request hits your server, gets processed, and succeeds. The payment goes through. The analytics event does not.
// Typical client-side tracking implementation
function handlePurchase() {
// This can fail silently: network error, tab closed,
// browser throttling, ad blocker, script error
analytics.track("purchase_completed", {
orderId: order.id,
amount: order.total,
});
// This hits your backend with retries, timeouts,
// and error handling
await api.post("/checkout/complete", {
orderId: order.id,
paymentMethodId: selectedPayment.id,
});
}
The analytics call and the business logic call have completely different reliability guarantees. One is fire-and-forget over an unreliable channel. The other is a retried, authenticated API call to your own infrastructure. Yet companies treat them as equally trustworthy data sources.
Checkout Retries Create Phantom Events
Payment processing involves retries. A checkout request times out. The client retries. Both requests eventually succeed on the server, but only one results in a charge because your payment system is idempotent. However, your analytics layer fired checkout_started twice.
Your funnel now shows 100 checkout starts and 95 completions, a 95% completion rate. The real number is 98 unique checkout attempts and 95 completions, a 96.9% rate. Small difference in this example, but at scale with aggressive retry logic, the distortion compounds.
Event Stream (what analytics sees):
checkout_started (user A, attempt 1)
checkout_started (user A, attempt 2, retry)
checkout_started (user B, attempt 1)
checkout_completed (user A, deduplicated by payment system)
checkout_completed (user B)
Analytics dashboard: 3 starts → 2 completions = 66.7% rate
Reality: 2 users → 2 completions = 100% rate
Data Pipelines Lag and Deduplicate Incorrectly
Your analytics events flow through a pipeline: client SDK, collection endpoint, message queue, processing workers, data warehouse. Each stage introduces latency and potential data loss.
Late-arriving events get dropped or double-counted depending on your windowing strategy. Events that arrive out of order get attributed to the wrong session. Schema changes in the client SDK break downstream processing silently, and the pipeline keeps running, just producing wrong numbers.
The Source of Truth Is Not Where You Think
If client-side analytics is unreliable, what do you actually trust?
Financial Systems
Your payment processor knows exactly how many transactions succeeded. Stripe, Square, Adyen - they have no incentive to miscount. Their numbers are audited. They reconcile daily.
-- This number is real.
SELECT
COUNT(DISTINCT customer_id) AS new_customers,
SUM(amount) AS revenue
FROM payments
WHERE status = 'succeeded'
AND created_at >= '2024-01-01'
AND created_at < '2024-02-01'
AND customer_id NOT IN (
SELECT customer_id FROM payments
WHERE created_at < '2024-01-01'
AND status = 'succeeded'
);
This query hits your payment system. It returns the exact count of new paying customers and exact revenue. No sampling. No dropped events. No ad blocker interference.
Order Systems
Your order database knows every order that was placed, fulfilled, or cancelled. It does not rely on a JavaScript snippet loading correctly in a browser.
-- Orders by acquisition channel (server-side attribution)
SELECT
COALESCE(o.attributed_source, 'unknown') AS source,
COUNT(*) AS orders,
SUM(o.total) AS revenue
FROM orders o
WHERE o.status IN ('completed', 'fulfilled')
AND o.created_at >= '2024-01-01'
GROUP BY COALESCE(o.attributed_source, 'unknown')
ORDER BY revenue DESC;
Server-Side Events
The most reliable analytics come from server-side event emission. When your backend processes a signup, it emits an event. When your backend processes a payment, it emits an event. These events are generated by the system of record, not by a script running in an environment you do not control.
# Server-side event emission at the point of business logic
class CheckoutService:
def complete_checkout(self, order_id: str, user_id: str):
# Process the actual business logic
order = self.order_repo.get(order_id)
payment = self.payment_service.charge(order)
if payment.succeeded:
order.mark_completed()
self.order_repo.save(order)
# Emit event from the server, not the browser
self.events.emit("order.completed", {
"order_id": order.id,
"user_id": user_id,
"amount": order.total,
"source": order.attributed_source,
"timestamp": datetime.utcnow().isoformat(),
})
return order
raise PaymentFailedError(payment.error)
This event is emitted only when the order actually completes. It cannot be blocked by an ad blocker. It cannot be lost to a flaky mobile connection. It carries the server's attribution data, not whatever the browser decided to send.
The Architecture That Actually Works
The reliable approach is to treat client-side analytics as a supplementary signal, not the source of truth. Build your core metrics from server-side systems.
Server-Side Attribution
The hardest problem in this architecture is attribution. If you move measurement to the server, how do you know which marketing channel drove the conversion?
The answer is to capture attribution data at the earliest authenticated touchpoint and persist it server-side.
// Capture attribution on first meaningful server interaction
app.post("/api/signup", async (req, res) => {
const attribution = {
source: req.body.utm_source || null,
medium: req.body.utm_medium || null,
campaign: req.body.utm_campaign || null,
referrer: req.headers.referer || null,
landing_page: req.body.landing_page || null,
signup_timestamp: new Date().toISOString(),
};
const user = await createUser({
email: req.body.email,
attribution, // Stored with the user record permanently
});
events.emit("user.created", {
user_id: user.id,
...attribution,
});
return res.json({ user });
});
This is not perfect. You still lose attribution for users who visit without UTMs and sign up later. But it is a known, quantifiable gap, not a silent black hole.
Reconciliation
Run daily reconciliation between your analytics layer and your server-side systems. The gap between them is your measurement error.
-- Daily reconciliation query
WITH analytics_signups AS (
SELECT DATE(timestamp) AS day, COUNT(*) AS count
FROM analytics.events
WHERE event = 'signup_completed'
GROUP BY DATE(timestamp)
),
server_signups AS (
SELECT DATE(created_at) AS day, COUNT(*) AS count
FROM users
WHERE created_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE(created_at)
)
SELECT
s.day,
s.count AS actual_signups,
COALESCE(a.count, 0) AS tracked_signups,
s.count - COALESCE(a.count, 0) AS gap,
ROUND(
100.0 * (s.count - COALESCE(a.count, 0)) / s.count, 1
) AS gap_pct
FROM server_signups s
LEFT JOIN analytics_signups a ON a.day = s.day
ORDER BY s.day DESC;
When I first ran this query at a previous company, the gap was 23%. Nearly a quarter of real signups were invisible to the analytics dashboard. Every decision made from that dashboard - marketing spend, funnel optimization, feature prioritization - was based on incomplete data.
What To Do About It
You probably cannot rip out your analytics stack tomorrow. But you can stop making critical decisions based on it alone.
- Use server-side systems as the source of truth for revenue, signups, and conversions
- Use client-side analytics for behavioral data only: page views, feature usage, UX research
- Run reconciliation queries weekly and publish the gap as a metric
- Capture attribution server-side at signup, not just in the browser
- Instrument your backend services to emit business events
- Accept that some attribution will be unknown and budget for it
The dashboard is a useful lens for understanding user behavior. It is a terrible system of record. The sooner your organization internalizes this distinction, the sooner you stop making decisions based on fiction.
References
- Designing robust and predictable APIs with idempotency- Stripe Engineering Blog
- Kafka: a Distributed Messaging System for Log Processing- Apache Kafka Documentation
- Keystone Real-time Stream Processing Platform- Netflix Technology Blog
- Exactly-once Semantics are Possible- Confluent Blog
- PostHog: Open-source product analytics- PostHog Blog