A practical introduction to statistics for software developers — how to calculate and interpret the metrics that matter for performance monitoring, A/B testing, and data analysis.
You don't need a statistics degree to use statistics effectively as a developer. But you do need to understand the basics — especially when analyzing latency data, running A/B tests, or reporting on user behavior. This guide covers the essential concepts with practical code examples.
The average (mean) is the most-used statistic and often the most misleading. Consider this response time data (milliseconds):
[50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]
If you were reporting server performance, the mean would make you think there's a problem when 9 of 10 requests are under 56ms.
const mean = (data) => data.reduce((a, b) => a + b, 0) / data.length;
mean([50, 52, 48, 55, 51]); // 51.2
When to use: Data without outliers, normal distributions (heights, measurement errors)
function median(data) {
const sorted = [...data].sort((a, b) => a - b);
const mid = Math.floor(sorted.length / 2);
return sorted.length % 2 !== 0
? sorted[mid]
: (sorted[mid - 1] + sorted[mid]) / 2;
}
median([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]); // 51.5
When to use: Data with outliers, skewed distributions (response times, salaries, page sizes)
Measures how spread out values are from the mean:
function stdDev(data) {
const avg = mean(data);
const squareDiffs = data.map(x => (x - avg) ** 2);
return Math.sqrt(mean(squareDiffs));
}
// Small std dev = values cluster near the mean (consistent)
// Large std dev = values spread out (inconsistent)
stdDev([50, 52, 48, 55, 51]); // ~2.4 — very consistent
stdDev([10, 100, 30, 90, 50]); // ~33.9 — highly variable
When to use: Measuring consistency, detecting anomalies, setting alert thresholds
The Nth percentile is the value below which N% of data falls:
function percentile(data, p) {
const sorted = [...data].sort((a, b) => a - b);
const index = (p / 100) * (sorted.length - 1);
const lower = Math.floor(index);
const fraction = index - lower;
if (lower + 1 < sorted.length) {
return sorted[lower] + fraction * (sorted[lower + 1] - sorted[lower]);
}
return sorted[lower];
}
const latencies = [50, 52, 48, 55, 51, 49, 53, 47, 1200, 52];
percentile(latencies, 50); // 51.5 (median = P50)
percentile(latencies, 90); // P90
percentile(latencies, 95); // P95
percentile(latencies, 99); // P99 — the worst 1%
Report P50 (typical), P95, and P99. Your SLA probably uses P99 or P95. Never use mean alone.
function latencyReport(times) {
return {
p50: percentile(times, 50),
p95: percentile(times, 95),
p99: percentile(times, 99),
max: Math.max(...times),
mean: mean(times),
};
}
For A/B tests, you need to know if a difference is statistically significant or just random chance. The key concepts:
A quick rule: with conversion rates, you typically need at least 1,000 users per variant to detect a 10% relative improvement.
// Track a rolling error rate
function errorRate(total, errors) {
return (errors / total * 100).toFixed(2) + '%';
}
// Track a rolling window (last 5 minutes)
class RollingWindow {
constructor(windowMs) {
this.windowMs = windowMs;
this.events = [];
}
record(success) {
const now = Date.now();
this.events.push({ time: now, success });
this.events = this.events.filter(e => now - e.time < this.windowMs);
}
errorRate() {
if (this.events.length === 0) return 0;
const errors = this.events.filter(e => !e.success).length;
return errors / this.events.length;
}
}
The "3-sigma rule": values more than 3 standard deviations from the mean are likely anomalies (~0.3% of normal data):
function detectAnomalies(data, threshold = 3) {
const avg = mean(data);
const sd = stdDev(data);
return data.filter(x => Math.abs(x - avg) > threshold * sd);
}
detectAnomalies([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]);
// [1200] — correctly identified as anomaly
Moving averages smooth out short-term fluctuations to reveal trends:
function movingAverage(data, windowSize) {
return data.map((_, i) => {
if (i < windowSize - 1) return null;
const window = data.slice(i - windowSize + 1, i + 1);
return mean(window);
});
}
// 7-day moving average of daily active users
const dau = [120, 135, 128, 142, 150, 138, 145, 155, 162, 158];
movingAverage(dau, 7);
// [null, null, null, null, null, null, 136.9, 139, 145.4, 150]
Use HeoLab's Statistics Calculator to compute mean, median, mode, standard deviation, variance, and percentiles for any dataset — paste in your numbers and get full analysis instantly.