Statistics for Developers: Mean, Median, Standard Deviation, and Percentiles

You don't need a statistics degree to use statistics effectively as a developer. But you do need to understand the basics — especially when analyzing latency data, running A/B tests, or reporting on user behavior. This guide covers the essential concepts with practical code examples.

Why Average Is Usually the Wrong Metric

The average (mean) is the most-used statistic and often the most misleading. Consider this response time data (milliseconds):

[50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]

Mean: 186ms (misleading — one outlier skews it)
Median: 51.5ms (the "typical" experience)
P99: 1200ms (the worst 1% of users experience)

If you were reporting server performance, the mean would make you think there's a problem when 9 of 10 requests are under 56ms.

Core Statistical Measures

Mean (Average)

const mean = (data) => data.reduce((a, b) => a + b, 0) / data.length;

mean([50, 52, 48, 55, 51]); // 51.2

When to use: Data without outliers, normal distributions (heights, measurement errors)

Median

function median(data) {
  const sorted = [...data].sort((a, b) => a - b);
  const mid = Math.floor(sorted.length / 2);
  return sorted.length % 2 !== 0
    ? sorted[mid]
    : (sorted[mid - 1] + sorted[mid]) / 2;
}

median([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]); // 51.5

When to use: Data with outliers, skewed distributions (response times, salaries, page sizes)

Standard Deviation

Measures how spread out values are from the mean:

function stdDev(data) {
  const avg = mean(data);
  const squareDiffs = data.map(x => (x - avg) ** 2);
  return Math.sqrt(mean(squareDiffs));
}

// Small std dev = values cluster near the mean (consistent)
// Large std dev = values spread out (inconsistent)
stdDev([50, 52, 48, 55, 51]); // ~2.4 — very consistent
stdDev([10, 100, 30, 90, 50]); // ~33.9 — highly variable

When to use: Measuring consistency, detecting anomalies, setting alert thresholds

Percentiles

The Nth percentile is the value below which N% of data falls:

function percentile(data, p) {
  const sorted = [...data].sort((a, b) => a - b);
  const index = (p / 100) * (sorted.length - 1);
  const lower = Math.floor(index);
  const fraction = index - lower;
  if (lower + 1 < sorted.length) {
    return sorted[lower] + fraction * (sorted[lower + 1] - sorted[lower]);
  }
  return sorted[lower];
}

const latencies = [50, 52, 48, 55, 51, 49, 53, 47, 1200, 52];
percentile(latencies, 50);  // 51.5 (median = P50)
percentile(latencies, 90);  // P90
percentile(latencies, 95);  // P95
percentile(latencies, 99);  // P99 — the worst 1%

Which Metric to Use for What

API Performance Monitoring

Report P50 (typical), P95, and P99. Your SLA probably uses P99 or P95. Never use mean alone.

function latencyReport(times) {
  return {
    p50:  percentile(times, 50),
    p95:  percentile(times, 95),
    p99:  percentile(times, 99),
    max:  Math.max(...times),
    mean: mean(times),
  };
}

A/B Testing

For A/B tests, you need to know if a difference is statistically significant or just random chance. The key concepts:

Sample size: More data = more confidence. Small samples can show random differences
P-value: Probability that the difference happened by chance (want p < 0.05)
Statistical significance: The result probably isn't random
Practical significance: The effect is large enough to matter

A quick rule: with conversion rates, you typically need at least 1,000 users per variant to detect a 10% relative improvement.

Error Rate Tracking

// Track a rolling error rate
function errorRate(total, errors) {
  return (errors / total * 100).toFixed(2) + '%';
}

// Track a rolling window (last 5 minutes)
class RollingWindow {
  constructor(windowMs) {
    this.windowMs = windowMs;
    this.events = [];
  }

  record(success) {
    const now = Date.now();
    this.events.push({ time: now, success });
    this.events = this.events.filter(e => now - e.time < this.windowMs);
  }

  errorRate() {
    if (this.events.length === 0) return 0;
    const errors = this.events.filter(e => !e.success).length;
    return errors / this.events.length;
  }
}

Anomaly Detection with Standard Deviations

The "3-sigma rule": values more than 3 standard deviations from the mean are likely anomalies (~0.3% of normal data):

function detectAnomalies(data, threshold = 3) {
  const avg = mean(data);
  const sd = stdDev(data);
  return data.filter(x => Math.abs(x - avg) > threshold * sd);
}

detectAnomalies([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]);
// [1200] — correctly identified as anomaly

Moving Averages for Time Series

Moving averages smooth out short-term fluctuations to reveal trends:

function movingAverage(data, windowSize) {
  return data.map((_, i) => {
    if (i < windowSize - 1) return null;
    const window = data.slice(i - windowSize + 1, i + 1);
    return mean(window);
  });
}

// 7-day moving average of daily active users
const dau = [120, 135, 128, 142, 150, 138, 145, 155, 162, 158];
movingAverage(dau, 7);
// [null, null, null, null, null, null, 136.9, 139, 145.4, 150]

Calculate Statistics Instantly

Use HeoLab's Statistics Calculator to compute mean, median, mode, standard deviation, variance, and percentiles for any dataset — paste in your numbers and get full analysis instantly.

Why Average Is Usually the Wrong Metric

The average (mean) is the most-used statistic and often the most misleading. Consider this response time data (milliseconds):

[50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]

Mean: 186ms (misleading — one outlier skews it)
Median: 51.5ms (the "typical" experience)
P99: 1200ms (the worst 1% of users experience)

If you were reporting server performance, the mean would make you think there's a problem when 9 of 10 requests are under 56ms.

Core Statistical Measures

Mean (Average)

const mean = (data) => data.reduce((a, b) => a + b, 0) / data.length;

mean([50, 52, 48, 55, 51]); // 51.2

When to use: Data without outliers, normal distributions (heights, measurement errors)

Median

function median(data) {
  const sorted = [...data].sort((a, b) => a - b);
  const mid = Math.floor(sorted.length / 2);
  return sorted.length % 2 !== 0
    ? sorted[mid]
    : (sorted[mid - 1] + sorted[mid]) / 2;
}

median([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]); // 51.5

When to use: Data with outliers, skewed distributions (response times, salaries, page sizes)

Standard Deviation

Measures how spread out values are from the mean:

function stdDev(data) {
  const avg = mean(data);
  const squareDiffs = data.map(x => (x - avg) ** 2);
  return Math.sqrt(mean(squareDiffs));
}

// Small std dev = values cluster near the mean (consistent)
// Large std dev = values spread out (inconsistent)
stdDev([50, 52, 48, 55, 51]); // ~2.4 — very consistent
stdDev([10, 100, 30, 90, 50]); // ~33.9 — highly variable

When to use: Measuring consistency, detecting anomalies, setting alert thresholds

Percentiles

The Nth percentile is the value below which N% of data falls:

function percentile(data, p) {
  const sorted = [...data].sort((a, b) => a - b);
  const index = (p / 100) * (sorted.length - 1);
  const lower = Math.floor(index);
  const fraction = index - lower;
  if (lower + 1 < sorted.length) {
    return sorted[lower] + fraction * (sorted[lower + 1] - sorted[lower]);
  }
  return sorted[lower];
}

const latencies = [50, 52, 48, 55, 51, 49, 53, 47, 1200, 52];
percentile(latencies, 50);  // 51.5 (median = P50)
percentile(latencies, 90);  // P90
percentile(latencies, 95);  // P95
percentile(latencies, 99);  // P99 — the worst 1%

Which Metric to Use for What

API Performance Monitoring

Report P50 (typical), P95, and P99. Your SLA probably uses P99 or P95. Never use mean alone.

function latencyReport(times) {
  return {
    p50:  percentile(times, 50),
    p95:  percentile(times, 95),
    p99:  percentile(times, 99),
    max:  Math.max(...times),
    mean: mean(times),
  };
}

A/B Testing

For A/B tests, you need to know if a difference is statistically significant or just random chance. The key concepts:

Sample size: More data = more confidence. Small samples can show random differences
P-value: Probability that the difference happened by chance (want p < 0.05)
Statistical significance: The result probably isn't random
Practical significance: The effect is large enough to matter

A quick rule: with conversion rates, you typically need at least 1,000 users per variant to detect a 10% relative improvement.

Error Rate Tracking

// Track a rolling error rate
function errorRate(total, errors) {
  return (errors / total * 100).toFixed(2) + '%';
}

// Track a rolling window (last 5 minutes)
class RollingWindow {
  constructor(windowMs) {
    this.windowMs = windowMs;
    this.events = [];
  }

  record(success) {
    const now = Date.now();
    this.events.push({ time: now, success });
    this.events = this.events.filter(e => now - e.time < this.windowMs);
  }

  errorRate() {
    if (this.events.length === 0) return 0;
    const errors = this.events.filter(e => !e.success).length;
    return errors / this.events.length;
  }
}

Anomaly Detection with Standard Deviations

The "3-sigma rule": values more than 3 standard deviations from the mean are likely anomalies (~0.3% of normal data):

function detectAnomalies(data, threshold = 3) {
  const avg = mean(data);
  const sd = stdDev(data);
  return data.filter(x => Math.abs(x - avg) > threshold * sd);
}

detectAnomalies([50, 52, 48, 55, 51, 49, 53, 47, 1200, 52]);
// [1200] — correctly identified as anomaly

Moving Averages for Time Series

Moving averages smooth out short-term fluctuations to reveal trends:

function movingAverage(data, windowSize) {
  return data.map((_, i) => {
    if (i < windowSize - 1) return null;
    const window = data.slice(i - windowSize + 1, i + 1);
    return mean(window);
  });
}

// 7-day moving average of daily active users
const dau = [120, 135, 128, 142, 150, 138, 145, 155, 162, 158];
movingAverage(dau, 7);
// [null, null, null, null, null, null, 136.9, 139, 145.4, 150]

Calculate Statistics Instantly

Use HeoLab's Statistics Calculator to compute mean, median, mode, standard deviation, variance, and percentiles for any dataset — paste in your numbers and get full analysis instantly.

Why Average Is Usually the Wrong Metric

Core Statistical Measures

Mean (Average)

Median

Standard Deviation

Percentiles

Which Metric to Use for What

API Performance Monitoring

A/B Testing

Error Rate Tracking

Anomaly Detection with Standard Deviations

Moving Averages for Time Series

Calculate Statistics Instantly

Try These Tools

Related Articles

Statistics for Developers: Mean, Median, Standard Deviation, and Percentiles

Why Average Is Usually the Wrong Metric

Core Statistical Measures

Mean (Average)

Median

Standard Deviation

Percentiles

Which Metric to Use for What

API Performance Monitoring

A/B Testing

Error Rate Tracking

Anomaly Detection with Standard Deviations

Moving Averages for Time Series

Calculate Statistics Instantly

Try These Tools

Related Articles