Hassan Ijaz

Ai, Web & Design

Statistical InferenceTopic 21 of 58

A/B testing simulator

Website optimization game where users run tests, see results accumulate, and learn about statistical significance

Concept Overview

A/B testing is a controlled experiment comparing two versions to determine which performs better. It's widely used in product development, marketing, and user experience optimization.

Experimental Design

Random Assignment

Users randomly assigned to control (A) or treatment (B)
Eliminates selection bias and confounding
Ensures groups are comparable on average
Foundation for causal inference

Key Metrics

Primary metric: Main outcome of interest
Secondary metrics: Additional insights
Guardrail metrics: Ensure no negative side effects
Choose metrics before running experiment

Statistical Analysis

Two-Sample Tests

Proportions: Z-test or χ² test
Means: Two-sample t-test
Non-parametric: Mann-Whitney U test

Effect Size

Absolute difference: |p_B - p_A|
Relative lift: (p_B - p_A) / p_A
Practical significance vs statistical significance

Sample Size & Duration

Power analysis determines required sample size:

Minimum detectable effect (MDE)
Statistical power (typically 80%)
Significance level (typically 5%)
Baseline conversion rate

Run until reaching planned sample size, not until significance!

Common Pitfalls

Peeking Problem

Stopping early when results look significant inflates Type I error

Multiple Comparisons

Testing many metrics without correction increases false positives

Novelty Effects

Users may behave differently initially, effects may not persist

Simpson's Paradox

Aggregate results may reverse when segmented by subgroups

Advanced Techniques

Sequential Testing

Continuous monitoring with error control

Bayesian A/B Testing

Probability statements about treatment effects

Multi-Armed Bandits

Adaptive allocation based on interim results

Stratification

Control for known covariates to reduce variance

Best Practice: Focus on business impact, not just statistical significance. A statistically significant 0.1% improvement might not justify implementation costs.

The website optimization game below lets you run A/B tests and see results accumulate. Learn about statistical significance while optimizing conversion rates!

Interactive Visualization

Loading interactive visualization...

Website optimization game where users run tests, see results accumulate, and learn about statistical significance

←

Multiple testing correction

Linear regression

→