Lazy loaded image
Mathematics
Lazy loaded image Statistical Test Selection Guide
Words 1134Read Time 3 min
May 14, 2020
Jul 20, 2025
type
status
date
slug
summary
tags
category
icon
password
notion image

DECISION FRAMEWORK

Step 1: Identify Data Type
  • Binary/Categorical: 0/1, Yes/No, A/B/C/D
  • Continuous: Heights, prices, times, scores
Step 2: Check Sample Size
  • Large: n ≥ 30 (general rule)
  • For proportions: np ≥ 10 AND n(1-p) ≥ 10
  • For chi-square: Expected frequency ≥ 5 in each cell
Step 3: Test Assumptions
  • Normality: Shapiro-Wilk test, Q-Q plots
  • Equal variance: Levene's test, F-test
  • Independence: Study design consideration
Step 4: Choose Test
Follow the tables above based on your data characteristics.
 

DISCRETE/CATEGORICAL DATA

Scenario
Sample Size
Conditions
Test Method
Use Case
Independence Test
Large
Expected freq ≥ 5
Pearson's Chi-square
Gender vs Product preference
Independence Test
Small
Expected freq < 5
Fisher's Exact Test
Drug effectiveness (small trial)
2x2 Table
Any
Alternative to chi-square
Fisher's Exact Test
Always valid for 2x2

Pearson's Chi-square

"Is there a significant association between gender and product preference?"

Step 1: Create the observed contingency table
Like
Dislike
Total
Male
60
40
100
Female
90
10
100
Total
150
50
200

Step 2: Calculate the expected frequencies
Use the formula:
Expected values:
  • Male–Like:
  • Male–Dislike:
  • Female–Like:
Like
Dislike
Male (E)
75
25
Female (E)
75
25

Step 3: Compute the Chi-square statistic
Use the formula:
Where:
  • O: Observed value
  • E: Expected value
Calculate for each cell:
Total Chi-square statistic:
 

Step 4: Degrees of Freedom

Step 5: Look up the p-value
With χ² = 24.0 and df = 1, we can look up the p-value using a chi-square distribution table or calculator.
  • The critical value for α = 0.05 and df = 1 is 3.841.
  • Since 24.0 > 3.841, the p-value is much less than 0.05 (actually < 0.0001).

There is a significant association between gender and product preference.
We reject the null hypothesis (which states they are independent).

Fisher's Exact Test

"Is there a significant association between treatment (drug vs. placebo) and recovery (cured vs. not cured)?"

Step 1: Construct the contingency table
Cured
Not Cured
Total
Drug Group
3
1
4
Placebo Group
1
3
4
Total
4
4
8
This is a 2x2 contingency table with very small counts → Chi-square test is not valid, so we use Fisher’s Exact Test.

Step 2: Understanding Fisher’s Exact Test
Fisher's Exact Test calculates the exact probability of observing a table as extreme or more extreme than the one observed, assuming the null hypothesis of independence.

Step 3: Fisher's Exact Test Formula
For a 2x2 table like this:
Success
Failure
Total
Group A
a
b
a+b
Group B
c
d
c+d
Total
a+c
b+d
n
The exact probability of observing that configuration is:

Apply to your data:
Cured (a)
Not Cured (b)
Drug
3
1
Placebo
1
3
So:
  • a = 3, b = 1, c = 1, d = 3
  • n = 8
Let’s break this down numerically:
  • Numerator:
  • Denominator:

Step 4: Interpret the p-value
  • p = 0.2286
  • At α = 0.05, we fail to reject the null hypothesis.
Conclusion: There is not enough evidence to say the drug is more effective than the placebo in this small sample.

  • Use Case: Small sample, 2x2 table
  • Why Fisher?: Expected cell values < 5
  • p-value: Exact, not approximate like in chi-square
  • Result: No significant association in this case

PROPORTION COMPARISON

Scenario
Sample Size
Conditions
Test Method
Use Case
Two Proportions
Large
np ≥ 10 & n(1-p) ≥ 10
Z-test for Proportions
A/B testing
Two Proportions
Small
not satisfied conditions above
Binomial Test
Rare event testing
One Proportion
Any
Against known value
Binomial Test
Is coin fair?

Z-test for Proportions

Research Question
"Is the 10% conversion rate of Page B significantly higher than the 8% of Page A, or could this difference be due to random chance?"

Step 1: Define the observed values
  • Page A:
  • Page B:

Step 2: Check Z-test assumptions
→ All conditions satisfied, we can use the Z-test.

Step 3: Compute the pooled proportion

Step 4: Calculate the Z-statistic
Use the formula:
 
Plug in the values:

Step 5: Find the p-value
  • Z ≈ -1.105
  • For a two-tailed test, we look up the p-value for Z = ±1.105
Using a Z-table or calculator:

Step 6: Conclusion
  • p = 0.27 > 0.05, so we fail to reject the null hypothesis.
  • → There is no statistically significant difference between the two conversion rates at the 5% level.
Even though Page B has a higher conversion rate (10% vs. 8%), this difference is not statistically significant with the given sample size.

Binomial Test for Two Proportions (Small Sample)

A new product is tested by 10 users, and 1 person makes a purchase.
You want to test if the observed 10% (1/10) conversion rate is significantly higher than the expected 5% baseline.

Research Question:
Is the observed purchase rate (10%) significantly different from a known or assumed rate (5%)?
This is actually a One-Proportion test, but you framed it as comparing a sample rate (10%) to a known rate (5%). In small samples, we use a binomial test instead of Z-test.

Hypotheses:
  • Null (H₀): p = 0.05 (conversion rate is 5%)
  • Alternative (H₁): p > 0.05 (conversion rate is higher than 5%) → one-tailed

Step 1: Binomial test formula
  • n = 10 (trials)
  • x = 1 (success)
  • p = 0.05 (expected rate)
But in practice, we use a binomial test calculator or Python

Result:
  • P-value ≈ 0.2639
  • At α = 0.05 → Not significant

The observed 10% conversion rate is not significantly greater than the expected 5%.
You fail to reject the null hypothesis.

Binomial Test for One Proportion

You flip a coin 10 times and get 9 heads. You suspect the coin isn’t fair.

Research Question:
Is the observed result (9 heads out of 10) significantly different from what we'd expect with a fair coin (p = 0.5)?

Hypotheses:
  • H₀: p = 0.5 (coin is fair)
  • H₁: p ≠ 0.5 (coin is biased) → two-tailed

Result:
  • P-value ≈ 0.1094
  • Still > 0.05 → Not significant

Even though 9 out of 10 seems extreme, it's not statistically significant at the 5% level.
You do not have enough evidence to say the coin is biased.
 

CONTINUOUS DATA

Large Sample (n ≥ 30)

Variance Known?
Distribution
Test
Use Case
Known
Any
Z-test
Population std known
Unknown
Normal or Any
t-test
Most common
Unknown
Non-normal
Mann-Whitney U
Skewed data

Small Sample (n < 30)

Distribution
Variance
Test
Use Case
Normal
Known
Z-test
Rare in practice
Normal
Equal & Unknown
Student's t-test
Classic
Normal
Unequal & Unknown
Welch's t-test
Unequal variance
Non-normal
Any
Mann–Whitney U
Non-parametric
 
上一篇
A Complete Guide to A/B Testing: Analyzing Webpage Design Impact on Conversion Rates
下一篇
TF-IDF Calulation