Statistics - Hypothesis Testing in Data Science
Source: Dev.to
Hypothesis testing is a systematic procedure used in statistics and data science to decide whether a claim about a population is supported by sample data or not.

Steps of Hypothesis Testing
Step 1 – State the Problem Clearly
Identify what you want to test.
Example question:
Is the average score of students equal to 70?
Step 2 – Formulate the Hypotheses
(a) Null Hypothesis (H₀)
- Assumes no change / no effect
- Always contains equality (=, ≤, ≥)
- Example: H₀: μ = 70
(b) Alternative Hypothesis (H₁)
- Opposite of H₀, represents what we want to prove
- Example: H₁: μ ≠ 70 (two‑tailed test)
Step 3 – Choose the Significance Level (α)
Probability of rejecting a true null hypothesis. Common values:
- α = 0.05 (5%)
- α = 0.01 (1%)
Meaning: there is a 5 % risk of making a wrong decision.
Step 4 – Select the Appropriate Test
Choose based on sample size, data type, and knowledge of population variance.
| Situation | Test Used |
|---|---|
| Large sample, known variance | Z‑test |
| Small sample, unknown variance | t‑test |
| Categorical data | Chi‑square |
| More than two means | ANOVA |
Step 5 – Collect Sample Data
Gather data randomly from the population.
Example: Sample of 40 students’ scores.
Step 6 – Compute the Test Statistic
Shows how far the sample result is from the assumed population value. Common statistics: Z, t, χ².
Formula example – Z‑test
Z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}}
Step 7 – Determine the p‑Value
p‑value = probability of observing the sample result assuming H₀ is true.
- Small p‑value: strong evidence against H₀
- Large p‑value: weak evidence against H₀
Step 8 – Make the Decision
| Decision Rule | Outcome |
|---|---|
| If p‑value ≤ α | Reject H₀ |
| If p‑value > α | Fail to reject H₀ |
Example: p‑value = 0.03, α = 0.05 → Reject H₀
Step 9 – Draw a Statistical Conclusion
State the result in words, not symbols.
Example: “There is sufficient statistical evidence that the average score is different from 70.”
Step 10 – Interpret the Result in Context
Relate the conclusion to the real‑world problem.
Example: The teaching method has a significant impact on students’ performance.
Flow Summary
- Define the problem
- State H₀ and H₁
- Choose α
- Select the test
- Collect data
- Calculate test statistic
- Find p‑value
- Decision (Reject / Fail to reject H₀)
- Conclusion
- Real‑world interpretation
Important Notes
- “Fail to reject H₀” ≠ “Accept H₀”
- Statistical significance ≠ Practical importance
- Always check the assumptions of the chosen test