[screen 1]
“Average income rose 5% last year!” Sounds good - but if billionaires got massive raises while most people saw no change, the “average” hides the real story.
Statistics summarize complex information, but they can also obscure important details. Understanding basic statistical concepts helps you evaluate data claims critically without needing advanced math.
[screen 2]
Mean, Median, and Mode
These three “averages” tell different stories:
Mean: Add all values and divide by count (the arithmetic average)
Median: The middle value when data is sorted
Mode: The most common value
Example: Incomes of 1, 2, 3, 4, 100 Mean: 22 (distorted by outlier) Median: 3 (typical middle person) Mode: None (all different)
When someone says “average,” ask which type - it matters.
[screen 3]
Why This Matters
The mean is easily skewed by extreme values. If nine people earn €30,000 and one earns €300,000, the mean is €57,000 - but it doesn’t represent anyone’s actual experience.
The median (€30,000) better represents the typical person.
Political and economic data often use mean when median would be more honest - or vice versa, depending on the desired narrative.
[screen 4]
Understanding Distributions
Data isn’t just about averages - how spread out are the values?
Normal distribution: Bell curve, most values near the middle
Skewed distribution: Data bunched on one side with long tail
Bimodal: Two common value ranges (e.g., incomes in highly unequal societies)
Outliers: Extreme values that don’t fit the pattern
The shape of distribution reveals stories that averages hide.
[screen 5]
Standard Deviation and Variability
Standard deviation measures how spread out data is:
- Low standard deviation: Values cluster tightly around the mean
- High standard deviation: Values vary widely
Two cities might have the same average temperature, but one varies from -10°C to 40°C (high standard deviation) while the other stays 15-25°C (low standard deviation).
Understanding variability prevents misleading comparisons based on averages alone.
[screen 6]
Probability Basics
Probability quantifies likelihood, from 0 (impossible) to 1 (certain). Key concepts:
Independent events: One doesn’t affect the other (coin flips)
Conditional probability: Likelihood given other information
Rare events: Low probability doesn’t mean impossible
Gambler’s fallacy: Past random events don’t affect future ones (coin doesn’t “remember” previous flips)
Understanding probability helps evaluate risk claims and predictions.
[screen 7]
Sample Size Matters
Larger samples generally produce more reliable results:
- 3 people isn’t representative of a country
- 100 people gives rough indication
- 1,000+ people enables meaningful conclusions
- 10,000+ people allows analysis of subgroups
But sampling method matters as much as size. A self-selected online poll of 10,000 people is less reliable than a random sample of 1,000.
[screen 8]
Margins of Error
Surveys include margins of error - the expected variation if the survey were repeated:
“Candidate leads 52% to 48%, margin of error ±3%”
This means the true values could be 49% to 55% and 45% to 51% - it’s actually too close to call.
Many headlines ignore margins of error, treating small differences as meaningful when they’re within statistical noise.
[screen 9]
Statistical Significance Explained
“Statistically significant” means results are unlikely due to random chance alone. Typically defined as less than 5% probability of occurring by chance.
But statistical significance ≠ practical importance:
- A drug might have “statistically significant” effects that are too small to matter clinically
- Large sample sizes can make tiny, meaningless differences “significant”
- Publication bias means only significant results get published
Always ask: significant AND important, or just significant?
[screen 10]
The Replication Crisis
Many published studies fail to replicate when repeated. Reasons include:
- Publication bias (only positive results published)
- p-hacking (manipulating analysis until significance appears)
- Underpowered studies (too small to detect real effects)
- Researcher degrees of freedom (many analysis choices)
Single studies, especially with surprising results, should be treated as preliminary. Look for replication and meta-analyses.
[screen 11]
Common Statistical Fallacies
Regression to the mean: Extreme values tend toward average on retest (doesn’t require explanation)
Texas sharpshooter: Finding patterns after collecting data (like drawing target around bullet holes)
Base rate neglect: Ignoring how common something is overall
Ecological fallacy: Assuming group statistics apply to individuals
Simpson’s Paradox: Trend reverses when data is disaggregated
Awareness of these patterns helps spot manipulation.
[screen 12]
Evaluating Statistical Claims
When encountering statistics, ask:
- Which average is being used - mean, median, or mode?
- What’s the sample size and selection method?
- What’s the margin of error?
- What does the distribution look like?
- Is statistical significance meaningful in practical terms?
- Has this been replicated?
- What data is being omitted?
- Who benefits from this interpretation?
[screen 13]
Statistics as Rhetoric
Remember: Statistics are used rhetorically, not just scientifically. The same data can be presented to support opposite conclusions through:
- Choosing which average to report
- Selecting timeframes
- Using absolute vs. relative numbers
- Showing or hiding variability
- Cherry-picking data points
Critical statistical literacy isn’t about advanced math - it’s about asking which choices were made in presenting data and why.