Module 3 · Distributions
Two classes, the same average — and nothing alike.
Two school classes sit the same exam. Both have an average score of 70%. Sounds like the same class twice, doesn't it?
In the first class, almost everyone scored between 65 and 75 — a tight, predictable bunch. In the second, half the class scored above 90 and half below 50 — brilliant and struggling students, and almost no one near 70 at all.
Same average of 70%. Does the average tell you what the class looks like?
From one measurement to a shape
One measurement is a dot. But measure many — every patient's blood pressure, every student's score — and something appears: a shape.
Stack the measurements up by value, and you get a histogram — tall where lots of values pile up, short where few do. That shape, the pattern of how often each value occurs, is a distribution. It answers a richer question than any single number: not just "what's typical?" but "how do the values spread around typical?"
Read a distribution like a crowd photo: where are people bunched together, and where are the lonely outliers at the edges?
Why so often a bell?
Here's something strange and wonderful: measure enough natural things — heights, blood pressures, measurement errors — and the same shape keeps appearing. A symmetric hump, fat in the middle, thinning out evenly on both sides. The normal distribution — the bell curve.
Why this shape, again and again? Because when lots of small, independent influences add up, they tend to cancel out near the middle and only rarely all pull the same way. Most people land near average; extremes need many influences lining up at once, so they're rare. (Why this happens so reliably is a deeper result we'll meet next lesson — for now, just notice that it does.)
The bell matters because it's so common — and because, remarkably, just two numbers describe the whole thing.
The two numbers that say everything
A normal curve is pinned down completely by two values:
- The mean — where the centre sits. Slide it, and the whole bell shifts left or right along the scale.
- The standard deviation (SD) — how wide the bell is. A small SD makes a tall, narrow peak (everyone close to average); a large SD makes a low, wide spread (lots of variation between individuals).
That's it. Centre and width. Give me those two numbers and I can draw you the exact curve — every bump and tail of it. Let's prove it.
Where the spread comes from
SD measures, on average, how far each patient sits from the mean. You build it in four moves:
Five readings: 117, 119, 121, 121, 122 mmHg.
1. Mean = 600 / 5 = 120
2. Distance from the mean for each: −3, −1, +1, +1, +2
3. Square each and add up (negatives can't cancel positives): 9 + 1 + 1 + 1 + 4 = 16
4. Divide by n − 1, then take the root: 16 / 4 = 4, and √4 = 2 mmHg
The middle quantity — 4, before the square root — is the variance. The square root just brings it back to the original units (mmHg), which is why we report SD, not variance.
Because you have a sample, not everyone. Dividing by n − 1 corrects a slight tendency to underestimate the true spread. (Every stats package does this for you.)
Your turn. Five readings with mean 30. Their squared distances from the mean are: 9, 9, 0, 9, 9.
Add the squared distances: ?
Divide by n − 1 = 4: ? (variance)
Take the square root: ?
SD = 3. Three numbers in, one out — and now "spread" isn't a vibe, it's a quantity you can compute.
Sculpt the curve
Below is the distribution of blood pressure in a population. You have two dials: the average (where the bell centres) and the spread (how wide it is). Move them and watch the whole shape obey.
Try this: slide the average and watch the bell glide sideways without changing shape. Then slide the spread and watch it stretch from a tall spike to a low, wide mound.
Notice what you just did: with only two dials, you can produce any normal curve there is. Nothing else is needed. That's the quiet power of the bell — an entire ocean of data, captured by a centre and a width.
The 68–95–99.7 rule
Now look at the shaded bands on your curve. They reveal the bell's most useful trick — a rule that turns "spread" into a ruler.
For any normal distribution:
- About 68% of values fall within 1 SD of the mean.
- About 95% fall within 2 SD.
- About 99.7% fall within 3 SD.
This is why the SD is so powerful: it's a unit of distance from the average. Being "1 SD out" is ordinary — two-thirds of everyone is within that. Being "2 SD out" is uncommon. Being "3 SD out" is genuinely rare — fewer than 3 in 1,000.
Hold onto this. "How many SDs from the mean?" becomes the universal way to ask "how typical, or how surprising, is this value?" — and it's the seed of how we'll later decide whether a result is more than chance.
Turn the rule into a range
The 95% band is just mean ± 2 SD. With mean = 120 and SD = 15:
lower = 120 − 2 × 15, upper = 120 + 2 × 15.
Lower bound: 120 − 2 × 15 = ?
Upper bound: 120 + 2 × 15 = ?
About 95% of patients have a systolic BP between 90 and 150. You just turned two numbers into a range — exactly the move you'll repeat with SE to build a confidence interval next lesson.
Typical or extreme?
Let's use the ruler. For each measurement below, where does it fall on the bell — and how surprising is that? (Average = 120, SD = 15.)
When the bell doesn't fit
One honest warning, because it matters enormously in health economics: not everything is normal.
Some of the most important things in HTA are skewed — lopsided, with a long tail on one side. Healthcare costs are the classic case: most patients cost a little, but a few cost a fortune, stretching a long tail to the right. Length of hospital stay and survival time behave the same way.
Symmetric
Right-skewed (e.g. costs)
When the shape is skewed, the average gets dragged by the tail — and can mislead.
Why care? Because when a distribution is skewed, the mean gets dragged toward the tail — a handful of hugely expensive patients pulls the "average cost" upward, until it no longer describes a typical patient at all. That's why, for skewed data, the median (the middle value) is often more honest than the mean — a distinction that returns when we measure survival, and when we cost out treatments.
Always ask what shape the data is. A mean on skewed data can quietly mislead.
Why this matters for HTA
Distributions are the floor everything else stands on. Treatment effects, lab results, patient costs, survival times — all of them are distributions, not single numbers, and reading them means reading shape, centre, and spread together.
Three instincts you now carry onto your desk:
- A mean alone is half a story — always ask about the spread beside it.
- "How many SDs from average?" is how you judge whether any value is routine or remarkable.
- When data is skewed — costs, survival — the mean can deceive; look for the median.
Before you trust a summary number, picture the distribution behind it. The shape often tells you more than the number.
Distributions & the normal curve, in one breath
- Many measurements form a distribution — a shape showing where values pile up.
- The normal (bell) curve appears everywhere and is described entirely by two numbers: the mean (centre) and SD (spread).
- SD isn't a vibe — it's the root of the average squared distance from the mean, a number you can compute.
- The 68–95–99.7 rule turns SD into a ruler: within 1 SD is typical, 2 SD uncommon, 3 SD rare.
- "How many SDs from the mean?" measures how surprising a value is — the seed of statistical significance.
- Not all data is normal — skewed things like costs and survival are better summarised by the median.
Two numbers — a centre and a spread — can describe an entire ocean of data. Learn to see the shape behind the summary.
A quick heads-up for next lesson: the SD you just met describes how individual people vary. There's a closely-named cousin — the standard error — that describes something different: how much a study's estimate varies. Same word "standard," very different job. We'll separate them carefully next.
So far, "spread" has meant how individual people vary. Next, the subtle leap that all of statistics turns on: how much your estimate itself varies from study to study — the standard error — and why it shrinks as your sample grows.