Module 3 · Measures of Effect
The drug halves your risk. Are you impressed?
A pharmaceutical company sends you this headline: "New drug cuts the risk of serious events by 50%." The trial was well-run. The p-value is 0.001. Should you be impressed?
Your instinct might be yes — halving a risk sounds dramatic. But the number that matters isn't the ratio. It's the gap. And the gap depends entirely on what the original risk was.
A new drug reduces the risk of a serious event by 50%. Is that automatically a large, clinically important effect?
Four ways to describe one effect
Any trial comparing two groups produces four standard measures of effect — and every one of them answers a slightly different question. To read a study properly, you need to know what question each measure is asking.
Relative risk — the ratio of the event rate in the treatment group to the rate in the control group. How many times more (or less) likely is the event on treatment?
Risk difference — the treatment risk minus the control risk, expressed as percentage points. How many fewer (or more) events per 100 people?
Number needed to treat — how many patients must receive treatment to prevent one event? Derived from the risk difference: NNT = 1 ÷ RD.
Odds ratio — the ratio of the odds of an event in the treatment group to the odds in the control group. Standard in observational studies and meta-analyses; close to RR when events are rare.
All four come from the same raw data — a 2×2 table of events and non-events in each group. Let's build one.
The 2×2 table — all four measures live
Below is a trial of 2,000 patients — 1,000 on treatment, 1,000 on control. The default shows 10 events in the treatment group and 20 in the control group. Every cell is editable. Change any number and watch all four measures recalculate instantly.
Live measures
You've just seen the core tension: RR stays constant when you scale events up, but NNT changes dramatically. A RR of 0.5 means something very different when the events are rare (NNT 100) versus common (NNT 10). The table is the same structure — the interpretation isn't.
Relative risk: the ratio of two risks
The relative risk is the simplest ratio you can form from the table:
Our trial: Treatment risk = 10/1000 = 1.0%. Control risk = 20/1000 = 2.0%.
RR = 1.0% ÷ 2.0% = 0.50 — treatment halves the risk.
How to read an RR:
- RR = 1.0 — identical event rate in both groups; no effect.
- RR < 1.0 — treatment group has fewer events; protective effect. RR = 0.50 means the treatment group had half the events.
- RR > 1.0 — treatment group has more events; harmful effect. RR = 2.0 means twice as many events on treatment.
RR is a ratio — it says nothing about how common the events were in the first place. A "dramatic" RR of 0.10 on a risk of 0.001% is still an extremely small absolute benefit. That's what risk difference captures.
Risk difference: the absolute gap
The risk difference (RD) is the raw subtraction — control risk minus treatment risk — and it grounds the relative ratio in something real: how many fewer events per 100 patients?
Our trial: Control risk = 2.0%. Treatment risk = 1.0%.
RD = 2.0% − 1.0% = 1 percentage point
Treat 100 patients and, on average, 1 extra event is prevented.
Your turn. Same trial, same numbers — work it out.
RD = 2% − 1% = ? (percentage points)
One percentage point: that's the honest answer. The RR said "50% reduction" and sounds large. The RD says "1 extra event prevented per 100 patients" and tells you the true scale. Neither is wrong — but for decision-making, for budget impact, for cost-effectiveness, the absolute difference is the one that matters.
NNT: how many to treat?
The number needed to treat (NNT) takes the risk difference one step further — it flips it upside down to answer the clinician's question: how many patients do I need to treat to prevent one event?
Our trial: RD = 1 percentage point = 0.01 as a proportion.
NNT = 1 ÷ 0.01 = 100
You must treat 100 patients to prevent 1 event. The other 99 receive treatment — along with any costs and side-effects — for no benefit.
Same numbers. Calculate the NNT.
NNT = 1 ÷ RD = 1 ÷ 0.01 = ?
NNT = 100. Now you can evaluate cost-effectiveness properly: if the treatment costs £500 per patient, you spend £50,000 to prevent one event. Whether that's worth it depends on the event you're preventing — but at least now you're asking the right question. Always insist on NNT alongside RR; they're both describing the same effect, but NNT is impossible to misread as dramatic when it isn't.
Odds: a different way of counting
The odds ratio is ubiquitous in meta-analyses and observational research, but it's easy to misread as a relative risk. The confusion stems from what "odds" means — which is different from "risk".
A risk of 10% is an odds of 10:90 = 0.111. The OR is then the ratio of the treatment odds to the control odds:
Our trial: Treatment odds = 10/990 ≈ 0.0101. Control odds = 20/980 ≈ 0.0204.
OR = 0.0101 / 0.0204 ≈ 0.495
The RR was 0.50. The OR is 0.495 — almost identical, because events are rare.
OR ≈ RR when events are rare (baseline risk < 10%). As events become more common, the OR diverges further from the RR and overstates the effect. This "rare disease assumption" is the most important practical caveat when reading an OR.
Relative vs absolute: the same result, two stories
This is where effect sizes become a rhetorical tool. The same trial result can be reported truthfully, but with very different emotional impact, depending on which measure leads the headline.
Manufacturers tend to lead with relative measures — they're larger and more impressive. Cost-effectiveness analyses, clinical guidelines, and HTA bodies require absolute measures because they reflect what actually happens to a realistic population.
The number that sounds most impressive is almost never the number that matters most for a decision. Whenever you see a relative measure, ask immediately: what was the absolute risk difference, and what is the NNT?
Why this matters for HTA
Measures of effect sit at the very centre of every appraisal dossier — they determine the treatment effect that flows into every economic model. Knowing how to read them is not optional.
- When a submission leads with RR or OR, convert it immediately. Calculate the RD and NNT using the baseline risk for the population in question — not the trial population, not the headline-chosen subgroup. The decision is about patients on the national formulary, not trial centres.
- Check which point estimate the model uses, and whether it is accompanied by a plausible confidence interval. A narrow CI around a large RD is a strong finding; a wide CI around a small RD may not support approval at any threshold.
- Watch for OR/RR conflation. When a meta-analysis pools ORs from observational studies and applies them to a population with common outcomes, the resulting absolute benefit is systematically overstated — a source of optimistic bias baked into the model before any analysis begins.
A relative effect tells you the ratio. An absolute effect tells you the reality. The ICER is built from the absolute effect multiplied by the cost — get the measure wrong and the cost-effectiveness conclusion is wrong before you've written a single equation.
Measures of effect, in one breath
- The same trial result can be presented as RR, RD, NNT, or OR — all correct, all different in how large they make the effect appear.
- RR is a ratio: it says how many times more or less likely the event is on treatment. It says nothing about how common the event was.
- RD is the absolute gap in percentage points: the measure that drives real-world decision-making, budget impact, and cost-effectiveness.
- NNT = 1 ÷ RD: how many patients to treat to prevent one event. Impossible to spin as large when the effect is small.
- OR ≈ RR when events are rare; diverges — and overstates the effect — as events become common. Know the baseline risk before trusting an OR.
- In HTA appraisal, always insist on absolute measures. The model is built on absolute risk differences, and the decision is made on cost per event prevented.
Every trial reports an effect. Your job is to ask which measure they chose, whether it's the most flattering one, and what the NNT looks like once you do the subtraction yourself.
You now have the full statistical toolkit for appraising a trial: you can judge variation and uncertainty, read a confidence interval, understand what a p-value does and doesn't say, spot multiplicity traps, and convert any effect size into the measure that actually matters. The next step is to take that evidence and ask: how does it survive when we move outside the controlled trial? The next module turns to systematic reviews — how to pool evidence, when that pooling makes sense, and what the forest plot is really telling you.