M6 · MEASURING HEALTH OUTCOMES
Two treatments. One budget. Completely different kinds of good.
Treatment A extends life: it gives a patient two extra years they wouldn't otherwise have. But those years are spent seriously ill — in pain, dependent, low quality.
Treatment B doesn't extend life at all. It changes nothing about how long the patient lives. What it does is transform the years they already have — from years of suffering into years of decent, functional health.
You can fund one. Which does more good?
Notice you can't answer — not because you lack information, but because A and B produce different currencies of benefit. A produces time. B produces quality. There's no obvious exchange rate between "two more years, badly" and "the same years, much better." Every health system faces this comparison constantly — a cancer drug that adds months against a therapy that eases a chronic condition — and to make it, you need something that reduces both kinds of good to a single number.
That number is the QALY, and it does something quietly astonishing: it declares an exchange rate between length of life and quality of life. This lesson is about how it works, why it made modern health economics possible, and why that exchange rate is the most contested idea in the field.
Every treatment touches two things: how long you live, and how well. A measure of health has to capture both.
Think about what medicine actually changes. Some interventions add time — they postpone death: a cancer drug, a transplant, a statin preventing a fatal heart attack. Others add quality — they don't extend life but make it better: a hip replacement, an antidepressant, pain relief, cataract surgery. Most do some of each.
Now watch what happens if your measure of health captures only one dimension. Measure only length of life, and a hip replacement — which adds no years — scores zero benefit, which is absurd. Measure only quality, and a drug that keeps someone alive for five extra years registers as nothing unless it also improves their day-to-day state. Either one-eyed measure produces nonsense for half of all medicine.
So a serious measure of health must fuse both dimensions: it has to say how much quality over how much time. The QALY is exactly that fusion — and the whole trick is how you combine two things measured in completely different units. You do it by turning quality into a number.
To combine quality with time, quality needs a scale. That scale is called utility.
The QALY measures quality of life on a scale anchored at two points:
- 1.0 = full health. A year lived in perfect health.
- 0 = death.
A health state in between gets a fraction. A state of, say, moderate disability with some pain might sit at 0.5 — meaning, roughly, that a year lived in that state is valued as half as good as a year in full health. This number is called a utility (or a health-state value).
Two things to flag now and settle later. First: where does a specific number like 0.5 come from? It's not read off an instrument — it comes from asking people to value health states, and how you ask changes the answer. That's the entire next lesson; for now, take the utility as given. Second, and stranger: the scale doesn't stop at 0. Some health states are rated worse than death — utilities below zero — because, offered the choice, people would rather die than endure them. Hold that thought; it matters later.
For now, the one thing you need: quality of life is expressed as a utility between 0 and 1 (occasionally below), where 1 is full health and 0 is death.
Now the fusion. A QALY is just time multiplied by the utility of that time.
That's the whole engine. Run it:
- One year in full health: 1 year × 1.0 = 1 QALY.
- One year at utility 0.5: 1 × 0.5 = 0.5 QALYs. The year still passes, but it counts as half a year of full health.
- Four years at utility 0.5: 4 × 0.5 = 2 QALYs.
- Two years in full health: 2 × 1.0 = 2 QALYs.
Look at those last two. Four years at half-quality, and two years at full quality, both come to 2 QALYs — the QALY says they're equivalent. That's the exchange rate from the hook, made concrete.
Geometrically, this is beautifully simple. Put time on the horizontal axis and utility on the vertical. A person's health over their life traces a line, and the QALYs are the area underneath it — quality (height) accumulated over time (width). More area, more health. Every treatment is, in the end, an attempt to enlarge that area — and there are exactly two ways to do it.
The area under the curve.
Two treatments, two completely different shapes — and the QALY says they're worth exactly the same.
Below is a patient's health over time: utility on the vertical axis, years on the horizontal. Untreated, they live at utility 0.5 for 4 years, then die — the shaded area is their QALYs: 0.5 × 4 = 2 QALYs. Now apply each treatment and watch the area grow.
2.0 QALYs
Untreated: 0.5 utility for 4 years.
Sit with this. "Extend life" stretches the curve rightward — more years at the same modest quality. "Improve quality" lifts the curve upward — the same years, but lived fully. Two utterly different clinical stories: one a life-prolonging drug, the other a life-transforming one. And they produce the identical gain: +2 QALYs each. The QALY looks at a wider-but-lower rectangle and a taller-but-narrower one, computes the same area, and declares them equal. That equation — time and quality trade off one-for-one — is what lets a health system compare a cancer drug against a hip replacement on a single scale. It's the QALY's superpower. It's also, as we'll see, exactly what people argue about.
Compute QALYs from a health profile.
Real patients move through several health states. You just add up the pieces.
A patient's life rarely sits at one utility. Take this profile: they live 3 years at utility 0.6, then 2 years at utility 0.8, then die. Total QALYs = each period's time × its utility, summed.
3 years at utility 0.6, then 2 years at utility 0.8: (3 × 0.6) + (2 × 0.8) = ?
Treatment lifts the last 2 years to utility 1.0: new total (3 × 0.6) + (2 × 1.0) = 3.8. Incremental QALYs = 3.8 − 3.4 = ?
3.4 QALYs. You just computed the area under a two-step curve — 1.8 from the first stretch, 1.6 from the second.
0.4 incremental QALYs — the extra health the treatment produces. And this is exactly the number that goes in the denominator of the ICER you met in Module 5: cost per QALY = extra cost ÷ extra QALYs. This is where QALYs come from before they're ever divided into a cost. The whole economic edifice rests on this small multiplication.
The QALY's one-for-one trade between time and quality feels like arithmetic. It's actually an assumption — a strong one.
When the QALY says "2 years at utility 0.5 = 1 year at utility 1.0," it's assuming that quality and time multiply linearly and independently: that a year is a year is a year, that each unit of quality is worth the same regardless of how many years you have, and that whose years they are doesn't enter the sum. Each of these is a genuine assumption about human value, not a mathematical necessity.
Consider just one crack. Is 10 years at utility 0.5 really worth the same as 5 years at full health? Many people say no — they'd take fewer years lived well over more years lived poorly, or vice versa, in ways the clean multiplication doesn't capture. The QALY imposes a constant exchange rate where real preferences might curve. It's a workable simplification — but a simplification, chosen because it makes health comparable across every disease, not because it's how humans actually weigh their lives.
When the QALY treats "2 years at utility 0.5" as exactly equal to "1 year at full health," what is it fundamentally doing?
The same equation that makes the QALY powerful makes it, to many, unjust. Three serious objections you have to hold honestly.
- It can disadvantage the already-unwell. Utility gains are capped at 1.0. So a person who starts at full health has more "room" to gain than a person with a permanent underlying condition who can never reach 1.0. Give the same excellent treatment to both, and the healthier person can register more QALYs gained — which can make treatments for chronically ill or disabled people look systematically less cost-effective. Critics call this a built-in discrimination; it's a real structural feature, not a bug you can code away.
- Some states are worse than death. Because 0 is death, not the floor, states people rate below death get negative utilities. This is philosophically and emotionally heavy — the QALY is quietly asserting that some ways of being alive are worse than not being alive — and it has real consequences for how end-of-life and severe-disability treatments score.
- It's blind to distribution. A QALY is a QALY whoever gets it: one QALY each for 100 people equals 100 QALYs for one person. The QALY sums health without asking how it's spread — whether it lands on the worst-off or the best-off, on one desperate case or a hundred mild ones. Pure QALY-maximisation can, in principle, ignore fairness entirely.
None of these is a reason to discard the QALY — but every one is a reason to use it with your eyes open. An assessor who quotes a cost-per-QALY without knowing what the QALY quietly assumes is quoting a number they don't fully understand.
Given all that, why is the QALY the near-universal currency of HTA? Because every alternative is worse for the one job that matters.
The job is comparability across diseases. A health system must choose between a cancer drug, a hip replacement, a mental-health service, and a diabetes programme — all at once, from one budget. To do that you need a single unit of "health produced" that means the same thing everywhere. Look at the options:
- Natural units (life-years, mmHg, events avoided) — you met these in Module 5. Precise within one disease, mute across diseases. mmHg can't be compared to a depression-free month.
- Life-years alone — captures time but ignores quality, so it's blind to everything palliative and quality-improving.
- DALYs (disability-adjusted life years) — a close cousin used heavily in global health, built to measure burden rather than to compare treatments, with its own value assumptions baked in differently. (You'll meet DALYs specifically later in this module.)
The QALY, for all its flaws, is the only widely-usable unit that combines length and quality into one number comparable across every condition. That's why cost-utility analysis — cost per QALY — is the reference-case method for NICE, AOTMiT, and most agencies: not because the QALY is beyond criticism, but because it's the only tool that lets a payer allocate a fixed budget across the whole of medicine. It's the worst measure of health, except for all the others.
The other chair
The other chair. Reading a submission: a cost-per-QALY is never just a number — it's built on utility values and a health-state model you can interrogate. Ask where the QALY gain actually comes from: is it length of life or quality, and does the split match the clinical reality? A large QALY gain resting entirely on an optimistic quality improvement in a late, uncertain health state is softer than one built on survival. Watch for the disadvantaged-baseline effect in conditions where patients can't reach full health — and be alert when a small utility change is multiplied across many years into a large, load-bearing QALY gain. Building one: show your working — the health states, their utilities, the time in each — so the QALY isn't a black box. If your treatment's value is in quality rather than years, present it as such rather than dressing it as survival. Where your patients start from a low baseline, anticipate the "less room to gain" effect and address it head-on. A transparent QALY that an assessor can rebuild from your health-state model is far more persuasive than a headline figure they have to take on trust.
Same skill from both chairs — reading a QALY not as a fact but as an area under a curve: a stack of assumptions about quality, time, and how they trade, every one of which can be examined.
Why this matters for HTA
When it lands on your desk: the QALY is the unit almost every reimbursement decision is ultimately denominated in. Cost-effectiveness, the threshold, the whole apparatus of Module 5 — all of it runs on QALYs. Understanding what a QALY is — and quietly assumes — is understanding the currency the entire decision is priced in.
- You read a QALY gain as a structure, not a scalar. Every QALY figure decomposes into health states, utilities, and durations. Knowing that lets you find where the gain really lives — and whether it rests on solid survival data or a fragile assumption about future quality of life.
- You keep the construct's limits in view when you interpret the result. A cost-per-QALY that looks unfavourable for a severely ill population may partly reflect the QALY's built-in baseline effect, not the treatment's true worth — a reason some systems add severity or end-of-life modifiers on top of raw QALYs.
- You remember the QALY sums but doesn't distribute. Cost-effectiveness maximises total health; it doesn't, by itself, ask who gets it. Equity considerations — who the worst-off are, whether a rare severe condition deserves weight a mild common one doesn't — sit alongside the QALY, not inside it. The number informs the decision; it was never designed to be the whole of it.
The QALY is the most powerful idea in health economics and the most quietly value-laden. Use it — but never mistake it for a thermometer.
The QALY, in one breath.
- Health has two dimensions — how long and how well — and a real measure must fuse both. The QALY does: QALYs = time × utility.
- Utility scores quality of life: 1.0 = full health, 0 = death, with some states rated below zero (worse than death).
- Geometrically, QALYs are the area under the quality-time curve — and you can grow that area two ways: add years (extend) or lift quality (improve).
- The QALY treats those two as equivalent — a constant one-for-one trade between length and quality. That equation is its power (universal comparability across diseases) and a value assumption, not a fact.
- It carries real controversies: it can disadvantage those who can't reach full health, it admits states worse than death, and it's blind to distribution — a QALY counts the same whoever receives it.
- It won anyway because it's the only practical common currency of health across all diseases — which is why cost-per-QALY is HTA's reference case, flaws and all.
The QALY doesn't measure health. It measures how much health, in a currency we agreed to invent.
You now know what a QALY is and why everything in health economics is priced in it. But we've been quietly taking one thing for granted the whole time: the utility number. We wrote "0.5" as if it fell from the sky. It didn't — someone had to decide that a particular state of illness is worth half a year of full health. How do you extract that number from human beings? Do you ask patients or the public? What question do you even pose? That's the machinery behind every QALY — measuring utilities — and it's where we go next.