M6 · MEASURING HEALTH OUTCOMES

One patient. One death. Two wildly different numbers — and both are correct.

A 70-year-old dies of a chronic disease they'd had for a decade. Two analysts look at exactly the same life.

The reimbursement economist asks: how much health did the available treatment add? Given where this patient actually was — already ill, limited life expectancy — the treatment bought maybe a few quality-adjusted years. A modest QALY gain. On that basis, an expensive drug might not clear the threshold.

The WHO epidemiologist asks a different question: how much health did this disease destroy? Measured against a full, healthy lifespan this person might have had, the disease stole a decade of life plus years of disability before it. A large DALY burden. On that basis, this is a serious public-health problem demanding investment.

Same patient. Same disease. Same death. One number says "modest," the other says "enormous" — and neither is wrong. They're answers to opposite questions, counted from different starting points. This lesson is about the DALY: what it measures, how it mirrors the QALY, and — crucially — the three deep ways it doesn't, each of which hides a value judgement you need to see.

The QALY looks up. The DALY looks down. That single reversal changes everything downstream.

You've spent two lessons on the QALY, which measures health gained: start from where a patient is, apply a treatment, count the extra quality-adjusted years it produces. More QALYs, more good. It looks upward — from the patient's actual situation toward a better one.

The DALY — disability-adjusted life year — measures health lost: start from an ideal of full health and full lifespan, and count how far a disease drags a person below it. More DALYs, more damage. It looks downward — from an ideal toward the patient's diminished reality.

This reverses the direction of "good." A high QALY gain is a success; a high DALY count is a catastrophe. An effective intervention raises QALYs but lowers DALYs — it claws back health that disease would otherwise have taken. So "cost per QALY gained" and "cost per DALY averted" point the same way (both reward good interventions), but the underlying quantities are mirror images: one is health added, the other health rescued from loss.

It's tempting to stop there and conclude they're just the same thing with the sign flipped. They're not — and the differences are where the real content lives.

A DALY has two parts: years lost to early death, plus years lost to living unwell.

Disease takes health in two ways, and the DALY adds them up:

Years of Life Lost (YLL) — the toll of premature death. If a standard life expectancy is 80 and the person dies at 70, that's 10 years of life lost. Death before your time is pure lost years.
Years Lived with Disability (YLD) — the toll of living in poor health. Take the years spent with the condition and weight them by how bad the condition is: a disability weight from 0 (full health) to 1 (equivalent to death). Ten years lived at a disability weight of 0.4 counts as 10 × 0.4 = 4 years of healthy life lost.

Add them: DALY = YLL + YLD. Our patient — 10 years of disease at weight 0.4, then death at 70 against a standard of 80 — carries 4 (YLD) + 10 (YLL) = 14 DALYs of burden from this disease.

Notice the shape already differs from a QALY. The YLL isn't measured against what this patient would realistically have lived — it's measured against a standard. That choice of standard is the first of three deep differences, and it's doing more work than it looks.

Same person, two zeros.

Watch the QALY and the DALY count the same life — from different starting points — and refuse to mirror each other.

Here's our patient's life on a timeline: born at 0, disease onset at 60, death at 70, against a standard life expectancy of 80. Two counters run side by side.

080+

Disease onset (60)Age at death (70)Standard life expectancy (80)

60Age at death: 7080

QALY gained (by treatment)

5.0

DALY (vs. standard 80)

14.0

Same death. Two numbers. Because they count from different zeros.

Drag the slider and watch the gap. When the patient had lots of realistic life left, both metrics agree the loss is big. But push "age at death" up toward where they'd have died anyway, and the QALY gain from treating them shrinks toward nothing — while the DALY stays substantial, because it's measured against the standard 80, not against this person's real prognosis. The QALY counts up from where the patient actually is. The DALY counts down from where the standard says they should have been. Those two zeros are not the same point — and the space between them is a judgement about how much life a person was owed.

Compute a DALY.

Put the two parts together yourself.

Our patient: lives with the disease for 10 years at a disability weight of 0.4, then dies at 70, against a standard life expectancy of 80.

Years lived with disability: 10 years of disease × disability weight 0.4 = ?
Years of life lost: standard life expectancy 80 − age at death 70 = ?
DALY = YLD + YLL = 4 + 10 = ?

The YLL isn't measured against this patient's real life expectancy. It's measured against an ideal standard — and that bakes in a judgement.

When we computed YLL as 80 − 70, the 80 wasn't this patient's realistic prognosis. It was a standard life expectancy — an aspirational reference for how long a human could live in full health. The Global Burden of Disease study uses a single high standard for everyone (built from the lowest observed mortality worldwide), precisely so that a life lost is valued the same regardless of where the person lived.

This is a profound choice, and it has consequences. Because YLL counts against a fixed high standard, the death of a young person generates far more DALYs than the death of an old one — a 20-year-old's death is 60 years lost against the standard; an 80-year-old's is zero. The DALY thus builds in a view that early death is a greater loss — defensible, but a value judgement, not a neutral fact.

And it used to go further. The original DALY applied explicit age-weighting — valuing years of young and middle adulthood more than childhood or old age, on the theory that those years are socially most productive. That was a naked value judgement, widely criticised, and the Global Burden of Disease study removed it in 2010. The lesson isn't that DALYs are broken — it's that a metric measuring loss against an ideal has already decided what the ideal is, and those decisions were, for years, hiding in plain sight.

Under the DALY, why does a 20-year-old's death generate more DALYs than a 75-year-old's from the same disease?

A disability weight looks like "1 minus a utility." It isn't — and the difference matters.

Both the QALY and the DALY put a number on how bad a health state is. It's natural to assume they're the same number flipped: utility 0.6 ↔ disability weight 0.4. Sometimes they're close. But they come from different processes, and they systematically diverge.

A utility (QALY-world) is a preference, extracted by making people trade — years in Time Trade-Off, risk in Standard Gamble. It answers: what would you sacrifice to avoid this state?

A disability weight (DALY-world) has historically come from a different question, put to different people. Rather than a personal trade-off, the Global Burden of Disease derives weights from large surveys asking respondents to judge which of two health states represents greater loss of health — paired comparisons about health itself, deliberately stripped of considerations like income, care, or adaptation. It answers: how much health does this state destroy? — not what would you personally give up?

Because the questions differ, disability weight ≠ 1 − utility. A state might carry a utility of 0.6 (people wouldn't trade much life to escape it) yet a disability weight that doesn't equal 0.4. So even the quality halves of QALYs and DALYs — the parts that look most interchangeable — aren't. Converting one to the other by subtraction is a common, quiet error.

The QALY and the DALY were built for different jobs. Using one for the other's job is where people go wrong.

This is the difference that ties the other two together. The two metrics were designed to answer different questions:

The DALY was built for the Global Burden of Disease project — a vast effort to measure how much illness there is in a population and where. Its job is burden and prioritisation: which diseases, in which regions, destroy the most health? It answers "where is the fire biggest?" It's the native currency of the WHO, global health funders, and epidemiology — and it's built to be comparable across countries and decades, which is exactly why it uses a universal standard rather than local prognoses.
The QALY was built for cost-effectiveness analysis — comparing specific interventions to allocate a specific budget. Its job is allocation: given this money, which treatment buys the most health? It answers "where should the next pound go?" It's the native currency of HTA agencies deciding reimbursement.

That's why you meet DALYs in global-health and infectious-disease contexts, in low- and middle-income settings, in vaccination and public-health programmes — and QALYs in NICE and AOTMiT reimbursement dossiers. Neither is "better." They're tools shaped to different questions, and the skill is knowing which question is actually being asked.

You're a reimbursement person — QALY country. So why must you understand DALYs? Because they're increasingly on your desk, and mixing the two is a real error.

DALYs turn up in HTA more than they used to: infectious-disease technologies, vaccines, therapies aimed at low- and middle-income populations, and burden-of-disease arguments manufacturers use to frame a condition as serious. When they appear, three traps await:

Don't convert 1:1. A submission that computes DALYs averted and then treats them as if they were QALYs gained is making the error this whole lesson has been dismantling — different reference, different weights, different meaning. One DALY averted is not interchangeably one QALY gained.
The threshold is different too. Cost-effectiveness against DALYs uses its own benchmarks — historically the WHO-CHOICE rule of thumb (an intervention costing under 1× GDP per capita per DALY averted was "highly cost-effective," under 3× "cost-effective"). That rule has been heavily criticised for having no real opportunity-cost basis — but the point stands: a cost-per-DALY-averted threshold is not a cost-per-QALY threshold, and you can't judge one against the other's yardstick.
Know which currency you're in. The single most important move is simply to identify whether a submission's outcomes are QALYs or DALYs, and to hold every downstream number — the ratio, the threshold, the verdict — to the matching framework. The metrics are close cousins, but they are not the same, and an appraisal that blurs them is quietly comparing incompatible things.

The other chair

The other chair. Reading a submission: first, identify the currency — QALYs or DALYs — before you read a single ratio, because everything downstream must match it. If DALYs appear, check they aren't being silently converted to QALYs or judged against a QALY threshold. Interrogate the standard used for YLL and the source of the disability weights; a burden case built on an aspirational standard can make a condition look more devastating than a local prognosis would. And treat a DALY-averted figure with its own threshold logic, not NICE's £20–30k. Building one: if your value genuinely lives in disease burden — a vaccine, an infectious disease, a neglected condition — a DALY framing may be legitimate and powerful, but present it as a DALY case, with its own threshold, not smuggled in as QALY-equivalent. Be transparent about the standard life expectancy and the disability weights you've used, since both carry assumptions an assessor will probe. Where the reference case wants QALYs, give QALYs — and use DALYs, if at all, as a clearly-labelled supporting burden argument, never as a substitute.

Same skill from both chairs — knowing that "health lost" and "health gained" are different questions with different machinery, and never letting a number from one framework be judged by the rules of the other.

Why this matters for HTA

When it lands on your desk: most reimbursement work is denominated in QALYs — but the DALY is the currency of a whole adjacent world (global health, burden of disease, infectious and neglected conditions) that increasingly overlaps with HTA. Reading both, and never confusing them, is now part of the toolkit.

You identify the metric before you trust the ratio. QALY gained and DALY averted look similar and behave differently. The first thing to establish about any economic outcome is which of the two it is — because the threshold, the interpretation, and the comparators all follow from that.
You refuse the 1:1 conversion. A submission that translates DALYs to QALYs (or back) by simple equivalence is making a category error: the reference points, the weights, and the purposes differ. Where a conversion is unavoidable, it demands explicit, defended assumptions, not a silent swap.
You read the built-in judgements. The DALY's normative life-expectancy standard, its disability weights, its history of age-weighting — each is a value choice, not a neutral measurement. Just as with the QALY, understanding what the metric quietly assumes is understanding what the "objective" number is really claiming.

The QALY and the DALY are two answers to the question "how much does health matter here" — one counting what we can add, the other what disease has taken. An assessor's job is to know which question is on the table, and to keep the answer honest to it.

The DALY, in one breath.

The DALY measures health lost to disease — the mirror of the QALY's health gained. Low DALY = good; an effective intervention lowers it.
DALY = YLL + YLD: years of life lost to premature death, plus years lived with disability weighted by a disability weight (0 = full health, 1 = death).
It counts from a different zero. YLL is measured against a normative standard life expectancy, not the patient's real prognosis — so the QALY (counting up from the patient's actual state) and the DALY (counting down from an ideal) don't mirror each other 1:1.
That standard builds in judgements: early death weighs more, and the original DALY even added explicit age-weighting (removed in 2010).
Disability weight ≠ 1 − utility. They come from different questions (health-loss judgements vs preference trade-offs), so even the quality halves don't convert cleanly.
Different tools for different jobs: DALYs for measuring burden and prioritising (global health, GBD, WHO); QALYs for comparing interventions and allocating a budget (HTA reimbursement). Don't convert one to the other, or judge one by the other's threshold.

Same health, opposite question. Choose the metric and you've chosen what you're really asking.

You've now completed the toolkit for measuring the effect — the health side of every cost-effectiveness ratio. You can tell a real endpoint from a surrogate, build a QALY, trace where its utilities come from, and read the DALY that mirrors it. But a measured effect and a measured cost still have to be combined into a decision — and that combination has machinery of its own: the incremental cost-effectiveness ratio (the ICER), the threshold, the plane on which every technology is plotted. Assembling the effect and the cost into a verdict is where the course goes next.