M9 · UNCERTAINTY

One threshold was a choice.

Last lesson ended on a single, satisfying number: 85% of the PSA cloud sat below the £30,000 threshold, so the technology had an 85% probability of being cost-effective. Clean. But look hard at where that number came from — it depended entirely on drawing the threshold line at £30,000.

And we already know, from Module 7, that £30,000 isn't a law of nature. The threshold is a contested, movable policy choice — £20,000 to £30,000 for two decades, £25,000 to £35,000 since April 2026, and arguably nearer £13,000 on opportunity-cost grounds. Pick a different threshold and that 85% changes. So computing the probability at one threshold quietly smuggles in a decision that isn't ours to fix.

The fix is obvious once you say it out loud: don't pick one threshold. Compute the probability at every threshold, and plot the result. That plot is the cost-effectiveness acceptability curve — the CEAC.

Sweep the threshold.

Go back to the cloud. Each point is one simulated run — a ΔEffect and a ΔCost. The threshold line runs through the origin, and its steepness is the threshold: a £10,000 threshold is a shallow line, a £50,000 threshold a steep one. A point counts as cost-effective if it sits below the line — if its own ICER is under the threshold.

Now, instead of fixing that line, sweep it. Start it almost flat (a very low threshold) and only a sliver of the cloud sits beneath it — few runs are cost-effective when you'll pay almost nothing per QALY. Tilt it steeper (raise the threshold) and the line sweeps upward across the cloud, catching more and more points beneath it. At each angle — each threshold λ — you pause and count: what fraction of the cloud is now below the line?

One threshold gives you one fraction. A £10,000 threshold might catch 8% of the cloud; £30,000 catches 85%; £50,000 catches 97%. Sweep smoothly through every threshold and you get a fraction for each one.

Plot the curve.

Now just plot those fractions. Put the threshold (λ) on the horizontal axis and the probability of being cost-effective on the vertical axis, running 0 to 1. Each threshold contributes one point — its fraction of the cloud — and joining them traces a curve that climbs from bottom-left to top-right. That's the cost-effectiveness acceptability curve.

Why does it (almost always) rise? Because raising the threshold can only ever add points to the cost-effective side — a shallower verdict never un-catches a point a steeper one already caught. So as λ increases, the fraction below the line can only grow. At a very low threshold almost nothing qualifies; at a very high one almost everything does; and the interesting action — where the probability climbs through the middle — is exactly the range of thresholds a real decision-maker cares about.

The single number from last lesson was one point on this curve. The CEAC is that calculation done everywhere at once — so that whatever threshold the decision-maker lands on, the answer is already read off.

Build the CEAC from the cloud.

On the left is the PSA cloud from last lesson. On the right, an empty axis. Drag the threshold slider: watch the line sweep across the cloud on the left, and — for each threshold — a point drop onto the right, tracing the CEAC. The two views are the same calculation, seen two ways.

Cost-effectiveness plane

Cost-effectiveness acceptability curve

Threshold λ£30,000

Threshold λ = £30,000 · Share of cloud cost-effective = — · ← this is the CEAC's height at £30,000

The CEAC isn't a new analysis — it's the same cloud, interrogated at every threshold instead of one. Notice where it crosses 0.50: that's the threshold at which the technology becomes more-likely-than-not cost-effective, and it sits near the cloud's centre, not at any single fixed point. And notice the number you're actually handing a decision-maker isn't "yes" or "no" — it's "at your threshold, here's how likely."

Reading the curve.

A CEAC repays a few specific readings:

Read off the decision threshold. Find the current threshold on the x-axis, go up to the curve, read the probability. "At £30,000, an 85% chance of being cost-effective." That's the headline the curve exists to deliver.
Note where it crosses 0.50. That's the threshold at which cost-effective becomes more likely than not — it corresponds to the middle of the ICER distribution, and (because the model is non-linear) it need not equal the base-case ICER from the deterministic run. A curve crossing 0.50 far above the relevant threshold is a warning.
Read its steepness. A curve that rockets from 0.2 to 0.9 across a narrow band of thresholds means the verdict is exquisitely sensitive to exactly where the threshold is set — precisely the range where the Module 7 threshold debate becomes decisive. A gently sloping curve means the threshold choice barely matters.

One reading to resist: the CEAC does not have a "pass mark." There's no rule that a technology must clear 0.5, or 0.95, to be funded — that's the next screen's trap.

Now you.

A PSA produces 4,000 simulated runs. At a threshold of £25,000 per QALY, 2,400 of those runs fall on the cost-effective side of the line.

What is the height of the CEAC at £25,000 — the probability of being cost-effective at that threshold? (Enter it as a decimal, e.g. 0.60.)

What the y-axis does not mean.

This is the screen that separates people who can read a CEAC from people who think they can. The vertical axis is a probability of being cost-effective — and three tempting misreadings all get it wrong.

It is not the probability the drug works, or is clinically better. The y-axis folds together effectiveness, cost, and the threshold. A drug that certainly improves survival can have a low CEAC because it's expensive; a modest drug can have a high CEAC because it's cheap. "Cost-effective" is not "effective."

It is not the size of the benefit. Height is certainty, not magnitude. Consider two drugs at a £30,000 threshold. Drug A adds a tiny 0.05 QALYs but with very little uncertainty — its CEAC reads 0.99. Drug B adds a large 0.60 QALYs but with wide uncertainty — its CEAC reads 0.62. A is more likely to be cost-effective; B delivers far more health. Read only the CEAC and you'd rank A above B, which is exactly backwards for a decision about health gained.

It is not a significance test. There is no "must exceed 0.95" rule, and no "0.50 = the decision flips." The actual decision is made on expected value — is the mean net benefit of the cloud positive at the relevant threshold? — which is driven by where the cloud's centre sits, not by whether its probability clears some bar. The CEAC describes the risk of being wrong around that decision; it doesn't make the decision. Treating 0.5 or 0.95 as a hurdle imports the logic of p-values into a place it doesn't belong.

So: decide on the expected result; use the CEAC to see how confident that decision is. Never the other way round.

What the CEAC still can't tell you.

Even read correctly, the CEAC has a hole in it — and naming it sets up the rest of the module.

The CEAC tells you the probability of making the wrong call, but nothing about the cost of making it. A 20% chance of being wrong is trivial if being wrong wastes a few pounds and terrifying if it commits the health system to hundreds of millions. Probability of error and consequence of error are different things, and the CEAC shows only the first. Quantifying the second — how much a wrong decision actually costs, and therefore how much it would be worth to reduce the uncertainty — is the job of value of information, two lessons on.

Two narrower limits round it out. The CEAC reflects parameter uncertainty only — the same GIGO caveat as the whole PSA: it says nothing about whether the model's structure is right, so a confident-looking curve can sit atop a shaky model (structural uncertainty is the next lesson). And with more than two options, a single CEAC per option can mislead about which is optimal at each threshold; a related construct, the cost-effectiveness acceptability frontier, handles that case. For the two-way comparisons that dominate appraisals, though, the CEAC is exactly the right tool — provided you remember what its y-axis does and doesn't say.

What's the flaw?

Two cancer drugs are each compared with standard care at a £30,000 threshold. Drug A's CEAC reads 0.98; Drug B's reads 0.65. A reviewer concludes: "Drug A is clearly the better use of NHS money — it's far more likely to be cost-effective." What's the flaw?

Why this matters for HTA

The CEAC is a fixture of appraisal documents, and it's as often misread as read. A few disciplines keep it honest:

Read the height at the relevant threshold, and know that threshold is contested. The single most useful number is the CEAC's value at the decision-maker's threshold. Because that threshold moved to £25,000–£35,000 in 2026 and is argued to be lower still, glance at the whole curve across that band — a technology comfortably cost-effective at £30,000 may not be at £20,000.
Never confuse the curve's height with the size of the prize. A high CEAC is confidence, not benefit. When comparing options, go back to expected net benefit; use the CEAC to gauge how much risk sits around that expectation, not to rank the options themselves.
Remember what it's silent on. The CEAC shows the probability of a wrong decision, not its cost, and nothing about structural uncertainty. A steep, high curve is reassuring about parameter uncertainty and says nothing about whether the model was built on a plausible extrapolation. Keep those questions separate.

A CEAC answers "how likely is cost-effective?" at every threshold at once — which is genuinely useful, and genuinely not the same question as "is this worth doing?" The first is about confidence; the second is about expected value. Good appraisal keeps them apart.

The CEAC, in one breath.

A CEAC is built by sweeping the threshold across the PSA cloud and, at each threshold, plotting the fraction of the cloud that's cost-effective. Threshold on the x-axis, probability of cost-effectiveness on the y-axis.
It answers "how likely is this cost-effective?" for every threshold at once — so whatever threshold a decision-maker uses (and Module 7 showed it's contested), the answer is already on the curve. It (almost always) rises with the threshold and crosses 0.5 near the middle of the ICER distribution.
The y-axis is certainty, not magnitude: a small near-certain benefit can outscore a large uncertain one. The decision is made on expected net benefit — the cloud's centre — while the CEAC describes the risk of being wrong. There is no 0.5 or 0.95 "pass mark."
It shows the probability of a wrong decision, never its cost, and only parameter (not structural) uncertainty.

One threshold gave one answer. The CEAC gives the answer for every threshold — and quietly reminds you that "cost-effective" was always a question with the threshold built in.

The CEAC captured how uncertainty in the inputs ripples through to the verdict. But there's a deeper uncertainty it can't touch: what if the model itself — its structure, its extrapolation, its very assumptions — is wrong? That's not something a distribution on a parameter can express. Handling it needs a different move: scenario analysis and structural uncertainty, next.