M9 · UNCERTAINTY

The uncertainty PSA couldn't touch.

The last two lessons built a powerful picture of uncertainty: give every input a distribution, run a PSA thousands of times, and read off — from the cloud, from the CEAC — exactly how confident the verdict is. It feels complete. It isn't.

Look at what that entire apparatus quietly assumed: one model. The same structure, the same extrapolation, the same set of health states, ran ten thousand times with different numbers plugged in. Every point in that cloud came from the same machine. So the cloud tells you how the answer wobbles when the inputs are uncertain — and says absolutely nothing about the possibility that the machine itself is built wrong.

That's the uncertainty PSA cannot reach, and it's often the biggest one in the room. This lesson is about the kinds of doubt that live not in the values a model uses, but in the shape of the model itself — and the one tool that can probe them.

Three kinds of uncertainty.

It helps to name the three distinct kinds of uncertainty in any evaluation, because they need completely different tools:

Parameter uncertainty — we don't know the exact values of the inputs. The utility might be 0.6 or 0.7; the hazard ratio has a confidence interval. This is what PSA and the CEAC handle: put a distribution on each value and sample.
Structural uncertainty — we don't know if the model itself is built the right way. Is the survival curve exponential or Weibull? Should the Markov model have three health states or five? Is progression even the right way to carve up the disease? These aren't values — they're choices about the model's shape.
Methodological uncertainty — we don't know which analytical conventions to apply. Payer perspective or societal? Discount at 1.5% or 3.5%? A lifetime horizon or ten years? Which comparator? These are defensible-but-contested choices about how the analysis is framed.

PSA addresses only the first. The second and third are where a model's real fragility usually hides — and neither can be sampled from a distribution.

Why structure won't fit in a distribution.

Here's the crux. To put something in a PSA, you need a parameter — a quantity with possible values and a distribution over them. That works beautifully for a utility or a cost. But try it on structure.

What distribution describes "exponential versus Weibull"? There's no number that means "the shape of my extrapolation," no continuous scale running from one model to another. The choice between them isn't a value you're unsure of — it's a fork in the road, and the two branches lead to different models entirely. Recall Module 8: an exponential and a Weibull curve can fit the observed data almost identically and then imply wildly different survival tails. That wasn't one parameter taking different values; it was two different machines producing two different answers.

Because structural choices are discrete forks rather than uncertain quantities, they simply don't fit the distribution-and-sample machinery. (There are advanced workarounds — averaging across models, or a switch parameter that flips structure mid-simulation — but at heart this is discrete uncertainty, and it needs a discrete tool.) You can't sample your way out of having chosen the wrong model. You can only build the other one and look.

The only tool: rebuild and compare.

That's exactly what scenario analysis does. You take an alternative, defensible assumption — a different extrapolation, a different perspective, a different horizon — you rebuild the model under it, recompute from scratch, and compare the verdict to the base case.

Note how different this is from one-way sensitivity analysis. One-way analysis slides a value up and down within the same model. Scenario analysis swaps a piece of the model itself and reruns. One asks "what if this number is higher?"; the other asks "what if I built this differently?" You're not turning a dial — you're replacing a part and seeing whether the machine still gives the same answer.

And that reframes the whole question. It's no longer "what is the ICER?" but "is the verdict robust to the assumptions I can't be sure of, or does it flip depending on which defensible model I happen to build?" If ten reasonable structures all say cost-effective, the decision is solid. If it hangs on picking one particular extrapolation, you don't have a result — you have a preference dressed as one.

Switch the assumptions.

Below is our base case model, ICER £24,000, cost-effective at £30,000. But four of its assumptions are genuinely contested — the survival extrapolation, the time horizon, the perspective, the comparator. Flip any of them to a different, equally defensible choice and watch the ICER — and the verdict — respond. The goal isn't a "right" setting; it's to see which assumptions the verdict depends on.

Survival extrapolation⚠ fragile driver

Time horizon

Perspective

Comparator

ICER £24,000 · Verdict at £30,000: Cost-effective · Assumptions changed from base: 0

Verdict is robust to 3 of 4 assumptions; fragile to: survival extrapolation.

Read what the panel just told you. The verdict is robust to three of these four assumptions — you can change perspective, horizon, and comparator in any combination and it stays cost-effective. But it's fragile to one: the survival extrapolation. The entire decision rests on choosing exponential over Weibull — a structural assumption about the unobserved tail, exactly where Module 8 warned there's no data to adjudicate. That single fork, not the whole model, is where the appraisal should concentrate.

Now you.

Classify each source of uncertainty — and pick the tool that handles it.

1. "The progressed-state utility is estimated at 0.5, with a 95% CI of 0.42–0.58."

2. "Whether the survival curve should be extrapolated as exponential or Weibull."

3. "Whether to adopt a payer or a societal perspective."

4. "Whether the Markov model needs three health states or five."

Scenario ≠ one-way sensitivity.

It's worth pinning down the difference between scenario analysis and the one-way sensitivity analysis from earlier in this module, because conflating them is how uncertainty gets understated.

One-way sensitivity analysis varies a value across its range inside a fixed model: utility from 0.42 to 0.58, cost ±20%. The model's architecture never changes — only the number in one cell. It produces the smooth swings of a tornado.

Scenario analysis changes the architecture: a different extrapolation, a different set of states, a different perspective, and then recomputes the whole model. It's not a slide along a scale — it's a jump to a different model, so its results are discrete (base case £24,000; Weibull scenario £37,200), not a continuous swing.

The distinction matters because a submission can run a lavish PSA and a thorough tornado — exhausting parameter uncertainty — while quietly leaving the biggest structural fork untested. All that probabilistic machinery, however impressive, still lives inside one architecture. If that architecture is the thing in doubt, no amount of sampling within it will reveal the problem. Only rebuilding will.

The base case is a choice, not the truth.

Which leads to the most important habit in reading any model. The base case is presented as the answer — the central estimate, the headline ICER, with everything else demoted to "sensitivity analyses" or "scenarios." But that label is a choice by the modeller, not a fact about the world.

A base case is simply the set of assumptions the analyst selected as their reference. Nothing makes it more true than an equally defensible alternative that happens to sit in the "scenario" section. And that's precisely where results get engineered: pick the base-case assumptions that deliver a comfortable ICER, and relegate the assumption that breaks it — the Weibull extrapolation, the societal costs of a competitor, the shorter horizon — to a scenario buried on page 180, or omit it altogether.

So the assessor's question is never "do the scenario analyses confirm the base case?" It's the sharper one: "why is this the base case, and not the equally reasonable alternative under which the verdict flips?" A result that only survives under the manufacturer's chosen structure isn't robust — and a set of scenarios carefully arranged to avoid the one fork that matters isn't reassurance, it's misdirection. Robustness is only demonstrated by testing the alternatives you'd least like to be true.

What's the right response?

A submission includes an extensive PSA (10,000 runs, a tight CEAC showing 92% probability of cost-effectiveness at £30,000) but models survival with a single extrapolation — an exponential curve — with no alternative explored. A clinician on the committee believes a Weibull curve, implying a shorter survival tail, is at least as plausible. What is the right response?

Why this matters for HTA

Structural and methodological uncertainty are where the hardest appraisal fights happen, precisely because they can't be reduced to a tidy probability. A few disciplines:

Ask what the base case chose, and why. Every base case is a stack of assumptions someone selected. Identify the two or three that most affect the result — usually an extrapolation, a comparator, a perspective — and ask why each was set as it was, and what the equally defensible alternative does to the verdict.
Distinguish "robust to parameters" from "robust to structure." A glowing PSA and CEAC establish only the former. Before you trust a result, confirm the key structural forks were tested by scenario, not just the values sampled by PSA. A submission heavy on probabilistic analysis and silent on structural scenarios is a familiar warning sign.
Treat the scenario table as an argument, not a formality. Read it for what's missing: the plausible alternative that isn't there is often the one that breaks the result. Robustness is a claim about the assumptions you'd least like to be true — so those are the ones that have to appear.

Parameter uncertainty asks whether we've measured the model's inputs well enough. Structural uncertainty asks whether we've built the right model at all — and no probability, however precise, can answer a question about the machine from inside the machine.

Scenario analysis & structural uncertainty, in one breath.

PSA and the CEAC handle parameter uncertainty (uncertain values) but are blind to structural uncertainty (is the model's shape right?) and methodological uncertainty (perspective, discount rate, horizon, comparator). Every point in a PSA cloud came from the same model.
Structure can't be sampled — there's no distribution for "exponential vs Weibull." The only tool is scenario analysis: rebuild the model under a different defensible assumption and compare verdicts.
This differs from one-way SA, which slides a value inside a fixed model. A scenario swaps part of the model and recomputes — discrete jumps, not smooth swings.
The base case is a choice, not the truth. The real question is "why this base case, and not the equally reasonable alternative under which the verdict flips?" Robustness means surviving the assumptions you'd least like to be true.

A PSA turns every dial inside the machine and tells you how much the answer moves. Scenario analysis asks the question the dials can't: is this even the right machine?

We've now measured how likely a decision is to be wrong — from parameters (the CEAC) and from structure (scenarios). But one thing is still missing, and it's the one a budget-holder cares about most: how costly is it to be wrong, and therefore how much would it be worth to buy better evidence before deciding? That question — the value of information — closes the module.