M13 · SPECIAL TOPICS

A perfect test that changes nothing.

Here's a thought experiment to start somewhere uncomfortable. Imagine a diagnostic test that is perfect: 100% sensitivity, 100% specificity, never wrong, instant, cheap. It detects a particular condition flawlessly. Now suppose one of two things is true — either there is no treatment for that condition, or the clinician would do exactly the same thing whether the test came back positive or negative.

What is that perfect test worth to the health system? Sit with it for a second, because the answer is genuinely jarring: nothing. Zero health value. A flawless test that changes no decision and enables no treatment produces not one extra day of health for anyone. You've measured something with perfect precision and improved nobody's life.

That single example breaks the intuition almost everyone brings to diagnostics — that a better test is self-evidently worth more, and that a test's value is its accuracy. In Module 3 you learned to measure how good a test is: sensitivity, specificity, PPV, the ROC curve. That was essential, and it was only half the story. This module is about the other half — not "how good is this test?" but "what is this test worth?" — and those turn out to be startlingly different questions.

A test treats nothing.

Here is the principle the thought experiment was pointing at, and it's the foundation of everything that follows: a diagnostic test, by itself, treats nothing. It doesn't cure, shrink, slow, or heal anything. A test only ever produces one thing — information. A number, a positive or negative, an image.

And information has a peculiar property: it is worth something only when it changes what happens next. A test result creates health value if, and only if, it changes a decision, which changes a treatment, which changes a health outcome. Break that chain at any link and the value collapses. If the result doesn't change the decision (you'd treat anyway, or you'd never treat), no value. If the changed decision doesn't change treatment, no value. If the changed treatment doesn't improve the outcome, no value. The test's worth is entirely downstream of the test itself.

This is why diagnostics are so easy to misjudge. We instinctively locate a test's value in the test — in its cleverness, its accuracy, its technology. HTA locates it somewhere else entirely: in the cascade the test sets off. The test is the first domino; the value is in the last one falling. Assess the domino in your hand, and you've assessed nothing that matters.

The test-treatment pathway.

So to value a test, you have to follow the whole chain it triggers — what HTA calls the test-treatment pathway. It runs like this:

A patient is tested. The result is one of four things from Module 3's world — a true positive, false positive, true negative, or false negative. That result drives a decision (treat, don't treat, investigate further). The decision drives a treatment (or its absence). And the treatment — finally, several steps down — produces a health outcome: better, worse, or unchanged. The value of the test is the difference in outcome between the world where you used the test-and-treat strategy and the world where you didn't.

Notice what this does to Module 3's accuracy measures. Sensitivity and specificity don't disappear — they become the first link in the chain, governing how many patients land in each of the four boxes. But they're no longer the answer; they're an input to a much longer calculation. A true positive only creates value if the treatment it unlocks actually works. A true negative only creates value if it spares the patient something (needless treatment, anxiety, cost). The accuracy sets up the pathway; the pathway — all the way to the outcome — determines the value. To assess a diagnostic is to model that entire journey, not to read the top of it.

The cost of being wrong lives downstream.

Module 3 taught you to count false positives and false negatives. This module asks a different question: what do they actually cost? — and the answer is never in the 2×2 table. It's several steps downstream, in the treatment the error wrongly triggered, or wrongly withheld.

A false positive isn't a cell in a matrix — it's a healthy patient told they may be sick. They go on to the next step: a confirmatory biopsy with its own complications, or a course of treatment they never needed, with its side effects, its cost, its anxiety. The harm of a false positive is the harm of everything it sets in motion.

A false negative is worse in a different way. It's a patient with disease sent home reassured. The cancer keeps growing; the treatable becomes untreatable; the outcome that a timely diagnosis would have secured is quietly lost. The cost of a false negative is the whole trajectory of a disease left to run.

This is why two tests with identical sensitivity and specificity can have completely different value. Put the same accuracy in front of a disease where a missed case is quickly fatal, versus one where a missed case is caught easily next month, and the false negatives cost wildly different amounts. Put it in front of a treatment that's brutal versus gentle, and the false positives differ just as much. The accuracy is the same; the downstream consequences — which is to say, the value — are not. You cannot price a test's errors without following them all the way down.

Trace the value down the pathway.

Here's a cohort of 1,000 patients run through a test-and-treat pathway. Set the test's accuracy, the effectiveness of the treatment the test triggers, and the disease prevalence — and watch the net health benefit emerge at the far end. The lesson is in what makes it collapse. (An illustrative model of the mechanism, not a real cost-effectiveness calculation.)

Simplified to one axis — Module 3 treats sensitivity and specificity separately.

1,000 tested · 200 have the disease

True positive

180

False negative

20

False positive

80

True negative

720

→ Treat the positives (TP + FP) →

Net health benefit

+76.0 QALYs

Gain from 180 treated TPs (+90.0) minus FP overtreatment (−8.0) minus FN progression (−6.0)

Watch what happened when you set treatment effectiveness to zero. The test didn't change — still 90%, still 99% if you pushed it — but its value fell off a cliff, because the thing at the end of the pathway stopped working. That's the whole lesson made visible: the value was never in the test. It was in the treatment the test unlocks. Turn the test into a genius and the treatment into a dud, and you have a worthless pathway. The accuracy sliders feel like they should control the value — and they barely matter the moment the treatment can't deliver.

Now you.

For each statement, is it a property of the test (accuracy, Module 3), a value or cost that lives downstream on the pathway, or something that zeroes the test's value entirely?

1. The test's sensitivity is 95%.

2. A false positive leads to an unnecessary biopsy and its complications.

3. There is no treatment for the condition the test detects.

4. A missed cancer progresses to an inoperable stage.

5. The test result would not change what the clinician does next.

6. The test's specificity is 88%.

Linked evidence: borrowing the treatment's proof.

Now the hard part, the reason valuing a diagnostic is often harder than valuing a drug. To know a test's value, you need to know the outcome at the end of the pathway — but tests are almost never studied that far.

The ideal would be a test-treatment RCT: randomise patients not to a test, but to a whole strategy — "test with this and treat by the result" versus "the current strategy" — and measure the health outcome at the end. That's the gold standard, because it captures the entire pathway in one experiment. But it's rare, and for a reason: it's large, long, and expensive, because the outcome you care about sits several steps beyond the test, and you have to follow every patient all the way there. Most tests never get one.

So HTA does something cleverer and messier: linked evidence. It takes the evidence that does exist — studies of the test's accuracy (Module 3) — and links it to separate evidence of how well the treatment works (the therapy's own RCTs), stitching them together with a model of the pathway. Accuracy tells you how many true and false positives you'll produce; the treatment trials tell you what happens to each once treated; the model carries them to a health outcome. You are, in effect, borrowing the treatment's proof to value the test, because no single study runs the whole length of the chain. It's ingenious, and it's fragile — every join between two evidence sources is an assumption — which is exactly why diagnostic assessments are modelling-heavy and why a "simple, cheap" test can be far harder to evaluate than an expensive drug.

Companion diagnostics: test and drug as one.

There's a case where test and treatment are so entangled you cannot assess them apart at all: the companion diagnostic. Here a test — usually for a biomarker — exists precisely to decide who should get a specific drug, and the drug works only in the patients the test identifies.

Think of a targeted cancer therapy that only helps patients whose tumour carries a particular mutation. The drug without the test is dangerous or useless — you'd give an expensive, toxic therapy to everyone, most of whom can't benefit. The test without the drug is pointless — why identify the mutation if nothing turns on it? Each is worthless alone; together they're valuable. So HTA assesses them as a pair: the value of the "test-plus-drug" strategy against the alternative, with the test's job being to concentrate the drug on the patients who can respond, raising the whole package's cost-effectiveness.

This is the test-treatment pathway in its purest form — the test's entire value is the treatment it directs, made explicit and inseparable. It's also increasingly the norm in oncology and precision medicine, where more and more drugs arrive married to a diagnostic. And it sharpens every point in this lesson: you literally cannot state the test's value without naming the drug, or the drug's without naming the test. Assessed separately, both look absurd; assessed together, they make sense. The pair is the unit of assessment.

What's the first question?

A manufacturer submits a new diagnostic test with outstanding accuracy — 98% sensitivity, 97% specificity, well above the current test. An assessor reviewing it for reimbursement asks one question first, before considering the accuracy at all. Which question, and why?

Why this matters for HTA

Diagnostics are among the most commonly misjudged technologies in HTA, precisely because their accuracy is so easy to see and their value so easy to misplace:

A test earns its value not by being right, but by changing what happens next. Follow the pathway from result to decision to treatment to outcome, and a diagnostic's worth appears where you'd least expect it — never in the test, always in the care it redirects.

HTA of diagnostics, in one breath.

Assess a drug and you assess the thing itself; assess a test and you assess everything it sets in motion. The diagnostic is never the point — the point is the care that flows, or fails to flow, from what it tells you.

Diagnostics were our first "special topic" — a technology that standard drug-HTA can't assess without rethinking the whole approach. The next is stranger still: medical devices, which don't behave like pills in almost any respect — they're operator-dependent, they evolve through versions, they have learning curves, and the evidence looks nothing like a drug trial. How HTA bends to assess them is the next lesson.