M11 · REAL-WORLD EVIDENCE

The wrong question about RWE.

Three lessons have taken real-world evidence apart. We've seen it answers a different question than a trial (effectiveness, not efficacy), that it loses randomisation's protection, and that no amount of clever adjustment can remove the unmeasured confounding that leaves behind. A reasonable person, having absorbed all that, might conclude: don't trust real-world evidence.

That conclusion is a trap — because it's built on the wrong question. "Should we trust RWE?" treats trustworthiness as a fixed property of the evidence, like a grade stamped on it. It isn't. Everything we criticised in the last lesson — confounding by indication, residual bias — attaches to one particular job: proving that one treatment works better than another. But that is only one of the many questions HTA needs answered. Ask a different question, and those weaknesses may not apply at all.

So the real question isn't "is this RWE any good?" It's "what is this RWE for — and does its known weakness even matter for that task?" This lesson is about answering that question well, and about the powerful thing RWE does once you do: it turns "too uncertain to decide" into "decide, and keep learning."

Where RWE is weak: the treatment-effect question.

Let's be precise about the weakness, so we can then set it aside where it doesn't apply.

RWE is at its weakest on the comparative treatment-effect question: does drug A produce better outcomes than drug B (or than no treatment)? This is exactly where last lesson's villain lives. In observational data, patients weren't randomised to A or B — they were steered there for reasons tied to prognosis, so confounding by indication contaminates the comparison. Adjustment and propensity scores help with the measured part; residual confounding remains, invisible and irremovable. So when a submission uses RWE to claim "our drug beats the comparator," that claim carries all the fragility of the last lesson, and demands every caution we discussed — sensitivity analysis, negative controls, triangulation with trials.

This is real, and it's why the randomised trial remains the gold standard for causal claims about treatment effects. If the treatment effect is the question, RWE is the junior partner, admitted only with heavy caveats.

But hold that phrase — for treatment effects. Because the moment you stop asking a comparative-effect question, the whole objection evaporates.

Where RWE is strong (often the only source).

Look at what HTA actually needs to build an assessment, and notice how much of it isn't a treatment-effect comparison at all:

See the common thread: most of these are descriptive or epidemiological, not causal comparisons of treatments. Confounding by indication — which needs two treatment arms to contaminate — simply doesn't bite on "how many patients have the disease?" or "how long do untreated patients survive?" For these questions, RWE isn't the junior partner. It's the best evidence available — and frequently the only evidence that exists.

The right question: match the role to the task.

Put S2 and S3 together and the principle falls out. The strength of a piece of real-world evidence is not a property of the data — it's a property of the data-question pair. The same database is weak for one job and gold for another.

A claims dataset can't credibly tell you drug A beats drug B (confounding), but it can tell you exactly how many patients were treated, what it cost, and how long they survived — brilliantly. A disease registry can't randomise anyone, but it may hold the only reliable picture of the natural history of a rare condition anywhere in the world. So the assessor's question is never the blunt "do we trust this RWE?" It's the precise one:

What role is this RWE playing in the argument — and does its weakness matter for that role?

RWE used to establish a treatment effect: weakness matters enormously, apply full scrutiny. The same RWE used to estimate the eligible population, or long-term survival, or real-world cost: weakness largely irrelevant, and it may be the best source there is. One dataset, many possible roles, and the verdict changes with each. Reading RWE well means asking what job it's doing before asking how much to trust it.

Assign RWE to the job.

Here are six questions a real HTA has to answer. For each, reveal how strong real-world evidence is for that specific question — and why. Watch for the pattern in what makes RWE weak versus strong.

1. Does drug A extend survival more than drug B?

2. How long do untreated patients with this disease survive? (natural history)

3. How many patients in the country are eligible? (epidemiology)

4. What is 10-year survival for patients on this therapy? (long-term)

5. What does real-world care actually cost per patient?

6. In a disease too rare for any RCT, how does treated survival compare to historical untreated patients? (external control)

Questions assigned: 0/6 · RWE weak: 0 (the treatment-effect comparison) · RWE strong or only option: 0

There's the pattern. Out of six real HTA questions, RWE is weak on exactly one — the head-to-head treatment-effect comparison, where confounding by indication lives. On the other five — natural history, epidemiology, long-term survival, real costs, rare-disease external control — RWE ranges from strong to the only evidence in existence. So a submission leaning on RWE isn't automatically weak; it depends entirely on which question the RWE is answering. Judge the role, never the data alone.

Now you.

For each question, how strong is real-world evidence — strong, weak, or the only option (with caution)?

1. "Which of two active drugs works better?"

2. "How many people in the population have the disease?"

3. "What is the untreated natural course of this disease?"

4. "What does treating a patient actually cost in routine practice?"

5. "In an ultra-rare disease where no RCT is feasible, does the drug beat historical controls?"

6. "Does adding drug X to standard care improve outcomes versus standard care alone?"

RWE as the engine of conditional decisions.

Now the payoff that ties three modules together. Real-world evidence doesn't just fill gaps in a static assessment — it enables a fundamentally different kind of decision.

Recall two earlier threads. Module 9 showed that when a decision is uncertain, information has value — sometimes it's worth paying to reduce the uncertainty before committing (the value of information). Module 10 showed that a payer's decision needn't be a blunt yes/no — it can be a conditional deal, a managed entry. Real-world evidence is the missing piece that makes those ideas operational: it's the mechanism that supplies the information, in routine practice, after a conditional yes.

The result is coverage with evidence development (also called "only in research" or managed access): instead of rejecting a promising-but-uncertain technology, or approving it blindly, the payer says "yes, provisionally — and you will collect real-world evidence as we go." Concretely: a promising rare-disease drug, with an encouraging but uncertain effect and a high value of information, enters through a managed-entry agreement with a mandatory registry. For three years, every treated patient's outcomes are recorded. Then the decision is revisited: if the RWE confirms the benefit, it converts to full reimbursement; if it doesn't, the price is renegotiated or the drug withdrawn.

Look at what that loop does. It closes the circle from Module 9 to Module 11: uncertainty (M9) → a conditional deal to buy time and information (M10) → real-world evidence that actually resolves it (M11). The decision stops being a single shot in the dark and becomes a process of learning — commit provisionally, watch what happens in the real world, then confirm or reverse. RWE is the engine that makes "decide and keep learning" possible.

The catches (why "conditional" often isn't).

That loop is elegant on a slide. In practice it's harder, and a clear-eyed assessor knows where it strains:

None of this is a reason to abandon conditional decisions — they're often the right call when a technology is promising but the evidence is thin. It's a reason to design them honestly: specify in advance what evidence would confirm or reverse the decision, make the reversal credible, and only demand the data whose answer could move the outcome. Otherwise "managed entry" is just approval wearing a lab coat.

What's the soundest assessment?

A manufacturer submits a single-arm study of a new drug for a very rare cancer (too rare for a randomised trial), using real-world data on historical untreated patients as the comparison. They also provide registry data on how many patients have the disease and what current care costs. A reviewer must weigh the real-world evidence. What's the soundest assessment?

Why this matters for HTA

Reading RWE well is one of the defining skills of modern HTA, precisely because real-world evidence now appears in almost every submission, doing many different jobs at once:

Real-world evidence isn't strong or weak in the abstract — it's strong or weak for a job. Ask what job it's doing before you ask whether to trust it, and the same messy data becomes either a liability or the only window you have onto the world the decision actually lives in.

Real-world evidence in HTA decisions, in one breath.

An RCT tells you whether a drug can work; the rest of what a decision needs — who has the disease, what it costs, how long people really live, whether the promise held up — lives in the real world. The craft is knowing which questions to bring to which evidence, and turning uncertainty into something a health system can learn its way out of.

That closes Module 11 — and with it, the evidence and analysis half of HTA is complete: you can appraise a trial, synthesise the literature, value health, model it, quantify its uncertainty, judge affordability, and read real-world evidence for what it's worth. What remains is the world these analyses actually live in: the agencies that run them, the laws that bind them, the reimbursement processes that turn an assessment into a decision. Who are NICE, IQWiG, AOTMiT? What is the EU's new HTA Regulation? How does the money actually get approved? That's the regulatory and reimbursement context of Module 12.