Tomorrow morning I’m giving a talk on how to subject fewer patients to ineffective treatment in clinical trials. I should have used something like the title of this post as the title of my talk, but instead my talk is called “Clinical Trial Monitoring With Bayesian Hypothesis Testing.” Classic sales mistake: emphasizing features rather than benefits. But the talk is at a statistical conference, so maybe the feature-oriented title isn’t so bad.
Ethical concerns are the main consideration that makes biostatistics a separate branch of statistics. You can’t test experimental drugs on people the way you test experimental fertilizers on crops. In human trials, you want to stop the trial early if it looks like the experimental treatment is not as effective as a comparable established treatment, but you want to keep going if it looks like the new treatment might be better. You need to establish rules before the trial starts that quantify exactly what it means to look like a treatment is doing better or worse than another treatment. There are a lot of ways of doing this quantification, and some work better than others. Within its context (single-arm phase II trials with binary or time-to-event endpoints) the method I’m presenting stops ineffective trials sooner than the methods we compare it to while stopping no more often in situations where you’d want the trial to continue.
If you’re not familiar with statistics, this may sound strange. Why not always stop when a treatment is worse and never stop when it’s better? Because you never know with certainty that one treatment is better than another. The more patients you test, the more sure you can be of your decision, but some uncertainty always remains. So you face a trade-off between being more confident of your conclusion and experimenting on more patients. If you think a drug is bad, you don’t want to treat thousands more patients with it in order to be extra confident that it’s bad, so you stop. But you run the risk of shutting down a trial of a treatment that really is an improvement but by chance appeared to be worse at the time you made the decision to stop. Statistics is all about such trade-offs.