Adaptive clinical trials and machine learning

Arguments over the difference between statistics and machine learning are often pointless. There is a huge overlap between the two approaches to analyzing data, sometimes obscured by differences in vocabulary. However, there is one distinction that is helpful. Statistics aims to build accurate models of phenomena, implicitly leaving the exploitation of these models to others. Machine learning aims to solve problems more directly, and sees its models as intermediate artifacts; if an unrealistic model leads to good solutions, it’s good enough.

This distinction is valid in broad strokes, though things are fuzzier than it admits. Some statisticians are content with constructing models, while others look further down the road to how the models are used. And machine learning experts vary in their interest in creating accurate models.

Clinical trial design usually comes under the heading of statistics, though in spirit it’s more like machine learning. The goal of a clinical trial is to answer some question, such as whether a treatment is safe or effective, while also having safeguards in place to stop the trial early if necessary. There is an underlying model—implicit in some methods, more often explicit in newer methods—that guides the conduct of the trial, but the accuracy of this model per se is not the primary goal. Some designs have been shown to be fairly robust, leading to good decisions even when the underlying probability model does not fit well. For example, I’ve done some work with clinical trial methods that model survival times with an exponential distribution. No one believes that an exponential distribution, i.e. one with constant hazard, accurately models survival times in cancer trials, and yet methods using these models do a good job of stopping trials early that should stop early and letting trials continue that should be allowed to continue.

Experts in machine learning are more accustomed to the idea of inaccurate models sometimes producing good results. The best example may be naive Bayes classifiers. The word “naive” in the name is a frank admission that these classifiers model as independent events known not to be independent. These methods can do well at their ultimate goal, such as distinguishing spam from legitimate email, even though they make a drastic simplifying assumption.

There have been papers that look at why naive Bayes works surprisingly well. Naive Bayes classifiers work well when the errors due to wrongly assuming independence effect positive and negative examples roughly equally. The inaccuracies of the model sort of wash out when the model is reduced to a binary decision, classifying as positive or negative. Something similar happens with the clinical trial methods mentioned above. The ultimate goal is to make correct go/no-go decisions, not to accurately model survival times. The naive exponential assumption effects both trials that should and should not stop, and the model predictions are reduced to a binary decision.

Click to learn more about clinical trial design consulting

 

4 thoughts on “Adaptive clinical trials and machine learning

  1. A quibble, but I think that Naive bayes works well because the error tends to express itself in model overconfidence rather than prediction error. That is, a NB model will generally say that it’s 99.9% confident that a certain example is of class A, when the “true” model would only say 60%, but, in any case, the decision is the same.

  2. This is a great characterization of the difference between ML and stats–my only quibble is in the use of effect where affect is grammatically correct

  3. This is a sweeping generalization and seems founded in a misunderstanding of how problems are approached in statistics. It may be the case this is a problem in some cases, but overall I think the trend in statistics for at least the past 30 years had been on simultaneously building “accurate” and “direct” (in your langauge) models in order both to predict and answer scientific questions via statistical inference. Try doing that with naive bayes

Leave a Reply

Your email address will not be published. Required fields are marked *