Discussion
In this study, we developed a strategy to incorporate machine learning into predicting real-world time-on-treatment curves. To this end, we generalized the problem into predicting expected future time on treatment and then stratified the distribution of the predicted time. We showed strong performance of this approach in predicting rwTTD across a variety of influencing factors using simulated data. We showed its flexibility to be applied to any machine learning base classifiers. We then showed its robustness when trained and tested on different populations. Lastly, we demonstrated its robust performance using real world lung cancer and head and neck cancer data treated with pembrolizumab.
Although rwTTD is a critical metric in monitoring the efficacy of a treatment in the real world patient populations, no study has yet attempted to establish machine learning models to predict rwTTD. The key obstacle is that rather than predicting individual scores, we are required to predict a curve. This notion and strategy is new, and will spur the field of curve prediction in many other research fields. Of note, we demonstrated that the aggregation of individuals does not reflect the overall profile of the population, which is an important rationale behind the approach we presented in this study.
This study opens the possibility of many follow-up directions. For example, can such models be applied to clinical trial data, and using the generated model to predict real-world populations? Can models be well generalized from one demographic group to another? While we touched these aspects using simulated data and real world pembrolizumab data, it will be of interest to test in other diseases and drugs as well. How does the interpolation function affect the performance of the model? How do other base learners such as deep learning, Gaussian Progress Regression work with this model? Our approach allows incorporation of any supervised base learner which can be tested in future studies concerning other diseases and therapeutics. Finally, this study opens the possibility of population-wise predictions, which is distinguished from individual-wise prediction. This will have enormous applications in the future in all research areas whose current focus is on individual predictions.