Automated, efficient and model-free inference for randomized clinical trials via data-driven covariate adjustment
Topic: Automated, efficient and model-free inference for randomized clinical trials via data-driven covariate adjustment
Datetime: September 13th Friday, 11am-12pm EDT.
Presenter: Kelly Van Lancker from Ghent University
Zoom link:: https://umich.zoom.us/j/7573650566
Summary: In May 2023, the U.S. Food and Drug Administration (FDA) released guidance for industry on “Adjustment for Covariates in Randomized Clinical Trials for Drugs and Biological Products”. Covariate adjustment is a statistical analysis method for improving precision and power in clinical trials by adjusting for pre-specified, prognostic baseline variables. Though recommended by the FDA and the European Medicines Agency (EMA), many trials do not exploit the available information in baseline variables or make use only of the baseline measurement of the outcome. This is likely (partly) due to the regulatory mandate to pre-specify baseline covariates for adjustment, leading to challenges in determining appropriate covariates and their functional forms. We will explore the potential of automated data-adaptive methods, such as machine learning algorithms, for covariate adjustment, addressing the challenge of pre-specification. Specifically, our approach allows the use of complex models or machine learning algorithms without compromising the interpretation or validity of the treatment effect estimate and its corresponding standard error, even in the presence of misspecified outcome working models. This contrasts the majority of competing works which assume correct model specification for the validity of standard errors. Our proposed estimators either necessitate ultra-sparsity in the outcome model (which can be relaxed by limiting the number of predictors in the model) or necessitate integration with sample splitting to enhance their performance. As such, we will arrive at simple estimators and standard errors for the marginal treatment effect in randomized clinical trials, which exploit data-adaptive outcome predictions based on prognostic baseline covariates, and have low (or no) bias in finite samples even when those predictions are themselves biased.