Skip to main content

Methods

Clinical Trials Methodology

  • Berger VW, Bour LJ, Carter K, Chipman JJ, Everett CC, Heussen N, Hewitt C, Hilgers RD, Luo YA, Renteria J, Ryeznik Y, Sverdlov O, Uschner D. A roadmap to using randomization in clinical trials. BMC Med Res Methodol. 2021. [link]
  • Chipman JJ, Mayberry L, Greevy RA. Rematching on-the-fly: Sequential matched randomization and a case for covariate-adjusted randomization. Stat Med. 2023. [link]
  • Sverdlov O, Carter K, Hilgers RD, Everett CC, Berger VW, Luo YA, Chipman JJ, Ryeznik Y, Ross J, Knight R, Yamada K. Which Randomization Methods Are Used Most Frequently in Clinical Trials? Stat Biopharm Res. 2023. [link]
  • Uschner D, Sverdlov O, Carter K, Chipman J, Kuznetsova O, Renteria J, Lane A, Barker C, Geller N, Proschan M, Posch M. Using Randomization Tests to Address Disruptions in Clinical Trials. Stat Biopharm Res. 2023. [link]
  • Wei J, Mozumder SI, Li L, Xi D, Xu J, Lin R, Sverdlov O, Chipman JJ. Current practice on covariate adjustment and stratified analysis —based on survey results by ASA oncology estimand working group conditional and marginal effect task force. BMC Med Res Methodol. 2025. [link]
  • Kuznetsova O, Ross J, Bodden D, Cooner F, Chipman J, Jacko P, Krisam J, Luo YA, Mielke T, Robertson DS, Ryeznik Y, Villar SS, Zhao W, Sverdlov O. Randomization in the age of platform trials: unexplored challenges and some potential solutions. BMC Med Res Methodol. 2025. [link]
  • Chipman JJ, Rosenberger WF, Sverdlov O, Uschner D. Special Issue on Randomization Methods to Design and Analyze Trial Estimands, Adjust for Covariates, and Implement Efficient Designs. Stat Biopharm Res. 2024. [link]
  • Del Fiol G, Orleans B, Kuzmenko TV, Chipman J, Greene T, Martinez A, Wirth J, Meads R, Kaphingst KK, Gibson B, Kawamoto K, King AJ, Siaperas T, Hughes S, Pruhs A, Dinkins CP, Lam CY, Pierce JH, Benson R, Borsato EP, Cornia R, Stevens L, Bradshaw RL, Schlechter CR, Wetter DW. SCALE-UP II: protocol for a pragmatic randomised trial examining population health management interventions to increase the uptake of at-home COVID-19 testing in community health centres. BMJ Open. 2024. [link]
  • Han G, Zhao B, Pye K, Zhao H. The Piecewise Exponential Distribution. Signif (Oxf). 2017. [link]
  • Kukhareva PV, Li H, Balbin C, Stevens ER, Mann DM, Butler JM, Caverly TJ, Del Fiol G, Kaphingst KA, Schlechter CR, Tiase VL, Fagerlin A, Zhang Y, Hess R, Flynn MC, Reddy C, Martin D, Warner PB, Nanjo C, Choi J, Ngo-Metzger Q, Kawamoto K. Enhancement of Patient-Centered Lung Cancer Screening: The MyLungHealth Randomized Clinical Trial. JAMA Oncol. 2026. [link]
  • Chipman JJ, Greevy RA Jr, Mayberry L, Blume JD. Sequential monitoring using the Second Generation P-Value with Type I error controlled by monitoring frequency. The American Statistician. 2025. [link]
  • Chipman JJ, Sanda MG, Dunn RL, Wei JT, Litwin MS, Crociani CM, Regan MM, Chang P; PROST-QA Consortium. Measuring and predicting prostate cancer related quality of life changes using EPIC for clinical practice. J Urol. 2014. [link]
  • Greene T, Ying J, Vonesh EF, Tighiouart H, Levey AS, Coresh J, Herrick JS, Imai E, Jafar TH, Maes BD, Perrone RD, Del Vecchio L, Wetzels JFM, Heerspink HJL, Inker LA. Performance of GFR Slope as a Surrogate End Point for Kidney Disease Progression in Clinical Trials: A Statistical Simulation. J Am Soc Nephrol. 2019. [link]

Causal Inference & Decision Making

  • Wang X, Lee H, Haaland B, Kerrigan K, Puri S, Akerley W, Shen J. A matching-based machine learning approach to estimating optimal dynamic treatment regimes with time-to-event outcomes. Stat Methods Med Res. 2024. [link]
  • Chen S, Wu H, Zhao H. A comparison of causal inference methods for evaluating multiple treatment groups. J Nonparametr Stat. 2025. [link]
  • Xu Y, Yadlowsky S. Calibration Error for Heterogeneous Treatment Effects. Proceedings of Machine Learning Research. 2022. [link]
  • Xu Y, Greene TH, Bress AP, Bellows BK, Zhang Y, Zhang Z, Kolm P, Weintraub WS, Moran AS, Shen J. An efficient approach for optimizing the cost-effective individualized treatment rule using conditional random forest. Stat Methods Med Res. 2022. [link]
  • Xu Y, Greene TH, Bress AP, Sauer BC, Bellows BK, Zhang Y, Weintraub WS, Moran AE, Shen J. Estimating the optimal individualized treatment rule from a cost-effectiveness perspective. Biometrics. 2022. [link]
  • Shen J, Schwartz J, Baccarelli AA, Lin X. Testing for the causal mediation effects of multiple mediators using the kernel machine difference method in genome-wide epigenetic studies. Ann Appl Stat. 2024. [link]

Data Science, Prediction & Biomarker/AI Evaluation

  • Knudsen BS, Jadhav A, Perry LJ, Thagaard J, Deftereos G, Ying J, Brintz BJ, Zhang W. A pipeline for evaluation of machine learning/artificial intelligence models to quantify PD-L1 immunohistochemistry. Lab Invest. 2024. [link]
  • Jo Y, Puri S, Haaland B, Coletta AM, Chipman JJ, Embrey K, Kerrigan KC, Patel SB, Moynahan K, Gumbleton M, Akerley WL. Utilizing activity trackers to enhance survival prediction in metastatic non-small cell lung cancer. Clin Lung Cancer. 2025. [link]
  • Jo Y, Chipman JJ, Haaland B, Greene T, Kohli M. Multigene Copy Number Alteration Risk Score Biomarker–Based Enrichment Study Designs in Metastatic Castrate-Resistant Prostate Cancer. JCO Precis Oncol. 2024. [link]
  • Li Q, Fisher K, Meng W, Fang B, Welsh E, Haura EB, Koomen JM, Eschrich SA, Fridley BL, Chen YA. GMSimpute: a generalized two-step Lasso approach to impute missing values in label-free mass spectrum analysis. Bioinformatics. 2020. [link]
  • Chen YA, Almeida JS, Richards AJ, Müller P, Carroll RJ, Rohrer B. A nonparametric approach to detect nonlinear correlation in gene expression. J Comput Graph Stat. 2010. [link]

Omics Methods

  • Qiao X, Ngo D, Straight B, Needham BL, Hilton CE, Naugle A. A Bayesian high-dimensional mediation analysis for multilevel genome-wide epigenetic data. J Appl Stat. 2025. [link]
  • Brintz B, Fuentes C, Madsen L. An asymptotic approximation to the N-mixture model for the estimation of disease prevalence. Biometrics. 2018;74(4):1512–1518. doi:10.1111/biom.12913. [link]
  • Sweeney C, Boucher KM, Samowitz WS, Wolff RK, Albertsen H, Curtin K, Caan BJ, Slattery ML. Oncogenetic tree model of somatic mutations and DNA methylation in colon tumors. Genes Chromosomes Cancer. 2009. [link]
  • Nix DA, Courdy SJ, Boucher KM. Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics. 2008. [link]
  • Tavtigian SV, Harrison SM, Boucher KM, Biesecker LG. Fitting a naturally scaled point system to the ACMG/AMP variant classification guidelines. Hum Mutat. 2020. [link]
  • Clark KA, Paquette A, Tao K, Bell R, Boyle JL, Rosenthal J, Snow AK, Stark AW, Thompson BA, Unger J, Gertz J, Varley KE, Boucher KM, Goldgar DE, Foulkes WD, Thomas A, Tavtigian SV. Comprehensive evaluation and efficient classification of BRCA1 RING domain missense substitutions. Am J Hum Genet. 2022. [link]

Code

Sequential Monitoring - by Dr. Jonathan Chipman

  • SeqSGPV: The SeqSGPV package is used to design a study with sequential monitoring of scientifically meaningful hypotheses using the second generation p-value (SGPV). (link)

Surrogacy evaluation - by Dr. Xuan Wang

  • OptimalSurrogate: Model Free Approach to Quantifying Surrogacy. Identifies an optimal transformation of a surrogate marker such that the proportion of treatment effect explained can be inferred based on the transformation of the surrogate and nonparametrically estimates two model-free quantities of this proportion.  (link)
  • CMFsurrogate: Calibrated Model Fusion Approach to Combine Surrogate Markers. Uses a calibrated model fusion approach to optimally combine multiple surrogate markers. Specifically, two initial estimates of optimal composite scores of the markers are obtained; the optimal calibrated combination of the two estimated scores is then constructed which ensures both validity of the final combined score and optimality with respect to the proportion of treatment effect explained (PTE) by the final combined score. (link)
  • OSsurvival: Assessing Surrogacy with a Censored Outcome. Identifies the optimal transformation of a surrogate marker and estimates the proportion of treatment explained (PTE) by the optimally-transformed surrogate at an earlier time point when the primary outcome of interest is a censored time-to-event outcome. (link)
  • SurrogateOutcome: Estimation of the Proportion of Treatment Effect Explained by Surrogate Outcome Information. Estimates the proportion of treatment effect on a censored primary outcome that is explained by the treatment effect on a censored surrogate outcome/event. (link)
  • PTERP: PTE and RP for Optimally-Transformed Surrogate. Evaluates the strength of a surrogate marker by estimating the proportion of treatment effect explained (PTE) and relative power(RP) for the optimally-transformed version of the surrogate. (link)
  • longsurr: Longitudinal Surrogate Marker Analysis. (link)

Omics - by Dr. Ann Chen

  • ISCVAM is fast and interactive visual analytics for analyzing single cell RNA-seq and multimodal datasets. (link)
  • Drepmel—A Multi-Omics Melanoma Drug Repurposing Resource for Prioritizing Drug Combinations and Understanding Tumor Microenvironment (link)
  • GMSImpute: a generalized two-step Lasso approach to impute missing values in label-free mass spectrum analysis (link)

Microbiome and Epigenetics Computational Tools - by Dr. Xi Qiao

  • ZIMMA: a dual mediation modeling framework designed to identify microbial biomarkers mediating between exposures/treatments and health outcomes. Separating microbial prevalence and abundance, the method provides nuanced biological interpretation of microbiome composition and dominance, enhancing our understanding the role of microbial on health outcomes. (link)
  • CoMPaSS: a computational pipeline designed to systematically compare microbiome sequencing platforms performance and to navigate study design. It quantitatively evaluates cross-platform concordance from community diversity to taxonomic profiling, and provides integrated support for study design through power analysis, sample size estimation, and cost assessment. (link)
  • BHMM: a Bayesian hierarchical mediation framework designed to identify epigenetic biomarkers mediating between exposures and outcomes in multilevel data settings. It captures within- and between-subject correlations by hierarchical modeling , and enables high-dimensional mediators’ selection through spike-and-slab priors, leading to reliable identification of epigenetic pathways. (link)

Metalearners for estimating heterogeneous treatment effects - by Dr. Yizhe (Crystal) Xu

  • Survlearners: Extended implementations of five state-of-the-art metalearners (S-, T-, X-, M-, and R-learners) for survival outcomes. Metalearners are specific meta-algorithms that leverage predictive models, e.g., machine learning models, to solve the causal task of estimating treatment heterogeneity (link)

Target trial emulation (TTE) - by Dr. Yizhe (Crystal) Xu

  • TTECausalR: R tutorials for implementing target trial emulation (TTE) and causal inference methods to estimate treatment effects using observational data. These tutorials introduce important concepts in TTE and provide step-by-step instructions for deploying the procedures with examples. (link)

Other useful R packages or resources

  • Tableone: An R package to create “Table 1”, description of baseline characteristics (link)