Assistant Professor of Biostatistics

Harvard T.H. Chan School of Public Health

Department of Biostatistics

655 Huntington Avenue, Building 1, Room 419

Boston, Massachusetts 02115

The goal of my research is to develop statistical methods to help understand diseases of aging at the cellular/molecular level. I am interested in using high-throughput genomic data to develop mathematical models of the key biological processes and molecular mechanisms that underlie age-related decline in general, and cancer in particular. In addition to advancing our scientific understanding, this will enable the development of accurate prognostic and diagnostic tools for precision medicine.

My methodological research focuses on:

- robustness to model misspecification,
- nonparametric Bayesian models,
- frequentist analysis of Bayesian methods, and
- efficient algorithms for inference in complex models.

- models for de-biasing high-throughput sequencing data,
- inferring cancer tumor phylogenetic trees (clonal evolution),
- biostatistical analysis of X-linked Dystonia Parkinsonism (XDP), and
- models for tuberculosis risk assessment.

Bayesian data selection, E. N. Weinstein and J. W. Miller, 2021. (pdf) (arXiv) (Congratulations to Eli on receiving the IBM Student Paper Award at NESS 2021 for this work.)

Age-dependent regulation of SARS-CoV-2 cell entry genes and cell death programs correlates with COVID-19 severity, Z. Inde, B. A. Croker, and 29 others including J. W. Miller, Science Advances, 7(34): eabf8609, 2021. (pub) (pdf)

Trends, mechanisms, and racial/ethnic differences of tuberculosis incidence in the US-born population aged 50 years or older in the United States, S. Kim, T. Cohen, C. R. Horsburgh, Jr, J. W. Miller, A. N. Hill, S. M. Marks, R. Li, J. S. Kammerer, J. A. Salomon, and N. A. Menzies, Clinical Infectious Diseases, ciab668, July 2021. (pub) (pdf)

Asymptotic normality, concentration, and coverage of generalized posteriors, J. W. Miller, Journal of Machine Learning Research, 22(168):1−53, 2021. (pub) (pdf) (arXiv - preprint of earlier version)

Bayesian optimal experimental design for inferring causal structure, M. Zemplenyi and J. W. Miller, 2021. (pdf) (arXiv)

Reproducible model selection using bagged posteriors, J. H. Huggins and J. W. Miller, 2021. (pdf) (arXiv)

Inference in generalized bilinear models, J. W. Miller and S. L. Carter, 2020. (pdf) (arXiv)

Robust inference and model criticism using bagged posteriors, J. H. Huggins and J. W. Miller, 2020. (arXiv)

Identifying longevity associated genes by integrating gene expression and curated annotations, F. W. Townes, K. Carr, and J. W. Miller, PLOS Computational Biology, 16(11): e1008429, 2020. (pub) (pdf) (bioaRxiv)

Fast and accurate approximation of the full conditional for gamma shape parameters, J. W. Miller, Journal of Computational and Graphical Statistics (JCGS), Vol. 28, 2019, pp. 476-480. (pub) (pdf) (arXiv) (source code)

An elementary derivation of the Chinese restaurant process from Sethuraman's stick-breaking process, J. W. Miller, Statistics & Probability Letters, Vol. 146, 2019, pp. 112-117. (pub) (pdf) (arXiv)

Robust Bayesian inference via coarsening, J. W. Miller and D. B. Dunson, Journal of the American Statistical Association (JASA) , Vol. 114, 2019, pp. 1113-1125. (pub) (extended version) (older version on arXiv) (code and data) (Recognized publication for the 2021 COPSS George W. Snedecor Award, received by David B. Dunson.)

Real-time genomic characterization of advanced pancreatic cancer to enable precision medicine, A. J. Aguirre, J. A. Nowak, N. D. Camarda, R. A. Moffitt, and 57 others including J. W. Miller, Cancer Discovery, CD-18-0275, 2018. (pub)

A detailed treatment of Doob's theorem, J. W. Miller, 2018. (pdf) (arXiv)

Mixture models with a prior on the number of components, J. W. Miller and M. T. Harrison, Journal of the American Statistical Association (JASA), Vol. 113, 2018, pp. 340-356. (pub) (pdf) (arXiv) (code)

Flexible models for microclustering with application to entity resolution, B. Betancourt, G. Zanella, J. W. Miller, H. Wallach, A. Zaidi, and B. Steorts, Advances in Neural Information Processing Systems (NIPS), Vol. 29, 2016, pp. 1417-1425. (pub) (pdf) (arXiv)

Microclustering: When the cluster sizes grow sublinearly with the size of the data set, J. W. Miller, B. Betancourt, A. Zaidi, H. Wallach, and R. C. Steorts, Bayesian Nonparametrics: The Next Generation workshop, NIPS 2015. (pdf) (arXiv)

Inconsistency of Pitman-Yor process mixtures for the number of components, J. W. Miller and M. T. Harrison, Journal of Machine Learning Research, Vol. 15, 2014, pp. 3333-3370. (pub) (pdf) (arXiv) (Received the IBM Student Paper Award at NESS 2013.)

A simple example of Dirichlet process mixture inconsistency for the number of components, J. W. Miller and M. T. Harrison, Advances in Neural Information Processing Systems (NIPS), Vol. 26, 2013, pp. 199-206. (pub) (pdf) (arXiv)

Importance sampling for weighted binary random matrices with specified margins, M. T. Harrison and J. W. Miller. (pdf) (arXiv)

Exact sampling and counting for fixed-margin matrices, J. W. Miller and M. T. Harrison, The Annals of Statistics, Vol. 41, No. 3, 2013, pp. 1569-1592. (pub) (pdf) (arXiv)

Reduced criteria for degree sequences, J. W. Miller, Discrete Mathematics, Vol. 313, Issue 4, 2013, pp. 550-562. (pub) (pdf) (arXiv)

Nonparametric and Variable-Dimension Bayesian Mixture Models: Analysis, Comparison, and New Methods,
J. W. Miller, Brown University, Division of Applied Mathematics, 2014.
(pdf)

(Received the Brown University Outstanding Dissertation Award in the Physical Sciences, generously sponsored by the Joukowsky Family Foundation.)

Exact enumeration and sampling of matrices with specified margins, J. W. Miller and M. T. Harrison, Unpublished report (2011). (pdf) (arXiv)

A practical algorithm for exact inference on tables, J. W. Miller and M. T. Harrison, Proceedings of the Joint Statistical Meetings 2010, Statistical Computing Section.