Statistical and scientific interests
Bayesian inference
Bayesian inference is a powerful statistical tool for hierarchical models, a useful framework for incorporating prior understanding into models, and a convenient framework for jointly accounting for and propagating uncertainty. Beyond the development of models to answer scientific questions, I am interested in efficient and scalable inference, model adequacy, and diagnosing MCMC convergence. I have investigated the usefulness of a variety of approaches for scalable inference, including gradient-based MCMC techniques (both with and without further approximations) and more ad-hoc approaches. I have devised model adequacy diagnostics for genomic data to examine correlated evolution and for trees to investigate evolutionary rates through time. For model parameters which aren’t simple numeric quantities, I have worked to extend notions of the effective sample size, so that we can still see whether our MCMC is giving us trustworthy output.
Branching processes
Branching process models are useful tools in both evolution and epidemiology. Evolutionary trees can be modeled as partially-observed branching processes, most commonly birth-death models. In studies of the tree of life, speciation is modeled by births and extinction by deaths. In epidemiological contexts where genomic sequence data is available, we can also apply these evolutionary methods, with infection and recovery replacing speciation and extinction. Branching processes can further be useful approximations early in an outbreak or for subcritical regimes. I have worked on statistical models for phylogenetic trees and final size data, as well as tools for understanding models via simulation. I have devised Bayesian nonparametric models for time-varying birth and death rates, and examined the limits of what can be learned from data.
Computational evolutionary biology
Evolution is both the science of the history of life and a lens through which we can better study life as we know it. One particularly useful toolkit for studying evolution centers around the phylogeny, an evolutionary tree that describes relationships between, say, species or viral lineages. Phylogenies can be used as the backbone of systems of taxonomy, to investigate mass extinctions, to study rates of viral spread, and to examine the geospatial dispersal patterns of viruses. (For a statistical perspective on questions you can ask with a phylogeny, check out this review paper.) I am also interested in the process of phylogenetic inference itself. Trees are a challenging object to infer, which can make it difficult to usefully ask questions like “what happens when our model’s assumptions are bad?” or “how can we tell if our model output is trustworthy?” I study these questions with computational experiments and, where possible, mathematical extensions of existing theory.