Phylogenetically informative data from next-generation sequencing

We have long expected that complete genomic information will provide us with the tools to determine relationships among organisms, from strains of pathogens to the whole tree of life. However, while whole genome sequencing is now possible, extracting information from these data remains challenging. I have developed software (SISRS) to easily identify phylogenetically informative data from raw next-generation sequence data. We have used SISRS data to show that non-coding loci provide more overall signal and a higher proportion of phylogenetic signal compared to coding loci, and different types of loci (e.g. coding vs. introns) have surprisingly consistent levels of information across time scales.

Estimating diversification rates from phylogenies

Copy number variation in genomes

Evolution of plants in the Andes

This research is funded by NSF grant DEB-2100217 to R. Schwartz