Co-instructors: Matt Lavin (email)

This seminar series focuses on data analysis and visualization. In addition to phylogenetic programs such as PAUP, MrBayes, and BEAST2, the importance of using R (https://cran.r-project.org/) will be demonstrated. Students enrolled in PSPP 594 can sign up for DataCamp (https://www.datacamp.com) to learn about data science. 

Overall goal

This semester, we will go through Cadotte & Davies' Phylogenies in Ecology and the associated R code (normally, participants will develop and enhance their own phylogenetic data sets in order to fully address specific questions and hypotheses related to thesis research).

Learning outcomes

1) Assemble phylogenetic data sets from morphological and genetic. 2) Analyze a phylogenetic data set using parsimony, likelihood, and Bayesian approaches. 3) Analyze phylogenetic data by combining different types of data and applying alternative assumptions to each data partition. 4) Conduct clade support analyses and analyze potential conflict among data partitions. 5) Use likelihood and information criteria to manually select DNA substitution models and test for clock-like rates of evolution. 6) Estimate evolutionary rates of substitution and absolute ages for specified clades. 8) Analyze ecological and geographical constraints that potentially shape character evolution and the shape of phylogenetic trees.

In addition to R libraries, we will use programs like MUSCLE and PhyDE that facilitate data processing. Generating trees will focus on parsimony using PAUP and Bayesian inference using BEAST2, RevBayes, and MrBayes. Tree visualization will include FigTree and various R libraries. Databanking will include presentations on GenBankTreeBase, and Datadryad. If time permits, we can address community phylogenetic approaches (e.g., with Phylocom and Phylomatic, programs that are mostly being implemented in R).

Data and command blocks for parsimony analysis:

For importing text files of morphological data sets: Ian Foley's Acropini data; general import.txt.
Work flow from GenBank downloads to model selection: Chondrichtheys COI mock data set.
PAUP parsimony analysis of DNA data: Chance Noffsinger (including Muscle alignment commands).
PAUP parsimony analysis of morphological data: Richard Carr (from Grogan & Lund 2008).
Data partitioning and testing for congruence: data set 1.
Invoking topological constraints and testing for congruence: data set 2.
Invoking character assumptions (e.g., dollo, stepmatrices, etc.): data set 3.

Data and command blocks for basic likelihood analysis:

Heuristic searches: data set 4.
Basic manual selection of DNA models: non-coding and coding sequence data.
Manual selection of strict clock rate model using LRT: data set 5.
Manual selection of DNA substitution models using AIC: data set 6 and data set 7.

Data and command blocks and miscellany for Bayesian analysis:

A basic MrBayes command block: data set 8.
MrBayes analysis of morphological data: Jacob Gardner; of combined data: Matt Lavin.
BEAST2 protocol.
Analysis of *.log and *.trees files output from BEAST2 analyses using R: Luetzelburgia, Robinieae examples.
RevBayes (MSU workshop) and the fossilized birth-death time (tree) model (Tracy Heath).
Phyloseminar.org YouTube presentations including those by Paul O. Lewis (76, 77, 78, 79).

Comparative phylogenetic methods:

Joe Felsenstein's presentation addressing the same topics presented by Jacob Gardner.
BayesTraits presentation by Jacob Gardner.