TreeFit
Software for estimating how well evolutionary trees fit genetic distance data.
|
Description Evolutionary trees are frequently used to describe genetic relationships between populations. Hierarchical, bifurcating trees are a reasonable model for the evolution of DNA sequences and species, but may or may not be appropriate for describing patterns of genetic similarity and difference in populations connected by gene flow. For example, if populations are arranged in a stepping stone pattern (either one or two dimensional), the genetic relationships between populations will not follow a hierarchical pattern, and traditional neighbor-joining or UPGMA trees may not be appropriate tools for describing the structure of such populations. The computer program TreeFit was written to analyze how well a tree fits the genetic data the tree was calculated from. TreeFit creates neighbor-joining and UPGMA trees from a genetic distance matrix, and then compares the observed genetic distance between populations with the genetic distance in the tree. The similarity between these distances is express as R-squared, the familiar statistic used to summarize the scatter of points around a least-squares regression line.
Input file format TreeFit can read two file formats: a GENEPOP file of genotypes or a text file of genetic distances. If a GENEPOP file is used, TreeView will calculate a matrix of pairwise Fst and use those genetic distances to construct evolutionary trees. The GENEPOP file format is described on the program's webpage.
If genetic distances are used as input, the distance file must have the following format:
1. The first line of the file contains the title of the data set. 2. Each subsequent line starts out with the name of a population. 3. Population names can not contain spaces. 4. The distance matrix must be in lower-left format (not upper-right). 5. Genetic distances and delimited by spaces or tabs. 6. The genetic distance for each population to itself is omitted. 7. There can not be any extra text after the last line. 8. Note that the second line of the file has the name of the first population (e.g. "Cabin" below) but does not have a genetic distance after it.
Here is a example:
Pairwise FST for
six populations of bighorn sheep
Download TreeFit runs on the Microsoft Windows operating system that has the .NET platform installed. See my Software page for instructions on how to install this on your computer (it probably is already there). Click here to download a ZIP file containing TreeFit and a library of functions (kalinowski_library.dll) that the program needs.
A sample distance matrix file is available here. A GENEPOP file for the same data is available here.
A manual for the program is available in pdf format here.
Installation / UnInstallation To "install," place TreeFit.exe and kalinowski_library.dll in the same folder. Click on TreeFit.exe to run. Delete both files to "uninstall." |