Software for estimating reproductive success from genetic data
Mykiss is a computer program for estimating the reproductive success of individuals based on genetic data (codominant genotypes).
The program has three noteworthy features.
- It estimates reproductive success for each possible pair of parents using the maximum-likelihood method of Roeder et al. (1989). This method estimates the proportion of offspring descended from each possible pair of parents, and should produce more accurate estimates than methods that attempt to identify the parents of each offspring one at a time.
- The program was specifically developed to accommodate unsampled parents. Mykiss will estimate the proportion of offspring that have unsampled mothers and fathers.
- The program uses the genotyping error model of Wang (2004). This model assumes there are two types of errors: allelic dropout and misscoring of alleles. The error rate for each locus is assumed to be equal. Users can either supply an error rate or estimate the error rate from the data.
Input file format: Data analysis
Mykiss reads genetic data stored in GENEPOP files having three POP's. The first POP is for potential mothers, the second for potential fathers, and the third is for all of the offspring sampled.
Input file format: Simulation
Mykiss can simulate genotypes for parents and offspring. The 3-POP input file format described can be used to specify allele frequencies in a population for running simulations. If this format is used, the allele frequencies in the adults will be used to simulate new data (i.e. the genotypes of the offspring will be ignored). In addition to this 3-POP type of file, a 1-POP file can be used in which there is a single GENEPOP population.
How to test how the program works
The easiest way to learn how the program works is to simulate some data and then analyze it. Follow the instructions below to simulate a data set and then analyze it.
- Download the program and a sample GENEPOP file of genotypes (here).
- Click on the "Simulate genotypes" option in the [Simulation] menu.
- Open a GENEPOP file having allele frequencies to use in the simulation (e.g. the GENEPOP file above).
- Specify the parameters to use in the simulation (e.g., how many offspring to simulate).
- Click on "Estimate reproductive success using maximum likelihood" in the [Analysis] menu.
- Look at the output.
Sample output is shown below for 10 offspring simulated from the allele frequencies contained in the example data file.
OFFSPRING MOTHER PROB. FATHER PROB. LOG10(T) 1__F4xM6 F4 0.9937 M6 1.0000 -3.95 2__F4xM6 F4 0.9974 M6 1.0000 -3.70 3__F3xUnk F3 0.9988 Unk 0.9989 -5.22 4__F7xM5 F7 1.0000 M5 1.0000 -5.05 5__F1xM6 F1 0.9937 M6 1.0000 -5.04 6__F4xM6 F4 0.9848 M6 1.0000 -4.00 7__F10xM3 F10 1.0000 M3 1.0000 -4.15 8__F1xM6 F1 0.9854 M6 1.0000 -5.05 9__F2xM3 *F6 0.8402 M3 0.9763 -4.29 10__F4xM6 F4 0.9989 M6 1.0000 -3.72
The left-most column of this output contains ID's for ten simulated offspring. Because
this is simulated data, the parents of each offspring are known, and they are recorded
in the ID of each offspring. For example, the mother of the first offspring is F4
the father is M6.. The next two columns list the estimated mother for each offspring
and the posterior probability for that estimate. The next two columns report similar
results for fathers. The last column lists the logarithm of the estimated Mendelian
transition probability for the parent/offspring triplet. An example illustrates what
this number represents. For the case of offspring #1, F4, and M6. The Log10(Probability)
that F4 and M6 have an offspring with a genotype equal to that of offspring #1 is
equal to -3.95. In the data shown, all the Log(P)'s are fall in the same range, but
if there is anomalous genotypes among the offspring, its transition probabilities
may be very low, and this statistic can be useful for identifying them.
Note a few other characteristics in the output. Offspring #3 has a unsampled father. This is indicated by "Unk." Lastly, note the asterisk before the assigned mother of offspring #9. This indicates the parentage assignment was incorrect. Such indications can be done only for simulated data (!).
In addition to the results shown above, Mykiss writs the multilocus genotype of the offspring and their assigned parents to an output file. A selection from one of these files is shown below. The same type of data described above is included. This includes ID's for offspring, mother, father, Log10(Transition probability), and the posterior probability of parentage for the mother and father. In addition, the multilocus genotypes of the individuals are shown. In this example below, genotypes for seven loci are shown; alleles are indicated with numbers 1-7. A Mendelian incompatibility is indicated by brackets < >.
Offspring 15__F8xM3 -6.75 2,2 1,4 3,4 1,3 4,4 <3,4> 3,7 Mother F8 0.9885 2,2 1,3 2,3 1,3 1,4 3,4 3,6 Father M3 1.0000 2,2 2,4 2,4 1,1 1,4 2,3 3,6
Download and installation
Mykiss runs on the Microsoft Windows operating system that has the .NET platform installed. If you have a new computer, this is probably already installed (See my Software page for additional information).
Click HERE to download a ZIP file containing Mykiss and a library of functions (kalinowski_library.dll) that the program needs.
To "install" Mykiss, place Mykiss.exe and kalinowski_library.dll in a folder. Click on Mykiss.exe to run. Delete both files to "uninstall."
I am working on a manuscript describing the statistical approach used by Mykiss, and hope to submit it soon. Please contact me if you would like a copy.
Funding for the development of this software was provided by the United States Fish and Wildlife Service Abernathy Fish Technology Center and the National Science Foundation.