Sequence Viewer for Students
Educational software for viewing and comparing DNA sequences
In order for students to understand how DNA stores information, how this information
is used, and how DNA sequences evolve, they need to study DNA sequences of actual
genes. Unfortunately, the DNA sequences of most genes are inconveniently long, and
this makes them difficult to view and compare. Specialized software is helpful, but
most programs for manipulating DNA sequences have been written for researchers, and
are difficult for students to use. Sequence Viewer for Students (Sequence Viewer)
is an easy-to-use computer program for viewing and compare DNA sequences. The program
is intended for introductory biology students with no previous genetics training.
Sequence Viewer performs the following functions:
- Displays aligned DNA sequences
- Identifies and displays variable sites
- Translates DNA sequences into amino acid sequences
- Counts the number of nucleotide differences between DNA sequences
- Displays labels marking active sites, exons, or other regions of a gene selected by the user
- Sequence Viewer does not search online databases, align sequences, or construct phylogenies. All of these functions will need to be performed by an instructor using other software.
Sequence Viewer runs on the Microsoft Windows operating system. It also requires the
Microsoft .NET framework. The .NET Framework is a component of the Microsoft Windows
operating system used to build and run Windows-based applications. If your computer
has a recent version of Windows, it probably already has .NET installed. You can check
by clicking 'Start' on your Windows desktop, selecting 'Control Panel', and then double-clicking
the 'Add or Remove Programs' icon. When that window appears, scroll through the list
of applications. If you see Microsoft .NET Framework 2.0 listed (or a more recent
version), .NET is already installed and you do not need to install it again.
If you do not have .NET already installed on your computer, the easiest way to install it is to update your operating system. This is not difficult. To do this, open Microsoft Explorer, go to the <Tools> menu and select <Windows Update>. Then find Microsoft .NET framework 2.0 and install it. (It will be listed under "Pick updates to install.")
Sequence Viewer requires no formal installation. To “install” the program, download this file onto your computer. Once you have downloaded the file, unzip it, and click on SequeceViewer.exe to run. Delete the file to “uninstall” the program.
Viewing DNA sequence
Sequence Viewer is a tool for viewing and comparing DNA sequences. The sequences to display are stored in text files which Sequence Viewer opens and reads. To open one of these text files this, start Sequence Viewer, go to the <File> menu, and select <Open text file containing DNA sequences>. Figure 1 shows what the program should look like once the DNA sequence file is opened.
Once a DNA sequence file has been opened, the DNA sequence can be viewed in several ways. Perhaps the most useful function offered by the program is to highlight variable sites. Sequence Viewer does this by displaying the entire DNA sequence of the first sequence listed in the input file, and then displaying only variable nucleotides for the remaining sequences. Nucleotides that are the same as the first sequence are depicted with a dot. Figure 2 shows DNA sequences for cytochrome c displayed in this manner.
Formatting text files to store DNA sequences
The text files containing DNA sequences to view must be formatted in a special way. The following example shows DNA sequences for cytochrome c in humans, chimpanzees, and mice.
#Human ATGGGTGATGTTGAGAAAGGCAAGAAGATTTTTATTATGAAGTGTTCCCAGTGCCACACC GTTGAAAAGGGAGGCAAGCACAAGACTGGGCCAAATCTCCATGGTCTCTTTGGGCGGAAG ACAGGTCAGGCCCCTGGATACTCTTACACAGCCGCCAATAAGAACAAAGGCATCATCTGG GGAGAGGATACACTGATGGAGTATTTGGAGAATCCCAAGAAGTACATCCCTGGAACAAAA ATGATCTTTGTCGGCATTAAGAAGAAGGAAGAAAGGGCAGACTTAATAGCTTATCTCAAA AAAGCTACTAATGAGTAA #Chimpanzee ATGGGTGATGTTGAGAAAGGCAAGAAGATTTTTATTATGAAGTGTTCCCAGTGCCATACC GTTGAAAAGGGAGGCAAGCACAAGACTGGGCCAAATCTCCATGGTCTCTTCGGGCGGAAG ACAGGTCAGGCCCCTGGATATTCTTACACAGCCGCCAATAAGAACAAAGGCATCATCTGG GGAGAGGATACACTGATGGAGTATTTGGAGAATCCCAAGAAGTACATCCCTGGAACAAAA ATGATATTTGTCGGCATTAAGAAGAAGGAAGAAAGGGCAGACTTAATAGCTTATCTCAAA AAAGCTACTAATGAGTAA #Mouse ATGGGTGATGTTGAAAAAGGCAAGAAGATTTTTGTTCAGAAGTGTGCCCAGTGCCACACT GTGGAAAAGGGAGGCAAGCATAAGACTGGACCAAATCTCCACGGTCTGTTCGGGCGGAAG ACAGGCCAGGCTGCTGGATTCTCTTACACAGATGCCAACAAGAACAAAGGCATCACCTGG GGAGAGGATACCCTGATGGAGTATTTGGAGAATCCCAAAAAGTACATCCCTGGAACAAAA ATGATCTTCGCTGGAATTAAGAAGAAGGGAGAAAGGGCAGACCTAATAGCTTATCTTAAA AAGGCTACTAATGAGTAA
The input file must have the following characteristics. Most importantly, the DNA sequences must be aligned and each sequence must be the same length. Second, each DNA sequence is preceded by a line labeling the sequence, and this line must begin with a “#” symbol. The DNA sequences do not have to be divided into segments, and if they are divided into segments, the segments do not have to the same length. Deletions in sequences must be indicated with a “-“ symbol so the that all of the sequences are aligned.
Notes and labels
Sequence Viewer can display notes to the user, and can label sections of DNA sequences. A “note” in this context is text that is displayed to the user before the first DNA sequences (see Figure 3). A “label” is text that is shown above DNA sequences at a particular location in the sequence. Text to place in notes and labels is put in the top of the text file before the first DNA sequence. Here is an example:
NOTE *** = Active site of protein NOTE XXX = Warfarin binding site LABEL 394 "*********" LABEL 412 "XXXXXXXXX"
The number following “LABEL” is the number of the nucleotide to place the label over. Figure 3 shows how these notes and labels are displayed.
Translating DNA sequences
Sequence Viewer will translate DNA sequences into amino acid sequences (if possible). Genetic codes vary among organisms, and Sequence Viewer will translate DNA sequences using one of three codes: standard, vertebrate mitochondrial, and invertebrate mitochondrial. If the user does not specify a code, the standard code is used. The genetic code to use is specified in the DNA sequence file using one of the following lines:
GENETIC_CODE = STANDARD GENETIC_CODE = VERTEBRATE_MITOCHONDRIAL GENETIC_CODE = INVERTEBRATE_MITOCHONDRIAL GENETIC_CODE = NONE
The last option indicates that the DNA sequences should not be translated. One of these lines should be placed in the beginning of the file before the first DNA sequence.
Counting the number of differences between sequences
Sequence Viewer will count the number of nucleotide differences between sequences and output the number of differences between all pairs of sequences as a matrix. This menu is password protected with the password “dna.” This function is password protected in order to reduce student tendencies to “click” first and think later.
Example data files
Four example data files are available here. These include: vertebrate cytochrome c sequences, primate NADH1 sequences, human mitochondrial DNA sequences, and VKORC1 sequences in rats. The cytochrome c sequences and NADH1 are useful for showing evolutionary relationships among vertebrates and primates, respectively. The human mitochondrial sequences are useful for illustrating genetic diversity within species, and the VKORC1 sequences are useful for discussing natural selection.
Please use the following citation to cite this program:
- Kalinowski ST, MJ Leonard, TM Andrews (2010) Sequence Viewer for Students: An easy to use computer program for viewing and comparing DNA students. Genetics (In review).