Sequence Viewer for Students

Introduction

In order for students to understand how DNA stores information, how this information is used, and how DNA sequences evolve, they need to study DNA sequences of actual genes. Unfortunately, the DNA sequences of most genes are inconveniently long, and this makes them difficult to view and compare. Specialized software is helpful, but most programs for manipulating DNA sequences have been written for researchers, and are difficult for students to use. Sequence Viewer for Students (Sequence Viewer) is an easy-to-use computer program for viewing and compare DNA sequences. The program is intended for introductory biology students with no previous genetics training.
Sequence Viewer performs the following functions:

Displays aligned DNA sequences
Identifies and displays variable sites
Translates DNA sequences into amino acid sequences
Counts the number of nucleotide differences between DNA sequences
Displays labels marking active sites, exons, or other regions of a gene selected by the user
Sequence Viewer does not search online databases, align sequences, or construct phylogenies. All of these functions will need to be performed by an instructor using other software.

System Requirements

Sequence Viewer runs on the Microsoft Windows operating system. It also requires the Microsoft .NET framework. The .NET Framework is a component of the Microsoft Windows operating system used to build and run Windows-based applications. If your computer has a recent version of Windows, it probably already has .NET installed. You can check by clicking 'Start' on your Windows desktop, selecting 'Control Panel', and then double-clicking the 'Add or Remove Programs' icon. When that window appears, scroll through the list of applications. If you see Microsoft .NET Framework 2.0 listed (or a more recent version), .NET is already installed and you do not need to install it again.
If you do not have .NET already installed on your computer, the easiest way to install it is to update your operating system. This is not difficult. To do this, open Microsoft Explorer, go to the <Tools> menu and select <Windows Update>. Then find Microsoft .NET framework 2.0 and install it. (It will be listed under "Pick updates to install.")

Installation

Sequence Viewer requires no formal installation. To “install” the program, download this file onto your computer. Once you have downloaded the file, unzip it, and click on SequeceViewer.exe to run. Delete the file to “uninstall” the program.

Viewing DNA sequence

Sequence Viewer is a tool for viewing and comparing DNA sequences. The sequences to display are stored in text files which Sequence Viewer opens and reads. To open one of these text files this, start Sequence Viewer, go to the <File> menu, and select <Open text file containing DNA sequences>. Figure 1 shows what the program should look like once the DNA sequence file is opened.

Once a DNA sequence file has been opened, the DNA sequence can be viewed in several ways. Perhaps the most useful function offered by the program is to highlight variable sites. Sequence Viewer does this by displaying the entire DNA sequence of the first sequence listed in the input file, and then displaying only variable nucleotides for the remaining sequences. Nucleotides that are the same as the first sequence are depicted with a dot. Figure 2 shows DNA sequences for cytochrome c displayed in this manner.

Formatting text files to store DNA sequences

The text files containing DNA sequences to view must be formatted in a special way. The following example shows DNA sequences for cytochrome c in humans, chimpanzees, and mice.

      #Human
      ATGGGTGATGTTGAGAAAGGCAAGAAGATTTTTATTATGAAGTGTTCCCAGTGCCACACC
      GTTGAAAAGGGAGGCAAGCACAAGACTGGGCCAAATCTCCATGGTCTCTTTGGGCGGAAG
      ACAGGTCAGGCCCCTGGATACTCTTACACAGCCGCCAATAAGAACAAAGGCATCATCTGG
      GGAGAGGATACACTGATGGAGTATTTGGAGAATCCCAAGAAGTACATCCCTGGAACAAAA
      ATGATCTTTGTCGGCATTAAGAAGAAGGAAGAAAGGGCAGACTTAATAGCTTATCTCAAA
      AAAGCTACTAATGAGTAA

      #Chimpanzee
      ATGGGTGATGTTGAGAAAGGCAAGAAGATTTTTATTATGAAGTGTTCCCAGTGCCATACC
      GTTGAAAAGGGAGGCAAGCACAAGACTGGGCCAAATCTCCATGGTCTCTTCGGGCGGAAG
      ACAGGTCAGGCCCCTGGATATTCTTACACAGCCGCCAATAAGAACAAAGGCATCATCTGG
      GGAGAGGATACACTGATGGAGTATTTGGAGAATCCCAAGAAGTACATCCCTGGAACAAAA
      ATGATATTTGTCGGCATTAAGAAGAAGGAAGAAAGGGCAGACTTAATAGCTTATCTCAAA
      AAAGCTACTAATGAGTAA

      #Mouse
      ATGGGTGATGTTGAAAAAGGCAAGAAGATTTTTGTTCAGAAGTGTGCCCAGTGCCACACT
      GTGGAAAAGGGAGGCAAGCATAAGACTGGACCAAATCTCCACGGTCTGTTCGGGCGGAAG
      ACAGGCCAGGCTGCTGGATTCTCTTACACAGATGCCAACAAGAACAAAGGCATCACCTGG
      GGAGAGGATACCCTGATGGAGTATTTGGAGAATCCCAAAAAGTACATCCCTGGAACAAAA
      ATGATCTTCGCTGGAATTAAGAAGAAGGGAGAAAGGGCAGACCTAATAGCTTATCTTAAA
      AAGGCTACTAATGAGTAA

The input file must have the following characteristics. Most importantly, the DNA sequences must be aligned and each sequence must be the same length. Second, each DNA sequence is preceded by a line labeling the sequence, and this line must begin with a “#” symbol. The DNA sequences do not have to be divided into segments, and if they are divided into segments, the segments do not have to the same length. Deletions in sequences must be indicated with a “-“ symbol so the that all of the sequences are aligned.

Notes and labels

Sequence Viewer can display notes to the user, and can label sections of DNA sequences. A “note” in this context is text that is displayed to the user before the first DNA sequences (see Figure 3). A “label” is text that is shown above DNA sequences at a particular location in the sequence. Text to place in notes and labels is put in the top of the text file before the first DNA sequence. Here is an example:

      NOTE *** = Active site of protein 
      NOTE XXX = Warfarin binding site

      LABEL 394 "*********"
      LABEL 412 "XXXXXXXXX"

The number following “LABEL” is the number of the nucleotide to place the label over. Figure 3 shows how these notes and labels are displayed.

Translating DNA sequences

Sequence Viewer will translate DNA sequences into amino acid sequences (if possible). Genetic codes vary among organisms, and Sequence Viewer will translate DNA sequences using one of three codes: standard, vertebrate mitochondrial, and invertebrate mitochondrial. If the user does not specify a code, the standard code is used. The genetic code to use is specified in the DNA sequence file using one of the following lines:

      GENETIC_CODE = STANDARD
      GENETIC_CODE = VERTEBRATE_MITOCHONDRIAL
      GENETIC_CODE = INVERTEBRATE_MITOCHONDRIAL
      GENETIC_CODE = NONE

The last option indicates that the DNA sequences should not be translated. One of these lines should be placed in the beginning of the file before the first DNA sequence.

Counting the number of differences between sequences

Sequence Viewer will count the number of nucleotide differences between sequences and output the number of differences between all pairs of sequences as a matrix. This menu is password protected with the password “dna.” This function is password protected in order to reduce student tendencies to “click” first and think later.

Example data files

Four example data files are available here. These include: vertebrate cytochrome c sequences, primate NADH1 sequences, human mitochondrial DNA sequences, and VKORC1 sequences in rats. The cytochrome c sequences and NADH1 are useful for showing evolutionary relationships among vertebrates and primates, respectively. The human mitochondrial sequences are useful for illustrating genetic diversity within species, and the VKORC1 sequences are useful for discussing natural selection.

Citation

Please use the following citation to cite this program:

Kalinowski ST, MJ Leonard, TM Andrews (2010) Sequence Viewer for Students: An easy to use computer program for viewing and comparing DNA students. Genetics (In review).

Educational software for viewing and comparing DNA sequences