Due date Sep. 29, 2003
In this assignment you will use operations on arrays that are relevant to
work in computational biology.
As you no doubt know, genetic information is encoded in the DNA by means of
an alphabet consisting of 4 characters: A, C, G and T. A strand of DNA
consists of thousands or millions of these letters. Within this very long
string, certain combinations of letters have a special meaning (for example
punctuation marks that indicate where a gene begins and ends).
For this assignment, the input consists of two strings: one long string
that corresponds to a DNA fragment, and one short string that describes a
marker. You can assume that the length of the marker is less than 10.
The output of the program must include the following:
a) The percentages of A, C, G, and T in the original strand (i.e. count the
number of occurrences of each and divide by the length).
b) The positions at which the marker occurs in the original strand.
c) assuming that every piece that appears between two markers is a gene,
indicate the position of the longest gene in the strand, and output the
We will discuss in class how to read the input and go from strings to arrays.