# Homework 7 Solution

For this assignment you will turn in:

In class(10pts):

1.       A statement of the problem (typed)

2.       An explanation of your solution (typed)

3.       A flowchart (hand-drawn or computer generated)

4.       Pseudocode (typed)

Via BlackBoard(40pts):

1.       C program named <username_gc.c

Assignment:

Follow the steps that we have outlined in class for algorithm development to generate a program that reads in DNA sequences from a file and determines the content of A, T, C, and G in the sequence. Specifically, I am interested in the GC content (the percentage of the sequence that it G or C). The first line of the file will be in integer that tells you how many sequences there are in the file. Each line following will contain a single sequence. You will need to store the percent of A, T, C, G in a 2D array, this is because you need to know the average GC content of the genome to determine whether a bacterial gene is, or is not, pathogenic. If a bacterial gene has a higher GC content than the genome as a whole, then it is likely that that gene is pathogenic.

The Wikipedia page on GC content gives additional explanation: https://en.wikipedia.org/wiki/GC-content

Specifications:

Inputs:

-          File called sequences.txt (contains a plasmid of Yersina pestis) Outputs:

-          File called content.txt containing A, T, C, G, and GC content of each sequence along with a pathogenicity prediction:

EX:

%A
%T
%C
%G
%GC
pathogenic?
10
20
40
30
70
Y
20
50
10
20
30
N

Functions:

1.  void printToFile(int seq, float content[seq][4], float avgGC)

a.       prints the results out to a file

b.      You should open and close your file in this function

2.  float averageGC(int seq, float content[seq][4])

a.       calculates the average GC content for the whole genome

3.  char isPathogenic(float avgGC, float seqGC)

a.       returns Y if pathogenic, N if not

*  This is the minimum functions that you must use. You may use others if you like.

Other:

1.       This is individual work. You may NOT work in groups.

2.       Please staple all work together.

3.       You are expected to error check.

4.       For code: No compile = No points, no exceptions!

5.       You must use the functions exactly as described

6.       You must have your input file called “sequences.txt” and output file called

“content.txt”