[Content_Types].xml_rels/.relsword/_rels/document.xml.relsword/document.xml## Homework 03 - Identifying All Open Reading Frames. ### ...

1 answer below »
need it completed on google colab


[Content_Types].xml _rels/.rels word/_rels/document.xml.rels word/document.xml ## Homework 03 - Identifying All Open Reading Frames. ### Due: 10/10/22 at 11:59pm EST Make sure your code is well documented with comments . Clearly mark different sections of your code corresponding to the questions. Follow the instructions carefully before answering the questions. REMEMBER: You are not allowed to use any modules that have not been used in class! Double check your python notebook or the PDF of your notebook that it contains all the answers clearly before submission and whether all of the codes have been properly executed ( executed answers should be present clearly ). Imagine you have sequenced a new species. The first question you may ask is, “Where are all the protein-coding genes in the sequence?”. To discover regions of the sequence that encode proteins, we will start by Identifying ORFs (Open Reading Frames). An Open Reading Frame is defined as a segment of DNA that potentially encodes a protein-coding region. It starts with the start codon and ends with one of the stop codons. Remember the DNA sequence is often provided as one strand and may not be in the correct phase for coding. For this reason, we normally check all 6 phases of the sequence. The first three phases are on the strand that is provided. In the first phase counting of codons starts at position [0]. The second phase starts at position [1], and the third phase starts at position[2]. The last three phases are on the reverse complement of the DNA sequence provided. The fourth phase starts on the reverse complement at position [0], and so on. Images taken from : https://www.khanacademy.org/science/high-school-biology/hs-molecular-genetics/hs-rna-and-protein-synthesis/a/the-genetic-code NOTE: You are allowed to use functions from previous homeworks and lectures to answer these questions. Part 1: Create your own module seq_tools.py to identify open reading frames. Write a function reverse_complement() which will take a DNA and proivde the opposite strand in 5' to 3' order. For example, the reverse complement of ACGTGCT is AGCACGT . I had to first reverse it (TCGTGCA) then write it's complement (AGCACGT) Write a function get_orf() , that returns all ORF as protein sequences . The input to the function is one sequence and the output is a list of all open reading frames as amino acids. Use the genetic_code dictionary from previous homework to answer this question. a. We will assume that it is a DNA molecule so make sure to check all 6 frames. Use the reverse_complement() function from the previous question to get the three phases on the other strand of the DNA. b. An open reading frame does not have to start with ATG, but it ends with a stop codon in frame. c. There can be more than one open reading frames in one frame. Once a read frame ends with a stop codon, a new reading frame starts right after it. I should be able to test your solution like this: ` import seq_tools sequence = 'ATGTCGTA' allorfs = seq_tools.get_orf(sequence) word/footnotes.xml word/endnotes.xml word/theme/theme1.xml word/media/image1.png word/media/image2.png word/settings.xml word/fontTable.xml docProps/core.xml nathalie alejandro nathalie alejandro 2 2022-10-10T16:29:00Z 2022-10-10T16:29:00Z docProps/app.xml Normal.dotm 0 3 465 2653 Microsoft Office Word 0 22 6 false false 3112 false false 16.0000 docProps/custom.xml word/styles.xml
Answered 1 days AfterOct 10, 2022

Answer To: [Content_Types].xml_rels/.relsword/_rels/document.xml.rels...

Sathishkumar answered on Oct 11 2022
56 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here