220 likes | 285 Vues
Bioinformatics Why Can’t It Tell Us Everything?. Bioinformatics What are our Data Sets?. Interested in information flow with cells Currently, the key information is mostly a matter of biological macromolecules
E N D
Bioinformatics Why Can’t It Tell Us Everything?
BioinformaticsWhat are our Data Sets? • Interested in information flow with cells • Currently, the key information is mostly a matter of biological macromolecules • Eventually, information of interest will also include flow of nutrients, energy, and impact of small molecules on macromolecular function
BioinformaticsWhat are our Questions? • What is in there? • What does it do? • How similar is it to something else? • How does it fold? • Where does it go in a cell? • What does it interact with? • How it is regulated? • Level of confidence?
* Function of organism is determined by function of its cells * Function of cells determined by chemical reactions that take place within them * Chemical reactions occur or not according to presence and activity of enzymes* Enzymes are proteins* Proteins are determined by genes* Therefore, genes determine organismal function BioinformaticsLogical Reasoning Behind Data Sets
Genomics Proteomics
Central DogmaDNA RNA Protein Genes & proteins are different molecular languages, but they are colinear
Double helical DNA Basic Unit (alphabet): Nucleotide (base) Only 4: A, T, G, and C Double-stranded: A<>T and G<>C 5’..AGCTGCATGCTAGCTGACGTCA….3’ 3’..TCGACGTACGATCGACTGCAGT….5’ “Words” (genes) to encode proteins, RNA
DNAStructure Connected to Information DNA Tower in Perth, AUS
DNAReplication & Transcription as Algorithms • With rare exceptions, all DNA is replicated • Crucial tool is ability to go from one strand to another • Transcription uses same base-pairing rules with U instead of T, but occurs in packets
Transcription = DNA to RNA Where to Start is a Big Question
Protein Alphabet: amino acids There are 20amino acids Met Cys Ser Leu Ala Ala Val
ProteinsNumber of Possible 100-mer Peptides? 20 possible residues at each position For 2-mers, 20 possible at position 1 and 20 possible at position 2, so 20 x 20 = 202 = 400 Same logic for 100-mers, 20100 = 2100 x 10100 = (210) 10 x 10100 = ~ (103) 10 x 10100 = 10130
Protein Alphabet: amino acids There are 20amino acids Encoded by codons (triplets of nucleotides) ATG TGC AGC CTAGCTGCCGTC CTAGCTGCCGTC Met Cys Ser Leu Ala Ala Val
Genetic Code Found on Earth:How Does It Work? 5’-UCGACCAUGGUUGACCAUUGAUUACCACG-3’
Genetic Code • Triplet • Nonoverlapping • Comma-less • Redundant
Bioinformatics:Mining a Mountain of Data Where are the putative genes?