1.19k likes | 1.34k Vues
Comp. Sci. 4322 Algorithms in Bioinformatics 生物資訊演算法. Arthur W. Chou 周維中. Please feel free to ask questions!. Help me know what people are not understanding Slow down the slides (We do have a lot of material). ((( ))). Stationary Hand. Wiggling Hand.
E N D
Comp. Sci. 4322 Algorithms in Bioinformatics生物資訊演算法 Arthur W. Chou 周維中
Please feel free to ask questions! Help me know what people are not understanding Slow down the slides (We do have a lot of material)
((( ))) Stationary Hand Wiggling Hand You have a question or comment of any nature You have something relevant to say to what is being spoken now
We are going to test you on yourability to explain the material. Hence, the best way of studying is to explain the material over and over again out loud toyourself, to each other, and to your stuffed bear. Explaining
While going along with your day Day Dream Mathematics is not all linear thinking. Allow the essence of the materialto seepinto your subconscious Pursue ideas that percolate up and flashes of inspiration that appear.
Be Creative Ask questions. Why is it done this way and not thatway?
Guesses and Counter Examples • Guess at potential algorithms for solving a problem. • Look forinput instances for which your algorithm gives the wrong answer. • Treat it as a game between these two players.
Refinement:The best solution comes from a process of repeatedly refining and inventing alternative solutions
What is Bioinformatics? • “the research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.” (http://www.bisti.nih.gov/)
There once lived a man who learned how to slay dragons and give all he possessed to mastering the art. After three year he was fully prepared but, alas, he found no opportunity to practice his skills. - Zhuan Zhi
As a result he began to teach how to slay dragons. - Rene Thom
Sequence Analysis of gp120 on HIV-1 Virus Highly variable levels of N-linked glycosylation on the V1 loop of HIV-1 envelope glycoproteins and their relationship to the immunogenicity of HIV Env of primary viral isolates by Arthur W. Chou and Laboratory of Nucleic Acid Vaccines at UMASS Medical School ( submitted to Journal of Virology )
MHC gp120 gp41 envelope lipid bilayer matrix p17 core p24 RT RNA HIV particle and genome
HIV vaccines approaches recombinant protein (gp120) synthetic peptides (V3) naked DNA live-recombinant vectors (viral, bacterial) whole-inactivated virus live-attenuated virus
HIV env genetic subtypes Adapted from J. Mullins
HIV-1 Envelope Glycoprotein Domains HIV sequences
Observation • The lengths of the V1 loop vary with the number of N-glycosylation sites; i.e, the longer the V1 loop, the more the number of sites. V1 loop: LengthNo. of Glycosylation sites 15 - 25 2 - 4 26 - 45 4 - 7
Questions • Does this correlation between the length of the sequences and the number of glycosylation sites only occur in V1? • How are the glycosylation sites distributed throughout the sequences?
Results of Sequence Analysis • This high correlation rarely occurs at other places: the average number of glycosylation sites for any stretch of length 45 is ~ 2.5 . • Analysis of the distribution of Glycosylation sites show that they are mostly concentrated on the V1 loop.
Results of the paper • Mutated Env antigen with selected removal of N-glycosylation sites at the tip of V1 loop on one of the primary HIV-1 Env, 92US715, induced higher antibody responses against the autologous HIV virus. [ Contribution ] These data improve our understanding on the structure-function relationship of HIV-1 Env antigens, and facilitate better design of HIV vaccines with the aim of inducing broad neutralizing antibody responses.
Algorithm • Distance-Based method • Dynamic Programming Time: 16:00 Wednesday September 28 Place: Department of Mathematics ST 516
DNA Sequence (Gene) Protein Amino Acid Sequence Protein Function Protein 3D Structure Interactions
Outline: • 1. What Is Life Made Of? • 2. What Molecule Code For Genes? • 3. What Is the Structure Of DNA? • 4. What Carries Information between DNA and Proteins? • 5. How are Proteins Made?
Outline For Section 1: • All living things are made of Cells • Prokaryote, Eukaryote • Cell Signaling • What is Inside the cell: From DNA, to RNA, to Proteins
Cells • Fundamental working units of every living system. • Every organism is composed of one of two radically different types of cells: prokaryoticcells or eukaryotic cells. • Prokaryotesand Eukaryotesare descended from the same primitive cell. • All extant prokaryotic and eukaryotic cells are the result of a total of 3.5 billion years of evolution.
Cells • Chemical composition-by weight • 70% water • 7% small molecules • salts • Lipids • amino acids • nucleotides • 23% macromolecules • Proteins • Polysaccharides • lipids • biochemical (metabolic) pathways • translation of mRNA into proteins
Life begins with Cell • A cell is a smallest structural unit of an organism that is capable of independent functioning • All cells have some common features
Prokaryotes and Eukaryotes • According to the most recent evidence, there are three main branches to the tree of life. • Prokaryotes include Archaea (“ancient ones”) and bacteria. • Eukaryotes are kingdom Eukarya and includes plants, animals, fungi and certain algae.
Cells Information and Machinery • Cells store all information to replicate itself • Human genome is around 3 billions base pair long • Almost every cell in human body contains same set of genes • But not all genes are used or expressed by those cells • Machinery: • Collect and manufacture components • Carry out replication • Kick-start its new offspring (A cell is like a car factory)
Overview of organizations of life • Nucleus = library • Chromosomes = bookshelves • Genes = books • Almost every cell in an organism contains the same libraries and the same sets of books. • Books represent all the information (DNA) that every cell in the body needs so it can grow and carry out its vaious functions.
Some Terminology • Genome: an organism’s genetic material • Gene: a discrete units of hereditary information located on the chromosomes and consisting of DNA. • Genotype: The genetic makeup of an organism • Phenotype: the physical expressed traits of an organism • Nucleic acid: Biological molecules(RNA and DNA) that allow organisms to reproduce;
DNA the Genetics Makeup • Genes are inherited and are expressed • genotype (genetic makeup) • phenotype (physical expression) • On the left, is the eye’s phenotypes of green and black eye genes.
More Terminology • The genome is an organism’s complete set of DNA. • a bacteria contains about 600,000 DNA base pairs • human and mouse genomes have some 3 billion. • human genome has 24 distinct chromosomes. • Each chromosome contains many genes. • Gene • basic physical and functional units of heredity. • specific sequences of DNA bases that encode instructions on how to make proteins. • Proteins • Make up the cellular structure • large, complex molecules made up of smaller subunits called amino acids.
Genome Sizes • E.Coli (bacteria) 4.6 x 106 bases • Yeast (simple fungi) 15 x 106 bases • Smallest human chromosome 50 x 106 bases • Entire human genome 3 x 109 bases
All Life depends on 3 critical molecules • DNAs • Hold information on how cell works • RNAs • Act to transfer short pieces of information to different parts of cell • Provide templates to synthesize into protein • Proteins • Form enzymes that send signals to other cells and regulate gene activity • Form body’s major components (e.g. hair, skin, etc.)
DNA: The Code of Life • The structure and the four genomic letters code for all living organisms • Adenine, Guanine, Thymine, and Cytosine which pair A-T and C-G on complimentary strands.
DNA, continued • DNA has a double helix structure which composed of • sugar molecule • phosphate group • and a base (A,C,G,T) • DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’ 3’ TAAATCCGG 5’
DNA, RNA, and the Flow of Information Replication Transcription Translation
Overview of DNA to RNA to Protein • A gene is expressed in two steps • Transcription: RNA synthesis • Translation: Protein synthesis
Cell Information: Instruction book of Life • DNA, RNA, and Proteins are examples of strings written in either the four-letter nucleotide of DNA and RNA (A C G T/U) • or the twenty-letter amino acid of proteins. Each amino acid is coded by 3 nucleotides called codon. (Leu, Arg, Met, etc.)