1 / 68

Bioinformatics Principles

1. Bioinformatics Principles. 박종화 Jong Bhak 朴鍾和 TGI UNIST Ulsan Korea jongbhak@gmail.com. 20160509. B i o 감사의말. Researchers who are honest and passionate in doing science People who support scientific research by paying tax

angeliquer
Télécharger la présentation

Bioinformatics Principles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1 Bioinformatics Principles 박종화 Jong Bhak 朴鍾和 TGI UNIST Ulsan Korea jongbhak@gmail.com 20160509

  2. Bio감사의말 • Researchers who are honest and passionate in doing science • People who support scientific research by paying tax • MRC, Harvard, KAIST, KOBIC, TBI & Genome Research Foundation. 테라젠 고진업대표이사. • NCC, 이연수, 이진수박사, • 국가참조표준센터(채균식, 김창근박사) • 해양연 이정현박사와 동료들 • 한양대학교 (류성언교수, 김덕수교수, 고인송교수) • UNIST, (조무제총장), BME교수, 지역지원자들 • UNIST 학생

  3. World Map: Land Area 육지면적

  4. What map? No. of PCs

  5. Whatmap ? Infant death rate per

  6. Bioinformatics Is about Mapping X is IEP Y is Size Old map Is not Accurate. However It helps People to Explore. Data Information Knowledge Gangnido 1402

  7. Lesson • Depending on the parameters we use, the world and problems can be interpreted differently. • Bioinformatics is mapping the biological universe using certain bioinformatics parameters

  8. First principles first • Biological systems are within the universe • It is necessary to understand the universe to understand life

  9. The origin of the universe

  10. What is essence of the universe?: information jongbhak@genomics.org CopyLeft Under BioLicense http://biolicense.org

  11. Assumption/Hypothesis: • The essence of the universe is switching. The existences are the simulatenous instances of essence.

  12. Assumption/Hypothesis: • The fundamental elements of the universe are of information. • Physical objects are representations, reflections, or virtuals of information. • Basic entities are of inforamtion: energy/matters, space, and time are derivatives of information.

  13. Assumption/Hypothesis: • The physical reality is the product of information. • The physical world is numerical and mathematical representation of information. • Information precedes three dimentional physical world. • The universe is perfectly computable as it is an instance of computing.

  14. So, what is the universe? • The universe is the largest set of switches

  15. “Para-Programmed”Meta-Programmed Universe? jongbhak@genomics.org CopyLeft Under BioLicense

  16. Physical universe is an instance of the informational universe

  17. Schematic representation of para-determined information universe

  18. What is Life? Life is a set of switches jongbhak@genomics.org CopyLeft Under BioLicense

  19. Switching State Change • Switching is the result of state change • The concept of time invented (Dynamics) jongbhak@genomics.org CopyLeft Under BioLicense

  20. IFE: Infinitely Fractal Encapsulation jongbhak@genomics.org CopyLeft Under BioLicense

  21. Information Hierarchy

  22. Reproducing chemicals  Human • We could be all computers • The Earth is a gigantic computer jongbhak@genomics.org CopyLeft Under BioLicense

  23. Fundamentals for bioinformatics • Required: Thoughts on Philosophy, Society, Science, and Biology in general.

  24. Life is Complex(?) • Complex Homo sapiens live in layers of complex systems: multi-cellular, multi-organismal, multi-societal and multi-cultural. • Similar patterns occur again again in different layers? • The main problem is what the general patterns are in the infinite number of biological layers. • Bioinformatics is the Study of Complexity

  25. The core of Biology • Biology is an information science over energy metabolism • Two important things: • ENERGY and INFORMATION • Johann G. Mendel’s (1822-1884) genetic work on peas (Pisum)  bioinformatic analysis, modelling and prediction. • Perhaps he is the first well-known classical bioinformatist in history.

  26. Evolution, • Charles Darwin’s [1809 - 1882] is one of very few general principle biology has. The process of evolution is often applied to technology nowadays. Virtually every aspect of Biological information processing is concerned about evolution. Evolutionary theories also provide the third element in Biology: time. • Bioinformatics deals with evolution in Biology, all the time.

  27. Long Definition of Bioinformatics Is a discipline of Science that analyses, seeks understanding and models Life as an Information Processing phoenomenon over Energy with methods derived from philosophy, mathematics and computer science using biological experimental data. - Jong Bhak, 2000

  28. Short Definition of Bioinformatics • Bioinformatics is Biology and Biology is Bioinformatics - Jong Bhak, 2000

  29. Brief history of Biology • Darwin • Mendel • 3D Proteins • DNA model • Sequencing(Sanger) • Cloning Recombination • Amplification Technologies • Human Reference Genome • Next Gen. Sequencing/Personal Genomics • Diagnostics using Genomic data • Synthetic Biology • Genome Engineering • Cancer Cure  2022 • Aging Cure  2042

  30. Darwin (evolution).. Mendel (genetic analysis).. DNA codon anticodon peptide개념정리 Bioinformatics ·Structural genomics • Comparative genomics ·Sequencing ·Functional genomics, Interactomics ·SNP, SAP ·Proteomics (Mass spec. protein chip) ·DataBases ·Computational analysis Methodology DNA modelling (Watson & Crick, 1953): Molecular Biology Hemoglobin Myoglobin(Max Perutz,John Kentreu) (Structure and sequence relationship) structure sequence Computational Methodology (Chris Sander, Arther Lesk,… ) Genome sequencing (F.sanger) Full genome sequencing Dynamic programming Sequence comparison app module개발 (Niedleman & Bunsche) DB 구축개시(Gen Bank, PIR, ...) DNA chip & Microarray technology Southern blot Hybridization methology Functions Computer INTERNET

  31. Bioinformatics in time • Last decades: heavily driven by structural studies  such as protein folding problem and structural comparisons/classifications/molecular analyses. • A recent shift toward sequence, databases, software, computation, commercialization and functions of proteins.  Mid 1990s. • A leap of life : BioInternet  early 1990s. Life managed to connect humans as neurons did some time ago.

  32. Bioinformatics in time • The most important contemporary problem: • Explaining complex systems of biology functionally and evolutionarily. Major fields of Bioinformatics:  next page.

  33. Major Domains of Bioinformatics • Sequence • Structure • Expression • Interaction • Function

  34. Bioinformatics • Sequence • Genomics, Comparative Genomics • Structure • Structural Genomics, Structural Proteomics • Biophysics • Expression • Functional Genomics, Proteomics • Interaction • Proteomics, Interactomics • Function • Physiomics, Metabolomics

  35. Major Parts of Bioinformatics: • (1) Structural studies, (2) sequence analysis, (3) molecular interactions (4) functional analysis of genes, proteins and their ligands (Large scale expression analysis: DNA chips, microarray ) • (5) Algorithm development ( Mathematical and physical calculation programs.Bioperl, BioJava, BioXML, BioPython, BioCPP, CGI programming ), Network and middleware programs. BioInfrastructure • (6) Database construction (Relational databases, Object oriented databases). Medical informatics • (7) large scale data mining (artificial intelligence approach), • (8) Complex systems and network analysis • (9) Various prediction methods. • (10) Visualization of large and complex data. • (11) Large computer systems construction (hardware) and administration. • (12) OS, Compiler, Microprocessor optimization for bioanalyses • (13) Socio-economic modelling of life • (14) neuronal and psychological description of complex organisms • (15) designing and engineering cells and organisms

  36. Applied Fields of Bioinformatics • Sequencing related • Gene prediction, gene mapping, annotation, visualization • Genomics • Structural Genomics, • Functional Genomics (proteomics, interactomics) • Comparative genomics • SNP (single nucleotide polymorphism) , SAP (single amio…) • Proteomics • Mass spec, Protein Chip, Protein Interaction • Interactomics (Network Biology) • Complex systems (Network Biology) approach • Neuroinformatics (neurological informatics) • Medinformatics (medical informatics)

  37. Adding one more dimension? How to map/compute RNA expressions In relation with bio-function? 6 billion persons 6billion Bases 1,000,000 RNA expression

  38. Adding even more dimension? How to map/compute Phenome? 6 billion persons 6billion Bases 1,000,000 Phenotypes 1,000,000 RNA expression

  39. How to map/compute epigenome? 6 billion persons 1,000,000 epigenetic variation 6billion Bases 1,000,000 Phenotypes 1,000,000 RNA expression

  40. How to map/compute Microbiome? 6 billion persons 100,000 microbes 1,000,000 epigenetic variation 6billion Bases 1,000,000 Phenotypes 1,000,000 RNA expression

  41. How to map/compute Proteome? 6 billion persons 10,000,000 epigenetic variation 1,000,000 microbes 100,000 단백질 6billion Bases 1,000,000 Phenotypes 1,000,000 RNA expression

  42. Bioinformatic problems boil down to: • Representation of data.

  43. Ways of representing BioEntities • Sequence • Structure • Expression levels • Pathways • Function • Networks

  44. Very Basic information for non-biologists. • Elementary biological information on proteins etc. • Only for non-biologists!

  45. Proteins • Proteins: The central processing molecules of life.(15% of the mass of the average person) • Minium 20 different kinds of amino acids: Alanine ala a CH3-CH(NH2)-COOH Arginine arg r HN=C(NH2)-NH-(CH2)3-CH(NH2)-COOH Asparagine asn n H2N-CO-CH2-CH(NH2)-COOH Aspartic acid asp d HOOC-CH2-CH(NH2)-COOH Cysteine cys c HS-CH2-CH(NH2)-COOH Glutamine gln q H2N-CO-(CH2)2-CH(NH2)-COOH Glutamic acid glu e HOOC-(CH2)2-CH(NH2)-COOH Glycine gly g NH2-CH2-COOH Histidine his h NH-CH=N-CH=C-CH2-CH(NH2)-COOH Isoleucine ile i CH3-CH2-CH(CH3)-CH(NH2)-COOH Leucine leu l (CH3)2-CH-CH2-CH(NH2)-COOH Lysine lys k H2N-(CH2)4-CH(NH2)-COOH Methionine met m CH3-S-(CH2)2-CH(NH2)-COOH Phenylalanine phe f Ph-CH2-CH(NH2)-COOH Proline pro p NH-(CH2)3-CH-COOH Serine ser s HO-CH2-CH(NH2)-COOH Threonine thr t CH3-CH(OH)-CH(NH2)-COOH Tryptophan trp w Ph-NH-CH=C-CH2-CH(NH2)-COOH Tyrosine tyr y HO-p-Ph-CH2-CH(NH2)-COOH Valine val v (CH3)2-CH-CH(NH2)-COOH http://www.nyu.edu/pages/mathmol/library/life/life1.html

  46. Amino Acids (L-form)

  47. Types of Amino Acids Amino acids can be grouped into 4-5 different groups for Bioinformatic analysis. Most important distinctions: Hydrophobic and Hydrophilic groups Big side chain groups and Small side chain groups Cysteine  for disulphide bonding. (well conserved) Proline  structurally important Histidine  important for switching • Aliphatic - alanine glycine isoleucine leucine proline valine • Aromatic - phenylalanine tryptophan tyrosine • Acidic - aspartic acid glutamic acid • Basic - arginine histidine lysine • Hydroxylic - serine threonine • Amidic (containing amide group) - asparagine glutamine • http://chemistry.gsu.edu/glactone/PDB/Amino_Acids/aa.html

  48. Amino Acids • CH – COO – R – NH3 (CORN law: Clockwise) Zwitterions remain when the a-amino acid is dissolved in water at pH7. Addition of an acid, supplying more protons, produces ions with a surplus positive charge:

  49. Peptide Bond

  50. Planes of peptide bonds

More Related