1 / 32

Bioinformatics ApS

Bioinformatics ApS. Bioinformatics ApS. Founded in February 2002 with investment from Biovision Founders: Staff working at the Bioinformatics Research Center, Aarhus University Employees: 2 software developers. Founders.

Télécharger la présentation

Bioinformatics ApS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics ApS

  2. Bioinformatics ApS • Founded in February 2002 with investment from Biovision • Founders: Staff working at the Bioinformatics Research Center, Aarhus University • Employees: 2 software developers

  3. Founders • Leif Schauser, Ph.D. (CEO): Associate Professor, Molecular Biology, Bioinformatics Research Institute • Jotun Hein, Ph.D. (Member of BoD): Professor in Bioinformatics at Oxford University • Mikkel Schierup, Ph.D. (CSO): Associate Professor, Biology, Bioinformatics Research Institute • Christian Storm, Ph.D. (CTO): Associate Professor, Computer Science, Bioinformatics Research Institute

  4. Strategy • Develop leading bioinformatics tools designed for association studies • Branding the software • Collaborations with pharmacogenomics industry • Disease genes, adverse drug reactions • Extend product suite to include other bioinformatics solutions (databases, comparative genomics)

  5. Drug target hunting Microarray analysis Disease gene mapping 100s of targets, not necessarily relevant Mb of genomic sequence, Power dependent Requires high resolution Problems: Tissues, controls, Limited by array Problems: Sampling, modeling, penetrance

  6. GeneRecon • Association mapping using all markers at the same time and all other available information • Fully probabilistic approach Input: SNPs or microsattelites Disease and control group Output: Localisation of disease

  7. GeneRecon - implementation • C++ program (>10.000 lines of code) • Multiplatform (Unix, Windows, MacOSX) • Amenable to parallelization • Bayesian MCMC approach

  8. GeneRecon - implementation • 10 million recalculations/hour • Different models of disease transmission • Diploid data with unknown phase • Thorough tests for calculations

  9. GeneRecon – output • Disease gene location (full distribution) • Disease-causing haplotypes • Estimation of phenocopies • Penetrance • Date the origin of disease

  10. GeneRecon - collaborations • Scandinavian medium sized biotechnological companies • Proprietary dataset under analysis • Danish University Hospital • Schizophrenia data are currently being collected

  11. D r r M D M Disease mapping Pedigree Analysis: Association Mapping: Time Pedigree known Few Meiosis (max 100s) Many Generations Resolution: cMorgans (Mbases) Pedigree sampled Many Meiosis (>104) Resolution: 10-4 Morgans (Kbases)

  12. Linkage Disequilibrium (LD)

  13. A T C Haplotypes: 2m-1 A G C SNPs: {A,T} {C,G} {A,C} Haplotypes

  14. The Human Genomehttp://www.sanger.ac.uk/HGP/ 1 2 3 X 6 4 7 11 16 5 8 9 10 19 20 15 17 21 13 14 18 22 12 Y 3 billion base pairs per haploid genome 30.000-40.000 genes

  15. SNP facts • For 2 complete haplotype genomes, there are about 3 million SNP differences (>1 SNP / kb). • Currently 3 mio. SNPs in database RefSNP with frequency with genotype 3.079.086 196.054 32.101 http://www.ncbi.nlm.nih.gov/SNP/

  16. Large scale survey of LDReich et al. (2001)

  17. Recent LD studies • LD extends over considerable distance in most populations • African populations show less LD than European populations • Small, isolated populations (e.g. Saami, Evenki) show increased LD • Founder populations (e.g. Finland, Sardinia) do not always show increased LS • Evidence for heterogeneity in LD along chromosomes • Haplotype blocks • Recombination hotspots

  18. Genetic Basis for Disease Monogenic Cystic Fibrosis Huntington’s Disease Sickle Cell Anemia Polygenic Azheimer’s disease Schizophrenia Hereditaray Heart Disease Astma

  19. Cystic fibrosis: a case study Traditional analysis Bayesian MCMC sampling

  20. The market • All major pharmaceutical and many biotech companies conduct genetic studies • Disease association (drug target identification) • Adverse drug response (pharmacogenomics) • Tailored drug administration • Outsourcing of non-core activities

  21. Timeline for drug discovery # Targets Discovery (5 yrs) 5000 Population study I Pre-Clinical (1 yr) 50 Clinical (6 yrs) 5 Population study II Review (2 yrs) 1 Marketed

  22. Cambridge Healthtech Institute: SNP-research market could reach $1.2 billion by 2005 • Annual expenditures on SNP research: • $158 million in 2001 • $1.2 billion in 2005 (estimated): 7 fold growth • Increasing interest in pharmacogenomics-or tailoring treatment to patients based on their genomic profiles-by pharmaceutical, biotechnology, and genomic tools companies.

  23. Factors influencing SNP research

  24. Example • DeCode typed 10.000 markers in all Icelanders (250.000)

  25. Needs of the market • Detailed understanding of population biology • Extract signals from noisy data (power) • Efficient algorithms that provide quick and precise answers

  26. Comparative Genomics • Gene finding (Correct annotation is crucial) • Identifying important residues in drug targets (HIV, proteins etc.) • Identifying regulatory sequences, networks

  27. Future Disease gene finding: GeneRecon Database solutions Comparative Genomics

  28. A T C Haplotypes: 2m-1 A G C SNPs: {A,T} {C,G} {A,C} • Experimental methods of determining Haplotypes: • Egg & Sperm Sequencing • Cell Lines with Lost Chromosomes • Sequencing Clones Spanning SNPs Haplotypes These methods are very expensive so computational reconstruction of haplotypes from SNPs is preferable.

  29. Parameters Bayesian Analysis, i.e. all parameters have assigned distributions. Markov Chain Monte Carlo allows the calculation of posterior (post-data) calculation of parameters and quantities of interest.

  30. The Shattered Coalescent (Morris, Whittaker & Balding,2002) Advantages: Allows for multiple origins of the disease mutant + sporadic occurances of the disease without the mutation (phenocopies)

More Related