1 / 28

ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions

ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions. Self-introduction. Andrey Zinovyev. M.Sc. in theoretical physics (1997). Programming, industrial information systems (C++, Delphi). Ph.D. in computer science (2001), Method of elastic maps and applications in bioinformatics.

oki
Télécharger la présentation

ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ACE & RACEannotation of complex/combinatorialexpressions

  2. Self-introduction Andrey Zinovyev M.Sc. in theoretical physics (1997) Programming, industrial information systems (C++, Delphi) Ph.D. in computer science (2001), Method of elastic maps and applications in bioinformatics Web-services development (Java, JSP) Senior postdoctoral fellow in IHES, France http://www.ihes.fr/~zinovyev or type “zinovyev” in Google

  3. Plan of the talk  ACE framework introduction what we have  What will be in RACE?  ACE software C++ code web-application  Plans for ACE and RACE  Computational environment

  4. Gene annotation TF1 Probability profiles b.ace TF2 RNA structures r.ace m.ace Microarrays Genome as databaseeverything is annotation Genomes: human, chimp, mouse, rat ATGCGTGCAAATGCTCTTTGTGTAACGTGTCGACGTACGTGTGTAACGTGCGACGTACGT common format for annotation files (binary p-files)

  5. Genome preprocessingcompile once, run everywhere ATGCGTGCAAATGCTCTTTGTGTAACGTGTCGACGTACGTGTGTAACGTGCGACGTACGT ace.annotate ace.RNAtools ace.annotate arc ace.map r.ace Potential RNA structures, splicing sites b.ace Potential TF binding sites m.ace Gene expression data c.ace Chromatin structure and dynamics ace.enhance ace.cluster ace.display ace.dyCr ace.stat

  6. Structure space Structure spacethe truth is out there set of annotations Multidimensional combinatorial space of all possible structures appearing in a scanning window

  7. Method_01 ace.enhance expression (heuristic mask) Method_02 … Method_11 ace.enhance annotation ace.enhancebe more abstract Accessing and masking structure space  view in genome browser (ace.display) • compare with experiment (cross-annotation) (ace.dyCr) • construct more abstract space and apply ace.enhance further

  8. TF1 TF2 b.ace Transfac release Genome release ace.annotate b.ace ~1.2Tbyte

  9. ace.enhance • Enhance methods: • Fixed spacing of sites • Fixed order of sites • Fixed strand orientation of sites • Multiple copies of site • Minimal spacing of sites • Maximal spacing of sites • Variable, defined spacing between sites • Minimal p-value for weight matrix • Maximal p-value for weight matrix • Bias weight-matrix M1&&M2||M3||M4||M5 … + ace.cluster: simplified version of enhance for detecting clusters of repetitions of one motif

  10. rarHS – 659.631 hits cMyb – 1.647.505 hits CEBP – 1.189.196 hits PU.1 – 472.383 hits ace.annotate => ace.enhance expression, window 50bp: PU.1 &&rarHS — rarHS || rarHS — rarHS &&CEBP<cMyb 8 ** 11 ** Result: 102 hits 14.1 5’ 3’ 5’ 3’ 14.2 5’ 3’ 14.3 Example14 transcription factors, chr14 of UCSC_HG15

  11. jfl_im = TAGAGA TAGAGT TAGGGA TAGGGT 183.389 hits ace.annotate => Example2clusters of motifs, chr14 ace.enhance expression, window 300bp: jfl_im  10 copies Result: 51 hits in 5 groups

  12. ACE C++ tools  aceLib, wraps system-dependent code  generic programming for code reusability ace.annotate – probability based annotations and motifs search ace.enhance – accessing (masking) structure space: combinatorial query language ace.cluster – extracting clusters of repetitions: simplified version of enhance ace.dyCr – first step in structure space analysis: dynamic cross-annotation ace.stat – statistical significance analysis

  13. ACE web-application (JSP)ace.uit

  14. database layout: .ace

  15. modules layout: ace.rte/ace.annotate

  16. modules layout: ace.rte/ace.enhance

  17. data layout: my.ace

  18. documentation layout: ace.doc

  19. false-positive rate Plans with ACEprincipal problem ace.stat : statistical model of random noise maximum entropy principle significance analysis

  20. Plans with ACEvisualizing structure space creating 2D maps of structure space data visualization, dimension reduction

  21. ace.eva ace.net Plans with ACEintegrating m.ace ace.map m.ace

  22. silencing structures in space Plans with ACEmodel of chromatin structure and dynamics chromatin state profiles arc c.ace imunoprecipitation experiments

  23. Plans with ACEcomparative genomics genome1 genome2

  24. Installation of b.ace in Lillehttp://ace.ibl.fr 1.2 Tbyte PowerVault storage PowerEdge Dell server

  25. Installation of RACE in Sherbrooke (golf) LISA DB UCSC local ace UCSC browser r.ace DB G browser

  26. new genome release where? Distributed environmentdatabase synchronization protocol b.ace Lille France LISA Sherbrooke Canada r.ace Sherbrooke Canada m.ace INSERM Paris c.ace IHES Paris public dbs

  27. ace.display ace.stat ace.dyCr ace.enhance pluggable methods RACEplatform for integration ace.annotate find simple motifs (loops, hairpins) ace.RNAtools pluggable algorithms p-files (r.ace database)

  28. ACE team ace team leader : Arndt Benecke, IHES ace.uit, ace C++: Andrey Zinovyev, IHES aceLib, ace C++: Thomas Bücher, Inst.Neur. ace.map : Sebastian Noth, INSERM ace.stat : Richard Madden, UdSh arc : Graham Smith, IHES

More Related