1 / 23

Using Network Processors in Genomics

H. Bos – Leiden University 13/02/2004. 1. Using Network Processors in Genomics. Herbert Bos * † Kaiming Huang * {herbertb,khuang}@liacs.nl * Leiden Universiteit, Netherlands † Vrije Universiteit, Netherlands http://www.liacs.nl/~herbertb/projects/biocomp/.

samuru
Télécharger la présentation

Using Network Processors in Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. H. Bos – Leiden University 13/02/2004 1 Using Network Processors inGenomics Herbert Bos* † Kaiming Huang* {herbertb,khuang}@liacs.nl *Leiden Universiteit, Netherlands † Vrije Universiteit, Netherlands http://www.liacs.nl/~herbertb/projects/biocomp/

  2. H. Bos – Leiden University 13/02/2004 2 Case study: BLAST • search nucleotide/protein database for query • BLAST discovers similarity rather than exact match • two main phases: • scoring (registering where query and DNADB match) • alignment (dynamic programming) • only the first phase on NPUs

  3. H. Bos – Leiden University 13/02/2004 3 Window matching

  4. H. Bos – Leiden University 13/02/2004 4 Window matching

  5. H. Bos – Leiden University 13/02/2004 5 Window matching

  6. H. Bos – Leiden University 13/02/2004 6 Window matching

  7. H. Bos – Leiden University 13/02/2004 7 Window matching • naïve approach: roughly W*N*M comparisons • does not scale • string search algorithms: Aho-Corasick • all windows matched at the same time • shifting genome one nucleotide at a time • matching algorithm transformed in a DFA • DFA may be quite large

  8. H. Bos – Leiden University 13/02/2004 8 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga}

  9. H. Bos – Leiden University 13/02/2004 9 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9

  10. H. Bos – Leiden University 13/02/2004 10 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9

  11. H. Bos – Leiden University 13/02/2004 11 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9 tacgcga

  12. SRAM H. Bos – Leiden University 13/02/2004 12 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  13. SRAM H. Bos – Leiden University 13/02/2004 13 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  14. SRAM H. Bos – Leiden University 13/02/2004 14 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  15. a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 15 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  16. a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 16 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  17. a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 17 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI

  18. H. Bos – Leiden University 13/02/2004 18 IXPBlast: packet handling 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 • packets read and processed in batches of 100.000 • “spilling” must be taken into account • currently no feedback

  19. H. Bos – Leiden University 13/02/2004 19 Results • 232 MHz IXP1200 ~ 1.8GHz Pentium-4 • 1611 Nucleotide query (MyD88) • 1.4 GB genome (Zebrafish) • IXP1200: 90 sec with DFA • IXP1200: 129 sec with “trie” • P4: 132: 132 sec with “trie” • number of matches: 524856

  20. H. Bos – Leiden University 13/02/2004 20 Results

  21. H. Bos – Leiden University 13/02/2004 21 Conclusions • NPUs are useful in other application domains • Newer hardware is expected to perform much better • “Throughput processors” • Adapting our current approach to use BLAST tricks/heuristics

  22. H. Bos – Leiden University 13/02/2004 22 Network processors • geared for high throughput • used exclusively in network systems • example: intrusion detection • similar to looking for gene onin genomes • differences Radisysixp1200 board

  23. H. Bos – Leiden University 13/02/2004 23 Application domain: “Genomics” • example: search genome for occurrence of “patterns” • similar problems as IDS, poor performance on GPP cannot exploit parallelism • throughput-driven • how about FPGAs? • how about clusters? • NPU • easier to program than FPGAs • cheaper than cluster computing • “on the desktop”  IP never leaves the room

More Related