1 / 14

More Specialized Data Structures

More Specialized Data Structures. String data structures Spatial data structures. String Data Structures. String Operations. String indexing Pattern matching Find pattern P in text T Find common substrings among a set of a strings Application Domains Bioinformatics Google search!.

adria-ewing
Télécharger la présentation

More Specialized Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More Specialized Data Structures String data structures Spatial data structures Cpt S 223. School of EECS, WSU

  2. String Data Structures Cpt S 223. School of EECS, WSU

  3. String Operations • String indexing • Pattern matching • Find pattern P in text T • Find common substrings among a set of a strings • Application Domains • Bioinformatics • Google search! Cpt S 223. School of EECS, WSU

  4. A simplified hash table for strings 0.Build a lookup table of size |Σ|wfor all w-length words in D 1 2 3 4 5 6 7 Σ={A,C,G,T} w = 2  42 (=16) entries in lookup table S1: C A G T C C T S2: C G T T C G C Lookup table: AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT S1,4 S1,2 S1,1 S1,5 S1,3 S1,6 S2,1 S2,6 S2,3 S2,2 S2,4 S2,5 Cpt S 223. School of EECS, WSU

  5. PATRICIA trees • “Practical Algorithm to Retrieve Information Coded in Alphanumeric” • Compacted trie of a set of strings • Dictionary searches made easy Cpt S 223. School of EECS, WSU

  6. Suffix Tree • Compacted trie of all suffixes of a string 1 2 3 4 5 6 B A N A N A Find Pattern: “ANAN” Think how to implement Google Search? Cpt S 223. School of EECS, WSU

  7. Generalized Suffix Tree (GST) WINDOW$ INDIGO$ 1234567 1234567 $ D ND I $OG O W (1, 7) (2, 7) (2, 5) ND OW$ $ $OGI OW$ $OGI $OG $W INDOW$ $ (2, 4) (2, 2) (1, 3) (1, 5) (2, 6) (2, 3) (1, 4) $OGI OW$ (1, 6) (1, 1) (2, 1) (1, 2) Cpt S 223. School of EECS, WSU

  8. Spatial Data Structures Cpt S 223. School of EECS, WSU

  9. Spatial Data Structures Bounding rectangle Points in 2-D Cpt S 223. School of EECS, WSU

  10. c … F D E G …. Recursive Bisection Quad trees(4-way trees) • Technique for spatial domain decomposition root Cpt S 223. School of EECS, WSU Source: Handbook of Data Structures & Applications, Chapman & Hall/CRC Press, 2005

  11. Compact path into single edge Compacted Quad-trees (for 2D data) 2D space with data Quad-tree decomposition N E • Each node has exactly 4 children (for 4 quadrants) • For 3D data, the corresponding tree is called an oct-tree Cpt S 223. School of EECS, WSU Source: Handbook of Data Structures & Applications, Chapman & Hall/CRC Press, 2005

  12. (a1,b1) Range Query Result (a2,b2) Range Queries on Quad-trees (0,0) Cpt S 223. School of EECS, WSU

  13. Oct-Trees (for 3D data) • Issue: • What happens if • the data is unevenly • (ie., non-uniformly)distributed ? • Most of the levels in the tree will be empty Solution: “Compacted Oct-trees” Cpt S 223. School of EECS, WSU

  14. k-d trees (for k dimensions) • Maintain a combined binary search tree for all dimensions • Recursively bisect each dimension, alternating dimensions at each level of the tree Cpt S 223. School of EECS, WSU

More Related