Download
word up n.
Skip this Video
Loading SlideShow in 5 Seconds..
Word Up! PowerPoint Presentation

Word Up!

231 Vues Download Presentation
Télécharger la présentation

Word Up!

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Word Up! Using Lucene for full-text search of your data set

  2. Full-text search • Review of full-text search options • Focus on Lucene • Integrating Lucene with JPA/Hibernate

  3. Full-text search options • ‘LIKE’ queries • SQL extensions • Kludge with web search engine • Kludge with web search appliance • Embeddable search library

  4. ‘LIKE’ queries

  5. ‘LIKE’ queries • Simple, straightforward • Fast, easy to implement • Large result set • Limited fuzziness (wildcard or regex)

  6. Full-text search extensions • No standard syntax (Sybase, MSSQL, DB2, etc. all different) • Administrative overhead for text search indices • Other limitations

  7. Kludge with search engine • External indexing/search software • ht://Dig • mnoGoSearch • Sphinx • Xapian • Not necessarily pure Java • Can be database-intensive • Lag in updating search index

  8. Kludge with search appliance • “Black-box” solutions • Thunderstone • Google Search Appliance • Your data set mixes with public content • Doesn’t always work as advertised • Can’t fine-tune search

  9. Embeddable search library

  10. Search library • Example: Apache Lucene • Deploys as part of your application • 100% Java • Fuzzy full-text search (Levenshteinalgorithm) • Searches against text, numeric, booleanfields with multiple options • Can be integrated with JPA/Hibernate via Hibernate Search, Compass

  11. About Lucene • Search index stored on file system (also JDBC and BDB options) • Can store/retrieve data to/from search index (Lucene Projections) • Can index HTML, XML, Office docs, PDFs, Exchange mail with external tools • Supports extended and multi-byte character sets by default

  12. More about Lucene • Indexes records as Lucene Document object • Lucene Document doesn’t have to be a literal document – can be any arbitrary object • Document can have any number of name-value pairs • Synchronizing your data with search index is someone else’s problem …

  13. Integrating with JPA / Hibernate • Most common method: Hibernate Search • Supports only Hibernate provider • Automatically updates search index when object persisted to database • Entity classes mapped to separate indexes • Entity fields mapped to Lucene index fields using Java annotations

  14. Integrating with JPA/Hibernate … • Alternate method: Compass Project • Supports Hibernate, OpenJPA, others • No release since 2009 – effectively unsupported

  15. Annotated class example … @Indexed @Entity @Cacheable(true) @Table(name="MARKER", schema="MAPLINK") public class Marker extends MarkerA implements Serializable { @Id @Column(name="MKR_MARKERID") @Field(store=Store.YES) private long mkrMarkerid; @Column(name="MKR_LAT", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLat; @Column(name="MKR_LONG", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLong; @Indexed – tells Hibernate that this entity class should be indexed

  16. Annotated class example … @Indexed @Entity @Cacheable(true) @Table(name="MARKER", schema="MAPLINK") public class Marker extends MarkerA implements Serializable { @Id @Column(name="MKR_MARKERID") @Field(store=Store.YES) private long mkrMarkerid; @Column(name="MKR_LAT", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLat; @Column(name="MKR_LONG", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLong; @Field – tells Hibernate to create a matching name-value pair in the search index for this entity class Store.YES – stores the value for retrieval directly from the index, without touching the database

  17. Annotated class example … @Indexed @Entity @Cacheable(true) @Table(name="MARKER", schema="MAPLINK") public class Marker extends MarkerA implements Serializable { @Id @Column(name="MKR_MARKERID") @Field(store=Store.YES) private long mkrMarkerid; @Column(name="MKR_LAT", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLat; @Column(name="MKR_LONG", nullable = true) @Field(store=Store.YES) @NumericField private Double mkrLong; @NumericField – index as a numeric value, enables greater than / less than / range searches

  18. Let’s take a Luke at the index …

  19. Practical search exercise

  20. Questions!