1 / 18

DB2 Net Search Extender IBM DB2 Data Management March 2003

DB2 Net Search Extender IBM DB2 Data Management March 2003. Agenda. Overview of Search Products in IBM Product Objectives: DB2 Net Search Extender Product Overview Key Features Positioning of the Text Extender family Customer Scenarios Future direction.

Télécharger la présentation

DB2 Net Search Extender IBM DB2 Data Management March 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DB2 Net Search Extender IBM DB2 Data Management March 2003

  2. Agenda • Overview of Search Products in IBM • Product Objectives: DB2 Net Search Extender • Product Overview • Key Features • Positioning of the Text Extender family • Customer Scenarios • Future direction

  3. Products with Text Search in IBM Today Category Product Positioning A brokered, index-free search for parallel, distributed, heterogeneous search of specific content sources. Bundled with DB2 II, II for Content, WP Brokered Search IBM Lotus Extended Search Portal product which includes full-text search library designed for high precision search on small/mid-sized collections WebSphere Portal search Index-Based Search Federated text and parametric search for content and data sources.Web crawler for indexing web sites. DB2 Information Integrator for Content Knowledge management system for full text search & expertise location Lotus Discovery Server DB2 extension for fast, scalable full-text search with a SQL/MM dialect. For text stored in DB2 and federated databases. Integrated to DB2 Content Manager. DB2 Net Search Extender

  4. Objective of the DB2 Net Search Extender • DB2 Net Search Extender is recommended for applications that need: • A full-text search to handle for example the demands of e-Business applications with important textual content • A relational database and a rich data schema to support the application requirements • DB2 Net Search Extender is: • An extension to DB2 designed to provide excellent text search capabilities for e-business applications • Seamless integrated in SQL query language using an extension of Structured Query Language: SQL/MM (multi-media) • The DB2 Net Search Extender is NOT: • An Internet search product like Google • A generalized free text search product like Verity or Autonomy

  5. Overview of DB2 Net Search Extender • Provides a parallel, scalable, full-text search • Delivers excellent performance and scalability • Tailored for e-Business applications such as e-Commerce and Content Management with text search requirements • Works seamlessly with text documents contained in DB2 and other federated databases • Extends existing DB2 applications easily by using standard extensions to SQL. • Provides very fast indexing and dynamic index update which is the basis for a high speed search solution. • Integrates with the DB2 Control Center for seamless and easy to use administration

  6. DB2 Net Search Extender Key Features • Search functions: Options to refine the search process • Boolean operations • Proximity search for words in the same sentence or paragraph • "Fuzzy" searches for words having a similar spelling as the search term • Wildcard searches, using front, middle, and end masking • Thesaurus support to broaden the query • Search within sections within documents for more targeted search • Search on numeric attributes • Supports search in 37 languages • Highlight function

  7. DB2 Net Search Extender Key Features • Search Results: Presentation of Search results and responsiveness • Set a result limit on queries where a high hit count is anticipated • Built-in SQL functionality is combined with the optimizer automatically to select the best optimization plan according to the expected search results • Order the results by the document score • Returns results quickly – a high performance search solution. • Search methods: Programming mechanisms tailored to different e-Business requirements • SQL function for general text search applications • SQL scalar search function • General text search on views and presorted indexes • SQL table-valued function • High performance dedicated text search • Text Search Stored Procedure

  8. SQL Search: SQL scalar search function • The recommended search method - useful for most situations • Use where standard SQL would be used • Use when text search results are combined with other, different conditions • Integrated with the DB2 optimizer for excellent performance where JOIN of data is needed Arrows are data flows SQL scalar search Return results DB2 Server Index Extract matching primary keys “CONTAINS” Join DB2 table

  9. Text Search on Views: SQL table-valued function for search • Use where you would normally use an SQL scalar function, but you want to exploit text indexes on views or presorted text indexes. Arrows are data flows TextSearch table-valued search function Return results DB2 Server Index Extract matching primary keys “db2ext.textsearch” Join DB2 table

  10. High Performance, Dedicated Text Search Stored Procedures for Search • Use for high performance/high scalability applications that need text search-only queries • Use for queries that do not need to join text search results with the results of other complex SQL conditions. Arrows are data flows TextSearch stored procedure search Index DB2 Server “db2ext.textsearch” Cache Columns in cache defined at text index creation DB2 table

  11. DB2 Net Search Extender Key Features • Indexing: Very fast indexing and dynamic index update is the basis for a high speed search solution • Provides fast indexing of large data volumes • Provides incremental updates of indexes • Indexes text documents stored in DB2 and federated databases • Provides a choice of command line or interface through the DB2 Control Center for indexing • Supports language-specific stopword lists to reduce the index size and search speed • Monitors the progress of indexing • Optional: supports presorted text indexes • Optional: provides caching of table columns in main memory at indexing time to avoid physical read operations at search time

  12. Indexing • Very fast indexing and dynamic index update is the basis for a high speed search solution DB2 Server Index update “UPDATE INDEX…” Insert/Update/Delete read RDBMS tables Net Search Extender Instance Services trigger read log table

  13. Indexing Performance with DB2 Net Search Extender • DB2 Net Search Extender shows excellent scalability and performance when it is used together with partitioned database setup.

  14. Summary • Search and information mining is a complex problem • The amount of accessible data (petabytes) • Diversity of sources, types & formats • Despite heterogeneity, users would like seamless use of all kinds of information • Parametric & Text • Multilingual • Without syntax/protocol differences • And they want good results! • We have core technologies • Historic trends are toward integrating technologies • Key IBM products are being extended with search and mining capabilities • Search and mining technologies are evaluated as standalone products as well as embedded components • There is a search product available to solve your business problem

  15. Backup charts

  16. More info • DB2 V8.1 announcement: • at: • http://www.ibmlink.ibm.com/usalets&parms=H_202-214 • found at: • http://www-3.ibm.com/software/data/db2/udb/v8/ • "What's new in DB2" PDF document: • http://www-3.ibm.com/software/data/db2/udb/pdfs/db2q0.pdf • DB2 Net Search Extender web site with Data Sheet • http://www-3.ibm.com/software/data/db2/extenders/netsearch/

  17. Positioning the three DB2 Text-based Extenders • DB2 Net Search Extender V8 – for use with DB2 UDB V8 • The strategic product going forward • Improvements over both TIE and NSE V7 capability • Merges the functionality of TIE and NSE V7 products • Backward compatibility for DB2 NSE V7 and TIE V7 applications • DB2 Net Search Extender V7 • Designed to support web site traffic • Uses faster underlying search engine than Text Extender • Caches all potential results • Data scalability only limited by physical memory • Less SQL functionality and flexibility than the Text Information Extender • DB2 Text Information Extender (TIE) V7.2 • Uses same underlying search engine as NSE • Has the SQL flexibility of Text Extender • DB2 Text Extender (TE) is the original text extender • Limited new investment in this Extender • High functionality but limited scalability

  18. DB2 Net Search Extender - Formats and Languages • The text document formats supported are: • HTML : Hypertext Markup Language (document models supported) • XML: Extended Markup Language (document models supported) • GPP: General Purpose format (aka flat text with user-defined tags, document models supported) • TEXT: Flat text • INSO: Plug in for Outside-In filtering software by Stellent • Language support is defined as follows: • tokenization of textual data • applying language specific processing where required (e.g. "new paragraph" indicator for Hindi) • support for DBCS languages using the proven bi-gram approach for tokenization • Language/Codesets as follows: • 19 Group One languages (English through Korean) • 15 Group Two languages (Arabic through Turkish) • 17 Group Three languages (Albanian through Vietnamese) • 5 Group Four languages (Indonesian through Telugu/India)

More Related