1 / 15

Why tune relevance

Why tune relevance. Because we want to find the one single best item, among a large group of possible candidates…. 1. Multiple levels of control. Relevancy Ranking Precision Recall. Application Model. Business Rules. Levels of control. InPerspective ™. Core Algorithmic Model.

bree-kemp
Télécharger la présentation

Why tune relevance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why tune relevance Because we want to find the one single best item, among a large group of possible candidates…. 1

  2. Multiple levels of control Relevancy Ranking Precision Recall Application Model Business Rules Levels of control InPerspective™ Core Algorithmic Model

  3. FAST Relevancy Framework Multiple levels of control Accessible to… Control Mechanisms Application Model End Users Sorting order, navigation, relevance feedback Business Rules Business Managers Query and document “boosting” (BMCP) Levels of control InPerspective™ Administrator “Rank Profile” Developer Algorithm “weights” Core Algorithmic Model

  4. FAST Relevancy FrameworkInPerspective™ • Freshness • How fresh is the document compared to the time of the query? • Completeness • How well does the query match superior contexts like the title or the url? • Example: query=”Mexico”, Is ”Mexico” or ”University of New Mexico” best? • Authority • Is the document considered an authority for this query? • Examples: Web link cardinality, article references, product revenue, page impressions, ... • Statistics • How well does the contents of this document on overall match the query? • Examples: Proximity, context weights, tf-idf, degree of linguistic normalization,++ • Quality • What is the quality of the document? • Examples: Homepage?, Press release?, ... • Distance • What is the distance from where I am? InPerspective

  5. FAST Relevancy Framework : Rank Profile • Rank-Profile: Default (Intranet) • Authority: • Freshness: • Proximity: • Context: • Body: • Description • URL: • Keywords: • Title: • Rank-Profile: Financial News • Authority: • Freshness: • Proximity: • Context: • Body: • Description • URL: • Keywords: • Title: • Rank-Profile:Wealth Management • Authority: • Freshness: • Proximity: • Context: • Body: • Description • URL: • Keywords: • Title:

  6. Search Business Center SBC

  7. FAST UnityTM What It Does, How It Works, and What Value It Provides

  8. FAST ESP Federation FAST Unity at a Glance FAST Sources • FAST ESP 5.x • FAST Data Search 4.x • FAST ImPulse • FAST AdMomentum • FAST RetrievalWare Front-end Search Application • External Sources • Microsoft SharePoint 2003 & 2007 • Web search engines • Google, Yahoo, OpenSearch, Gigablast • Web services • Match.com, PriceGrabber, Google Image • Advertising services • Google Adsense Search Index Web Search Engine Web Site … … Internal Sources External Sources (e.g. another ESP instance)

  9. Look and feel - Unity Calls-to- Action Featured Content Ads Multimedia User-generated Content SubscriptionFeeds Third-partyContent

  10. ExampleWeb 2.0 Model • One query - multiple result sets • Results are returned asynchronously • Delivered directly to the browser

  11. FAST ESP - Scalability BUSINESS APPLICATIONS BUSINESS MANAGERS END - USERS SITE SEARCH eCOMMERCE ANALYTICS SCALABILITY COMPLIANCE ACCURACY INTRANET AVAILABILITY SECURITY FRAUD DETECTION eDIRECTORIES MARKET INTELLIGENCE FLEXIBILITY DEVELOPERS SURVEILLANCE IT MANAGERS 3D Scalability: #Documents - #Users - Index Latency Single Search Node Performance • 20-50 Million documentsUp to 1TB of information • 100-500 queries per second • 20-50 ms query response time • Down to 50 ms indexing latency • Indexing 50+ documents per second while maintaining search performance • FAST Scalability Facts: • Deployments with >40TB • Deployments with >3B documents • Deployments with 1 to 1000+ servers • Deployments with 1000s of queries per second • Deployments with >500 updates per second • 20-50 ms query response time • Sub-second indexing latency • Crawling >200 documents per second per server Dual Pentium4, 3 Ghz 4 GB Ram3 X SCSI 15K rpmHW RAID-0 derivate Document Freshness SCALING

  12. Query Performance of FAST Search VS RDBMSProven High QPS, Low Latency Access – Database Offloading QPS • Structured data: • 5 million records; • 13 fields per record • Structured queries: • 22 SQL queries( Representative in ERP ) ESP5 ESP5 • #1: FAST ESP4 w/ disk • Mean = 99 [ms] • St.dev. = 36 [ms] • #2: Oracle w/ memory mapping • Mean = 4 057 [ms] • St.dev. = 9 368 [ms] RDBMS RDBMS Identical HW : single node, 2 CPU, 4GB ram 3 SCSI disks Identical data : auction data from eBay, 3.6 million doc’s Identical queries: 200 queries defined by Oracle Latency

  13. SEARCH ESP5 ScalabilityEfficiency Per Server & Linear Scaling INDEX SEARCH Query ... QUERY PROCESSING CONTENTREFINEMENT ... ... Pluggable Content Dispatcher Query & Result Distribution ... ... ... ... ... RESULT PROCESSING Documents

  14. ESP5 – Raising the BarEnabling the Adaptive Information Warehouse SCALABLE HIGH PERFORMING • Linear scaling of feeding capacity • Archival solutions @ 40 PB • 14G Search solution (14X google) • Feed @ >6000 updates/s • Querying @ >2000 QPS • 100M documents per server • >2 X indexing throughput • Consistent low latency • Reduced disk footprint • Feeding architecture improved • Simplified state management • Improved fault-tolerance • Out-of-the-box monitoring • End2End SOA philosophy • Studio&Programmatic extensibility • Semantic index • SAN/NAS optimizations RELIABLE FLEXIBLE

  15. FAST ESP Competence Analysis . Performance & Scalability with commodity servers . 70+ multi-language support . Easy to use management tool and security control . Relevancy/Precision find what users want . Navigation to quickly to find what users want within few clicks . Add-on applications including Recommendation, Advertising promotion, Mobile access, DB cleansing/offloading, … . 200+ connectors to connect market popular silos . Extensibility and Integration with open architecture . Market leading #1 . Large R&D investment and commitment

More Related