190 likes | 286 Vues
Searching E-Commerce Data. J. Shafer, R. Agrawal : WWW-9. Outline. Current parametric-search paradigm New paradigm Implementation details. Current Paradigm. User enters search criteria into HTML form
E N D
Searching E-Commerce Data J. Shafer, R. Agrawal : WWW-9
Outline • Current parametric-search paradigm • New paradigm • Implementation details
Current Paradigm • User enters search criteria into HTML form • User’s query is received by web-server and submitted to a server-side database (usually as SQL) • Result set is returned to user as an HTML page (or a series of pages) • Examples: Schwab, Expedia
Problems • Users often don’t know exactly what they are looking for • Unfamiliar with domain and/or database • Unable to compose precise queries • Too many/few results: try different query • Each query change must be sent back to the server and evaluated (often as a new query)
Source of Problems • Database query technology is targeted as reporting rather than exploration • The query itself is the goal • The results are “interesting” regardless of their size • In user-exploration, the end goal is typically to find one or two particular records of interest • Search is a process, not a single operation • An individual query is simply a means to an end
What is needed? • Combine searching with browsing • Replace submit/response metaphor with “continuous” querying that allows interactive exploration
Observations • Data must be cached on client side • There is always a notion of state • There is only one mouse • User can only see those records currently displayed on screen
Architecture Overview Eureka ListRenderer DataPump DataGroup HTTP DataColumn #1 DataColumn #N . . . client server JDBC Servlet Database
Example Dataset: Used Cars Make Distance Make Distance rid DataPump
Numeric DataColumns Distance rid RID List Data
Categorical DataColumns Make Data rid RID List value: Ford count: 3 index: 4 value: Honda count: 4 index: 7 hashtable value: Chrysler count: 2 index: 2 value: BMW count: 2 index: 0
ListRenderer • Only paint as many rows as fit on the canvas • Repaint canvas whenever: • scrollbar position changes • sort order changes • records appear/disappear from query results • Restrictions array indicates whether or not a particular row should be painted (count != 0)
Numeric Restrictionsmax(distance) = 100 Restrictions RID List Data lowerIndex upperIndex
Categorical RestrictionsMake != { Ford } Restrictions RID List Data RID List Data value: Ford count: 3 index: 4 value: Honda count: 4 index: 7 value: Chrysler count: 2 index: 2 value: BMW count: 2 index: 0
Rendering the List(sorted by Distance) Restrictions RID List Data ListRenderer rid
Rendering the List(sorted by Make) Restrictions RID List Data value: Ford count: 3 index: 4 ListRenderer rid value: Honda count: 4 index: 7 value: Chrysler count: 2 index: 2 value: BMW count: 2 index: 0
Additional features • Restriction by example • Ranking • Fuzzy restrictions
Final Comments • Used Eureka in several situations with very positive feedback • Used Eureka on datasets with 100K records with no visible deterioration in performance • Performance is excellent, even in Java