Search Stack Secrets

Search Stack Secrets Ryan Gehring - Indiegogo

Practical Search for Rubyists Elasticsearch/ SOLR / alternatives roundup. Essential plugins you need to install today. Semi SOA search design. Schemaless is for amateurs! Mappings = friend. Problem solving with analyzers. Avoiding Tire DSL- query json ingredients.

Elasticsearch v SOLR v … Horizontal scalability GREAT API Developer support (analyzers, etc.) Downside: slightly less great ruby client.

Awesome Plugins elasticsearch-head A web front end for an ElasticSearch cluster http://mobz.github.com/elasticsearch-head ElasticSearch Paramedic Paramedic is a simple yet sexy tool to monitor and inspect ElasticSearch clusters. ElasticsearchJDBC river https://github.com/jprante/elasticsearch-river-jdbc

One solid service-y and Rails 4-approved design Webform in view supplies GET parameters, submits to a search controller. Search controller okays the proper, permissioned parameters via strong parameters, instantiates a search object. Search model translates parameters into a query --- either using Tire (the ruby client) or JSON. Query fired and results are served!

Mappings + Analyzers: Ingredients for Success! Elasticsearch is schemaless by default, but you can optimize by providing a schema. What fields to index, How to analyze+tokenize fields. These analyzers help a lot!

Problem solving with analyzers • My search isn’t robust to misspellings! • N-gram • Edge n-gram • My search isn’t robust to plurals / caps / whitespace/ etc. • Snowball (standard+lowercase+someenglish language stemming + stopwording) • I can only solve one of these at once! • Multi field analysis.

Problem solving with boosts • Boosts are a concept from Lucene; they are multipliers on scores. • You can set the relative importance of matching fields: example: title -> 10, vs. free_text -> 1 • You can set the relative importance of matching on ANALYZED fields: example: ngram_title -> 6, snowball_title -> 10. • Bonus for fields with exact token matches.

Key queries in Elasticsearch • Filtered Query: • Apply binary filters to an arbitrary query; try it with the query_string query type for full text, analyzed search queries + filters. • Custom Score Query • Provide the exact equation for scoring --- you can take mathematical transforms of variables using MVAL or even python with the right plugin.

Theoretical Section Integrating models via custom scoring. Learning models – a qualitative, quantitiative process. Data sources and paradigms. Key metrics for search. Monitoring statistical model performance.

Custom score queries are regression equations. You can use supervised learning methods to train them over time like Google.

Statistical learning & search. • Clickstream models • Logistic regression • Binary target, click no click • Learn boosts, coefficients, etc. • Paired comparison models • Logistic regression • Binary target, A > B • Learn boosts, coefficients, etc.

Search model training is a qualitative-first process. Review search algorithms before you push them. Have other people review search results before you push them. Make your app robust to new search query models – abstract the regression to a query model. Do side-by-side qualitative search QA.

Search success metrics… any googlers here? Items consumed / session for browse pages. 1- abandoned search % for search pages. Conversion rate originating from search page.

Search model learning Explain output --- the ultimate training data, in a nasty, semi-structured mess. Built an AST parser for Lucene explain output so you can get clean rows of observations. Every query’s intimate scoring details are logged into a DB as lines of training data.

Search model monitoring You can calculate stability metrics for thousands of queries between two models and highlight the least stable queries. You can monitor prediction accuracy on clickstream data for performance degradation.

Search Stack Secrets

Search Stack Secrets

Presentation Transcript

The Secrets of News Search Engine Optimization

Search Engine Secrets

Search Secrets - SES San Jose 2008

The Secrets of Search Engine Optimization (SEO)

Beam-Stack Search: Integrating Backtracking with Beam Search

Stack

STACK

Stack

Stack

Best-Kept Secrets for Search Marketing Success

Stack

Stack

Marketing Secrets For Search Engines

Secrets to optimize content for search engines

Stack

STACK and Stack Pointer

Stack

Revealing the Secrets of Local Search