80 likes | 92 Vues
Learn how simplicity leads to better results in prior-art patent search, using structured search, filtering, and combination of terms effectively. Key findings presented from CLEF-IP 2009 study at Dublin City University.
E N D
Applying the KISS Principle with Prior-Art Patent Search CLEF-IP, 22 Sep 2010 Walid Magdy Gareth Jones Dublin City University
DCU participation in CLEF-IP 2009 • The more text, the better the results • Structured search does not help • Filtering helps • Combination of terms and phrases does better • Word matching for search is not the best • Blind relevance feedback is ineffective • Part of the answer is within the question
KISS • Keep It Simple and Straightforward • Three submitted simple runs:1. IR run (simple search)2. Cit run (straightforward citation extraction)3. IR+Cit run (combine IR and Cit runs) • Evaluation results (25 submitted runs):1. IR run (3rd in recall)2. Cit run (1st in precision)3. IR+Cit run (2nd in MAP, recall, and PRES)
IR run • Different document versions of a patent are merged • Only English parts are indexed (title, abstract, description, and claims) • Query is constructed from the same fields as follows:- unigrams with freq>2 from “description” field- bigrams with freq>3 from all fields • French and German topics are translated using Google translation • 1st three levels of classification are used to filter results
Cit and IR+Cit runs • All patents IDs are extracted from description section in patent topics • IDs that do not exist in collection are filtered out • Remaining IDs are considered as relevant documents • Only 771 out of 2,005 topics could have citations extracted from its text (2,307 citations) • IR run is appended to Cit run after removing duplicates to create IR+Cit run
Conclusion & Future Work • When simpler approaches achieve better results than sophisticated ones:Much research is still needed in this area • Extracted citations can be useful for relevance feedback • Better translations can be used for FR/DE topics • Faster translation techniques can be used to translate FR/DE documents
Simply, this was theKISSprinciple with patent search Thank you