IST 497G Ron Grzywacz November 2002

IST 497GRon GrzywaczNovember 2002 Personalization of Information Retrieval

Overview • The Topic • Issues • Importance • What has been done and how • The next step

Personalization of IR • This refers to the automatic adjustment of information content, structure, and presentation tailored to an individual user. • Characteristics • Age • Gender • Special Interest Groups • Topic

Issues • Why Use? • One Size Doesn’t fit all • Limits Diversity • Limits Functionality • Limits Competition • Different Users have different needs

How to customize the process? Which characteristics? Not all people of certain groups are the same Virtually impossible to create one unique search engine for every individual Wouldn’t make sense to build 100 million different versions of Google from the group up based on one user’s needs Issues

Importance • Why important? • Not every solution is ideal for each person • Certain people don’t understand how to use certain systems • Current systems aren’t tailored for a specific type of user

The Example • Suppose we query a system for “Michael Jordan” • Most popular engines would return information about the famous basketball player • Suppose that we were looking for information about computer science papers written by Michael Jordan • By using the query “Michael Jordan”, there is no information regarding the context of our desired search

Context Search • This presents a problem • We have no way to infer or assume the context of a user’s desired search • Currently it’s a hit or miss process to return relevant contextual information • What if we tried to automatically infer contextual information?

Automatic Inferring of Contextual Information • By monitoring user patterns, we could gather information about the user and the possible context of search • This raises issues regarding privacy • Do you want some company building information about your preferences? • What happened if this information was released or misused against you.

Personalized Search • By combining the previous items we can build a personalized search service. • The example “Michael Jordan” query could return data about the basketball player to sports enthusiasts and information about computer science papers to researchers.

What’s Been Done? • Inquirus/Inquirus 2 • This is a meta-search engine done by a team from NEC (which included Prof. Giles) • It attempts to add a category or contextual information to a keyword search • Examples would be “personal homepage”, “research paper” and “general introductory information” • It uses the information to query relevant search engines, modify queries and select ordering policy

What’s Been Done? • My experiences with Inquirus • You have 3 search options within this service • Web Search • Google • Groups (Google Groups) • Returned valid results from both Google and Groups • Valid results were also returned from Web Search but rank order was not as good • It also included words such as “of” in the search terms. This could be problematic, since sometimes words like that had the most results

What’s Been Done? • The Watson Project – Northwestern Univ. • Suppose someone is searching for “information about cats” • It’s easy to manipulate text to create many unique scenarios • Veterinary student writing a term paper on animal cancer • Feline Cancer, Diagnosis, Treatment • Contractor working on a proposal for a new building. • Caterpillar Corp. machinery • Grade-school student writing a paper about Egypt. • Pictures and stories about cat mummies and gods

What’s Been Done? • The Problems with the query • Relevance of active goals • The active goals of the user contribute significantly to the interpretation of the query and to the criteria for judging a resource relevant to the query. • Word-sense ambiguity • The word sense of “cat” is different from the others in each scenario. The context of the request provides a clear choice of word sense. • Audience appropriateness • The audiences in each of the scenarios also constrain the choice of results. Sources appropriate for a veterinarian probably will not be appropriate for a student in grade school.

What’s Been Done? • The Watson Project Solution • A system used to collect contextual information from everyday computer use • Watson is a client side application that monitors you daily computer use of applications such as word processors, web browsers and Email clients. • By knowing about you and your work, Watson can help you find information that is relevant to you.

What’s Been Done? • My experience with Watson • A small download and brief install loaded the java based application • I used it while creating this presentation • It managed to return results regarding search engines and their development, but it did not return anything relevant to my specific topic • It did manager to generate a search in CiteSeer for me

What’s Been Done • Context already assumed • Although we can not automatically assume the context of a user’s search yet, people have built engines that use a given context • CiteSeer • This is a search engine for research papers in scientific literature

What’s Been Done • PubMed • Customized Science and Medicine database of journals and articles • Questia • Search engine for students who are doing research and writing papers

What’s Been Done • A different approach using personalized IR • In what other ways can we use this type of technology? • KnowledgeFlow Inc. – Web Angel • Browser Plug-in that stores your preferences and returns customized advertisements to you as you browse the web • A practical consumer/commercial application of personalized IR

What’s Been Done • There are other ways to personalize web content • Recommender Systems • A System that monitors your habits or receives input about your preferences and generates things you might be interested in • Amazon.com • Barnes and Noble

Other Recommender Systems • Movielens • Movie selection service • Gives you a survey to gauge your preferences • Makes recommendations based on your likes • Book Forager • Novel selection service • Allows you to choose a variety of book characteristics • Makes a recommendation based on your current choices

What’s Next? • How can we build upon current services? • Use of AI to evaluate query to determine contextual information • Use user provided information to generate specific data • Build upon user provided data by monitoring browsing preferences

Summary • The Topic • Issues • Importance • What has been done and how • The next step

Citations • Budzik, J., and Hammond, K. User Interactions with Everyday Applications as Context for Just-in-time Information Access. In Proceedings of Intelligent User Interfaces 2000. ACM Press, 2000. (Nominated for Best Paper Award) • Steve Lawrence. Context in Web Search, IEEE Data Engineering Bulletin, Volume 23, Number 3, pp. 25–32, 2000. • N. Ramakrishnan and S. Perugini. The Partial Evaluation Approach to Information Personalization. ACM Transactions on Information Systems, August 2001.

IST 497G Ron Grzywacz November 2002

IST 497G Ron Grzywacz November 2002

Presentation Transcript

6 th November 2002

November 2002

ADEM Air Update Ron Gore November, 2012

Technology Expo 29th November 2002

Ben Smit 6 November 2002

Presentation November, 2002 PISA

November 2002

January – November 2002

Margarida Baptista November 15, 2002

Concert Conference November 2002

CAGNY NOVEMBER 2002 MEETING

Lecture, November 27, 2002

Lecture, November 20, 2002

Dirk Timmerman November 2002

Ron

VERTEX 2002 Workshop, Hawaii, November 3-9, 2002

Ben Smit 6 November 2002

CAGNY NOVEMBER 2002 MEETING