180 likes | 319 Vues
Web Search and Advanced Internet Services. 290N Class Introduction Tao Yang, 2014. Introduction. Web Search/Traffic User interests Web content Importance of search engine traffic Online advertisement Class Topics. Internet Users. Sales of Mobile Devices/PCs.
E N D
Web Search and Advanced Internet Services 290N Class Introduction Tao Yang, 2014
Introduction • Web Search/Traffic • User interests • Web content • Importance of search engine traffic • Online advertisement • Class Topics
Sales of Mobile Devices/PCs http://www.businessinsider.com/the-future-of-mobile-deck-2012-3?op=1
Content trend and ownership • Content consumption is fragmenting – nobody owns more than 10% of WW PVs • No single place will own all the content [Ramakrishnan and Tomkins 2007]
Web Search Engine Market in USA (Jan 2012) Google: 66.2% Bing: 15.2% Yahoo: 14.1% Ask: 3% AOL: 1.6%
Search query Ad
Questions • Do you think an “average” user, knows the difference between sponsored search links and algorithmic search results?
Course Objectives • Practice and experience for building search services and developing related mining applications • Broad topics in web mining and search engines, advertisement • Algorithms & System support • Workload: • 1 take-home exam • Group project (2 persons). • paper reviewing and presentation • Implementation/evaluation. Report. • 2 group HW exercises (Lucene/Solr search, Hadoop log analysis)
Course Topics • Web Search • Indexing, Compression, and Online Search • Ranking methods with text/ link/click analysis. Machine learning. • Text Mining • Duplicate analysis. Text Categorization and Clustering • Recommendation • Advertisement • Systems Support • Online servers and offline computation. MapReduce. • Caching. Crawling and document parsing. • Open source systems
Expected Work • Tentatively Project 50%. Take-home exam 40%. 10% HW exercise. • Timeline • Jan 29: 1-page project proposal (plain email text). • Jan 30-Feb 6: • Meet with me and select paper(s) for reviewing. • Demo for HW 1 • Feb 15 week: • Project progress & related papers presentation • Feb 27. HW2 • Then schedule second meeting with me on HW2 and proj • March 15 week or earlier: • Project demo/interview • Final project slides/report. • Take-home exam. Problems based on class presentation/references/HW.
References • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze (MRS), Introduction to Information Retrieval, Cambridge University Press. 2008. • Search Engines: Information Retrieval in Practice by Croft, Metzler, Strohman (CMS) Addison-Wesley, 2010 • Selected papers • www.cs.ucsb.edu/~tyang/class/290N13
Class Computing Resource • Triton supercomputer accounts: • Week 2 (Jan 13).Get a class account in Triton by emailing your name, UCSB email, and ssh public key with subject "CS290N ssh key" to scc@oit.ucsb.edu . Instructions on generating ssh keys can be found in http://cs.ucsb.edu/~hnielsen/cs140/ssh-keypair.html • CSIL sandbox disk space • /cs/sandbox/class/cs290n • /cs/sandbox/student/<username> • 290N class discussion group at Google.com (we will send an invitation based on the class list).