320 likes | 415 Vues
Explore the journey of failed predictions and valuable lessons learned by Chris Welty from IBM Research. Starting with personal anecdotes and failed prophecies, gain insights into the importance of adaptation, scalability, and acknowledging the evolving landscape of technology.
E N D
How I was right, even when I was wrong Chris Welty IBM Research
Outline • Opening Joke • Some personal history • My failed predictions • Lessons learned? • A glimpse into the future • Closing joke
The birth of a know-it-all • Born in NY early 60s • Early expert on everything • Disappointment with Buck Rogers • 2cnd grade • Summed up numbers from 1-100 in five minutes • Got 5100 • The answer is 5050 • First prediction (1975) • I will marry Farrah Fawcett-Majors
Outline • Opening Joke • Some personal history • My failed predictions • Lessons learned? • A glimpse into the future • Closing joke
First exposure to email • 1981 uucp mail • allegro!batcave!cornell!rpics!weltyc • Prediction: • No one will ever use email • Why I was right: • Usenet paths were ridiculous • What I missed: • Paths and email were not tightly bound • People really wanted email WRONG!
Next generation email • 1983 CSNet • weltyc@rpics • Prediction: • As I said… • Why I was right: • Someone still has to maintain the list • Won’t scale • What I missed • People really needed email • It was better, not perfect WRONG!
. .edu .com .org Domain Naming Service • Proposed to IETF in 1985 • Distributed hierarchical database • Distributed not only the data, but the maintenance • might make email work • Prediction: • The .edu top-level will become overloaded • Why I was right: • The hierarchy was unbalanced • What I missed: • People were willing to invest in scale • Money to be made in supplying domain names! • It was better, not perfect WRONG!
HTTP/HTML • Proposed to IETF in 1990 • Hypertext is decades old, this just adds tags • Prediction: • No big deal, unimportant • Porn will make the InterNet unusable • Why I was right: • Porn really was king of the early web (~70%) • What I missed: • People were willing to invest in scale • It was more than just tags for hypercard WRONG!
Web 2.0 (i.e. Social Web) • Started roughly 2002 • Web of people instead of machines • Wikis, social tagging, social networks • Prediction • TEENAGE NONSENSE • Will be poisoned by stupidity, negativity, misdirection, spam • Will not scale • Why I was right • Most of it is teenage nonsense • Most people really are idiots • What I missed • People want to share their knowledge • People scale on the web • Quality seems to be self governing in certain areas WRONG!
Blind Men and Elephants Which one are you? I was right about the trunk.
Semantic Web • The idea has been around for about a decade • You may have heard of it • I got the pitch from TimBL… • Prediction: • KR is decades old, this just adds tags • Will not scale (KA, Reasoning) • Proliferation of bad ontologies will lead to bad systems • Why I was right: • Reasoning doesn’t really scale (exptime is incomplete) • Bad ontologies do lead to bad systems • What I missed • Its not just tags • KA does scale – people want to share their knowledge • A lot of people don’t care about reasoning • Better not perfect • KA not needed – the actual vision WRONG!
The Semantic Web Vision • ~80% of web pages are generated from back end databases • Publish the semantics (schema?) as well as the data • URIs provide a web-based form of identity • It’s the semantic WEB, not the SEMANTIC web • NOT: humans will markup their web pages • NOT: NLP will populate the SW from web pages
Outline • Opening Joke • Some personal history • My failed predictions • Lessons learned? • A glimpse into the future • Closing joke
Lessons learned • People who make bad predictions still get to be invited speakers! • The unimpressed scientist syndrome • Applications that are needed will just happen • Better not perfect • People really want to share their knowledge • Scalability of people on the web • Scale happens
The Unimpressed Scientist • Be more open minded • Tend to “accept” rather than “reject” • Don’t confuse the trunk for the elephant • The evaluation criteria is not whether it will work, but whether it is needed
Better not Perfect • Improvements are important • So ask yourself, “Is this better” • Nit-picking usually is not important • The boundary conditions matter, but aren’t everything • Measurement, experimental conditions, become critical • What is “better”? • NLP perhaps takes this too far
Scalability • Faster, bigger computers • Better distribution • People on the web • The Captchas story • Heuristics, statistics
People want to share their knowledge • Shouldn’t be a surprise, this is what motivates us • Still, most people are idiots • …so… • Pure openness doesn’t work, but • Reviews, feedback, “how valuable”, etc. seem to work
Outline • Opening Joke • Some personal history • My failed predictions • Lessons learned? • A glimpse into the future • Closing joke
Promising trends • Almost back to the 80s • KA with semantic wikis • E.g. ontoworld.org, Halo • NLP and KR are coming back together • Powerset, etc. • Collaborative, large, KBs • Dbpedia, freebase • Imdb, wordnet • Cyc • Scalable reasoning • SHER • Rules • RIF BLD released (http://www.w3.org/TR/rif-bld) • RDF compatibility (http://www.w3.org/TR/rif-rdf-owl)
Important Problems • API incompatibility • Connotation vs. Denotation • URIs provide identity, but what do they mean • Coreference, disambiguation, word sense • Experimental methodology, measurement • E.g. precision & recall • Dependencies of results • The very long tail • Wherefore reasoning? • Ontology Quality, Evaluation
Grassroots to the Web • Early web dominated by “what it looks like” in Mosaic • Unimpressed UI and Hypertext researchers • Focus on spreading the word, not doing it right • Many early web pages didn’t have links in text at all • “Catalog” pages with lists of links • “Text” pages with few or no links • Embedded images more interesting than links • Just do it rather than do it right • But… • When the web became serious, the research started to matter
A little semantics… • The SW catchphrase • “A little semantics goes a long way” • Sometimes strengthened • A lot of semantics is too much • 80/20 rule • Double-edged sword • FOAF doesn’t look like even 1% • The simplicity of FOAF hides any serious value proposition for SW • SW not for people, for data • Important to get it right?
Some evidence • Does quality matter? • Good quality ontologies cost more • Required for some applications • Improvements in quality can improve performance [Welty, et al, 2004] • 18% f-improvement in search • Cleanup cost ~1mw/3000 classes • BUT … low quality ontology still improved base
Wherefore Reasoning? • Very hard to “sell” OWL reasoning • Many users want very simple reasoning • Simple subclass • Simple range/domain constraints • Simple rules • Some users want more than OWL • But just to express their semantics • Improving precision? • Improving recall? Must be measured.
The very long tail Ontologies, explicit semantics frequency Something else?
Question Answering • Q: What weapon was featured in the ballet “Fall River Legend?” • A: American Ballet Theatre • OK, add “weapon” to ontology…
Question Answering • Q: What gum’s motto was “Double your pleasure, double your fun”? • A: personal lubricant
Vision Speech Natural Language Context awareness Tacit knowledge Learning Socialization Organization Perfect memory Calculation speed Planning & scheduling Games & simulation Search Networks Humans vs. Machines
Outline • Opening Joke • Some personal history • My failed predictions • Lessons learned? • A glimpse into the future • Closing joke
Question Answering • Q: What president gave the longest inaugural speech? • A: Dieter Fensel • “Improvements” need to be measured • P α 1/R Leader Talk, presentation