The Information Universe of the (Near) Futur e

The Information Universe of the (Near) Future Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

What it will look like Why it needs infinite scalability and how to achieve thiswith the Large Knowledge Collider The Information Universe of the (Near) Future Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

The Current Information Universe The Future Information Universe and another web page about Frank This page is about the Vrije Uniersitei a web page in English about Frank And this page is about LarKC And this page is about Stefano ? ? ? linked web-pages, written by people, written for people, used only by people... ? ? Many of these pages already come from data, usable by computers! linked data, usable by computers! useful for people! But we can’t link the data....

How far away is this ? Not very far away! every book sold by Amazon rapidly growing Linked Open Data cloud. already many billions of facts & rules any CD ever recorded (almost) life-science databases basic facts on every country on the planet hierarchical dictionaries (UK, FR, NL) common sense rules & facts (100.000’s) scientific bibliographies names of artists & art works (10.000’s) Geographic names (millions) Encyclopedia It gets bigger every month

different owners & locations Full Web-style decoupling:re-usability, independence • All identifiers are URL's (= on the Web) • Allows total decoupling of • data • vocabulary • meta-data [<x> IsOfType <T>] x T <person>

For the first time ever, it is now possible: to re-use somebody else's knowledge base • without having to talk to them first (syntax, semantics) • without having to make copies Rapid growth: "billion triple challenge" (= machine-reason with a billion facts and rules) • 2006: “where do we get a billion facts from?” • 2008: “which billion shall we choose!”

What to do when success is becoming a problem? The Large Knowledge Collider a platform for infinitely scalable reasoning on the data-web

Infinite scalability? • parallelisation • cluster computing • distribution • “Thinking@home”, “self-computing semantic Web” • approximation • “almost” is often good enough • gets better with more resources

brain the size of a planet First result: MaRVIN • MaRVIN scales by: • distribution (over many nodes) • approximation (sound but incomplete) • anytime convergence (more complete over time)

“Show me all liver toxicity associated with compounds with similar structure” Show me all liver toxicity associated with the target or the pathway. “Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population” Genetics Chemistry LITERATURE Current NCBI: linking but no inference Use case: Drug Discovery FDA white paper Innovation or Stagnation (March 2004): “developers have no choice but to use the tools of the last century to assess this century's candidate solutions.” “industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs” • Problem: pharmaceutical R&D in early clinical development is stagnating “Show me any potential liver toxicity associated with the compound’s drug class, target, structure and disease.” (Q1Q2Q3)

Is public transportation where the people are? • Where is the traffic moving • Is public transportation where people are • Which location attracts most people right now • Is public transportation where people will be Which landmarks attract more people? Where are people concentrating? Where is traffic moving? Use Case: City on-line • Our cities face many challenges • Urban Computingis the ICT way to address them improve the quality of life

And this page is about Stefano Is anybody doing this for real? • OpenCalais: • enrich text (news items) with semantic meta-data • recognise people, places, events, organisations,... • useful for searching, selecting, personalising, aggregating, summarising, etc • From early ’09: • identify “people, places, events, organisations,...”by linking to the Open Data cloud: And this page is about LarKC

Summarising The Information Universe of the Future will be a Web of Data • This Web of Data is rapidly taking shape • There are compelling use-cases • Industrial take-up is beginning to happen • We are building new infrastructure to deal with required scale

Contact Info Frank.van.Harmelen@cs.vu.nl http://www.larkc.eu Want to ask questions? Want to play with LarKC? Want to contribute plugins? Want to run a use-case?

The Information Universe of the (Near) Futur e

The Information Universe of the (Near) Futur e

Presentation Transcript

The Composition of the Universe

The Origin of the Universe

The Web of Linked Data Information Universe

The Origin of the Universe

The Core of the Universe:

The Origin of the Universe

The Wonders of the Universe

The Rest of the Universe

The Beginning of the Universe

The Scale of the Universe

The Birth of the Universe

The Fate of the Universe

The Fate of the Universe

The History of the Universe

Information Universe

The Acceleration of the Universe

The Age of the Universe

The Shape of the Universe

The Evolution of the Universe

The Expansion of the Universe!

The Universe of Business Information

Information Universe