1 / 14

The Information Universe of the (Near) Futur e

The Information Universe of the (Near) Futur e. Creative Commons License: allowed to share & remix, but must attribute & non-commercial. What it will look like. Why it needs infinite scalability. and how to achieve this with the Large Knowledge Collider.

tana
Télécharger la présentation

The Information Universe of the (Near) Futur e

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Information Universe of the (Near) Future Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

  2. What it will look like Why it needs infinite scalability and how to achieve thiswith the Large Knowledge Collider The Information Universe of the (Near) Future Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

  3. The Current Information Universe The Future Information Universe and another web page about Frank This page is about the Vrije Uniersitei a web page in English about Frank And this page is about LarKC And this page is about Stefano ? ? ? linked web-pages, written by people, written for people, used only by people... ? ? Many of these pages already come from data, usable by computers! linked data, usable by computers! useful for people! But we can’t link the data....

  4. How far away is this ? Not very far away! every book sold by Amazon rapidly growing Linked Open Data cloud. already many billions of facts & rules any CD ever recorded (almost) life-science databases basic facts on every country on the planet hierarchical dictionaries (UK, FR, NL) common sense rules & facts (100.000’s) scientific bibliographies names of artists & art works (10.000’s) Geographic names (millions) Encyclopedia It gets bigger every month

  5. different owners & locations Full Web-style decoupling:re-usability, independence • All identifiers are URL's (= on the Web) • Allows total decoupling of • data • vocabulary • meta-data [<x> IsOfType <T>] x T <person>

  6. For the first time ever, it is now possible: to re-use somebody else's knowledge base • without having to talk to them first (syntax, semantics) • without having to make copies Rapid growth: "billion triple challenge" (= machine-reason with a billion facts and rules) • 2006: “where do we get a billion facts from?” • 2008: “which billion shall we choose!”

  7. What to do when success is becoming a problem? The Large Knowledge Collider a platform for infinitely scalable reasoning on the data-web

  8. Infinite scalability? • parallelisation • cluster computing • distribution • “Thinking@home”, “self-computing semantic Web” • approximation • “almost” is often good enough • gets better with more resources

  9. brain the size of a planet First result: MaRVIN • MaRVIN scales by: • distribution (over many nodes) • approximation (sound but incomplete) • anytime convergence (more complete over time)

  10. “Show me all liver toxicity associated with compounds with similar structure” Show me all liver toxicity associated with the target or the pathway. “Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population” Genetics Chemistry LITERATURE Current NCBI: linking but no inference Use case: Drug Discovery FDA white paper Innovation or Stagnation (March 2004): “developers have no choice but to use the tools of the last century to assess this century's candidate solutions.” “industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs” • Problem: pharmaceutical R&D in early clinical development is stagnating “Show me any potential liver toxicity associated with the compound’s drug class, target, structure and disease.” (Q1Q2Q3)

  11. Is public transportation where the people are? • Where is the traffic moving • Is public transportation where people are • Which location attracts most people right now • Is public transportation where people will be Which landmarks attract more people? Where are people concentrating? Where is traffic moving? Use Case: City on-line • Our cities face many challenges • Urban Computingis the ICT way to address them improve the quality of life

  12. And this page is about Stefano Is anybody doing this for real? • OpenCalais: • enrich text (news items) with semantic meta-data • recognise people, places, events, organisations,... • useful for searching, selecting, personalising, aggregating, summarising, etc • From early ’09: • identify “people, places, events, organisations,...”by linking to the Open Data cloud: And this page is about LarKC

  13. Summarising The Information Universe of the Future will be a Web of Data • This Web of Data is rapidly taking shape • There are compelling use-cases • Industrial take-up is beginning to happen • We are building new infrastructure to deal with required scale

  14. Contact Info Frank.van.Harmelen@cs.vu.nl http://www.larkc.eu Want to ask questions? Want to play with LarKC? Want to contribute plugins? Want to run a use-case?

More Related