1 / 70

Free your Data: Instant Gratification with the Semantic Web

Free your Data: Instant Gratification with the Semantic Web. David Karger. Why everyone should be their own database administrator, UI designer, application developer, and web site builder, and how they can. David Karger. A Semantic Web Vision.

symona
Télécharger la présentation

Free your Data: Instant Gratification with the Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Free your Data: Instant Gratification with the Semantic Web David Karger

  2. Why everyone should be their own database administrator, UI designer, application developer, and web site builder, and how they can David Karger

  3. A Semantic Web Vision • Autonomous computational agents perform sophisticated information tasks on behalf of their human users • Use data that is annotated with rich semantics • Ontologies that explain precisely what the data means • Schema annotations that explain how to align multiple ontologies • Rules that explain how new data can be formally derived from existing • Inference systems that put it all together • Lots of logicians and AI researchers developing tools • This vision is frightening • Involves solving problems that have bedeviled AI for decades • Often used to attack the semantic web • Or to argue to slow down deployment • “we can’t put up that data until we have an ontology!”

  4. Aim Lower: the Semimantic Web • Not “make computers help” but “make them not hinder” • “First, do no harm” • Create a tiny bit of structure: • Name objects (with URLs) • Record named relations between them • No semantics on relations • No schemas • No inference • This is both • Technically simple • Immediately useful • You should do it • And you can right now

  5. Why Applications? • Typical user tasks require interaction with multiple pieces of information • Display • Explore • Query • Manipulate • Applications bring together the data, specialized views, and operations necessary to perform tasks

  6. Irrelevant info Distracting Covers up more important info Artist Of dance, not music ID3v2 added “Composer” shown in wrong place • Menu of genre choices • My genre (of dance, not music) missing • ID3v2 lets user add • No “difficulty” field • Place in comment field • Uses field up • Where put “tempo”?

  7. Summary of Problems • Application has fixed idea of “right” data • Both properties and values for them • And right way to display that data • User wants to “stretch” the app to their needs • Cannot hide irrelevant data • Cannot incorporate new kinds of data • Cannot change how data is presented • Perhaps just use generic comment field? • Add what you want • Format how you want

  8. Properties have structure • Used for layout • And for browsing

  9. Sometimes, one application isn’t enough • Applications inappropriately partition task • Because task wasn’t planned for in application design • No application has all the necessary data, operations • Need to launch several to do task • Each includes unneeded data, operations • Clutter distracts from what you need to see • Can’t work with data “across” application boundaries • Can’t record or view data connections • Have to find it again in second application • Or enter it manually a second time • Type budget numbers on postits to move to other application

  10. Why? • Building applications is hard • Done by expert few for the many • They determine which data, views, operations are useful • Applications are “mass produced” • Everybody gets the same one • And only build for large markets • Word processor, email, photo album, … • Problem: different people want different applications • Basket weaving. UFO sightings, junkyard management • Want to work with unusual information • Want to see, navigate, manipulate it “their way” • Developers can’t afford to build these boutique applications

  11. What about the Web? • Anything can get a URL • Anything can go in a page, linked to anything • Common to “schematize on the fly”, making lists of interesting properties/values • Support for orienteering • Scan list of choices • Pick the one that seems to lead in the right direction • Fact: people orienteer even when there’s an easy query that is faster • On web, never bounce off an application boundary

  12. Downside • Hard to author • Especially if I want to record lots of complex data • Hard to manipulate, do complex queries • HTML loses meaning of data • Can’t “switch to tabular view” • That’s why web sites are backed by databases • Data is kept structured to support complex queries • Templating engines convert to human readable presentation • End users aren’t going to manage this kind of web site • Gives powerful operations, but only “inside” web site • User may discover need to cross site boundaries • Like applications, web sites create (possibly wrong) data partitions • So all the problems with applications apply here too

  13. Not just music • Scientific research generates masses of data • E.g. Bioinformatics • Others want to access that data • Big standards bodies meet to decide on community standard formats and systems under which everyone will distribute data • When scientist wants to try or report something new, or needs data from outside the community, stuck.

  14. Information Wants to be Free • Applications and Web Sites make assumptions about how their data will be used • Those assumptions are hard-coded into the interaction with the data • But no developer can predict all uses of the data • Fixed interfaces prevent data repurposing • Solution: give direct access to the data • Just set up a SQL server? • (A long-running screed of the DB community)

  15. But it Can’t be Just about the Data • People need to look at the data • (unless we figure out those autonomous agents…) • And need to create it in the first place • Apps and template-driven web sets give us nice interfaces for interacting with the data they manage • But if we use them we can’t repurpose the data • And what interface can we use for the repurposed data? • Web needed a server (of data) and a client (to show it) • How make viewing, authoring and repurposing arbitrary data as easy as viewing and authoring web pages? • Without knowing precisely what data people will want to view or how they will want to view it?

  16. Example: Piggy Bank • I need data from more than one web site • And I need to look at it differently than any web site • What is minimum necessary support? • Piggy Bank: A firefox plugin for navigating structured data

  17. Find some movies

  18. Free that data

  19. Show it a different way

  20. Combine it with other sources

  21. Mash Ups? • Developer decides to integrate data from multiple sites • Writes programmatic “scrapers” • reverse the web site’s templating process to recover data • Combines resulting data structures • Presents using their own template driven web site • Thus guilty of same sin as the one they are fighting • I only get the mash-ups a programmer decides to create • Piggy bank lets end users do their own mashing

  22. Data Model

  23. Superman Loew’s 8PM Kendall Sq. RDF Movie • W3C standard • Minimum data model • URL for arbitrary objects • Arbitrary named links between two objects • No schemas • Much like the web, except • URLs need not be web pages • Machine readable “anchor text” in links • Yet Powerful • Relations are natural/universal • Represent a semantic network type title venue time location type Theater

  24. Are we done? • Is RDF the only answer? • SQL/Tuples, XML can represent same info • So any would do • And user shouldn’t have to know which we’ve chosen • But RDF is easiest to create sloppily, incrementally • So best suited to let enthusiasts create some • And imposes fewest requirements to be “compatible” • Is RDF the whole answer? • Still unclear how to interact with it

  25. Visualization

  26. Lenses • If data is amorphous, monolithic UI won’t do • Can’t know in advance what kind of data we’ll need to display • Or what user will want to do with that data • Let each type come with “view prescription” • “To display a document, show its title, author, and abstract • “To display a person, show his name and affiliation” • Specifies properties to show, and “decoration” (fonts, layouts) • After you get the data, assemble lenses to show it • (recursively) • Lenses are described in RDF • So they can be collected, repurposed like any other data

  27. Fresnel dsp:publicationLens rdf:type :Lens; :classLensDomain ow:Publication; :group gr:group; :purpose :defaultLens; :showProperties ( dc:description dc:identifier dc:creator dc:contributor dc:date dc:subject dc:type dc:publisher dc:rights ) . dsp:rightsFomat rdf:type :Format; :group gr:group; :propertyFormatDomain dc:rights; :propertyStyle "dspace-rights" .

  28. Benefits • Data collected from anywhere can be viewed together • Each piece of data with its own lens • Lenses are described, not programmed • Enthusiasts can write their own • (especially if we give them wysiwyg tools) • No need to build a template driven web site • Just edit, publish some lenses

  29. Manipulation

  30. Application Development by End Users • People want applications to manipulate their data • But applications only manipulate developer’s data • So let end users build their own • Use lenses, but refract in both directions • Lenses describe how to map data to presentation • Invert, interpret manipulation of presentation as manipulation of data • (extend lenses to talk about click, drag, drop) • Operations represented as web services • Internal and remote operations • Receive RDF data and act on it

  31. The Big Picture

  32. Sufficient for Nice Applications? • Application design is impoverished • Divide up the screen • Put an object in each piece • Show properties of each object • With pretty formatting • Put operations in menus • And add some toolbars to save time • This application “vocabulary” is limited enough • to be described instead of programmed • so it can be edited by end users

  33. Workspace Designer • Editing mode for applications • Define regions of screen • By splitting existing regions • Resize Regions • Specify content of each region • Object to be shown (drag and drop object) • Lens to use to show object (menu of relevant lens) • Operations to make available on object (drag operations)

  34. Writing a Brain Research Paper

  35. Adding “Things to Do” Region

  36. Revised Application

  37. Lens Designer • Specify how a particular object can be shown • Similar to workspace designer • Lens is “workspace” for viewed object • Subdivide canvas • Specify property to show in each region • Specify lens for value of each property

  38. Topic: GSK3beta Topic Disease: DiabetesT2 Alt Dis: Alzheimers Target: GSK3beta Cmpd: SB44121 CE: DBP Team: GSK3 Team Person: John Related Set Path: WNT Drug Discovery Dashboard http://www.w3.org/2005/04/swls/BioDash

More Related