Microdata in HTML 5.0
Microdata in HTML 5.0. Technologies for Web Application Development Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic. Machine-readability.
Microdata in HTML 5.0
E N D
Presentation Transcript
Microdata in HTML 5.0 Technologies for Web Application Development Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic
Machine-readability • machine readabilityof HTML page means capability of machines to interpret data on that page • HTML 5.0 elements allow for machine readability only partly, e.g. • time element • address element • we could continuously standardize new and new semantic elements … • … but it would be wrong • it is not possible to standardize everything • we need a way for free extensibility towards machine readability
Machine-readability <div> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <timedatetime="2011-04-22">April 20th 2011</time> to <timedatetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address>Hotel OtavArena, Burketova 303, 397 01, Písek</address> </li> </ul> </section>
Machine-readability • it is useful to increase machine-readability of your web pages by annotating content with machine-readable values • How to achieve machine-readability? • microformats • microdata • RDFa
Microdata for machine-readability • microdataallows nested groups of name-value pairs to be added to documents in addition to “classical” content • basic microdata concept is item • group of name-value pairs • item ~ object • name-value pair ~ attribute value • value: atomic value (integer, string, date, …) or another item
Microdata for machine-readability • attribute itemscope • any HTML element having this attribute represents single item
Microdata for machine-readability <div itemscope> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section itemscope> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <time datetime="2011-04-22">April 20th 2011</time> to <time datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemscope> Hotel OtavArena, Burketova 303, 397 01, Písek </address> </li> </ul> </section>
Microdata for machine-readability • each item has its type • type ~ class, item ~ class instance • attribute itemtype • type should be from a standardized vocabulary • otherwise no one will be able to interpret our content • e.g. http://data-vocabulary.org • http://data-vocabulary.org/Person • http://data-vocabulary.org/Address • http://data-vocabulary.org/Event • http://data-vocabulary.org/Product • http://data-vocabulary.org/Review
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person"> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section itemscopeitemtype="http://data-vocabulary.org/Event"> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <time datetime="2011-04-22">April 20th 2011</time> to <time datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemscopeitemtype="http://data-vocabulary.org/Address"> Hotel OtavArena, Burketova 303, 397 01, Písek </address> </li> </ul> </section>
Microdata for machine-readability • each item has set of properties • name/value pairs • attribute itemprop • property name should be from a standardized vocabulary • otherwise no one will be able to interpret our data • e.g. item of type http://data-vocabulary.org/Person has following properties: • name • photo • url • friend • Address: street-address, city, region, postal-code, country-name • …
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person"> Welcome,<br/> my name is <span itemprop="name">Martin Nečaský</span> and I work as an <span itemprop="title">assistant professor<span> at <span itemprop="org">Charles University</span> in Prague. </div>
Microdata for machine-readability • and item of type http://data-vocabulary.org/Event has following properties: • summary • url • startDate • endDate • location • Description • …
Microdata for machine-readability <section itemscopeitemtype="http://data-vocabulary.org/Event"> <div> <a itemprop="url" href="http://www.cs.vsb.cz/dateso/2011/"> <span itemprop="summary"> DATESO 2011 </span> </a> </div> <div itemprop="description"> After a year, we will again meet at DATESO 2011. We ... </div> <ul> <li><b>When:</b> <time itemprop="startDate" datetime="2011-04-22">April 20th 2011</time> to <time itemprop="endDate" datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemprop="location" itemscopeitemtype="http://data-vocabulary.org/Address"> Hotel OtavArena, <span itemprop="street-address">Burketova 303</span>, <span itemprop="postal-code">397 01</span>, <span itemprop="locality">Písek</span> </address> </li> </ul> </section>
Microdata for machine-readability • let’s get a deeper insight – where is property value for a property? • let E be element with itempropattribute. Then value V is determined by following table
Microdata for machine-readability • let’s get a deeper insight – what if data is separated from my item (e.g. due to page layout)? • no problem, use itemrefattribute to refer other elements with item properties
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person" itemref="myfriendsAmyfriendsB"> Welcome,<br/> my name is <span itemprop="name">Martin Nečaský</span> and I work as an <span itemprop="title">assistant professor<span> at <span itemprop="org">Charles University</span> in Prague. </div> <div id="myfriendsA"> My friends:<br/> <a itemprop="friend" href="http://john.black.com">John Black</a><br/> <a itemprop="friend" href="http://bill.white.com">Bill White</a> </div> <div id="myfriendsB"> My other friends:<br/> <a itemprop="friend" href="http://joe.pink.com">Joe Pink</a> </div>
Projects • Who considers working with machine-readable web pages? • Google Rich Snippets • Yahoo SearchMonkey • Facebook Open Graph Protocol • DBPedia.org • … • Are there any other standards? • http://www.foaf-project.org/ • http://trac.usefulinc.com/doap • http://www.heppnetz.de/projects/goodrelations/ • …
Projects - Google Rich Snippets • consume microformats, microdata, RDFa • data from your web pages may be used by Google for searching and displaying search results
Projects – Yahoo SearchMonkey • consume microformats • project has been stopped in 2010 • but it is integrated to Yahoo and Microsoft products and further developed • http://www.ysearchblog.com/2010/08/17/news-about-our-searchmonkey-program
Projects – FB Open Graph Protocol • enables integrate your web pages into FB social network • own way to achieve machine-readability (via meta elements) <html xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="http://www.facebook.com/2008/fbml"> <head> <title>The Rock (1996)</title> <meta property="og:title" content="The Rock"/> <meta property="og:type" content="movie"/> <meta property="og:url" content="http://www.imdb.com/..."/> <meta property="og:image" content="http://ia.media-imdb.com/..."/> <meta property="og:site_name" content="IMDb"/> <meta property="fb:admins" content="USER_ID"/> <meta property="og:description" content="A group of U.S. Marines, under ..."/> ... </head> ... </html>