430 likes | 557 Vues
This article delves into the realm of linked data mashups, highlighting the journey from querying to data visualization using SPARQL. We explore how RDF triples are constructed with subjects, predicates, and objects using URIs—focusing on New York as a case study. The piece discusses current technologies supporting linked data, such as APIs and data catalogs, while addressing challenges like lack of standards and data silos. Additionally, it showcases the SPARQL query language’s capabilities in semantic web applications, emphasizing the potential for decentralized data linking and modularity.
E N D
Linked Data Mashups:From Query to Visualization Dominic DiFranzo
RDF Triple: Subject Predicate Object Use URI for universal naming New York has the postal abbreviation NY <urn:x-states:New%20York> <http://purl.org/dc/terms/alternative> "NY" .
Linking I found a new dataset and it has the following triple <http://dbpedia.org/page/New_York> <http://dbpedia.org/ontology/Place/otherName> “The Empire State” .
owl:sameAS <urn:x-states:New%20York> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/page/New_York> .
Current Technology • Sunlight Foundation’s National Data Catalog, Socrata, Open311 API, and Microsoft’s Open Government Data Initiative, etc • Store in some backend, release data through an API.
Challenges • Only ask what its built to answer • No standard - must relearn each time • Opaque - no way for consumers to see, reuse or improve the data model • Silos of Data - no linking at the data level • VeryTop Down
Linked Data • decentralized - sources may be spread out and referenced across the Web • modular - linked without advance planning or coordination • scalable - once store in place, it’s easy to extend • advantages hold even when definitions and structure of the data changes over time.
Sparql SPARQL is a query language for the Semantic Web.
Sparql SELECT ?node ?title WHERE{ ?node <http://purl.org/dc/elements/1.1/title> ?title . } LIMIT 1
Long! SELECT ?node ?name WHERE{ ?node <http://xmlns.com/foaf/0.1/givenname> ?name . ?node <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . } LIMIT 10
Prefix PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?node ?name WHERE{ ?node foaf:givenname ?name . ?node rdf:typefoaf:Person . } LIMIT 10
Shortcuts PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?node ?name WHERE{ ?node foaf:givenname ?name ; rdf:typefoaf:Person . } LIMIT 10
Named Graph PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?graph ?node ?title WHERE{ GRAPH ?graph{ ?node dc:title ?title . } } LIMIT 3
Named Graph PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?node8 ?desc8 ?node401 ?desc401 WHERE{ GRAPH <http://data-gov.tw.rpi.edu/vocab/Dataset_401>{ ?node401 dc:description ?desc401 . } GRAPH <http://data-gov.tw.rpi.edu/vocab/Dataset_8>{ ?node8 dc:description ?desc8 . } } LIMIT 3
Union PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?node8 ?desc8 ?node401 ?desc401 WHERE{ { GRAPH <http://data-gov.tw.rpi.edu/vocab/Dataset_401>{ ?node401 dc:description ?desc401 . } }UNION{ GRAPH <http://data-gov.tw.rpi.edu/vocab/Dataset_8>{ ?node8 dc:description ?desc8 . } } } LIMIT 3
Optional PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?node ?name ?givenname WHERE{ ?node foaf:name ?name . OPTIONAL{ ?node foaf:givenname ?givenname . } }
Filter PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?node ?name ?givenname WHERE{ ?node foaf:name ?name . ?node foaf:givenname ?givenname . FILTER regex(?name, "Biden") . }
SPARQLProxy • This is a web service that allows you to query any SPARQL endpoint, and get back the results in any format you want. • A RESTful way to query any endpoint in any environment.
SPARQLProxy http://logd.tw.rpi.edu/sparql? Paramiters: query: [required] encoded String of SPARQL query query-uri :[required] URI of SPARQL query (use as an alternative to "query" parameter. These two parameters are mutul-exclusive)
SPARQLProxy service-uri: [required] URI of SPARQL Endpoint – default is the LOGD endpoint output: output format. ''xml'' - SPARQL/XML (default) : ''exhibit'' - JSON for MIT Exhibit : ''gvds'' - JSON for Google Visualization : ''csv'' - CSV : ''html'' - HTML table : “sparql” - SPARQL JSON
Example • http://logd.tw.rpi.edu/sparql.php?query-option=text&query=PREFIX+conversion%3A+%3Chttp%3A%2F%2Fpurl.org%2Ftwc%2Fvocab%2Fconversion%2F%3E%0D%0ASELECT+%3Fg+sum%28+%3Ftriples+%29+as+%3Festimated_triples%0D%0AWHERE+{%0D%0A++GRAPH+%3Fg++{%0D%0A+++%3Fg+void%3Asubset+%3Fsubdataset+.%0D%0A+++%3Fsubdataset+conversion%3Anum_triples+%3Ftriples+.%0D%0A++}%0D%0A}+%0D%0AGROUP+BY+%3Fg%0D%0A&service-uri=&output=html&callback=&tqx=&tp=
Example // compose query $sparqlproxy_uri = "http://logd.tw.rpi.edu/ws/sparqlproxy.php" $params = array(); $params["query-uri"] = "http://logd.tw.rpi.edu/demo/retrieving-sparql-results/datagov-list-loaded-dataset.sparql"; $params["service-uri"] = "http://services.data.gov/sparql"; $params["output"] = "gvds"; $query= $sparqlproxy_uri."?". http_build_query($params,,'&') ; //specific for Drupal //show query result echo file_get_contents($query);
Visualizing The Data • Many JavaScript API and Libraries to help make visualizations • Trades in eases of use and control/customization. • We will focus on the Google Visualization API, very easy to use out-of-the-box but almost impossible to customize outside of what they provide. http://code.google.com/apis/chart/interactive/docs/gallery.html
Visualization Example • Start with a dataset(s) • We will look into State Library Agency Survey: Fiscal Year 2006http://logd.tw.rpi.edu/source/data-gov/dataset/353/version/1st-anniversary and Tax Year 2007 County Income Data http://logd.tw.rpi.edu/source/data-gov/dataset/1356/version/2009-Dec-03
Example • Lets make a map of "Adjusted Gross Income(AGI) per Capita” • a US map where each state is colored according to the average AGI per person living in that state. • We obtain a state's AGI data from Dataset 1356 and a state's population data from Dataset 353.
Lets make a query • http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.sparql
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>AGI per Capita Map</title> </head> <body> <div>AGI per Capita Map</div> <div id='map_canvas'>Loading Map ...</div> </body> </html>
<!-- import Google visualization API --> <script type="text/javascript" src="http://www.google.com/jsapi"></script>
<script type="text/javascript"> // load google visualization packages - STEP 1 google.load('visualization', '1', {'packages': ['geomap']}); // set callback function for drawing visualizations - STEP 2 google.setOnLoadCallback(drawMap);
function drawMap() { //Query data - STEP 3 varsparqlproxy = "http://logd.tw.rpi.edu/sparql"; varqueryloc = "http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.sparql"; varqueryurl = sparqlproxy + "?" + "output=gvds” + “&query-option=uri” + "&query-uri=" + encodeURIComponent(queryloc) ; var query = new google.visualization.Query(queryurl); query.send(handleQueryResponse); }
function handleQueryResponse(response){ // Check for query response errors. - STEP 4 if (response.isError()) { alert('Error in query: ' + response.getMessage() + ' ' + esponse.getDetailedMessage()); return; }
// read data - STEP 5 var data = response.getDataTable(); // create new data - STEP 6 varnewdata = new google.visualization.DataTable(); newdata.addColumn('string', 'State'); newdata.addColumn('number', 'AGI per Capita');
// populate each row - STEP 7 var rows = data.getNumberOfRows(); for (vari = 0; i < rows; i++ ) { var state = 'US-' + data.getValue(i, 0); // AGI figure uses thousand-dollar unit var value = Math.round(data.getValue(i, 1)*1000/ data.getValue(i, 2)); newdata.addRow([state, value]); }
// configuremapoptions - STEP 8 var options = {}; options['region'] = 'US'; // show US map options['dataMode'] = 'regions'; options['width'] = 900; options['height'] = 550;
// define geomap instance - STEP 9 varviz = document.getElementById('map_canvas'); new google.visualization.GeoMap(viz).draw(newdata, options ); }//end of handleQueryResponse function </script>//end of JavaScript Tag
See Live Version - http://logd.tw.rpi.edu/demo/building-logd-visualizations/agi-per-capita-v2.html