220 likes | 329 Vues
This lecture provides an overview of two key querying languages used for processing RDF data: SPARQL, a W3C standard since 2008 similar to SQL, and MQL, which is based on JSON and utilized by Freebase. We cover the basic functionalities of both languages, demonstrate practical query examples, and explain how to access data from sources like DBpedia and Freebase. Suitable for Master's students in Information System Management, this session enhances understanding of semantic web technologies and data manipulation.
E N D
Internet Technologies Making Queries on RDF Master of Information System Management
Two Approaches • SPARQL - Simple Protocol And RDF Query Language - Looks like SQL - Used by Dbpedia • MQL - Metaweb Query Language - Based on JSON - Used by Freebase Both freebase and Dbpedia make their statements available in RDF. Master of Information System Management
Today’s Lecture • A brief look at SPARQL • A brief look at MQL Master of Information System Management
SPARQL • SPARQL Simple Protocol and RDF Query Language • W3C Recommendation January 2008 • Queries written using Turtle - Terse RDF Triple Language • Download Jena and ARQ Query Engine • For Ruby, see ActiveRDF Master of Information System Management
SPARQL • Three specifications: (1) A query language (2) A query results XML format (3) A WSDL 2.0 Data Access Protocol using HTTP and SOAP • SPARQL is read only and cannot modify the RDF data Master of Information System Management
Input <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:rss="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:html="http://www.w3.org/1999/xhtml"> <foaf:Agent rdf:nodeID="id2246040"> <foaf:name>John Barstow</foaf:name> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/> <foaf:weblog> <foaf:Document rdf:about="http://www.nzlinux.org.nz/blogs/"> <dc:title>Visions of Aestia by John Barstow</dc:title> <rdfs:seeAlso> <rss:channel rdf:about="http://www.nzlinux.org.nz/blogs/wp-rdf.php?cat=9"> <foaf:maker rdf:nodeID="id2246040"/> <foaf:topic rdf:resource="http://www.w3.org/2001/sw/"/> <foaf:topic rdf:resource="http://www.w3.org/RDF/"/> </rss:channel> </rdfs:seeAlso> </foaf:Document> </foaf:weblog> <foaf:interest rdf:resource="http://www.w3.org/2001/sw/"/> <foaf:interest rdf:resource="http://www.w3.org/RDF/"/> </foaf:Agent> </rdf:RDF> This is shortblogger.xml The file bloggers.xml has many bloggers. Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?url FROM <shortblogger.xml> WHERE { ?contributor foaf:name "John Barstow" . ?contributor foaf:weblog ?url . } Stored in a file called ex1.rq Master of Information System Management
Output sparql --query ex1.rq ------------------------ | url | ========================= | <http://www.nzlinux.org.nz/blogs/> | -------------------------------------------- Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?url FROM <shortblogger.xml> WHERE { ?contributor rdf:type foaf:Person . ?contributor foaf:weblog ?url . } Output sparql --query ex2.rq ------------------------- | url | =============== | <http://www.nzlinux.org.nz/blogs/> | ------------------------- Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?x ?n FROM <bloggers.xml> WHERE { ?contributor rdf:type foaf:Person . ?contributor foaf:weblog ?x . ?contributor foaf:name ?n } All three conditions must be satisfied to match the query. Master of Information System Management
Output sparql --query ex4.rq -------------------------------------------------------------------------------------- | x | n | ================================================ | <http://www.picklematrix.net/semergence/> | "Seth Ladd" | | <http://www.wasab.dk/morten/blog/> | "Morten Frederiksen" | | <http://www.lassila.org/blog/> | "Ora Lassila" | | <http://people.w3.org/~dom/> | "Hazaël-Massieux" | | <http://xmlarmyknife.org/blog/> | "Leigh Dodds" | | <http://blogs.sun.com/bblfish/> | "Henry Story" | | <http://jeenbroekstra.blogspot.com/> | "Jeen Broekstra" | | <http://people.w3.org/~djweitzner/blog/?cat=8> | "Danny Weitzner" | | <http://danbri.org/words/> | "Dan Brickley" | Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?n FROM <bloggers.xml> WHERE { ?contributor foaf:name ?n } Output ----------------------------------------- | n | ================= | ”Mike McCarthy" | | "Pasquale Popolizio" | | "Dean Allemang" | : : Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?n FROM <bloggers.xml> WHERE { ?contributor foaf:name ?n } ORDER BY ?n ---------------------- | n | ============= | "Alexandre Passant" | | "Alistair Miles" | | "Andrew Matthews" | | "Benjamin Nowack" : : Master of Information System Management
Semi-Structured Data • Definition: If two nodes of the same type are allowed to hold different sets of properties the data is called semi-structured. • SPARQL uses the OPTIONAL keyword to process semi-structured data. Master of Information System Management
Processing PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?n ?interest FROM <bloggers.xml> WHERE { ?contributor foaf:name ?n . OPTIONAL { ?contributor foaf:interest ?interest } } ORDER BY ?n "Tetherless World Constellation group RPI" <http://www.w3.org/2001/sw/> "Tetherless World Constellation group RPI" <http://www.w3.org/RDF/> "Tim Berners-Lee" "Uldis Bojars" <http://www.w3.org/2001/sw/> "Uldis Bojars" <http://www.w3.org/RDF/> Master of Information System Management
Generating XML PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?n FROM <shortblogger.xml> WHERE { ?contributor foaf:name ?n . } Master of Information System Management
From The Command Line sparql --query ex8.rq --results rs/xml <?xml version="1.0"?> <sparql xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xs="http://www.w3.org/2001/XMLSchema#" xmlns="http://www.w3.org/2005/sparql-results#" > <head> <variable name="n"/> </head> <results> <result> <binding name="n"> <literal>John Barstow</literal> </binding> </result> </results> </sparql> Master of Information System Management
MQL • Metaweb Query Language • Make queries against the freebase data store. • The input and outputs are JSON strings. Master of Information System Management
Using MQL In A URL Example from the MQL documentation. Enter the following in your browser: https://api.freebase.com/api/service/mqlread?query= {“query”:{“type”:”/music/artist”,”name”:”The Police”,”album”:[]}} This is a query to freebase represented in JSON. Master of Information System Management
Output { "code": "/api/status/ok", "result": { "album": [ "Outlandos d'Amour", "Reggatta de Blanc", "Zenyatt\u00e0 Mondatta” "The Police Live!" ], "name": "The Police", "type": "/music/artist" }, "status": "200 OK", "transaction_id”:"cache;cache03.p01.sjc1:8101;2011-10-13T15” } Many more albums in the real query result. Master of Information System Management
Another look at the query. { "query": { "type":"/music/artist", "name":"The Police", "album":[] } } Master of Information System Management
Use The Query Editor http://www.freebase.comqueryeditor Master of Information System Management