1 / 64

Realizing Interoperability of Heterogeneous Repositories

Realizing Interoperability of Heterogeneous Repositories. Daniel Olmedilla L3S Research Center / Hannover University Programa de Postgrado en Ingeniería Informática y de Telecomunicación (Máster y Doctorado) Universidad Autónoma de Madrid, 10 th April, 2008. Outline.

urvi
Télécharger la présentation

Realizing Interoperability of Heterogeneous Repositories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Realizing Interoperabilityof Heterogeneous Repositories Daniel Olmedilla L3S Research Center / Hannover University Programa de Postgrado en Ingeniería Informática y de Telecomunicación (Máster y Doctorado) Universidad Autónoma de Madrid, 10th April, 2008

  2. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Common Query Interface • Common Metadata Schema • Ranking • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  3. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Common Query Interface • Common Metadata Schema • Ranking • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  4. IntroductionSimple Motivation Scenario (I) • Simple Scenario: • Alice is interested in learning about Windows and would like to attend a lecture about it this year Universidad Autónoma de Madrid

  5. IntroductionSimple Motivation Scenario (& II) Universidad Autónoma de Madrid

  6. IntroductionSearch Engine Limitations • Unstructured information and lack of semantics • Size and coverage of the Web • Hidden Web (also Deep Web) • Personalized Ranking Universidad Autónoma de Madrid

  7. IntroductionOther Approaches: Coalitions • Repositories interconnected • Lack of standards, ad-hoc solutions • Individual agreement required to join • Approaches • Replication • Loose control over data  sometimes undesirable • Federated Search • Lack of standards  costly Universidad Autónoma de Madrid

  8. IntroductionOther Approaches: P2P Networks • Advantages • Scalability • No single point of failure • Control remains with owners • Dynamicity • Disadvantages • Decrease on performance • Ad-hoc interfaces  lack of interoperability Universidad Autónoma de Madrid

  9. IntroductionA bit More Complex Motivation Scenario • Alice is a consultant and she has been asked to lead a project starting in two months. Now she needs to retrieve courses in order to • refresh and improve her previous knowledge on project management • get some basic knowledge about accounting and auditing • practice her advanced level of English Universidad Autónoma de Madrid

  10. IntroductionProblem Statement • Lack of standards and appropriate integration solutions prevent users from easily and effectively finding relevant resources to their needs Universidad Autónoma de Madrid

  11. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Definition • Why Interoperability? • Challenges to achieve it • Common Query Interface • Common Metadata Schema • Ranking • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  12. Interoperability: What and Why? Exercise 1: simple questions • What is interoperability? • What does it mean two systems interoperate? • And at the information level? Universidad Autónoma de Madrid

  13. Interoperability: What and Why? What is it? • Summary from existing definitions: • Ability of working together to accomplish a common task • Work in conjunction • Exchange of information and USE it • Provided at different levels • Without increasing the effort of the user • [Concise Oxford Dictionary, NISO, IEEE: Standard Computer Dictionary, DMReview, Whatis.com] Universidad Autónoma de Madrid

  14. Interoperability: What and Why? Interoperability encompasses … • Technical Interoperability • Semantic Interoperability • Political Interoperability • Inter-community Interoperability • Legal Interoperability • International Interoperability Universidad Autónoma de Madrid

  15. Interoperability: What and Why? Investment in Technology • ICT Gobally • $1,45 trillion annually • Technology in Europe • €6,4 billion in 2004 • Increasing (10% more than previous year) • [Money for Growth, The European Technology Investment Report 2005. PricewaterhouseCoopers Report, Jun. 2005] Universidad Autónoma de Madrid

  16. Interoperability: What and Why? Key Technological Issues (I) • 38 industry associations in 27 different countries • The most significant technology issues … included • Integration (21%) • Standards (20%) • [International Survey of E-Commerce. World Information Technology and Services Alliance (WITSA), 2000] Universidad Autónoma de Madrid

  17. Interoperability: What and Why? Key Technological Issues (& II) • [International Survey of E-Commerce. World Information Technology and Services Alliance (WITSA), 2000] Universidad Autónoma de Madrid

  18. Interoperability: What and Why? Interoperability Inhibited by Cost • “Although interoperability is a significant strategic direction, it is often inhibited by cost” • [Survey: Integration costs still hamper agility. Computerworld Today, February 2006] Universidad Autónoma de Madrid

  19. Interoperability: What and Why? User Effectiveness: Some Facts • User Effectiveness • Knowledge workers spend from 15% to 35% of their time searching for information • Searchers are successful in finding what they seek 50% of the time or less • Total Lost • not finding the right information: estimated among $2.5 to $3.5 million per year for an enterprise with 1000 knowledge workers • opportunity cost: potential additional revenue of $15 million annually • [Feldman. The high cost of not finding information. IDC White Paper & KMWorld Magazine, 2004] Universidad Autónoma de Madrid

  20. Interoperability: What and Why? Challenges to achieve it Universidad Autónoma de Madrid

  21. Interoperability: What and Why? E-Learning Study Analysis: Technical Requirements • Training-life-cycle in companies across Europe • Retrieving learning services from a wide variety of providers • Search heuristics • Metadata queries • Matching skill gaps with learning service selections • Matching personal development gaps with learning services • [Gunnarsdottir. User Trials – Evaluation Report. EU IST ELENA Deliverable, May 2005] Universidad Autónoma de Madrid

  22. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Common Query Interface • Simple Query Interface • Opening P2P to the rest of the World • Common Metadata Schema • Ranking • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  23. Common Communication Interface Simple Query Interface (SQI) • Simple but Highly flexible: targets different interoperability scenarios • Official CEN/ISSS Workshop Agreement since October 2006 • Listed by IMS on Query Services • Widely adopted in E-Learning community Universidad Autónoma de Madrid

  24. Common Communication Interface Simple Query Interface: Design Issues • Independent of query language, result format and vocabularies • Complex information sources may be queried (e.g., P2P networks) • Synchronous and asynchronous • Support for Lightweight implementations • Stateful and stateless • Access-control and search separation • Easy extensibility Universidad Autónoma de Madrid

  25. Common Communication Interface Simple Query Interface: Session Management • Authentication/authorization are requirements • Independent of the search interface • Separation is managed via sessions • session createAnonymousSession () • session createSession (user, passwd) • destroySession (sessionId) • Other different methods are allowed (e.g., based on credentials or trust negotiations) Universidad Autónoma de Madrid

  26. Not a member? Common Communication Interface Traditional Access Control in Decentralized Systems • Assumption: I already know you---you have a local account! Universidad Autónoma de Madrid

  27. Common Communication Interface Trust Negotiation: Features • Trust is based on parties’ properties • Every party can define access control policies to control outsiders’ access to their sensitive resources • Establish trust iteratively and bilaterally by the disclosure of certificates and by requests for certificates Universidad Autónoma de Madrid

  28. Step 1: Alice requests a service from Bob Step 2: Bob discloses his policy for the service Step 3: Alice discloses her policy for VISA Step 4: Bob discloses his BBB credential Step 5: Alice discloses her VISA card credential Step 6: Bob grants access to the service Service Common Communication Interface Trust Negotiation: Example Alice Bob Universidad Autónoma de Madrid

  29. Common Communication Interface Simple Query Interface: Query (I) Universidad Autónoma de Madrid

  30. Common Communication Interface Simple Query Interface: Query (& II) Universidad Autónoma de Madrid

  31. Common Communication Interface P2P Proxying Architecture • [Brunkhorst, Olmedilla. Interoperability for peer-to-peer networks: Opening P2P to the rest of the World. EC-TEL, Oct 2006] Universidad Autónoma de Madrid

  32. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Common Query Interface • Common Metadata Schema • Learning Resource Schema • Competence Modeling • Ranking • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  33. Common Metadata SchemaData Integration Global as View Local As View Universidad Autónoma de Madrid

  34. Common Metadata SchemaData Integration • Given a query reformulating it in terms of the sources • Is easier in GAV (just needs unfolding of the query) • Is harder in LAV • Adding a new source • Supposedly easier in LAV (just need to express the new source as a view of the global schema) • Harder in GAV (as the global schema needs to be revised) Universidad Autónoma de Madrid

  35. Common Metadata SchemaSimple Learning Resource Schema Universidad Autónoma de Madrid

  36. Common Metadata SchemaComplex Learning Resource Schema Universidad Autónoma de Madrid

  37. Common Metadata SchemaCompetence Requirements • Excerpt extracted from a newspaper • Complete Master’s Degree (any faculty) • Expert knowledge in Java J2EE, Servlets, JSP) • Very good IT English and / or Spanish • Drawbacks • Does not indicate what is mandatory or optional • It is not machine-understandable Universidad Autónoma de Madrid

  38. Common Metadata SchemaCompetence Definition • “an effective performance within a domain / context at different levels of proficiency” • Example: Competency “English Language”, Level “Advanced”, Context ”Computer Science” Universidad Autónoma de Madrid

  39. We use IEEE RCD to represent a Competency Uniquely identify an isolated competency Enriched with human-readable titles and descriptions Common Metadata SchemaCompetency Universidad Autónoma de Madrid

  40. Reusable scales of totally ordered proficiency levels Each level is identified by an ID, a human-readable label and an optional mapping to a numerical domain Common Metadata SchemaProficiency Level Universidad Autónoma de Madrid

  41. “... the interlaced conditions in which something exists or occurs” Competences might be interpreted differently in a different context Context are defined in tree-like hierarchies Easier to model and to handle Simpler algorithms, no cycle detection necessary May optionally link to additional ontologies Common Metadata SchemaContext Universidad Autónoma de Madrid

  42. Common Metadata SchemaCompetence • Links to the dimensions objects • High degree of reusability • Better support for gap analysis • Competences can be simple or composed of other (arbitrary nested) competences • Aggregation • Set Selection Universidad Autónoma de Madrid

  43. Common Metadata SchemaA bit More Complex Motivation Scenario (Revisited) • Alice is a consultant and she has been asked to lead a project starting in two months. Now she needs to retrieve courses in order to • refresh and improve her previous knowledge on project management • get some basic knowledge about accounting and auditing • practice her advanced level of English Universidad Autónoma de Madrid

  44. Outline • Introduction and Motivation • Interoperability: what is it and why is it needed? • Common Query Interface • Common Metadata Schema • Ranking • Link-based Personalized Ranking Platform • Successful Interoperability Demonstrations • Conclusions & Open Issues Universidad Autónoma de Madrid

  45. RankingPageRank • Page score based on the link structure of the web • It measures page popularity • page i pointing to page j means vote from i to j • The more backlinks a page has, the more important it is • Sum of the ranks of the backlinks Universidad Autónoma de Madrid

  46. RankingPageRank Example Universidad Autónoma de Madrid

  47. RankingPageRank Personalization • It has a personalization vector • Computationally expensive: not possible to make the whole computation for each user Universidad Autónoma de Madrid

  48. RankingPersonalized PageRank • Hubs: pages pointing to many important pages • Compute one Personalized PageRank Vector for each user (PPV) • Challenges: • Reduce storage required • Reduce time for computation • Each PPV corresponding to a Preference Set P can be expressed as a linear combination of Basis Hub Vector • Decomposes each Basis Hub Vector in two parts: • Hub skeleton vector (common interrelationships and precomputed) • Partial vector (unique values and computed at construction-time) Universidad Autónoma de Madrid

  49. RankingPersonalized PageRank Limitations • Personalization relies on user’s ability to choose a good Preference Set • High quality hubs which match his preferences • This process can be automated: • Information collected from the user can be used to derive his Preference Set • User does not even need to know what is a hub Universidad Autónoma de Madrid

  50. RankingA Personalized Ranking Platform (I) • Personalization relies on user’s ability to choose a good Preference Set • High quality hubs which match his preferences • This process can be automated: • Information collected from the user can be used to derive his Preference Set • User does not even need to know what is a hub Universidad Autónoma de Madrid

More Related