1 / 23

TANGO

TANGO. Table Analysis for Generating Ontologies David W. Embley (BYU) & George Nagy (RPI) under NSF Awards 0414644 and 0414854 INFORMATION & KNOWLEDGE MANAGEMENT Dr. Maria Zemankova (a) Table Interpretation (b) Query by Table. TABLE. TANGO STEPS. Wang Notation Tool. INTERPRETED TABLE.

janina
Télécharger la présentation

TANGO

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TANGO Table Analysis for Generating OntologiesDavid W. Embley (BYU) & George Nagy (RPI)under NSF Awards 0414644 and 0414854 INFORMATION & KNOWLEDGE MANAGEMENTDr. Maria Zemankova (a) Table Interpretation (b) Query by Table NSF TANGO BYU/RPI

  2. TABLE TANGO STEPS Wang Notation Tool INTERPRETED TABLE Wang Notation & XML MINI ONTOLOGY Ontology Editor GROWING ONTOLOGY Annotated Semantic Web Pages Standard Ontology Language (OWL) Ontology Based Web Services Form Based Specification Extraction Ontologies Relational Databases Query By Table NSF TANGO BYU/RPI

  3. TABLE This presentation Wang Notation Tool INTERPRETED TABLE Wang Notation & XML MINI ONTOLOGY Ontology Editor GROWING ONTOLOGY Annotated Semantic Web Pages Standard Ontology Language (OWL) Ontology Based Web Services Form Based Specification Extraction Ontologies Relational Databases Query By Table NSF TANGO BYU/RPI

  4. (a) Table Interpretation Confirm or correct HTML web pages Extract table Matlab table XMLtable Wang Notation Construct Wang notation Confirm orcorrect Mini Ontology NSF TANGO BYU/RPI

  5. Median Income tablehttp://www40.statcan.ca/l01/cst01/famil108a.htm?sdi=median%20income NSF TANGO BYU/RPI

  6. Median Income table displayed from Canada Statistics displayed in TANGO Wang Notation Tool NSF TANGO BYU/RPI

  7. Wang Notation • Abstract table is specified by ordered pair (C,) - (category,delta) • C is a finite set of labeled domains (header, sub headers of tables, etc) •  represents each individual value within a table corresponding to C. NSF TANGO BYU/RPI

  8. Categories • Two categories in previous table. • CATEGORY 1: (Region_Virtual,{(Canada,phi), (Newfoundland and Labrador,phi), (Prince Edward Island,phi), (Nova Scotia,phi), (New Brunswick,phi), (Quebec,phi), (Ontario,phi), (Manitoba,phi), (Saskatchewan,phi),(Alberta,phi),(British Columbia,phi),(Yukon Territory,phi), (Northwest Territories,phi), (Nunavut,phi)}) • CATEGORY 2: (Year_Virtual, {(2001,phi), (2002,phi), (2003,phi), (2004,phi), (2005,phi)}) NSF TANGO BYU/RPI

  9. Content (leaf) cells • Delta Notation for two (of 15) rows: delta({Year_Virtual.2001,Region_Virtual.Canada})=53,500 delta({Year_Virtual.2002,Region_Virtual.Canada})=55,000 delta({Year_Virtual.2003,Region_Virtual.Canada})=56,000 delta({Year_Virtual.2004,Region_Virtual.Canada})=58,100 delta({Year_Virtual.2005,Region_Virtual.Canada})=60,600 delta({Year_Virtual.2001,Region_Virtual.Newfoundland and Labrador})=41,400 delta({Year_Virtual.2002,Region_Virtual.Newfoundland and Labrador})=43,200 delta({Year_Virtual.2003,Region_Virtual.Newfoundland and Labrador})=44,800 delta({Year_Virtual.2004,Region_Virtual.Newfoundland and Labrador})=46,100 delta({Year_Virtual.2005,Region_Virtual.Newfoundland and Labrador})=47,600 NSF TANGO BYU/RPI

  10. XML Representation:Schema for (1) table (2) categories (3) data cells (4) augmentation <InterpretedTable xsi:noNamespaceSchemaLocation="G:\RPI\XML\02_TableInterface.XS.070803.xml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table TableOID="Table2" Number="2" DocumentCitation="Wang's Thesis" Title="Wang table" Caption="Grades in 1991 and 1992"> <CategoryNodes> <CategoryNode CategoryNodeOID="C1" Label="Median Total Income"></CategoryNode> <CategoryNode CategoryNodeOID="C11" Label="Canada"></CategoryNode> <CategoryNode CategoryNodeOID="C12" Label="Newfoundland and Labrador"></CategoryNode> <CategoryNode CategoryNodeOID="C13" Label="Prince Edward Island"></CategoryNode> <CategoryNode CategoryNodeOID="C14" Label="Nova Scotia"></CategoryNode> <CategoryNode CategoryNodeOID="C15" Label="New Brunswick"></CategoryNode> <CategoryNode CategoryNodeOID="C16" Label="Quebec"></CategoryNode> <CategoryNode CategoryNodeOID="C17" Label="Ontario"></CategoryNode> <CategoryNode CategoryNodeOID="C18" Label="Manitoba"></CategoryNode> <CategoryNode CategoryNodeOID="C19" Label="Saskatchewan"></CategoryNode> <CategoryNode CategoryNodeOID="C110" Label="Alberta"></CategoryNode> <CategoryNode CategoryNodeOID="C111" Label="British Columbia"></CategoryNode> <CategoryNode CategoryNodeOID="C112" Label="Yukon Territory"></CategoryNode> <CategoryNode CategoryNodeOID="C113" Label="Northwest Territories"></CategoryNode> <CategoryNode CategoryNodeOID="C114" Label="Nunavut"></CategoryNode> <CategoryNode CategoryNodeOID="C2" Label="Year (Virtual)"></CategoryNode> <CategoryNode CategoryNodeOID="C21" Label="2001"></CategoryNode> <CategoryNode CategoryNodeOID="C22" Label="2002"></CategoryNode> <CategoryNode CategoryNodeOID="C23" Label="2003"></CategoryNode> <CategoryNode CategoryNodeOID="C24" Label="2004"></CategoryNode> <CategoryNode CategoryNodeOID="C25" Label="2005"></CategoryNode> </CategoryNodes> </Table> <CategoryParentNodes> <CategoryParentNode CategoryParentNodeOID="C1"> <CategoryNodes> … … XML file for this table has ~350 lines of Object Identifier tags NSF TANGO BYU/RPI

  11. Verification tool: category headers for a selected content cell NSF TANGO BYU/RPI

  12. Verification tool:content cells for a selected header NSF TANGO BYU/RPI

  13. Verification tool:hierarchical category structure for a selected content cell NSF TANGO BYU/RPI

  14. (b) Query by Table Income 2002 $4500 2003 $3300 2004 $1240 2005 $3400 Income 2002 2003 2004 2005 QBT InterpretQuery Table Database Ontology from many tables NSF TANGO BYU/RPI

  15. Query Table Composed in MS-Excel by a person seeking information from an ontology compiled from many web tables NSF TANGO BYU/RPI

  16. Display of automatically processed Query Table for human verification NSF TANGO BYU/RPI

  17. Wang notation for Query Table NSF TANGO BYU/RPI

  18. QBT identifies requested data NSF TANGO BYU/RPI

  19. URLs of tables in the Example Database • Median Total Income : http://www40.statcan.ca/l01/cst01/famil108a.htm?sdi=median%20income* • Number of Induced Abortions: http://www40.statcan.ca/l01/cst01/health40a.htm?sdi=abortions • Number of Divorces: http://www40.statcan.ca/l01/cst01/famil02.htm?sdi=number%20divorces • Infant Mortality Rate: http://www40.statcan.ca/l01/cst01/health21a.htm?sdi=infant%20mortality%20rate* • Trips By Canadians in Canada: http://www40.statcan.ca/l01/cst01/arts26a.htm • Number of Homicides:http://www40.statcan.ca/l01/cst01/legal12a.htm?sdi=homicide • Population:http://www40.statcan.ca/l01/cst01/demo02a.htm?sdi=population • Number of Persons with Diabetes: http://www40.statcan.ca/l01/cst01/health54a.htm?sdi=diabetes • Number of Persons with Asthma: • http://www40.statcan.ca/l01/cst01/health50a.htm?sdi=asthma • University Degrees Awarded to Males: http://www40.statcan.ca/l01/cst01/educ51b.htm • University Degrees Awarded to Females: http://www40.statcan.ca/l01/cst01/educ51c.htm • Food services and drinking places (13 tables):http://www40.statcan.ca/l01/cst01/serv24j NSF TANGO BYU/RPI

  20. Fields in the Example Database • IDENTIFIER • REGION • YEAR • NUMBER_OF_ABORTIONS • ABORTION_RATE • NUMBER_OF_DIVORCES • INFANT_MORTALITY_RATE • NUMBER_OF_TRIPS • MEDIAN_TOTAL_INCOME • POPULATION • NUMBER_OF_HOMICIDES • GENDER • INCIDENCE_OF_DIABETES • UNIVERSITY_DEGREES_AWARDED • INCIDENCE_OF_ASTHMA • RESTAURANT_OPERATING_REVENUE • RESTAURANT_OPERATING_EXPENSES • RESTAURANT_OPERATING_PROFIT_MARGIN • RESTAURANT_OPERATING_WAGES NSF TANGO BYU/RPI

  21. QBT fills in requested data from Example Database NSF TANGO BYU/RPI

  22. A current puzzle How can QBT tell that these two query tables represent the same request? NB: Although plausible, both of these tables exemplify poor layout. NSF TANGO BYU/RPI

  23. Next steps • Complete the conversion of Wang/XML table descriptions to mini ontologies • Improve the interface for generating cumulative ontology from mini ontologies • Implement database generation from ontology • Embed logging routines for statistical evaluation of time/error trade-offs NSF TANGO BYU/RPI

More Related