1 / 25

Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture

Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture. Information Architecture can be defined as: 1. Document Based Static/Library: The current paradigm is Libraries; from stone tablets to digital databases.

mary
Télécharger la présentation

Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture

  2. Information Architecture can be defined as: 1. Document Based Static/Library: The current paradigm is Libraries; from stone tablets to digital databases. Documents are manually collected, read, classification assigned, and stored. Users find documents by static classification system and keyword tools. Users must retrieve entire documents to find knowledge inside the document. Users must provide their own context and analysis of the document. But: New documents do not automatically update the classification system or database. Classification system can be changed by manually re-classifying all old documents. Users must take the initiative to query the Library; and manually process the results. Knowledge Based Dynamic/Network The next generation is Synthetic Information Networks based on knowledge. Documents are automatically captured based on end user collection priorities. Documents are automatically pre-processed to extract the all the concepts and context. Documents are automatically classified and stored based on latest concept schema. Users automatically define / refine their topic priorities to “teach” the System their needs. The system automatically provides knowledge: summaries, abstracts, analysis and translations. The next generation operates in parallel and above static libraries and keyword tools.

  3. Information Architecture Trends: Commercial and government user communities will soon be forced to migrate: 1) Tidal Wave of Information Shifts Power The tidal wave of raw data will drive the expansion of Semantic Web architecture and applications. 2) Migration to XMLand RDF Standards Applications will follow Microsoft’s migration to XML/RDF standards for document authoring and exchange. 3) Universal Internet Web Portals Internet web portals based on Semantic Web applications will become the new central user application. 4) Parallel Legacy Database Integration Legacy databases will be extracted into parallel Semantic Web databases with integrated concept and context. 5) Global and Language Expansion Information sources and users will expand globally and drive no-loss translation between language domains. 6) Network Access and Distribution Semantic Web architecture will link data and users between servers, desktop, laptop, PDA, and cell phones. 7) Machine Transactions and Network Capacity Machine Semantic transactions will grow exponentially; and increases network investment, capacity and services. 1981 déjà vu again; The Technology Outcome is known; the Leadership is not yet known.

  4. 1. Paper forms include both Headings (structure) and Descriptions (content) Name “Name” has legal, spelling and other variations “Address” has time, accuracy and other variations Address “Comments” is completely unstructured data; And may refer to and/or conflict with other data In one document or documents, or the database. Comments “According to sources who have reported reliably in the past....” The Data look “hard” and “fixed” but are very are “soft” and “fluid”.

  5. 2. Electronic systems duplicate the same hard (structure) and soft (content). Name Name Address Address Comments Comments The Data are copied from paper to bits; but no value is added.

  6. 3. Relational databases define categories and store data in a static library archive. “List” “Print” “Compare” “Search” Name1 Address1 Name1 Name1 Name Name2 Address1 Name2 Name1 Address Comments Name3 Comment1 Name4 Name5 Electronic search functions are the same as human functions. Only faster.

  7. Most systems today use simple keyword tools to search in static libraries: Google and other keyword tools are better than nothing; But, fail the efficiency and performance confidence tests: a) Scale factors: More data is more hits: “You have 100,000 hits…” b) False Positives: You will only read the first 3-10; most hits will not be relevant to your needs. c) False Negatives: You will miss valid sources; if a word, term or user community is different. d) Raw Sources: A perfect hit is a raw document; not the summary, analysis, context or expertise. Name Address Comments Single data domains with simple keyword tools are rapidly obsolete.

  8. Adding Multiple Databases and Tools quickly reduces quality and efficiency. Name Address Comments Search multiple sources with multiple tools is slower and less accurate.

  9. Adding new databases creates multiple conflicting terms and classifications. Name Address Comments More new data now complicates rather than improves the system.

  10. Infinite interface patches cannot integrate all new data in all new databases. Name Address Comments Note: This is the $500M Trilogy Program Patching the new linkages in new databases is an exponential $ problem.

  11. Solution: Capture all data in multiple internal and inter-agency legacy databases, and automatically integrate new data, new classifications and new definitions: Structured Relational Database Unstructured Text Data Taxonomies and Context Sources Semantic Processor First Step is extract and process data; and store in one Semantic Web.

  12. Semantic Web technology operates above and in parallel to legacy systems; Preserving the legacy system, and dramatically enhancing the performance. Semantic Processor Semantic Database Semantic Context Legacy Databases Other Semantic Web Sources Synthetic Information Network Legacy Tools & Legacy Database Semantic Web Portal& Local Database Second Step is a network with access to ALL sources in ONE Portal.

  13. The Solution is a Semantic Web Architecture and a Dynamic Process:

  14. Synthetic Information architecture supports all data sources and user applications:

  15. A Semantic Web Database: Integrating 3 Unique Sources for “Duke”

  16. A Semantic Web Database: Integrating 3 Unique Sources for “Duke”

  17. Semantic Web Example: “Java” Database Built from unstructured text data: • Java occupies the central island in the Indonesia archipelago. • Java is a computer language is controlled by Sun Microsystems. • Java is often used as the slang term for coffee beverages. Semantic Web Defines the relationships between the 3 “Java” concepts

  18. Separating unstructured text into Knowledge Clusters or Ontologies • Java = geography cluster • Java = computer language cluster • Java = beverage cluster Automatically creates Knowledge Clusters that define unique concepts and attributes

  19. The Semantic Processor automatically defines the Specific Relationships: Identifying the Temporal, Logical, and Topical Contents within and between Clusters. Temporal Relationship Logical Relationship Logical Relationship Topical Concepts

  20. Semantic Processor Automatically builds a rich Ontology for each Concept: New data adds to the context of old data; and enhances the value of all data. Java Ontology Computer Ontology Coffee Ontology

  21. Semantic Processor Automatically expands these Ontologies with new data: • A Semantic Processor has automatically: • a. Defined the source language and • loaded the correct language processor. • b. Analyzed the document, and defined the • Logical relationships between concepts. • c. Extracted the data in RDF format; and • built a rich ontology for each concept. • d. Stored the information in Semantic Web • Database to permit rapid modifications. • e. Created the classification categories to • Store and retrieve the document. • e. Made all the information available to any • Semantic web application program and user. • f. Retained the original document and • linked back to it as a reference.

  22. Information is extracted automatically in RDF format: The external header and internal content is integrated and stored in a common RDF format.

  23. All information is available through a Single Semantic Web Portal Interface: • User Configuration • All Sources • Original Languages • All File Formats • Security Levels • Interactive Dialogue • - Related Terms • - Similar Terms • - Expanded Phrases • Concept Groups • - Dynamic categories • from user queries • Dynamic Summary • - Abstract from a • document or a • Concept Group. • Document Classification • Author • Document Date • Relevance Ranking • Original Documents • Category & Title • Brief Summary • File Format • Data Source • Language • Related Documents Searching multiple database with multiple tools and conflicting results is solved

  24. Synthetic Information Architecture supports powerful New Applications: • Search: Find any document in any on-line or intra-net database • Capture: Convert documents from any format:(Word, PDF, HTML, etc) • Synthesize: Build and maintain a real time semantic web database. • Summarize: Automatic summaries in any size, format or focus. • Analyze: Find and summarize sources and answers to questions. • Reference: List documents and sources that support user questions. • Report: Distribute sources, summaries, analysis, and experts. • Alert: Automatic customized search, analysis and reporting • Experts: Identify and qualify all experts and organization sources. • Internet: Search the Internet or Intranet sources automatically. • Single Portal: Access all information sources from a single portal. • Automate: search, extraction, summary, analysis and reporting Advanced functions are not possible with static library and keyword tools

  25. Conclusions: 1. Synthetic Information architecture and applications will follow IT history: IE: Intel/Microsoft architecture with Visi-calc spread sheet applications in 1981-83. 2. Dynamic Networks will rapidly grow on top of legacy static library systems. No need to stop current systems; only interface through RDF/XML and enhance them. 3. Scarce/expensive professional users will drive the architecture migration. Greatest pain/gain is the advanced professional users; not management or IT staff. 4. Architecture change will grow quickly from the Outside In / Bottom Up: The scale of cost, risk, schedule, training is within small office skills/budgets. 5. Established IT vendors will wait for large client RFI/RFPs; and miss the boat. Architecture / Applications dominance is driven by forward design wins; not purchasing. 6. The Limiting Factor is Technical Leadership; not Procurement Funding. The installed costs are so low, and payback so rapid in professional time and quality.

More Related