1 / 19

~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel

~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel. Previous work. MSc. Thesis, 2002, “Information Fusion in a Cooperative Multiagent System for Web Information Retrieval”

Télécharger la présentation

~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel

  2. Previous work • MSc. Thesis, 2002, “Information Fusion in a Cooperative Multiagent System for Web Information Retrieval” • K. B. Shaban, O. A. Basir, K. Hassanein, and M. Kamel, "Intelligent Information Fusion Approach in Cooperative Multiagent Systems", World Automation Congress. June 2002. • K. B. Shaban, O. A. Basir, K. Hassanein, and M. Kamel, "Information Fusion in a Cooperative Multi-agent System for Web Information Retrieval", The Fifth International Conference On Information Fusion, July 2002.

  3. Envisioned View of the System User User Personal Agent Personal Agent Intermediate “Fusion” Agent Resource “Information Retrieval” Agent Environment “The Web” System vision

  4. A1 Z1 R1 Z1 A2 Z2 Z1 R2 Z2 Zn Environment A1 Z3 A3 An A1 A2 An R1 Zn RG Rn R1 DECISION MAKER (a) Markovian team. (b) Centralized team. R2 R2 Rn DECISION MAKER A2 An RG RG Zn Z2 Consensus team. (c) Decision Fusion

  5. Retrieval Agent AltaVista Retrieval Agent Personal Agent Fusion Agent Excite Retrieval Agent AltaVista Implementation

  6. Current Project “Semantic-based Document Clustering”

  7. Project Goals • Clustering documents based on semantic similarities of their contents • Lend ideas to other mining projects • PhD. thesis by 2005/2006!

  8. Document Cluster Low Inter-cluster similarity Document Cluster Clustering Documents High Intra-cluster similarity Document Cluster Document Clustering

  9. Applications • Improve information retrieval systems performance • Improve the organization and viewing of documents • Accelerate nearest-neighbour search • Generate directories of hierarchy clusters • Improve automatic speech recognition systems

  10. Existing Schemes • Data representation models • Documents as bags-of-words (Vector Space Model (VSM)) • N-grams • Latent Semantic Indexing (LSI) • Phrase-based • Similarity measures • Euclidean distances • Minkowski distances

  11. Existing Schemes, Cont. • Clustering algorithms • Partitioning (k-means & Fuzzy C-means) • Geometric (Self-Organized Maps (SOM), LSI) • Probabilistic (Maximization Expectation (ME), Probabilistic LSI) • Evaluation methods • Entropy • F-measure • Overall Similarity

  12. Shortcomings • Abandoning meanings produce wrong results! • Ex. • ”John eats the apple standing beside the tree“ vs. ”The apple tree stands beside John’s house” • ”John is an intelligent boy“ vs. “John is a brilliant son”

  13. Knowledge Representation scheme Parse Tree Documents Document Cluster Syntactic analysis Semantic analysis Semantic- based document clustering Document Cluster Document Cluster Proposed Approach

  14. Proposed Approach - Steps • Preprocess text • Remove tags, hyperlinks, etc. • Morphological analysis • Identifying words, punctuations, etc. • Syntactic analysis • Building sentences grammatical structures (Parse Tree) • Semantic analysis • Assigning meaning to words • Discourse integration • Pragmatic analysis • Knowledge representation structure • Clustering using the produced representations • New similarity measures • New clustering algorithm • Better document clustering results (hopefully!)

  15. Parse Trees sent 2 sent 1 clause 1 clause 1 clause 2 adv np vg adv np np vg np prep prep n v apos det v n n n det n det standing the apple house the John eats tree the tree stands beside John beside apple ‘s Illustration • “John eats the apple standing beside the tree.” vs. “The apple tree stands beside John’s house.”

  16. eats the apple John Act 1 Obj 2 Obj 1 The apple tree Stands beside John’s house standing beside the tree Obj 1 St 1 Act 2 Obj 3 Illustration, Cont. Knowledge Representations

  17. Relation to LORNET? • Findings can be applied to Learning Objects (LO) mining • Knowledge Representations • Clustering • Classification • Retrieval • Knowledge Sharing

  18. Phase 1 Grad. courses Lit. review Proposal Comp. Exam Phase 2 Development Experimentations Evaluations Phase 3 Reporting Thesis writing Defence Jan 03 Jan 04 Jan 05 Milestones

  19. Thank you!Questions?

More Related