ITEC 4020 M Group 18 Amna Al-Omari Divya Love Norbert Megler Omer Saleem Sachin Uppal Shahla Defileh

ITEC 4020 M Group 18 Amna Al-Omari Divya Love Norbert Megler Omer Saleem Sachin Uppal Shahla Defileh

WEB SEARCH SYSTEM Presentation Overview Brief Overview of Assignment Objective Structure and Functionality Search Demonstration Questions

WEB SEARCH SYSTEM Introduction Our website, can be found at http://unix.aml.yorku.ca:8080/w04_g18/search.jsp Our Web Search system is based on inverted file indexing using the XML document which has been created by the crawler that was supplied to us. Our site contains 3 main WebPages: * The main Search page which is built by JSP and contains a text box and 2 buttons (reset and submit). * The result page which is built by JSP and contains all the hyperlinks for all documents that hold the keyword. * The display page which is built by Xml and displays the “clicked on” document.

WEB SEARCH SYSTEM Logical structure 1- java class which will read the given XML file and split it into 1139 separate XML documents. - We read the XML file using FileInputStream and BufferedReader. - The file is read one line at a time and each line is compared to the index “<PubmedArticle>” which signals the beginning of a new article. - Upon detection of word a new XML document file is created - The file number is kept track off and once the whole article is written into a file, the file counter is incremented by one.

WEB SEARCH SYSTEM CONTINUED 2- create a Temporary (merged) file which goes through the entire 1100 document and identifies: all terms their document number frequency

WEB SEARCH SYSTEM Continued.. 3- Next we create the First level index. - It uses a simple java class which reads from the Temporary file. - This index includes a counter (which represents the total number of terms), the terms which appear only once, number of the document that includes that specific term, and the total number of frequency of each term; all of this is then written into a text file.

Continued 4- Next is the creation of the second level indexing, created by a simple java class which includes counter, term, document number, and frequency.

Searching Functionality We are using MVC (model view controller) architecture i.e. Servlet acting as controller, JSP is used for displaying results and Java Bean has the main business logic. Once the user submits the keyword, the search functionality goes through the first level index to find the counter number for that specific term and then matches that counter number with all the XML document number which appears in the second index. Then goes through all the XML documents and grabs all the relevant document title for display.

Displaying of Results. XSL files take care of this functionality. Once the user clicks on any hyperlink that specific document displays through XML which uses XSL file.

The End Questions? Thank You

ITEC 4020 M Group 18 Amna Al-Omari Divya Love Norbert Megler Omer Saleem Sachin Uppal Shahla Defileh

ITEC 4020 M Group 18 Amna Al-Omari Divya Love Norbert Megler Omer Saleem Sachin Uppal Shahla Defileh

Presentation Transcript

Omer Angel

Dr Wael Al-Omari BDS, MDentSci, PhD

Budoor Al Shehhi Aisha Al Shamsi Raya Al Mazrouei Amna Bedwawi Manal Al Mazrouei

SHAHLA NAZ

Sachin Tendulkar

Group members :- - Muna – Maitha - Reem – Aseel-Amna

By: Amna Al-Hammadi Science 8B

Y AMNA ALI

Norbert M. Nemes

Sachin

Imaje 4020 – Software

Amna Jawed

Group #18

Norbert Holtkamp November 18, 2011

TAREQ ABU SHAHLA

Sachin Rawat Crypsis sachin@crypsis

Norbert Rainer

Uppal G99 Plots

Vashikaran Specialist Astrologer Sachin - Love Problem Solution Expert

Omer Tripp IBM Software Group omert@il.ibm

Caterers in Uppal