1 / 8

XGTagger, a generic interface dealing with XML contents.

XGTagger, a generic interface dealing with XML contents. Xavier Tannier, Jean-Jacques Girardot, Mihaela Mathieu Ecole des Mines de Saint-Etienne. September 19 th , 2005. 1. 2. 3. 4. <book> <title> Gone with the wind </title> <author> Margaret Mitchell </author> </book>. 1.

ilori
Télécharger la présentation

XGTagger, a generic interface dealing with XML contents.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XGTagger, a generic interface dealing with XML contents. Xavier Tannier, Jean-Jacques Girardot, Mihaela Mathieu Ecole des Mines de Saint-Etienne September 19th, 2005

  2. 1. 2. 3. 4. <book> <title>Gonewiththewind</title> <author>Margaret Mitchell</author> </book> 1. 4. 3. Gone VPP with IN the DT wind NN . Margaret NN Mitchell NN System S (POS tagger) Gone with the wind . Margaret Mitchell 2. XGTagger System S (black box) Input (text only) Output (text only) Initial XML document Final XML document

  3. 5. <book> <title> <w pos=“VPP”>Gone</w> <w pos=“IN”>with</w> <w pos=“DT”>the</w> <w pos=“NN”>wind</w> </title> <author> <w pos=“PN”>Margaret</w> <w pos=“PN”>Mitchell</w> </author> </book> XGTagger System S (black box) Input (text only) Output (text only) Initial XML document Final XML document 1. 5. <book> <title>Gonewiththewind</title> <author>Margaret Mitchell</author> </book> 4. Gone VPP with IN the DT wind NN . Margaret NN Mitchell NN

  4. Tag classification [Colazzo et al, 2001] • "hard" tags : break the linearity of the text. • ex: titles, chapters, paragraphs <tag>text A</tag><tag>text B</tag> • "soft" tags : identify significant parts of text, but remain "transparent" when reading it. • ex: bold, italics, underlined text A <bold>text B</bold>text C • "jump" tags : particular elements, as margin notes, citations, glosses. text A<note>text B</note>text C

  5. Soft tags, reading contexts and XGTagger <par> United States<bold>elections</bold>are admisnistered at the state and local level </par> United States elections are admisnistered at the state and local level

  6. Jump tags, reading contexts and XGTagger <paragraph> The 2004 United States<footnote>See an article p.163 about the United States of America.</footnote>elections caused less controversy than in 2000. </paragraph> The 2004 United States elections caused less controversy than in 2000. See an article p.163 about the United States of America. <paragraph> …………………………..<footnote>………………………………………………………………….</footnote>…………………………………………………………………… </paragraph>

  7. 1. <book> <title>Advances in Information Retrieval </title> </book> 2. Advances in Information Retrieval System S (parser) 3. 5. <book> <title> <w id=“1” pos=“NNS”>Advances</w> <w id=“2” pos=“IN”>in</w> <w id=“3” pos=“NP”>Information</w> <w id=“3” pos=“NP”>Retrieval</w> </title> </book> 4. Advances NNS in IN Information///Retrieval NP Example : Phrases

  8. 1. 2. <element> I had a conversation with my brother </element> I had a conversation with my brother System S (translator) 3. 5. <element> <w>I</w> <w>had</w> <w>a</w> <w french=“entretien” german=“Gescpräch”>conversation </w> <w>with</w><w>my</w> <w french=“frère” german=“Bruder”> brother</w> </element> 4. I had a conversation/entretien/Gespräch with my Brother/frère/Bruder Example : Translation

More Related