140 likes | 274 Vues
This demonstration showcases how to search for country names using n-gram patterns. Users can query combinations of countries and use part-of-speech constraints for refined outputs. Additionally, the tool illustrates multi-level outputs related to semi-supervised learning processes, starting from collecting seed entities from Wikipedia infoboxes. By employing high-precision patterns to discover new seeds and repeating the search for further patterns, users can enhance their data collection. This approach aids in the development of more accurate information extraction techniques.
E N D
Example 1. Search country names by ngram patterns • http://linserv1.cims.nyu.edu:23232/ngram/ • Query: countries such as * and *
Example 2. Search with POS • Provide more POS constraints if you are not happy with the output • Query: *NNP* was established in *CD*
Example 3. Search sentences with multi-level output • 1. Token only
Example 3. Search sentences with multi-level output • 2. Multi-level output
Examples related to semi-supervised learning • Relation: ORG-headquarters • General Procedure • 1. Collect seeds from Wikipedia infobox • 2. Search for patterns • 3. Use high-precision patterns to find more seeds • 4. Use new seeds to search for more patterns • 5. repeat step 3 and 4.
Examples related to semi-supervised learning • 1. Collect seeds from Wikipedia infobox <IBM, Armonk>
Examples related to semi-supervised learning • 2. Search for patterns • Query: IBM * * * Armonk • Output:
Examples related to semi-supervised learning 3. Use high-precision patterns to find more seeds
Examples related to semi-supervised learning • 3. Use high-precision patterns to find more seeds • <Microsoft, Redmond> • <Intel, Santa Clara> • … • 4. Use new seeds to find more patterns • Microsoft * * * * Redmond • Intel * * * Santa Clara • …
References • 1. Ngram tool http://nlp.cs.nyu.edu/sekine/papers/coling08.pdf • 2. Semi-supervised relation extraction http://cs.nyu.edu/courses/spring09/G22.2591-001/lecture9.html