1 / 22

LING 581: Advanced Computational Linguistics

LING 581: Advanced Computational Linguistics. Lecture Notes January 19th. Administrivia. New room Shantz 338 ( I have asked Jennifer Columbus to investigate refund: however, I’m told it may not happen ). Marshall 480. Shantz 338. Penn Treebank. Availability Source:

faxon
Télécharger la présentation

LING 581: Advanced Computational Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING 581: Advanced Computational Linguistics Lecture Notes January 19th

  2. Administrivia • New room • Shantz 338 • (I have asked Jennifer Columbus to investigate refund: however, I’m told it may not happen) Marshall 480 Shantz 338

  3. Penn Treebank • Availability • Source: • Linguistic Data Consortium (LDC) • U. of Arizona is a (fee-paying) member of this consortium • Resources are made available to the community through the main library • URL • http://sabio.library.arizona.edu/search/X

  4. Penn Treebank (V3) • Call Record

  5. Penn Treebank Tagging Guide Arpa94 paper Parse Guide

  6. Penn Treebank

  7. Penn Treebank sections 00-24

  8. Penn Treebank

  9. tregex • Tregex is a Tgrep2-style utility for matching patterns in trees. written In Java run-tregex-gui.commandshell script -mx flag, the 300m default memory size will need to be increased depending on the platform

  10. tregex • Select the PTB directory • TREEBANK_3/parsed/mrg/wsj/ • Browse Deselect any unwanted files

  11. tregex • Search

  12. tregex Help

  13. tregex • Help

  14. tregex • Help

  15. tregex • Help

  16. tregex • Help

  17. tregex • Pattern: • (@NP <, (@NP $+ (/,/ $+ (@NP $+ /,/=comma))) <- =comma)

  18. tregex • Help

  19. tregex

  20. tregex • Different results from: • @SBAR < /^WH.*-([0-9]+)$/#1%index << (@NP < (/^-NONE-/ < /^\*T\*-([0-9]+)$/#1%index))

  21. tregex Example: WHADVP also possible (not just WHNP)

  22. Ungraded Homework Exercise • Search for NP trace relative clauses as defined below: Be ready to compare search pattern and number found next time in class

More Related