1 / 30

LING 388: Language and Computers

LING 388: Language and Computers. Sandiway Fong Lecture 1. Administrivia. Where and when ECE 229 (Lecture) on Tuesdays 3 : 30 -4: 45PM Shantz 338 (Computer Lab) on Thursdays 3 : 30 -4: 45PM No Class Thursday November 28 th (Thanksgiving) Office Hours catch me after class, or

zuriel
Télécharger la présentation

LING 388: Language and Computers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING 388: Language and Computers Sandiway Fong Lecture 1

  2. Administrivia • Where and when • ECE 229 (Lecture) on Tuesdays 3:30-4:45PM • Shantz 338 (Computer Lab) on Thursdays 3:30-4:45PM • No Class • Thursday November 28th (Thanksgiving) • Office Hours • catch me after class, or • drop by my office (or make appt.) • Location: Douglass 311

  3. Administrivia • Email • sandiway@email.arizona.edu • Homepage • http://dingo.sbs.arizona.edu/~sandiway • Google me by my first name “sandiway” • Lecture slides: • available on homepage during and after each class • in both PowerPoint (.pptx) and Adobe PDF formats • .pptxslides may contain animation

  4. Administrivia

  5. Administrivia

  6. Administrivia

  7. Administrivia • Tips on how to take this class • No required textbook • Lecture slides contain everything you need to know in order to do the homeworks • To understand the slides, • you need to attend classes to “grok” the concepts • Unclear on something? • You are encouraged to ask questions in or after class • Ask while the question is still fresh in your mind • Review lecture video (NEW this year) • Practice • You can’t get good at computers just by reading a text • This is a hands-on class, try the exercises

  8. Administrivia • Course Objectives • Theoretical • Introduction to natural language processing techniques • Practical • Be able to write a natural language grammar that runs on a computer • Get an idea of what’s hard and what’s easy to do on a computer Outcome: by the end of the course, you will have built a small machine translation engine

  9. Administrivia • This semester, we will explore two parallel tracks • one: roll our own… learn how to write grammar rules • two: learn to use software available for language analysis

  10. Grammar-based Translator • Example… English grammar Japanese grammar “glue”

  11. Grammar-based Translator

  12. Grammar-based Translator • Idioms (language-specific): • John kicked the bucket gomasuri (ごますり)

  13. Grammar-based Translator • gomasuri (ごますり) • taroo-ga sensei-nigoma-o sutta • taroo-nom teacher-dat sesame-acc grinded • “John flattered the teacher” • taroo-gaHanako-nigoma-o sutta • taroo-nom Hanako-dat sesame-acc grinded • “John flattered Mary”

  14. Grammar-based Translator • What about state of the art systems?

  15. Administrivia • Laboratory Exercises • Some lectures will be laboratory sessions • We will do exercises on the computer in class • use your own laptop (better) or the iMac workstation in Shantz 338 • Homework questions will be handed out in these sessions • Homework questions are designed to extend the exercises done in the lab • You may do the homework exercises on your own computer or at the computer laboratory

  16. Grading 6~7 homeworks Mandatory and Extra Credit Questions: extra credit questions may be applied to the current homework they may also bump you up a grade if you are borderline at the end of the semester Homeworksare typically due one week after they are handed out Homeworks must be submitted by email to me (by midnight) Example: a homework given out on Tuesday will be due next Tuesday at midnight Ethics You may discuss the homeworks with your classmates However, you must do the work and write them up independently Sources must be acknowledged (students, webpage) UA Code of Academic Integrity http://deanofstudents.arizona.edu/codeofacademicintegrity Cheaters will be sanctioned Administrivia

  17. Administrivia • Late Policy • All homeworks are mandatory • deduction if handed in late • If you know you’re going to be late or have an upcoming emergency, let me know ahead of time • Homework tips • Homeworks are based on lab exercises • I don’t take attendance but practice is essential to understanding • Nightmare strategy: wait until the evening homework is due, scratch your head over the lecture notes, have tons of questions and start panicking • your computer crashes, the net goes down …

  18. Natural Language Processing (NLP)= Human Language Technology (HLT)= Computational Linguistics • Research Question: • What methods can we use to process natural languages on a computer? • Intersects with: • Computer science (CS) • Mathematics/Statistics • Artificial intelligence (AI) • Linguistic Theory • Psychology: Psycholinguistics • e.g. the human sentence processor

  19. Applications • Information retrieval • information is stored and accessed using language (keywords etc.) • document classification (email, news) • Machine translation • babelfish (now Bing) • http://babelfish.altavista.com/ • Google • http://translate.google.com • Language Comprehension • document summarization • Jeopardy (Quiz show) • Speech • automated 800 toll-free directory (800 555 1212) • cellphones (handsfree dialing) • car navigation (voice-synthesized directions)

  20. Applications • technology is still under development, in its infancy … • computers can’t really understand language (yet) • see google webpage translation • at least it’s free! • even if we are willing to pay... • machine translation has been worked on since after World War II (1950s), still not perfected today • why? • what are the properties of human languages that make it hard?

  21. Natural Language Properties • which ones are going to be difficult for computers to deal with? • Grammar (Rules for putting words together into sentences) • How many rules are there? • 100, 1000, 10000, more … • Portions learnt or innate • Do we have all the rules written down somewhere? • Lexicon (Dictionary) • How many words do we need to know? • 1000, 10000, 100000 …

  22. Computers vs. Humans • Knowledge of language • Computers are way faster than humans • They kill us at arithmetic and chess • But human beings are so good at language, we often take our ability for granted • Processed without conscious thought • Do pretty complex things

  23. Examples • Ambiguity • Where can I see the bus stop? • stop: verb or part of the noun-noun compound bus stop • Context (Discourse or situation) Let’s explore what computer software can do

  24. Examples Available online http://nlp.stanford.edu:8080/parser/

  25. Examples

  26. Examples • Changes in interpretation • John is too stubborn to talk to • John is too stubborn to talk to Bill

  27. Examples

  28. Examples

  29. Examples

  30. Examples

More Related