sara-landry
Uploaded by
12 SLIDES
273 VUES
120LIKES

COMP313A Programming Languages

DESCRIPTION

This lecture focuses on lexical analysis, a crucial aspect of programming languages that involves recognizing and processing tokens. We explore the concept of lookahead, which allows a lexer to read ahead in the input to make decisions based on upcoming characters. Examples include the classic FORTRAN tokenization issues. We will cover finite automata and how they determine if an input matches a language using regular expressions. Significant emphasis will be placed on the transition from regular expressions to finite automata, specifically through methods like Thompson’s construction and subset construction. We will also introduce LEX (FLEX) for generating lexical programs.

1 / 12

Télécharger la présentation

COMP313A Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP313A Programming Languages Lexical Analysis (2)

  2. Lookahead • <=, <>, < • When we read a token delimiter to establish a token we need to make sure that it is still available • It is the start of the next token! • This is lookahead • Decide what to do based on the character we ‘haven’t read’ • Sometimes implemented by reading from a buffer and then pushing the input back into the buffer • And then starting with recognizing the next token

  3. Classic Fortran example • DO 99 I=1,10 becomes DO99I=1,10 versus DO99I=1.10 • When can the lexical analyzer assign a token? • Push back into input buffer • or ‘backtracking’

  4. Finite Automata • A recogniser determines if an input string is a sentence in a language • Uses a regular expression • Turn the regular expression into a finite automaton • Could be deterministic or non-deterministic

  5. Transition diagram for identifiers • RE • Identifier -> letter (letter | digit)* letter accept start letter other 0 1 2 digit

  6. a start a b b accept 0 1 2 3 b Non-deterministic finite state automata b a start b b a accept 0 1 2 3 a b a Equivalent deterministic finite state automata

  7. Transition Table (NFA) Input Symbol

  8. Transition Table (DFA) Input Symbol

  9. From a Regular Expression to an NFAThompson’s Construction (a | b)* abb e a 2 3 e e start e e a b b 0 1 6 7 8 9 10 e e 4 5 accept b e

  10. Converting an NFA to a DFA • Subset Construction • NFA – each entry in the transition table is a set of states • In the resulting DFA each state will correspond to a set of NFA states • A DFA state keeps track of all the states the NFA can be in after reading an input symbol

  11. Subset Construction • Work out all the states reachable directly from the start state on epsilon transitions (e-closure). Combine these into the start state for the DFA…. • We’ll do the rest on the board in the lecture

  12. LEX (FLEX) • Tool for generating programs which recognise lexical patterns in text • Takes regular expressions and turns them into a program • You will learn the basics in a lab on Thursday

More Related