1 / 21

Modules in a Compiler

Modules in a Compiler. Read Pages 39-52. Module - a group of code that performs a specific task Compiler can be thought of in two distinct parts Front end – the modules associated with a specific source language Back end – the modules associated with an instruction set.

lonato
Télécharger la présentation

Modules in a Compiler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modules in a Compiler Read Pages 39-52

  2. Module - a group of code that performs a specific task • Compiler can be thought of in two distinct parts • Front end – the modules associated with a specific source language • Back end – the modules associated with an instruction set

  3. Modules in a compiler Source Scanner tokens Parser Program (lexical analysis) (syntax analysis) syntax tree Parse Table Code Target (symbol table)Generator Program

  4. Modules in a compiler • Scanner – considers source code as a series of ascii characters • Lexeme – a string of characters that mean something when put together • Token – a symbol that represents a lexeme • Built on the concepts of a RL • Symbol Table – an add on to get outside of CFL

  5. Modules in a compiler • Parser – determines if a group of tokens forms a semantically correct “sentence” / “program” • Built on the concepts of CFL • Code Generator – considers the semantics from the parser and generates semantically equivalent target code • What is a CFL?

  6. What it looks like. • Consider the simple program int main() { int x, j; float a ; x= 7; a=5.2; j= 3; if( j < x) a= 4.6; } http://jhelum.cs.iastate.edu/cgi-bin/comp.cgi Homework: Lab 1

  7. Context Free Grammars • CFG • a set of terminals (tokens) corresponding to characters in Σ • a set of non-terminals • a special non-terminal called the start state • a list of productions Э the LHS of each production is a single non-terminal, then an arrow and a sequence of tokens and/or terminals (RHS)

  8. Derivations from CFGs • A grammar derives strings by beginning with the start symbol and repeatedly replacing non-terminals by the body of the production for the given non-terminal • The strings that can be derived from the start symbol form the language defined by the grammar, the CFL

  9. CFG Examples • Convention is to have Capitals for non-terminals, lowercase terminals S Aa|Bb A Aa|Λ B bB|b • Which of the following are accepted by the grammar • aaa? • bbb? • abba? • CFG example • Find CFG for anbn

  10. Homework- Create a high-level programming language • Read pages 25 -36 and address each aspect in creating your language. • Your language will have the following • basic constructs – case sensitive? Blocking?... • Operators – add, subtract, multiply, divide, modulus, unary, comparison, logical, assignment… • Control structures – if-else, case, for loop, while loop, repeat until loop, • Functions – declaration, call, return values • Variables – integer, real, character, string, pointer, array, void, Boolean • Input / Output • Comments

  11. Converting between RE and CFG • Any RE can be represented by a CFG R.E. CFG • Λ S  Λ • a S  a • (r1 ) S  s1 (s1 is a start state for the grammar r1 ) • r1r2 S  s1 s2 • r1+r2 S  s1 | s2 • r1* S  S s1 | Λ

  12. Example • (aa+b)*a convert to CFG • Give a CFG for a palindrome • Homework page 51 #2.2.1, 2.2.2 a) b) c)

  13. Backus-Naur Form • BNF – a notation for a context free grammar • terminal are tokens • nonterminals are surrounded by <> • productions are ::= - “can be” • pipe(|) used for OR • ex. • <program> ::= <declaration section><body>; • <real> ::= <integer>.<fractional><integer> ::= <digit>|<integer><digit>|ε<fraction> ::= <digit>|<fractional><digit><digit> ::= 0|1|2|3|4|5|6|7|8|b • <expr> ::= <expr><op><expr>|id<op> ::= +|-|*|/

  14. Backus-Naur Form examples • Convert the REs to CFL • b(a+b)*a • (aa+bb+ba+ab)* • Λ+a+(b+aa)(a+b)*

  15. Parse Trees • productions are rules for building strings of tokens • parse tree shows how the strings can be built • a parse tree with respect to a grammar is a tree satisfying the following • the root is the starting node • each leaf is either a terminal or ε • each non-leaf node is labeled with a non-terminal • label of a non-leaf node is the left side of a production and the labels of the children of the node represent the right hand side of the production • parse tree plus the symbol table capture the semantics of the language

  16. Parse Tree Example: x+y*z(using previous production example) <expr> <expr> <op> <expr> <expr> <op> <expr> * id id + id

  17. Example Continued: x+y*z <expr> <expr> <op> <expr> id + <expr> <op> <expr> id * id Homework: Lab 2

  18. Ambiguity • Syntactically Ambiguous – if the same string has more than one parse tree • To fix ambiguity create productions to rule out all but one of the trees

  19. Ambiguity Removed <expr> ::= <term><op1><expr>|<term><term> ::= <factor><op2><term>|<factor><factor> ::= id<op1> ::= +|-<op2> ::= *|/ • Create the tree for x+y*z • Lexical analysis (scanner) – will take x+y*z & give id+id*id then syntax analysis (parser) uses the above to determine if it is a “legal” structure of our language

  20. scanner – built on regular language concepts • takes in lexemes • return tokens • Example list of tokens { - beginToken } - endToken ; - semiToken forToken, idToken, lessthanToken

  21. Homework • create transition graph for each of the literals • stringLiteral, intLiteral, realLiteral, charLiteral, hexLiteral – 0x….. (0-9,A-F), boolLiteral • create a complete set of tokens for the language • create a complete set of reserved words for the language

More Related