1 / 29

CPS 506 Comparative Programming Languages

CPS 506 Comparative Programming Languages. Syntax Specification. Compiling Process Steps. Program  Lexical Analysis Convert characters into a stream of tokens Lexical Analysis  Syntactic Analysis Send tokens to develop an abstract representation or parse tree. 2.

abedi
Télécharger la présentation

CPS 506 Comparative Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPS 506Comparative Programming Languages Syntax Specification

  2. Compiling Process Steps • Program  Lexical Analysis • Convert characters into a stream of tokens • Lexical Analysis  Syntactic Analysis • Send tokens to develop an abstract representation or parse tree 2

  3. Compiling Process Steps (con’t) • Syntactic Analysis  Semantic Analysis • Send parse tree to analyze for semantic consistency and convert for efficient run in the architecture (Optimization) • Semantic Analysis  Machine Code • Convert abstract representation to executable machine code using code generation 3

  4. Formal Methods and Language Processing • Meta-Language • A language to define other languages • BNF (Backus-Naur Form) • A set of rewriting rules ρ • A set of terminal symbols ∑ • A set of non-terminal symbols Ν • A start symbol S єΝ • ρ : Αω • ΑєΝandωє (Ν U Σ) • Right-hand side: a sequence of terminal and non-terminal symbols • Left-hand side: a non-terminal symbol 4

  5. BNF (con’t) • The words in Ν: grammatical categories • Identifier, Expression, Loop, Program, … • S : principal grammatical category • Symbols in Σ: the basic alphabet • Example 1: binaryDigit 0 binaryDigit 1 • or binaryDigit 0 | 1 • Example 2: Integer Digit | Integer Digit Digit 0|1|2|3|4|5|6|7|8|9 5

  6. BNF (con’t) Integer Digit Integer Integer Digit 1 8 Digit 2 • Parse Tree • Derivation Integer  Integer Digit  Integer Digit DigitDigit Digit Digit  2 Digit Digit28 Digit  281 6

  7. BNF (con’t) • Lexeme: The lowest-level syntactic units • Tokens : A set of all grammatical categories that define strings of non-blank characters (Lexical Syntax) • Identifier (variable names, function names,…) • Literal (integer and decimal numbers,…) • Operator (+,-,*,/,…) • Separator (;,.,(,),{,},…) • Keyword (int, if, for, where,…) 7

  8. BNF (con’t) Comment Keyword Separator Identifier Literal Operator // comments … void main ( ) { float p; p = 3.14 ; } 8

  9. BNF (con’t) 9

  10. Regular Expressions • An alternative for BNF to define a language lexical rules • x : A character • “abc” : A literal string • A | B : A or B • A B : Concatenation of A and B • A* : Zero or more occurrence of A • A+ : One or more occurrence of A • A? : Zero or one occurrence of A • [a-z A-Z] : Any alphabetic character • [0-9] : Any digit • . : Any single character • Example Integer : [0-9]+ Identifier : [a-z A-Z][a-z A-Z 0-9]* 10

  11. Syntactic Analysis • Primary tool: BNF • Input: Tokens from lexical analysis • Output: Parse • Syntactic categories • Program • Declaration • Assignment • Expression • Loop • Function definition 11

  12. Syntactic Analysis (con’t) • Example Arithmetic Expression Term | Arithmetic Expression + Term | Arithmetic Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  Identifier | Literal | ( Arithmetic Expression ) 12

  13. Syntactic Analysis (con’t) Arithmetic Expression Arithmetic Expression Term - Term Factor * Term Identifier Factor Factor Literal Letter Literal Integer a Integer 2 3 • Example 2 * a - 3 13

  14. Syntactic Analysis (con’t) • BNF limitations • Declaration of identifiers? • Initial value of identifiers? • In statically typed languages • Using Type System for the first problem • Detect in compile time or run time 14

  15. Ambiguous Grammar • A string is parsed into two or more various trees • Example Exp  Identifier | Literal | Exp – Exp Input: A – B – C Output: 1- A – (B – C) 2- (A – B) – C • Another example is “dangling else” • Using BNF rules • Using extra-grammatical rules 15

  16. Operator Precedence <expr>  <id> + <expr> | <id> * <expr> | ( <expr> ) | <id> A = B + C * A  A = B + (C * A) A = B * C + A  A = B * (C + A) Solution <expr>  <expr> + <term> | <term> <term>  <term> * <factor> | <factor> <factor>  ( <expr> ) | <id> A = B + C * A  A = B + (C * A) A = B * C + A  A = (B * C) + A 16

  17. Associativity of Operators A + B + C A * B * C A / B / C … • Left Associativity • Left Recursive: In a grammar rule, LHS also appears at the beginning of its RHS <expr>  <expr> + <term> | <term> A + B + C  (A + B) + C • Right Associativity • Right Recursive: In a grammar rule, LHS also appears at the end of its RHS <factor>  <exp> ** <factor> | <exp> <exp>  ( <expr> ) | <id> A + B ** C  A + (B ** C) 17

  18. Extended BNF (EBNF) Optional part of an RHS <if_stmt>  if ( <expr> ) <statement> [ else <statement> ] Repetition, or recursion, part of an RHS <id_list>  <id> { , <id_list> } Multiple choice option of an RHS <term>  <term> ( * | / | % ) <factor> Optional use of * and + <id_list>  <id> { , <id_list> }* <integer>  {0 | … | 9}+ 18

  19. Extended BNF (EBNF) (con’t) Factor Term * | / • opt subscript Conditional Statement  if ( Expr ) Statement { else Statement }opt • Syntax Diagram 19

  20. Case Study A BNF or EBNF for one grammar, such as Expression, different Literals, or if Statement in Java, C, C++, or Pascal BNF or EBNF for floating point numbers in Java, C, C++ BNF or EBNF for loop statements in one language 20

  21. Abstract Syntax • Pascal While i < 10 do begin i := i+ 1; end; • C or Java while (i < 10) { i = i + 1; } Consider the following codes: Although syntax are different, they are essentially equivalent Abstract Syntax is a solution to show the essential elements of a language 21

  22. Abstract Syntax (con’t) Member Element • General Form Abstract Syntax Class = list of essential components • Example Loop = Expression test; Statement body • A Java class for abstract syntax of loop class Loop extends Statement { Expression test; Statement body; } 22

  23. Abstract Syntax (con’t) Member Element • More examples Assignment = Variable target; Expression source • A Java class for abstract syntax of Assignment class Assignment extends Statement { Variable target; Expression source; } 23

  24. Abstract Syntax Tree Statement Assignment Variable Expression x Value 2 • A tree to show the abstract syntax tree Example x = 2; x := 2; Assignment = Variable target; Expression source 24

  25. Recursive Descent Parser A top-down parser to verify the syntax of a stream of text from left to right It contains several recursive methods, each of which implements a rule of the grammar More details and parsing algorithms in Compiler course 25

  26. Exercises Modify the following grammar to add a unary minus operator that has higher precedence than either + or *. <assign>  <id> = <expr> <id>  A | B | C <expr>  <expr> + <term> | <term> <term>  <term> * <factor> | <factor> <factor>  ( <expr> ) | <id> 26

  27. Exercises • Consider the following grammar: <S>  <A> a <B> b <A>  <A> b | b <B>  a <B> | a Which of the following sentences are in the language generated by this grammar? • baab • bbbab • bbaaaaa • bbaab 27

  28. Exercises • Convert the following EBNF to BNF: S  A { bA } A  a [b]A • Using grammar in question 1, add the ++ and – unary operators of Java. • Using grammar in question 1, show a parse tree and a leftmost derivation for each of the following statements: • A = (A+B) * C • A = B * (C * (A + B)) 28

  29. Exercises Rrewrite the BNF in question 1 to give + precedence over *, and force + to be right associative. Using BNF write an algorithm for the language consisting of strings {ab}n, where n>0, such as ab, aabb, … . Can you write this using regular expressions? 29

More Related