1 / 125

CPSC4600 Implementation of HLPL Spring 2003

CPSC4600 Implementation of HLPL Spring 2003. Instructor: Dr. Shahadat Hossain. Today ’ s Agenda. Administrative Matters Course Information Introduction to translation. Course Assessment. Lectures: MWF 10:00 a.m - 10:50 a.m. Project: Write a compiler for a small procedural language

chaz
Télécharger la présentation

CPSC4600 Implementation of HLPL Spring 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPSC4600 Implementation of HLPLSpring 2003 Instructor: Dr. Shahadat Hossain

  2. Today’s Agenda • Administrative Matters • Course Information • Introduction to translation

  3. Course Assessment • Lectures: MWF 10:00 a.m - 10:50 a.m. • Project: Write a compiler for a small procedural language • Two quiz tests (in January and March) • One midterm on February 15 • Final exam

  4. Course Material • Course web page: http://classes.uleth.ca/200301/cpsc4600a/ - Course related material will be made available here • Homework assignments will be given occasionally. They will not be graded • Text: Compilers by Aho, Sethi, and Ullman Brinch Hansen on Pascal Compilers by P.B. Hansen

  5. Grading Method • Project: 25% of course grade • Quiz tests: 5% each • Midterm: 25% • Final: 40%

  6. Out of Class Help Office Hours: MWF 4:00 p.m. - 5:00 p.m or by appointment Office Location: C540 University Hall

  7. Tentative Project Schedule Scanner (weight 20%): January 31 (midnight) Parser (weight 25%): February 21 (midnight) Scope and Type Check (weight 20%): March 08 (midnight) Complete Compiler with code generation (35%): April 04 (midnight) NB: Project can be done in groups of at most 3 people All phases of the project must be completed

  8. Course Outline Brinch Hansen On Pascal Compilers - Whole book The dragon book (Ahu, Sethi, Ullman) - Chapter 1 : all - Chapter 2 : all - Chapter 3: 3.1-3.7 (selected topics) - Chapter 4: 4.1-4.5 (selected topics) - Chapter 5: 5.1-5.2 (selected topics) - Chapter 6: 6.1-6.2 (selected topics) - Chapter 7: 7.1-7.6 (selected topics) - Chapter 8: 8.1-8.7 (selected topics)

  9. What is cs4600? Introduction to theoretical and practical aspects of program translators Learn theory by doing!! Handling nontrivial software project

  10. Bits of advice to succeed in cs4600 Start early. Document your code. Design before implementing Test each function/method code separately for correctness before you plug them in the main code Discuss with your group members

  11. Questions?

  12. Pascal- : A subset of Pascal Pascal- has only two simple types integer and Boolean two structured types array and record Type definition: A type definition always creates a new type; it can never rename an existing type type table = array [1..100] of integer; stack = record contents: table; size: integer end;

  13. Pascal- : A subset of Pascal Variable definition: A type name must be used in a variable definition var A: table; x, y: integer; All constants have simple types: Predefined constants: true, false Constant definition: const max = 100; on = true;

  14. Pascal- : A subset of Pascal Statements: -assignment x := y; -if-statement if x = y then x := 1; -while-statement while I <10 do I := I+1; -compound statement begin x := y; y := z end -procedure call -recursion

  15. A Complete Pascal- Program Program ProgramExample; const n=100; type table=array[1..n] of integer; var A: table; i,x: integer; yes: Boolean; procedure search(value: integer; var found: Boolean; var index: integer); var limit: integer; begin index:=1; limit:=n; while index<limit do if A[index]=value then limit := index else index := index+1; found := A[index] = value end;

  16. A Complete Pascal- Program begin {input table} i:=1; while i<=n do begin read(A[i]); i:=i+1 end; {test search} read(x); while x<>0 do begin search(x,yes,i); write(x); if yes then write(i); read(x); end end. {program}

  17. Pascal- Vocabulary The vocabulary of a programming language is made up of basic symbols and comments. Basic Symbols: a) Identifiers: In Pascal-, an identifier is called a Name, and consists of a letter that may be followed by any sequence of letters and digit (Identifiers are case insensitive) b) Denotations: Denotations represent specific values, according to conventions laid down by the language designer. In Pascal- a Numeral is the only denotation in the vocabulary.

  18. Pascal- Vocabulary c) There are two kinds of delimiters in Pascal-, word symbols and special symbols: Word symbols: and array begin const div do else end if mod not of or procedure program record then type var while Specialsymbols: + - * < = > <= <> >= := ( ) [ ] , . : ; .. Comments: A comment in Pascal- is an arbitrary sequence of characters enclosed in braces { }. Comments may extend over several lines and may be nested to arbitrary depth.

  19. Pascal- Vocabulary White space (spaces, tabs and new lines) and comments are called separators. Any basic symbol may be preceded by one or more separators, and the program may be followed by zero or more separators Example: {Incorrect} ifx>0thenx:=10divx-1; {Correct} if x>0 then{Can divide}x:=10 div x-1;

  20. Pascal- Grammar Program --> 'program' ProgramName ';' BlockBody '.' BlockBody --> [ConstantDefinitionPart] [TypeDefinitionPart] [VariableDefinitionPart] {ProcedureDefinition} CompoundStatement . Constant, Type, and Variable definition grammar:

  21. Pascal- Grammar Constant, Type, and Variable definition grammar ConstantDefinitionPart --> 'const' ConstantDefinition {ConstantDefinition} ConstantDefinition --> ConstantNameDef '=' Constant ';' Constant -> Numeral | ConstantNameUse TypeDefinitionPart --> 'type' TypeDefinition {TypeDefinition} TypeDefinition --> TypeNameDef '=' NewType ';' NewType --> NewArrayType | NewRecordType NewArrayType --> 'array' '[' IndexRange ']' 'of' TypeNameUse . IndexRange --> Constant '..' Constant

  22. Pascal- Grammar Constant, Type, and Variable definition grammar NewRecordType --> 'record' FieldList 'end' FieldList --> RecordSection {';' RecordSection} RecordSection --> FieldNameDefList ':' TypeNameUse FieldNameDefList --> FieldNameDef {',' FieldNameDef} VariableDefinitionPart --> 'var' VariableDefinition {VariableDefinition} VariableDefinition --> VariableNameDefList ':' TypeNameUse ';' VariableNameDefList --> VariableNameDef {',' VariableNameDef}

  23. Pascal- Grammar Expression grammar Expression --> SimpleExpression [RelationalOperator SimpleExpression] RelationalOperator --> '<' | '=' | '>' | '<=' | '<>' | '>=’ SimpleExpression --> [SignOperator] Term | SimpleExpression AddingOperator Term SignOperator --> '+' | '-' AddingOperator --> '+' | '-' | 'or’ Term --> Factor | Term MultiplyingOperator Factor MultiplyingOperator: '*' | 'div' | 'mod' | 'and’

  24. Pascal- Grammar Expression grammar Factor --> Numeral | VariableAccess | '(' Expression ')' | NotOperator Factor NotOperator --> 'not' . VariableAccess --> VariableNameUse | VariableAccess '[' Expression ']' | VariableAccess '.' FieldNameUse

  25. Pascal- Grammar Statement grammar Statement --> AssignmentStatement | ProcedureStatement | IfStatement | WhileStatement | CompoundStatement | Empty AssignmentStatement --> VariableAccess ':=' Expression ProcedureStatement --> ProcedureNameUse ActualParameterList

  26. Pascal- Grammar Statement grammar ActualParameterList: --> '(' ActualParameters ')' ActualParameters --> ActualParameter {',’ ActualParameter} ActualParameter --> Expression IfStatement --> 'if' Expression 'then' Statement | 'if' Expression 'then' Statement 'else' Statement WhileStatement --> 'while' Expression 'do' Statement CompoundStatement: 'begin' Statement {';' Statement} 'end' .

  27. Pascal- Grammar Procedure grammar ProcedureDefinition --> 'procedure' ProcedureNameDef ProcedureBlock ';' ProcedureBlock --> FormalParameterList ';' BlockBody FormalParameterList --> |'(' ParameterDefinitions ')' ParameterDefinitions --> ParameterDefinition {';' ParameterDefinition} ParameterDefinition --> 'var' ParameterNameDefList ':' TypeNameUse | ParameterNameDefList ':' TypeNameUse ParameterNameDefList --> ParameterNameDef | ParameterNameDefList ',' ParameterNameDef

  28. The Project Language PL A PL Program consists of a Block followed by a period Program -> Block . The Block describes a set of named objects, (constants, variables, and procedures) and a sequence of statements that use these objects Variables can be simple variables of integer or Boolean type, or one-dimensional arrays of integer or Boolean elements indexed from 1 to some constant n

  29. The Project Language PL Block -> begin DefinitionPart StatementPart end DefinitionPart -> { Definition ; } Definition -> ConstantDefinition | VariableDefinition | ProcedureDefinition ConstantDefinition -> const ConstantName = Constant ConstantName -> Identifier VariableDefinition -> TypeSymbol VariableList | TypeSymbol array VariableList [ Constant ] TypeSymbol -> integer | Boolean VariableList -> VariableName { , VariableName} VariableName -> Identifier

  30. The Project Language PL Procedure Definition: Procedures can be recursive, but have no parameters in PL ProcedureDefinition -> procProcedureNameBlock ProcedureName -> Identifier

  31. The Project Language PL PL Statements StatementPart -> { Statement; } Statement -> EmptyStatement | ReadStatement | WriteStatement | AssignmentStatement | ProcedureStatement | IfStatement | DoStatement

  32. The Project Language PL PL Statements can be Empty statement which does nothing EmptyStatement -> skip read statemenr, which reads one or more integers into variables; ReadStatement -> readVariableAccessList write statement, which writes out the values of a sequence of integer expressions; WriteStatement -> writeExpressionList assignment statement, which assigns to a sequence of variable accesses to distinct variables, all of the same simple type AssignmentStatement -> VariableAccessList:=ExpressionList

  33. The Project Language PL procedure statement , which activates the code of a possibly recursive parameterless procedure; All local variables are allocated anew whenever the procedure is activated; ProcedureStatement -> callProcedureName ProcedureName -> Identifier if statement, which selects a guarded command from a sequence of such commands, whose guard is true, and executes the corresponding sequence of statements; at least one of the guards must evaluate to true, otherwise the program execution is aborted with an error message; If more than one guard is true, one is selected arbitrarily. IfStatement -> ifGuardedCommandListfi DoStatement -> doGuardedCommandListod

  34. The Project Language PL do statement, which executes a sequence of guarded commands repeatedly, until all of the guards evaluate to false; at each iteration if at least one guard is true, a guarded command with a true guard is selected, and the corresponding statements are executed. DoStatement -> doGuardedCommandListod GuardedCommandList -> GuardedCommand { []GuardedCommand } GuardedCommand -> Expression->StatementPart VariableAccessList -> VariableAccess { ,VariableAccess } ExpressionList -> Expression { ,Expression }

  35. The Project Language PL Expressions: In PL expressions may contain arithmetic operators + - * / and \ or the relational operators, <, = and > Expression -> PrimaryExpression { PrimaryOperatorPrimaryExpression } PrimaryOperator -> & | | PrimaryExpression -> SimpleExperession [ RelationalOperatorSimpleExperession ] RelationalOperator -> < | = | > SimpleExperession -> [-] Term { AddingOperatorTerm AddingOperator -> + | -Term -> Factor { MultiplyingOperatorFactor } MultiplyingOperator -> * | / | \ Factor -> Constant | VariableAccess | (Expression) | ~Factor

  36. The Project Language PL Expressions: In PL expressions may contain arithmetic operators + - * / and \ or the relational operators, <, = and > Factor -> Constant | VariableAccess | (Expression) | ~Factor Constant -> Numeral | BooleanSymbol | ContsantName VariableAccess -> VariableName [ IndexedSelector ] IndexedSelector -> [Expression] BooleanSymbol -> false | true Numeral -> Digit { Digit } Identifier -> Letter { Letter | Digit | _ }

  37. Example Program in PL $A PL Program: Linear Search begin const n=10; integer array A[n]; integer x,i; Boolean found; proc Search begin integer m; i,m := 1,n; do i < m -> if A[i] = x -> m:=i; [] ~(A[i] = x ) -> i:=i+1; fi;

  38. Example Program in PL od; found := A[i] = x; end; $ input the table : i:=1; do ~(i > n) -> read A[i]; i:=i+1; od; $ Test Search : read x; do ~(x = 0) -> call Search; if found -> write x, i; [] ~found -> write x; fi; read x; od; end.

  39. A simple one-pass compiler (Chapter 2 ASU) Build a translator for converting simple infix expressions to their postfix form - Discusses all phases of translation process - Shows how the grammar rules are implemented in a programming languages - Shows how the components of the translator are “glued” together for example, A+B+C*D – Postfix form: AB+CD*+ – Postfix notation can be converted directly into code for a stack machine, for example push A, push B, +, push C, push D, *, +, store

  40. Infix and PostfixExpressions Example Infix: 3 * 4 Postfix: 3 4 * Infix: 3 * 4 + 5 * 2 Postfix: 3 4 * 5 2 *+ Infix: A+B+C*D Postfix AB+CD*+ Postfix expressions can be converted directly into code for a stack machine: push A, push B, ADD, push C, push D, MULT, ADD, store

  41. Context-free Grammar Stmt --> list eof list --> expr ; list | empty expr --> expr + term | expr - term | term term --> term * factor | term / factor | term div factor | term mod factor | factor factor --> ( expr ) | id | num

  42. Lexical Analysis Removewhite space (and comments) while (1) { t = getchar(); if (t== ‘ ‘ || t == ‘\t’ || t == ‘\n’) /* strip off blanks, tabs, new lines */} Recognize Numbers (token and its attribute values) while ( isdigit(t)) { value = value*10+t – ‘0’; t = getchar(); }

  43. Lexical Analysis Recognize identifiers, keywords, and reserve words if (isalpha(t)) { int b = 0; while ( isalnum(t)) { lexbuff[b++] = t; t = getchar(); } /* other code */ }

  44. Lexical Analysis A symbol table is needed to distinguish identifiers. -- Keywords are fixed char strings to identify certain construct e.g. begin -- Reserved words are keywords that may not be used as identifiers Interface to the lexical analyzer Pass token and attribute value Read char nextToken() Input Scanner Parser token Put back char Keep track of line no.

  45. Lexical Analysis How to distinguish the token “<“ from token “<=“ token when the scanner read the character “<“ ? The scanner must read ahead one or more characters The scanner is often implemented as a procedure called by the parser, returning a token at a time.

  46. Lexical Analysis Input buffer - A block of characters is read into the buffer at a time for I/O efficiency. A pointer keeps track of how many characters have been analyzed. Symbol Table - Symbol table is a database that contains information about identifiers (procedure names, variable names, labels, … etc) . It can be used to communicate among multiple compiling phases. - Symbol table interface Insert(s, t) return the index of a new entry for string s, token t. lookup(s) return the index of entry for string s, or 0 if not found – Handling reserved words We may initialize the symbol table by inserting all reserved words

  47. Lexical Analysis Symbol table implementation Symbol table is probably the most important data structure in compiler implementation. A good design will support the following – Fast access – Easy to maintain – Flexible – Supporting nested scope

  48. Lexical Analysis A sample implementation of symbol table div\0 mod\0 val\0 rate\0 height\0 …….

  49. Lexical Analysis Functions of lexical analysis phase – Grouping input characters into tokens – Stripping out comments and white spaces – Correlating error messages with the source program Issues (why separating lexical analysis from parsing) – Simpler design – Compiler efficiency – Compiler portability

  50. Lexical Analysis Tokens, Patterns and Lexemes – Pattern: A rule that describes a set of strings – Token: A set of strings with the same pattern – Lexeme: The sequence of characters of a token

More Related