1 / 81

Principles of Programming Languages

Principles of Programming Languages. P. S. Suryateja Asst. Professor, CSE Dept Vishnu Institute of Technology. UNIT – 1 SYNTAX & SEMANTICS. General Problem of Describing Syntax. A language is a set of strings of characters from some alphabet.

elina
Télécharger la présentation

Principles of Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Principles of Programming Languages P. S. Suryateja Asst. Professor, CSE Dept Vishnu Institute of Technology

  2. UNIT – 1SYNTAX & SEMANTICS

  3. General Problem of Describing Syntax • A language is a set of strings of characters from some alphabet. • The strings of a language are called as sentences or statements. • The syntax rules specify which strings belong to the language. • Lowest level syntactic units are known as lexemes. • Lexemes of a programming language include numeric literals, operators, special words etc...

  4. General Problem of Describing Syntax (cont...) • Lexemes are partitioned into groups like identifiers, keywords, literals etc. • A token of a language is a category of its lexemes.

  5. General Problem of Describing Syntax (cont...) • Consider the following statement: index = 2 * count + 17; LexemesTokens index identifier = equal_sign 2 int_literal * mult_op count identifier + plus_op 17 int_literal ; semicolon

  6. Language Recognizers • A language can be defined in two ways: by recognition and by generation. • For a language L that uses an alphabet Σ of characters, we need to construct a mechanism R, called a recognition device. • The recognition device would indicate whether the string formed with characters from alphabet is in the language L or not. • The syntax analysis part of a compiler is a recognizer for the language the compiler translates.

  7. Language Generators • A generator is a device used to generate the sentences of a language. • Generator is a device of limited usefulness as a language descriptor as the sentence generated by a generator is unpredictable. • Example for language recognizer is a Finite State Automata (FSA) and example for language generator is CFG.

  8. Formal Methods of Describing Syntax – Context-Free Grammars • Two of the four Chomsky’s classes of grammars namely regular grammars and context-free grammars are used to describe the syntax of programming languages. • Regular grammars are for describing tokens. • Context-free grammars are for describing the syntax of whole programming languages.

  9. Formal Methods of Describing Syntax – Backus-Naur Form (BNF) • John Backus presented a paper describing ALGOL 58 which introduced a new formal notation for specifying programming language syntax. • Later Peter Naur slightly modified the notation proposed by Backus for ALGOL 60. This revised notation is called as Backus-Naur Form (BNF).

  10. BNF - Fundamentals • A meta-language is a language that is used to describe another language. BNF is a meta-language for programming languages. • BNF uses abstractions for syntactic structures. Abstraction names are enclosed with angular brackets (< >). For example, the abstraction for an assignment statement can be <assign> and its definition is as follows: <assign> -> <var> = <expression> The text on the left side of the arrow is called left-hand side (LHS), is the abstraction being defined. The text to the right of the arrow is called as right-hand side (RHS), which is the definition of LHS and can contain a mixture of tokens, lexemes or other abstractions.

  11. BNF – Fundamentals (cont...) • The LHS and RHS combined is called a rule or production. • Example for the <assign> definition: total = s1 + s2 • The abstractions in a BNF description, or a grammar, are often called as non-terminals and the lexemes and tokens of the rules are called terminals. • A BNF description or a grammar is a collection of rules.

  12. BNF – Fundamentals (cont...) • A Java if statement can be described with the following rules: <if_stmt> -> if (<logic_expr>) <stmt> <if_stmt> -> if (<logic_expr>) <stmt> else <stmt> Above two rules can be combined as follows: <if_stmt> -> if (<logic_expr>) <stmt> | if (<logic_expr>) <stmt> else <stmt>

  13. BNF – Fundamentals (cont...) • BNF does not contain ellipsis (...) to represent variable-length lists. Instead it uses recursion in the rules. • A rule is said to be recursive if the LHS appears in its RHS as shown below: <iden_list> -> identifier | identifier, <iden_list>

  14. Grammars and Derivations • A grammar is a generative device for defining languages. • Sentences are generated through a sequence of application of the rules, beginning with a special non-terminal symbol known as start symbol. • The sequence of rule applications is called a derivation. • For a programming language, the start symbol often refers the entire program and is denoted as <program>.

  15. Grammars and Derivations (cont...) Adopted from Concepts of Programming Languages - Sebesta

  16. Grammars and Derivations (cont...) • A derivation of a program is as follows: Adopted from Concepts of Programming Languages - Sebesta

  17. Grammars and Derivations (cont...) • The symbol => is read as “derives”. • Each of the strings in the derivation, including <program>, is called a sentential form. • Derivations in which always the left most non-terminals are replaced are known as leftmost derivations. • The sentential form consisting of only terminals, or lexemes, is the generated sentence.

  18. Parse Trees • Grammars naturally describe the hierarchical structure of sentences. These hierarchical structures are known as parse trees. • Every internal node in a parse tree is a non-terminal symbol. • Every leaf node is a terminal symbol. • Every sub-tree describes one instance of an abstraction in the sentence.

  19. Parse Trees (cont...) Adopted from Concepts of Programming Languages - Sebesta

  20. Parse Trees (cont...) Adopted from Concepts of Programming Languages - Sebesta

  21. Ambiguity • A grammar is said to be ambiguous if a string derived by using the grammar has more than one parse tree. Adopted from Concepts of Programming Languages - Sebesta

  22. Ambiguity (cont...) Parse trees for the string A = B + C * A Adopted from Concepts of Programming Languages - Sebesta

  23. Operator Precedence • The mechanism which allows the implementation to choose one operator among several operators for evaluation is know as operator precedence. • Ambiguous grammars makes it difficult to choose one operator over another. • General rule is to execute the operator which is lower in the parse tree.

  24. Operator Precedence (cont...) Parse trees for the string A = B + C * A Adopted from Concepts of Programming Languages - Sebesta In one parse tree * is lower and in another + is lower. Which one to choose?

  25. Operator Precedence (cont...) • Correct ordering is specified by using separate non-terminals to represent the operands of operators that require different precedence. • Previous grammar can be re-written (unambiguous) as follows: Adopted from Concepts of Programming Languages - Sebesta

  26. Operator Precedence (cont...) Adopted from Concepts of Programming Languages - Sebesta

  27. Associativity • The semantic rule which specifies the precedence in case of same level operators is known as associativity. • If the LHS of a rule appears first in its RHS, such grammar is said to be left recursive. Adopted from Concepts of Programming Languages - Sebesta

  28. Associativity (cont...) • If the LHS of a rule appears last in its RHS, such grammar is said to be right recursive. • Left recursion supports left associativity and right recursion supports right associativity.

  29. Extended BNF (EBNF) • Due to shortcomings in BNF, it was extended. The extended version is known as Extended BNF or simply EBNF. • Three extensions are commonly included in the various versions of EBNF. • First extension is denoting a optional part in the RHS using square brackets. Ex: <if_stmt> -> if (<expr>) <stmt> [ else <stmt> ]

  30. Extended BNF (EBNF) (cont...) • Second extension is the use of braces in the RHS to indicate that the enclosed part can be repeated indefinitely. Ex: <iden_list> -> <identifier> {, <identifier> }

  31. Extended BNF (EBNF) (cont...) • Third extension deals with multiple-choice options using the parentheses and OR operator, |. Ex: <term> -> <term> (* | / | % ) <factor> • The brackets, braces and parentheses are known as metasymbols.

  32. Extended BNF (EBNF) (cont...) Adopted from Concepts of Programming Languages - Sebesta

  33. Attribute Grammars • An attribute grammar is used to describe more about the structure of a programming language. • Attribute grammar is an extension to a CFG. • Attribute grammar allows certain language rules like type compatibility to be conveniently described.

  34. Attribute Grammars – Static Semantics • Some characteristics of the programming languages like type compatibility cannot be specified using BNF. • A syntax rule that cannot be specified using BNF is, all variables must be declared before their usage. • These are examples of static semantic rules. Static semantics can be checked at compile time. • Attribute grammar is one of the alternatives for describing static semantics. It was designed by Knuth.

  35. Attribute Grammars – Basic Concepts • Attribute grammars are CFGs along with attributes, attribute computation functions and predicate functions. • Attributes are associated with grammar symbols (terminals and non-terminals) and are similar to variables. • Attribute Computation Functions are associated with grammar rules.They are used to specify how attribute values are computed. • Predicate functions, which state the static semantic rules, are associated with grammar rules.

  36. Attribute Grammars – Definition • Associated with each grammar symbol X is a set of attributes A(X). • The set A(X) contains two disjoint sets S(X) and I(X), called synthesized attributes and inherited attributes. • Synthesized Attributes are used to pass semantic information up the parse tree. • Inherited Attributes pass semantic information down and across a tree

  37. Attribute Grammars – Definition (cont...) • Associated with each grammar rule is a set of semantic functions. • For a rule X0 -> X1....Xn , the synthesized attributes of X0 are computed with semantic functions of the form S(X0) = f(A(X1),...,A(Xn)). So the value of a synthesized attribute on a node only depends upon the values of the attributes of that node’s child nodes. • Inherited attributes of symbols Xj, 1<=j<=n, are computed with a semantic function of the form I(Xj) = f(A(X0),.....,A(Xn)). So the value of inherited attribute on a node depends on attribute values of that node’s parent node and those of its sibling nodes.

  38. Attribute Grammars – Definition (cont...) • A predicate function has the form of a Boolean expression on the union of the attribute set {A(X0),....,A(Xn)} and a set of literal attribute values. • The only derivations allowed with an attribute grammar are those in which every predicate associated with every non-terminal is true.

  39. Intrinsic Attributes • Intrinsic attributes are synthesized attributes of leaf nodes whose values are determined outside the parse tree (ex: type of a variable from symbol table). • Given the intrinsic attribute values on a parse tree, the semantic functions can be used to compute remaining attribute values.

  40. Attribute Grammar – Example 1 Adopted from Concepts of Programming Languages - Sebesta Attribute grammar that describes the rule that the name on the end of an Ada procedure must match the procedure’s name. (This rule cannot be stated using BNF). Note: Numbers represented as subscripts are used to denote the instances of an abstraction.

  41. Attribute Grammar – Example 2 actual_type: Synthesized Attribute expected_type: Inherited Attribute Adopted from Concepts of Programming Languages - Sebesta

  42. Attribute Grammar – Example 2 (cont...) Adopted from Concepts of Programming Languages - Sebesta

  43. Attribute Grammar – Example 2 (cont...) Adopted from Concepts of Programming Languages - Sebesta

  44. Attribute Grammar – Example 2 (cont...) Adopted from Concepts of Programming Languages - Sebesta

  45. Dynamic Semantics • Dynamic semantics deals with meaning of the expressions, statements and program units. • No universally accepted notation or approach has been devised for dynamic semantics.

  46. Operational Semantics • Operational semantics specifies the meaning of a program in terms of its implementation on a real or virtual machine. • Change in the state of the machine defines the meaning of the statement. • To use operational semantics for a high-level language, a virtual machine is needed. • Highest level operational semantics is known as natural operational semantics and lowest level is known as structural operational semantics.

  47. Operational Semantics - Ex

  48. Operational Semantics - Evaluation • Advantages: • May be simple for small examples • Good if used informally • Useful for implementation • Disadvantages: • Very complex for large programs • Lacks mathematical rigor • Uses: • Vienna Definition Language (VDL) used to define PL/I • Compiler work

  49. Denotational Semantics • A formal method for specifying the meaning of programs. Denotational semantics is based on recursive function theory. • Key idea is to define a function that maps a program (a syntactic object) to its meaning (a semantic object). • The domain of the mapping function is called the syntactic domain and the range is called semantic domain. • The method is named denotational because the mathematical objects denote the meaning of their corresponding entities.

  50. Denotational vs. Operational • Denotational semantics is similar to operational semantics except: • There is no virtual machine • Language is mathematics (lambda calculus) • Difference between denotational and operational semantics: • In operational semantics, the state changes are defined by coded algorithms for a virtual machine • In denotational semantics, they are defined by rigorous mathematical functions

More Related