Semantic Analysis

Semantic Analysis • Legality checks • Check that program obey all rules of the language that are not described by a context-free grammar • Disambiguation • Name resolution, type resolution, overload resolution • Expanded intermediate representation • Annotate tree to guide subsequent phases

A formal model : attributes • Semantic information can be represented by computed values attached to an AST node • For an identifier: the corresponding entity • For a static expression: its computed value • For a function: the return type • For a record: the component names and types • For a derived type: its parent type • All of it is implicit in the original tree. Attributes provide compact, efficient representations

Attribute Computation • The value of an attribute at a node can be computed from the values of attributes at immediate neighbor nodes • The computation is keyed to the production in which the node appears • ProductionAssignment => Var := Lit ; • Equation: Type Lit = Type Var • The type of the literal is inherited from the variable that is the lhs of the assignment

Inherited and synthesized attributes • Production: N => ABC • An attribute of non-terminal N that is computed from the attributes of A, B, C is synthesized • An attribute of A that is computed from an attribute of N is inherited • An attribute of A that is computed from attributes of B, C Is inherited (“has to go through N to reach A”)

Attributes grammars • General formalism: define all context-dependent aspects as attributes. Provide equations for each attribute defined for each non-terminal. • There are no restrictions on dependencies: an attribute can depend on any attribute of other symbols appearing in the production • Semantic analysis is the computation of all attributes at each node of the AST • Attribute grammars are universal (Turing-equivalent)

The dream of full automation • Can define all aspects of the language with attribute grammars • Given a language for attributes, we can build an attribute evaluator, like a parser generator. • Attribute grammar + attribute evaluator • = automatically generated compiler • However: equations may be circular • detecting circularity is exponential • In practice, resulting compiler is too large / slow • Attributes are a powerful concept, not a universal tool.

Synthesized attributes Most useful attributes are synthesized, I.e. computed bottom-up. Example: numeric value of a base-2 representation: Bit => ‘0’ ValBit = 0 Bit => ‘1’ ValBit = 1 Bit_String => Bit ValStr = ValBit Bit_String => Bit_String Bit ValStr1 = 2*ValStr2 + ValBit

Inherited attributes • Inherited attributes describe context-dependent properties: visibility, typing. • Inherited attributes are computed top-down. Usually done as a separate pass over AST • The most important inherited attribute is the visibility environment, aka symbol table. • Typically represented as a global data structure, not as an attribute that is propagated from node to node.

Definitions and uses • A declaration introduces an entity: X : Integer; • The node for X is its defining occurrence • A subsequent occurrence of X in the current scope is a use of X X := 15; • The use-occurrence must indicate that this is the X defined above • The set of defining occurrences constitutes the symbol table.

Attributes of entities • The defining occurrence is a symbol table entry. • Holds all useful information about an entity • Type (another entity) • Size (numeric value: may be known statically) • Scope (another entity) • Name (pointer into names table) • Homonym (previous entity with same name) • Etc. (in GNAT, > 20 assorted fields. Described in Einfo)

Type entities and their attributes • Numeric types: low_bound, high_bound • Static expressions of related numeric type • Array types: list of index types, component type • Previously declared entities • Index bounds are expressions of the index type • Record types: list of components, variants • Entities appearing in component declarations • Variants indexed by values of discriminants • Flags: type is limited, type has tasks, type is packed, etc. (in GNAT, > 160 misc. predicates)

Attributes of program unit entities • All entities that contain local declarations have an attached list of local entities: • In GNAT, First_Entity, Last_Entity • Procedures: names and types of formals • Functions: names/types of formals, return type • Packages: separate lists of visible entities and private entities • Tasks: visible entries (operations), private data

Attributes of identifiers • For a variable: Entity denoted by identifier • Value, if entity is static constant • For a function: • set of possible interpretations (if overloaded) • single final interpretation (resolution) • For all: Type (redundant but convenient)

Attributes of Expressions • Possible types (if constituents are overloaded) • Type (after resolution) • Is_Static_Expression • Expr_Value (if static) • Raises_Constraint_Error (may be known) • In GNAT, described in Sinfo.

Bottom-up/Top Down processing • With recursion, very similar: procedure Analyze (N : Node_Id) is begin-- bottom-up analyze each child of N Compute local attributes end; procedure Resolve (N : Node_Id, Typ : Entity_Id) is begin– top-down Compute local attributes Resolve each child of N with information from N end;

Name Resolution • Compute the entity denoted by each identifier. • Apply visibility rules of language: • For a block-structured language, examine local scope first. • If not found, look at enclosing scopes • If not found, look at scopes in context (with_clauses, use_clauses) • If not found, look at implicit rules for operators • Entities with same name linked in homonym chain

Type resolution • Top-down pass: compute possible interpretations of each constituent, and their types • X + Y • : if X and Y have same numeric type, node has type of X • A (J) • if A is of an array type and J has the proper type for an index, node has component type of the type of A • F (X, Y, Z) • if F is a function and X, Y, Z have proper types for its formals, node has return type of F.

Overload Resolution • If a constituent is overloaded, context must impose a single type for resolution. function Convert (x : integer) return integer; function Convert (x : integer) return complex; function Convert (x : integer) return float; … Var := Convert (5); • Compute possible interpretations of Convert, resolve with type of Var. • Need to manipulate sets of names for types.

Finding a single interpretation • For a procedure call proc (f (x), g(y), h (z)); • proc, f, g, and h may be overloaded. • There must be a single interpretation of proc whose formal parameters are compatible with one of the possible interpretations of f, g, h. • Once proc is identified, resolve f with the type of its first formal, g with the type of the second, etc. • If more than one interpretation: ambiguous call • If none: illegal call

Analysis and expansion • Expansion replaces portions of AST with semantically equivalent portions for which it is easier to generate code • New tree fragments must be decorated with semantic information: • expansion and analysis are mutually recursive

Expansion: aggregates Length: integer := 5; type Arr is array (1..Length) of integer; X : integer := 22; Thing : Arr := (X, others => 42); -- complex construct Aggregate expands into: Thing : Arr; Thing (1) := X; Temp := 1; -- Created by compiler while Temp <= Length loop Thing (Temp) := 42; Temp := Integer’Succ (Temp); end loop;

Semantic Analysis