330 likes | 495 Vues
This presentation delves into the fundamental concepts of language processing technology as it pertains to programming languages. It covers the different levels of programming languages like high-level (C, Java) and low-level (Assembly), the role of language processors including editors, compilers, and interpreters, and the specification of programming languages encompassing syntax and semantics. Detailed examples are provided to illustrate language characteristics, such as expressions, data types, and control structures, alongside discussions on grammar and constraints that define a programming language.
E N D
Introduction to Language Processing Technology Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline • Level of Programming Languages. • Language Processors. • Specification of Programming Languages.
swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } swap: muli $2, $5, 4 add $2, $4, $2 lw $15, 0($2) ... 000010001101101100110000 000010001101101100110000 000010001101101100110000 000010001101101100110000 ... Level of Programming Languages • High level: C / Java / Pascal • Low level: Assembly / Bytecode • Machine Language C Compiler Assembler
High-Level Language Characteristics • Expressions: a = b + (c – d)/2; • Data types: • Integer, character, boolean. • Record, array. • Control structures: • Selective. • Iterative.
High-Level Language Characteristics • Declarations: • Identifier can be constant, variable, procedure, function, and type. • Abstraction: • Object-oriented concept. • Concern only what, not how. • Encapsulation: • Object-oriented concept. • Information hiding.
Language Processors • Program that manipulates programs express in some programming languages. • Example: • Editor. • Translator / Compiler. • Interpreter.
Translator sourceprogram objectprogram error messages Translator • Translate a “source” program into an “equivalent” “object” program. C C++ FORTRAN Java VB Assembly C Bytecode p-code
P Sort Sort Web Browser L Java x86 x86 Tombstone Diagrams • Ordinary program Program P Written with Language L
Web Browser Web Browser x86 x86 Tombstone Diagrams • Machine x86 SPARC Machine M M x86 SPARC
S ® T Java ® x86 Java ® x86 Java ® C L C x86 C++ Tombstone Diagrams • Translator S is translated to T Translator is written with Language L
Sort Sort Sort C ® x86 x86 C x86 x86 x86 Tombstone Diagrams • Compilation x86
Sort Sort Sort C ® SPARC SPARC SPARC C x86 x86 Tombstone Diagrams • Cross Compilation SPARC
Sort Sort Sort Java ® C C ® x86 C Java x86 x86 x86 x86 x86 Tombstone Diagrams • Two-stage compilation
C ® x86 Pascal ® x86 Pascal ® x86 x86 C x86 x86 Tombstone Diagrams • Compiling a compiler
Sort Basic Tombstone Diagrams • Interpreter Basic x86 Interpret source S S L Basic x86 SQL SPARC x86 Written in language L
HW1 HW1 C ® x86 370 370 x86 370 C 370 x86 370 x86 Tombstone Diagrams • Abstract machine = hardware emulator • interpreter for low-level language. 370 x86 = x86
Java ® JVM M Tombstone Diagrams • Java • Portable environment: write-once-run-anywhere. • Interpretive compiler. JVM = Bytecode JVM M
Sort Sort Sort Sort Java ® JVM JVM JVM JVM Java x86 JVM x86 JVM SPARC x86 x86 SPARC Tombstone Diagrams
C ® NNP NNP Tombstone Diagrams • Bootstrapping • Compiler L that is written on L language. • Full bootstrap • Start from nothing. • Half bootstrap • Start from other machine.
Full Bootstrap Csub® NNP Csub® NNP Csub® NNP Csub® NNP Csub® NNP C ® NNP C ® NNP NNP Csub NNP NNP Csub NNP NNP Tombstone Diagrams NNP NNP
C ® NNP C ® NNP C ® NNP C NNP NNP Tombstone Diagrams NNP
C ® NNP C ® NNP Csub® NNP C ® NNP Csub® NNP C ® NNP Csub® NNP NNP NNP Csub C NNP Csub NNP Tombstone Diagrams NNP NNP NNP
Half Bootstrap C ® x86 C ® NNP C ® NNP C ® X86 C ® NNP C ® NNP C ® NNP NNP C x86 x86 x86 x86 C x86 Tombstone Diagrams x86 x86
Specification of Programming Language • Specification • Syntax • Define symbol and structure of the language. • Grammar. • Contextual constraints • Constraints beyond grammar. • Rules of the language: scope rules, type rules, etc. • Semantics • Meaning of program: its behaviors when run. • How to translate a sentence S of the language L to a machine code on M
Syntax • Context-free grammar • Terminals. • Non-terminals / Variables. • Start symbol. • Production rules. • Usually being expressed with BNF notation.
BNF Notation • Backus-Naur Form. • Given production rule: N ®a N ®b • Can be written as: N ::= a | b
Example: Mini-Triangle Program ! This is a comment. It continues to the end-of-line. let const m ~ 7; var n: Integer in begin n:= 2 * m * m; putint(n); end Terminals begin const do else end if in let then var while ; : := ~ ( ) + - * / < > = \
Mini-Triangle Syntax Program ::= Command Command ::= single-Command | Command ; single-Command single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command end
Mini-Triangle Syntax Expression ::= primary-Expression | Expression Operator primary-Expression primary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier Declaration ::= single-Declaration | Declaration ; single-Declaration single-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoter
Mini-Triangle Syntax Type-denoter ::= Identifier Operator ::= + | - | * | / | < | > | = | \ Identifier ::= Letter | Identifier Letter | Identifier Digit Integer-Literal ::= Digit | Integer-Literal Digit Comment ::= ! Graphic* eol Letter ::= a | b | … |z Digit ::= 0 | 1 | 2 | … | 9
Syntax Tree • Ordered tree with • Internal nodes: non-terminals. • Leaf nodes: terminals. • N-tree of G is a syntax tree with N as the root.
Mini-Triangle Syntax Tree Expression ::= primary-Expression | Expression Operator primary-Expression primary-Expression ::= Integer-Literal | V-name | Operator primary-Expression |( Expression ) V-name ::= Identifier …