What:
In a traditional 2-step Compiler, the frontend is in charge of ensuring the following:
- Checking if program is legal
- Report errors usefully
- Produce IR
High Level:
Components:
Lexer:
- Converts the code into a stream of useable tokens. Also gets rid of whitespace and comments here actually.
Parser:
- Constructs an Abstract Syntax Tree. Detects if the tree is valid (e.g. missing semicolon)
- Top Down Parser: Build an AST by working from the start symbol to the input sentence
- Bottom Up Parser: Builds an AST by working from the input sentence BACK to the start symbol.
Semantic Analyser:
- Checks the AST is semantically correct.
- Type Checking: E.g. Are variable types consistent (adding integers to strings)
- Scope Validation: Variables are used within their proper scope
- Function Validation: Functions are called with the correct number and type of arguments
IR Generator:
- Converts the AST into Internal Representation (IR) Graph, which is fed into the Compiler Backend.
- There’s a fundamental problem with the AST. It doesn’t actually denote the order of which the code occurs (control-flow or data-flow).