What?
It’s a type of Language with specific types of rules.
Contains:
A grammar where:
- Start symbol : ( Exp in this case - ) and
- Non-terminals : ( Exp, Var, Num - ) ,
- Terminals : (+, -, 0…9, (, ), x, etc - ),
- Rules / Productions : ( of the form , where , ).
- Rules example:
Exp → Exp + Exp
Exp → Exp * Exp
Exp → Var | Num | ( Exp )
Var → x | y | z
Num → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
- Example legal sentence tree (aka syntax tree, Parse Tree) of above:
- An Abstract Syntax Tree would be a more concise, high entropy version of that tree.
How do you build a tree from a sentence?
I’m glad you asked. Cue the CYK Algorithm!
Writing Languages Concisely (corresponds with Regex):
- x*: zero or more occurrences of x
- x+: one or more occurrences of x
- [x]: zero or one occurrence of x
We can rewrite RE as CFG with:
- Rewrite any occurrences of x*, x+ or [x] with , and add the following rules.
- x* →
- x+ →
- [x] →
Left Parsable Grammars:
Take a language with terms like:
term_1 | ... | term_n
fact_1, | ...| fact_n
[exp], exp*
Left Parsable Grammars are ones where the terms do not share initial symbols. This makes life tremendously easier later one, cos we only need to (and now can) lookahead a single character in the parsing.