What?

It’s a type of Language with specific types of rules.

Contains:

A grammar where:

  • Start symbol : ( Exp in this case - ) and
  • Non-terminals : ( Exp, Var, Num - ) ,
  • Terminals : (+, -, 0…9, (, ), x, etc - ),
  • Rules / Productions : ( of the form , where , ).
  • Rules example:
   Exp → Exp + Exp
   Exp → Exp * Exp
   Exp → Var | Num | ( Exp )
   Var → x | y | z
   Num → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  • Example legal sentence tree (aka syntax tree, Parse Tree) of above:
  • An Abstract Syntax Tree would be a more concise, high entropy version of that tree.

How do you build a tree from a sentence?

I’m glad you asked. Cue the CYK Algorithm!

Writing Languages Concisely (corresponds with Regex):

  • x*: zero or more occurrences of x
  • x+: one or more occurrences of x
  • [x]: zero or one occurrence of x

We can rewrite RE as CFG with:

  • Rewrite any occurrences of x*, x+ or [x] with , and add the following rules.
  • x*
  • x+
  • [x]

Left Parsable Grammars:

Take a language with terms like:

  • term_1 | ... | term_n
  • fact_1, | ...| fact_n
  • [exp], exp*

Left Parsable Grammars are ones where the terms do not share initial symbols. This makes life tremendously easier later one, cos we only need to (and now can) lookahead a single character in the parsing.