COMS W4115
Programming Languages and Translators
Lecture 10: Constructing Predictive Parsers
February 25, 2008
Lecture Outline
- Review
- FIRST
- FOLLOW
- Constructing a predictive parsing table
- LL(1) grammars
- Reading
1. Review
- Context-free grammars
- Top-down parsing
- Nonrecursive predictive parser
- Transformations on grammars
2. FIRST
- FIRST(α) is the set of terminals that begin the strings
derivable from α.
If α can derive ε, then ε is also in FIRST(α).
- Algorithm to compute FIRST(X):
- If X is a terminal, then FIRST(X) = { X }.
- If X → ε is a production, then add ε to FIRST(X).
- If X
→ Y1 Y2 ... Yk
is a production for k ≥ 1, and
for some i ≤ k,
Y1Y2 ... Yi-1 derives the empty string,
and a is in FIRST(Yi),
then add a to FIRST(X).
If Y1Y2 ... Yk
derives the empty string,
then add ε to FIRST(X).
- Example. Consider the grammar G:
- For G, FIRST(
S) = {(, ε}.
3. FOLLOW
- FOLLOW(A) is the set of terminals that can appear immediately
to the right of A in some sentential form.
If A can be the rightmost symbol in some sentential form,
then add $ to FOLLOW(A).
- Algorithm to compute FOLLOW(A) for all nonterminals A of a grammar:
- Place $ in FOLLOW(S) where S is the start symbol of the grammar.
- If A → αBβ is a production,
then add every terminal in FIRST(β) (except for ε if it is there) to FOLLOW(B).
- If there is a production A → αB,
or a production A → αBβ,
where FIRST(β) contains ε,
then put everything in FOLLOW(A) into FOLLOW(B)
- Example. For G above, FOLLOW(
S) = {), $}.
4. Constructing a Predictive Parsing Table
- Input: Grammar G.
- Output: Predictive parsing table M.
- Method:
for (each production A → α in G) {
for (each terminal a in FIRST(α))
add A → α to M[A, a];
if (ε is in FIRST(α))
for (each symbol b in FOLLOW(A))
add A → α to M[A, b];
}
make each undefined entry of M be error;
Example 1. Predictive parsing table for the grammar:
Input Symbol
Nonterminal ( ) $
S S → (S)S S → ε S → ε
Example 2. Predictive parsing table for the grammar:
FIRST(S) = {(, ε}
FOLLOW(S) = {(, ), $}
Input Symbol
Nonterminal ( ) $
S S → (S)S S → ε S → ε
S → ε
5. LL(1) Grammars
- A grammar is LL(1) iff whenever
A → α | β
are two distinct productions, the following three conditions hold:
- For no terminal
a do both α and β derive
strings beginning with a.
- At most one of α and β can derive the empty string.
- If β derives the empty string, then α does not derive any string
beginning with a terminal in FOLLOW(
A).
Likewise, if α derives the empty string, then β does not derive
any string beginning with a terminal in FOLLOW(A).
- We can use the algorithm above to construct a predictive parsing
table with uniquely defined entries for any LL(1) grammar.
- The first "L" in LL(1) means scanning the input from left to right,
the second "L" for producing a leftmost derivation, and the "1" for
using one symbol of lookahead to make each parsing action decision.
6. Reading
aho@cs.columbia.edu