【正文】
parsing In a recursivedescent parser, the production information is embedded in the individual parse functions for each nonterminal and the runtime execution stack is keeping track of our progress through the parse. There is another method for implementing a predictive parser that uses a table to store that production along with an explicit stack to keep track of where we are in the parse. 62 How a tabledriven predictive parser works We push the start symbol on the stack and read the first input token. As the parser works through the input, there are the following possibilities for the top stack symbol X and the input token nonterminal a: 1. If X = a and a = end of input (): parser halts and parse pleted successfully 2. If X = a and a != : successful match, pop X and advance to next input token. This is called a match action. 3. If X != a and X is a nonterminal, pop X and consult table at [X,a] to see which production applies, push right side of production on stack. This is called a predict action. 4. If none of the preceding cases applies or the table entry from step 3 is blank, there has been a parse error 63 The first set of a sequence of symbols u, written as First(u ) is the set of terminals which start all the sequences of symbols derivable from u. A bit more formally, consider all strings derivable from u by a leftmost derivation. If u =* v , where v begins with some terminal, that terminal is in First(u). If u =* ? , then ? ?is in First(u ). 64 The follow set of a nonterminal A is the set of terminal symbols that can appear immediately to the right of A in a valid sentential form. A bit more formally, for every valid sentential form S =*uAv , where v begins with some terminal, that terminal is in Follow(A). 65 Calculating first set To calculate First(u) where u has the form X1X2...Xn, do the following: 1. If X1 is a terminal, then add X1 to First(u), otherwise add First(X1) ??to First(u ) . 2. If X1 is a nullable nonterminal, ., X1 =* ? , add First(X2) ??to First(u). Furthermore, if X2 can also go to ? , then add First(X3) ? ?and so on, through all Xn until the first nonnullable one. 3. If X1X2...Xn =* ? , add ? ?to the first set. 66 Calculating follow sets. For each nonterminal in the grammar, do the following: 1. Place in Follow(S) where S is the start symbol and is the input39。s right endmarker might be end of file, it might be newline, it might be a special symbol, whatever is the expected end of input indication for this grammar. We will typically use as the endmarker. 2. For every production A – uBv where u and v are any string of grammar symbols and B is a nonterminal, everything in First(v) except ? ?is placed in Follow(B). 3. For every production A – uB, or a production A – u Bv where First(v ) contains ? ?(. v is nullable), then everything in Follow(A) is added to Follow(B). 67 Constructing the parse table 1. For each production A – u of the grammar, do steps 2 and 3 2. For each terminal a in First(u), add A – u to M[A,a] 3. If ? ?in First(u), (. A is nullable) add A – u to M[A,b] for each terminal b in Follow(A), If ? ?in First(u), and is in Follow(A), add A – u to M[A,] 4. All undefined entries are errors 68 LL(1) grammar A grammar G is LL(1) iff whenever A – u | v are two distinct productions of G, the following conditions hold: for no terminal a do both u and v derive strings beginning with a (. first sets are disjoint) at most one of u and v can derive the empty string if v =* ??then u does not derive any string beginning with a terminal in Follow(A) 69 Errorreporting and recovery An error is detected in predictive parsing when the terminal on top of the stack does not match the next input symbol or when nonterminal A is on top of the stack, a is the next input symbol and the parsing table entry M[A,a] is empty. 70 Panicmode error recovery Panicmode error recovery is a simple technique that just bails out of the current construct, looking for a safe symbol at which to restart parsing. The parser just discards input tokens until it finds what is called a synchronizing token. The set of synchronizing tokens are those that we believe confirm the end of the invalid statement and allow us to pick up at the next piece of code.