【文章內(nèi)容簡(jiǎn)介】
and regular expressions. ? ( regular expression FA) Example ? identifier = letter(letter|digit)* ? Circles represent states. ? The arrowed lines represent transitions upon a match of the labeled characters. ? Start state is indicated by unlabeled arrow ? Accepting states are indicated by doubleline border. 1 2 letter letter digit Deterministic Final Automata ? A DFA M consists of – An alphabet Σ – A set of states S – A transition function T: (S x Σ) S – A start state s0 S – A set of accepting states A S ? The language accepted by M, written L(M), is the set of strings of characters c1c2…c n with each ci Σ such that there exist states s1=T(s0,c1), s2=T(s1,c2), …. sn=T(sn1,) with sn an element of A. ? ? ?Error Transitions 1 2 letter digit Error other other letter By convention, the error transitions are not drawn in the diagram, but assumed to always exist. other = ~letter other = ~(letter|digit) Example ? The set of strings that contain exactly one b is accepted by the following DFA: ? ? (~a)*b(~a)* (Regular expression ) ? We will omit labels when it is not necessary to refer to the states by name. b notb notb Example ? The set of strings that contain at most one b is accepted by the following DFA: ? (it is simpler than its regular expression ) ? We will omit labels when it is not necessary to refer to the states by name. b notb notb Example ? digit = [09] ? nat = digit+ ? signedNat = (+|)? nat digit digit digit digit digit + digit Example (cont) ? number = signedNat(“.” nat)?(E signedNat)? digit digit + digit digit digit . digit digit + digit E E Example ? {Pascal Comments} ? /* C Comments */ other { } other * / * / other * Actions ? When a transitions is made, the character from the input string is moved to a string that accumulates the characters belonging to a single token (lexeme of the token). ? When an accepting state is reached, the recognized token is returned. ? When an error state is reached, an error token is generated or backtracking is done. Actions Example 1 2 letter letter digit start in_id letter letter digit [other] finish return ID [ ] indicate that the delimiting character should be returned to the input string and not consumed. Uniting DFA’s ? In a typical programming language there are many tokens, and each token is recognized by its own DFA. ? If each token begins with a different character, then it is easy to unite their start states into one start state. – Example: :=, =, = = = return ASSIGN : = return LE return EQ Uniting DFA’s (cont) ? If several tokens begin with the same character, such as , =, and , we must arrange the diagram so that there is a unique transition to be made to each state. – The plexity of such task bees enormous, especially I fit is done in an unsystematic way. = return LE return NE return LT [other] ε transition ? An εtransition is a transition that may occur without consulting the input string (and without consuming any character). ? εtransitions are counterintuitive, they may occur “spontaneously”. ? They can express the choice of alternatives and allow to bine DFA?s. ε Expanding DFA ? Need to include ε in the alphabet: Σ U {ε} ? The value of transition function T is a set of states rather then a single state. – T allows each character lead to more than one state. ? The range of T is the power set of the set of states S (the set of all subsets of S). ? We denote the power set by P (S) Nondeterministic Finite Automation (NFA) ? An NFA M consist of – An alphabet Σ – A set of states S – A transition function T: S x (Σ U {ε}) P (S) – A start state s0 from S – Th