【正文】
mming Languages 17 Optional amp。 \n printf(“new line\n”)。PLLab, NTHU,Cs2403 Programming Languages 1 Lex Yacc tutorial KunYuan Hsieh Programming Language Lab., NTHU PLLab, NTHU,Cs2403 Programming Languages 2 Overview take a glance at Lex! PLLab, NTHU,Cs2403 Programming Languages 3 Compilation Sequence PLLab, NTHU,Cs2403 Programming Languages 4 What is Lex? ? The main job of a lexical analyzer (scanner) is to break up an input stream into more usable elements (tokens) a = b + c * d。 ID ASSIGN ID PLUS ID MULT ID SEMI ? Lex is an utility to help you rapidly generate your scanners PLLab, NTHU,Cs2403 Programming Languages 5 Lex – Lexical Analyzer ? Lexical analyzers tokenize input streams ? Tokens are the terminals of a language – English ? words, punctuation marks, … – Programming language ? Identifiers, operators, keywords, … ? Regular expressions define terminals/tokens PLLab, NTHU,Cs2403 Programming Languages 6 Lex Source Program ? Lex source is a table of – regular expressions and – corresponding program fragments digit [09] letter [azAZ] %% {letter}({letter}|{digit})* printf(“id: %s\n”, yytext)。 %% main() { yylex()。 Repeated Expressions ? a? = zero or one instance of a ? a* = zero or more instances of a ? a+ = one or more instances of a ? . ab?c = ac or abc [az]+ = all strings of lower case letters [azAZ][azAZ09]* = all alphanumeric strings with a leading alphabetic character PLLab, NTHU,Cs2403 Programming Languages 18 Precedence of Operators ? Level of precedence – Kleene closure (*), ?, + – concatenation – alternation (|) ? All operators are left associative. ? Ex: a*b|cd* = ((a*)b)|(c(d*)) PLLab, NTHU,Cs2403 Programming Languages 19 Pattern Matching Primitives Metacharacter Matches . any character except newline \n newline * zero or more copies of the preceding expression + one or more copies of the preceding expression ? zero or one copy of the preceding expression ^ beginning of line / plement $ end of line a|b a or b (ab)+ one or more copies of ab (grouping) [ab] a or b a{3} 3 instances of a “a+b” literal “a+b” (Cescapesstill work) PLLab, NTHU,Cs2403 Programming Languages 20 Recall: Lex Source ? Lex source is a table of – regular expressions and – corresponding program fragments (actions) … %% regexp action regexp action … %% %% “=“ printf(“operator: ASSIGNMENT”)。 a operator: ASSIGNMENT b + c。 ? regexp one or more blanks { actions (C code) } ? A null statement 。 – Causes the three spacing characters to be ignored a = b + c。 ↓ ↓ a=b+c。 PLLab, NTHU,Cs2403 Programming Languages 22 Transition Rules (cont?d) ? Four special options for actions: |, ECHO。 ? | indicates that the action for this rule is from the action for the next rule – [ \t\n] 。 ? The unmatched token is using a default action that ECHO from the input to the output PLLab, NTHU,Cs2403 Programming Languages 23 Transition Rules (cont?d) ? REJECT – Go do the next alternative … %% pink {npink++。} ink {nink++。} pin {npin++。} . | \n 。 [az]+ ECHO。 chars += yyleng。 %} letter [azAZ] %% {letter}+ foo()。 %} letter [azAZ] %% {letter}+ {printf(“a word\n”)。} %% main() { yylex()。 } PLLab, NTHU,Cs2403 Programming Languages 29 Usage ? To run Lex on a source file, type lex ? It produces a file named which is a C program for the lexical analyzer. ? To pile , type cc –ll ? To run the lexical analyzer program, type ./ inputfile PLLab, NTHU,Cs2403 Programming Languages 30 Versions of Lex ? ATamp。=39。 } 。+39。 } | expression 39。 NUMBER { $$ = $1 $3。 } 。 return 0。