【正文】
t yywrap(void) wrapup, return 1 if done, 0 if not done ECHO write matched string REJECT go to the next alternative rule INITAL initial start condition BEGIN condition switch start condition PLLab, NTHU,Cs2403 Programming Languages 27 User Subroutines Section ? You can use your Lex routines in the same ways you use routines in other programming languages. %{ void foo()。 [azAZ]+ {words++。 %% … PLLab, NTHU,Cs2403 Programming Languages 24 Lex Predefined Variables ? yytext a string containing the lexeme ? yyleng the length of the lexeme ? yyin the input stream pointer – the default input of default main() is stdin ? yyout the output stream pointer – the default output of default main() is stdout. ? cs20: %./ inputfile outfile ? . [az]+ printf(“%s”, yytext)。 REJECT。 REJECT。 REJECT。 – “ “ | “\t” | “\n” 。, BEGIN, and REJECT。d=b*c。 d = b * c。 will ignore the input (no actions) [ \t\n] 。 PLLab, NTHU,Cs2403 Programming Languages 21 Transition Rules ? regexp one or more blanks action (C code)。 a = b + c。 } PLLab, NTHU,Cs2403 Programming Languages 7 Lex Source to C Program ? The table is translated to a C program () which – reads an input stream – partitioning the input into strings which match the given expressions and – copying it to an output stream if necessary PLLab, NTHU,Cs2403 Programming Languages 8 An Overview of Lex Lex C piler Lex source program input tokens PLLab, NTHU,Cs2403 Programming Languages 9 (optional) (required) Lex Source ? Lex source is separated into three sections by %% delimiters ? The general format of Lex source is ? The absolute minimum Lex program is thus {definitions} %% {transition rules} %% {user subroutines} %% PLLab, NTHU,Cs2403 Programming Languages 10 Lex . Yacc ? Lex – Lex generates C code for a lexical analyzer, or scanner – Lex uses patterns that match strings in the input and converts the strings to tokens ? Yacc – Yacc generates C code for syntax analyzer, or parser. – Yacc uses grammar rules that allow it to analyze tokens from Lex and create a syntax tree. PLLab, NTHU,Cs2403 Programming Languages 11 Lex with Yacc Lex Yacc yylex() yyparse() Lex source (Lexical Rules) Yacc source (Grammar Rules) Input Parsed Input return token call PLLab, NTHU,Cs2403 Programming Languages 12 Regular Expressions PLLab, NTHU,Cs2403 Programming Languages 13 Lex Regular Expressions (Extended Regular Expressions) ? A regular expression matches a set of strings ? Regular expression – Operators – Character classes – Arbitrary character – Optional expressions – Alternation and grouping – Context sensitivity – Repetitions and definitions PLLab, NTHU,Cs2403 Programming Languages 14 Operators “ \ [ ] ^ ? . * + | ( ) $ / { } % ? If they are to be used as text characters, an escape should be used \$ = “$” \\ = “\” ? Every character but blank, tab (\t), newline (\n) and the list above is always a text character PLLab, NTHU,Cs2403 Programming Languages 15 Character Classes [] ? [abc] matches a single character, which may be a, b, or c ? Every operator meaning is ignored except \ and ^ ? . [ab] = a or b [az] = a or b or c or … or z [+09] = all the digits and the two signs [^azAZ] = any character which is not a letter PLLab, NTHU,Cs2403 Programming Languages 16 Arbitrary Character . ? To match almost character, the operator character . is the class of all characters except newline ?[\40\176] matches all printable characters in the ASCII character set, from octal 40 (blank) to octal 176 (tilde~) PLLab, NTHU,Cs2403 Progra