【正文】
tors. For example, gray|grey and gr(a|e)y are equivalent. quantification (量化 ): a quantifier after a token (such as a character) or group specifies how often that preceding element is allowed to occur. From Syntax of regular expression [1] metasequence description . matches any single character except newline [ ] matches a single character that is contained within the brackets. [abc] = { a, b, c } [09] = {0,1,2,3,4,5,6,7,8,9} [^ ] matches a single character that is not contained within the brackets. [^abc] = { x is a character : x is not a or b or c } ^ matches the starting position within the string $ matches the ending position of the string or the position just before a stringending newline {m,n} matches the preceding element at least m and not more than n times. a{3,5} matches only “aaa”, “aaaa” and “aaaaa”, NOT “aa” 在方括號中如果放的是名稱 , 且放在樣式開頭的話 , 代表這個樣式只用在某個開始狀態(tài) Syntax of regular expression [2] metasequence description * matches the preceding element zero or more times ab*c matches “ac”, “abc”, “abbc” + matches the preceding element one or more times [09]+ matches “1”, “14”, “983” ? matches the preceding element zero or one time [09]? matches “ ”, “9” | the choice (aka alternation or set union) operator matches either the expression before or the expression after the operator. abc|def matches “abc” or “def” ( ) group to be a new expression (01) denotes string “01” \ escape character * means wild card, \* means ASCII code of * “…” 代表引號中的全部字元 , 所有引號中的後設字元都失去它們特別的意義 , 除 \ 之外 “ /*” 代表兩個字元 / 和 * Example: based10 integer one digit of regular expression [09] positive integer is posed of many digits [09]+ [09]* is not adequate, since [09]* can accept empty string we need a sign to represent all integers ?[09]+ Accepted string: “5”, “1234”, “0000”, “000”, “9276000” Question: How to represent based16 integer under regular expression? OutLine ? What is lex ? Regular expression ? Finite state machine ? Content of flex ? Application Finite state machine (FSM) ?[09]+ S0 minus digit [09] [09] [09] Current state Input token (transition function) Next state description S0 minus S0 is initial state [09] digit minus [09] digit minus state recognize string “” digit [09] digit digit state recognize string “[09]+” or “[09]+” trap terminate integer trap ^[09] ^ ^[09] state transition diagram 1 2 3 4 S0 minus digit State sequence 1 2 3 4 S0 minus 1 digit 1 2 3 4 S0 minus 1 digit 2 digit 1 2 3 4 S0 minus 1 digit 2 digit 3 digit 1 2 3 4 S0 minus 1 digit 2 digit 3 digit 4 Transform FSM to Ccode S0 minus digit [09] [09] [09] trap ^[09] ^ ^[09] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 Driver to yylex_integer Exercise: extract real number ?[09]*\.[09]+(([Ee][+]?[09]+)?) real number ? why do we need a escape character for dot, “\.” ? ? Can this regular expression identify all real numbers? ? depict state transition diagram of finite state machine for this regular expression. ? Implement this state transition diagram and write a driver to test it ? Use flex t