【正文】
Univ. 可計(jì)算理論 2020/11/17 1/77 Lecture Notes for Computation Theory Book :《 計(jì)算理論導(dǎo)引 》 Introduction to the Theory of Computation by Michael Sipser Section 信息的定義 (可考慮部分略講 ) Instructor : 唐常杰 Students : . 20202020, SCU Style : Lecture / Seminar Univ. 可計(jì)算理論 2020/11/17 2/77 Univ. 可計(jì)算理論 2020/11/17 3/77 Univ. 可計(jì)算理論 2020/11/17 4/77 Univ. 可計(jì)算理論 2020/11/17 5/77 Outline Chapter : Definition of Information by Descriptive / Kolmogorov Complexity 若干補(bǔ)充 Inpressible Strings Chapter 6 Arguments by Inpressibility Learning theory Univ. 可計(jì)算理論 2020/11/17 6/77 Measuring Information ep213 cp 146 Standard information theory considers each n bit string in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101‖ 有規(guī)律地 重復(fù) 01串 17次,可壓縮為 0117 and the string (of equal length) B=“0101110101001010001110001100011011‖. 比較復(fù)雜,幾句化說不清的,信息量較大 直觀感覺: 長話短說 的 最短尺寸,可用來度量其信息量 Univ. 可計(jì)算理論 2020/11/17 7/77 Measuring Information ep213 cp 140 Standard information theory considers each n bit string in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101‖ 有規(guī)律地 重復(fù) 01串 17次,可壓縮為 0117 and the string (of equal length) B=“0101110101001010001110001100011011‖. 比較復(fù)雜,幾句化說不清的,信息量較大 直觀感覺: 長話短說 的 最短尺寸,可用來度量其信息量 Univ. 可計(jì)算理論 2020/11/17 8/77 Measuring Information ep213 cp 140 Standard information theory considers each n bit string in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101‖ 有規(guī)律地 重復(fù) 01串 17次,可壓縮為 0117 and the string (of equal length) B=“0101110101001010001110001100011011‖. 比較復(fù)雜的,短話說不清的,信息量較大 直觀感覺 : 表達(dá)語義的 最短尺寸,可用來度量其信息量 Univ. 可計(jì)算理論 2020/11/17 9/77 Regularity We can give a short description of ―0101010101 010101010101010101010101‖ by ―17 times 01‖. For the other ―01011101010010100011100011 00011011‖ this seems more problematic. Suggests: 規(guī)律性使得描述較短(信息量較小) A regular string is a string that has a short description。 an irregular one has no such summary. Univ. 可計(jì)算理論 2020/11/17 10/77 Regularity We can give a short description of ―0101010101 010101010101010101010101‖ by ―17 times 01‖. For the other ―01011101010010100011100011 00011011‖ this seems more problematic. Suggests: 規(guī)律性 使得描述較短(有規(guī)律的,信息量較?。? A regular string is a string that has a short description。 an irregular one has no such summary. Univ. 可計(jì)算理論 2020/11/17 11/77 Allowed Descriptions 用 Pi表示一長串 01 Problem: What do we consider a description? What may be an obvious description for one, may be illegible for somebody else: …….. 011001001000011111101101010100010001 000010110100011000010001101001100010 011000110011000101000101110000000110 111000001110011010001001010010… ? We need a proper definition of ?a description‘. Univ. 可計(jì)算理論 2020/11/17 12/77 Allowed Descriptions 用 Pi表示一長串 01 Problem: What do we consider a description? What may be an obvious description for one, may be illegible for somebody else: …….. 011001001000011111101101010100010001 000010110100011000010001101001100010 011000110011000101000101110000000110 111000001110011010001001010010… ? We need a proper definition of ?a description‘. Univ. 可計(jì)算理論 2020/11/17 13/77 ‘ Turing Describable’ ep215 cp141 重復(fù) 17次 01 TM M w 說明可用 Mw的長度來描述信息量 X 向復(fù)雜度(信息量)為 Min ({ |M|+|w| :: x=M(w), M is TM } ) 例 用 Winzip 壓縮 成為 , 121K | Winzip|+|| 能描述 Some details need to be sorted out first though… 規(guī)律的描述和輸入 Univ. 可計(jì)算理論 2020/11/17 14/77 ‘ Turing Describable’ ep215 cp141 重復(fù) 17次 01 TM M w 說明可用 Mw的長度來描述信息量 X 的 復(fù)雜度(信息量)為 Min ({ |M|+|w| :: x=M(w), M is TM } ) 例 用 Winzip 壓縮 成為 , 121K | Winzip|+|| 能描述 Some details need to be sorted out first though… 又例:某個(gè)問題用億次機(jī)算,最少要用 1個(gè)月 TM M time 規(guī)律的描述和輸入 Univ. 可計(jì)算理論 2020/11/17 15/77 ‘ Turing Describable’ ep215 cp141 重復(fù) 17次 01 TM M w 說明可用 Mw的長度來描述信息量 X 的 復(fù)雜度(信息量)為 Min ({ |M|+|w| :: x=M(w), M is TM } ) 例 用 Winzip 壓縮 成為 , 121K | Winzip|+|| 能描述 Some details need to be sorted out first though… 又例:某個(gè)問題用億次機(jī)算,最少要用 1個(gè)月 TM M time 規(guī)律的描述和輸入 壓縮程序復(fù)雜一些, ?壓縮出來的文件可能小一些 Univ. 可計(jì)算理論 2020/11/17 16/77 Measuring the size of (M,y) We want to express the size of the TM M and y in a number of bits. But how many bits is a specific (M,y) pair? Other key idea: We fix a universal Turing machine U that,on input M,y, simulates the TM M on y (yielding output x). 為了一致地比較,用通用圖靈機(jī) U,模擬 M on y int U (M,y) { return ( M(y) )。 } How does the M,y encoding work? Univ. 可計(jì)算理論 2020/11/17 17/77 Measuring the size of (M,y) We want to express the size of the TM M and y in a number of bits. But how many bits is a specific (M,y) pair? Other key idea: We fix a universal Turing machine U that,on input M,y, simulates the TM M on y (yielding output x). 為了 一致、公平 地比較,用通用圖靈機(jī) U,模擬 M on y int U (M,y) { return ( M(y) )。 } How does the M,y encoding work? Univ. 可計(jì)算理論 2020/11/17 18/77 An Encoding for M,y ep214, cp147 思想 : 0,1和分隔符號(hào),需三符號(hào),可用三進(jìn)位, 但這里巧用 二進(jìn)位編碼 The description of the TM M