【正文】
for arbitrary big N. Univ. 可計(jì)算理論 2020/11/17 56/77 (Un)putability of K 自學(xué)或略講 如何估計(jì) X的復(fù)雜度 K(x)? ? K(x) ? n 只對(duì)少數(shù)特殊的串成立,有規(guī)律或結(jié)構(gòu) 如 全 0串, 01交替串,等等, ? 對(duì)于描述 K(x) ? n 的串應(yīng)該證明 較小的圖靈機(jī)的確不能產(chǎn)生它。 例: ASCII用碼代替 字符位圖 設(shè) N=2m,N個(gè)符號(hào)只要編碼成長(zhǎng)度為 log(N) =m的串 1024個(gè)串用 10位編碼就可區(qū)別了, Hence K(x) ? cS + log(N) = cS + log(|S|), with constant cS depending on the description of S. Univ. 可計(jì)算理論 2020/11/17 45/77 Frequency Compression 頻率壓縮 Let x ?{0,1}n with 75% zeros and 25% ones. 高頻率者用短碼,低頻率用長(zhǎng)碼,使得總體碼長(zhǎng)較小 Consider the set S?{0,1}n of all such strings. The description of S requires ? c + 2log(n) bits. The size of S is |S| ? . The description of x requires no more than c + 2log(n) + log(|S|) = c + 2log(n) + bits. For large enough n: K(x) ? (approximately). Univ. 可計(jì)算理論 2020/11/17 46/77 Shannon Entropy of Strings 桑龍 串熵 ep217 We can press a bitstring ?{0,1}n, depending on the 0/1 distribution. The Shannon entropy of this distribution (p,1–p) gives an upper bound on the Kplexity. Univ. 可計(jì)算理論 2020/11/17 47/77 Best Splits (Cont.) 補(bǔ)充 熵概念 ? 用 8=23個(gè)字符( a1,..a8)寫密碼信 , 需要壓縮編碼,節(jié)約資源 ? 用 huffman編碼,編碼長(zhǎng)度與使用頻度反比,用得頻繁的碼長(zhǎng)短 , ? 設(shè) 8 各字符出現(xiàn)的為頻率分布如表。符合這一深層次的信息壓縮定理。| Winzip|+|| 能描述 Some details need to be sorted out first though… 又例:某個(gè)問題用億次機(jī)算,最少要用 1個(gè)月 TM M time 規(guī)律的描述和輸入 Univ. 可計(jì)算理論 2020/11/17 15/77 ‘ Turing Describable’ ep215 cp141 重復(fù) 17次 01 TM M w 說明可用 Mw的長(zhǎng)度來描述信息量 X 的 復(fù)雜度(信息量)為 Min ({ |M|+|w| :: x=M(w), M is TM } ) 例 用 Winzip 壓縮 成為 , 121K 一個(gè)圖靈機(jī) M ,M 長(zhǎng)度 2020 , 一個(gè)串 w 復(fù)雜度2020,且不可壓縮, M 一定不能描述 W. 適當(dāng)規(guī)定 大小 概念。 ? 自然現(xiàn)象中 耗散型,增加熵的多,作清潔和生命現(xiàn)象使無序到有序,減少熵(不考慮能量的耗散) Univ. 可計(jì)算理論 2020/11/17 51/77 An Application of KComplexity K復(fù)雜度的應(yīng)用 ep217 Kolmogorov plexity gives a rigorous definition for the notion of order or regularity. 嚴(yán)格描述了階或正則概念 The TM model gives us the most general way of describing mathematical objects like primes, puter programs, or theories. 圖靈機(jī)給出描述數(shù)學(xué)對(duì)象如素?cái)?shù)、程序。del‘s inpleteness thm. 給定 Th(N,+,?),設(shè) A是 找出其中完全公理和推導(dǎo)到規(guī)則的 程序( TM), AAttempt。 由前面結(jié)論,知道小馬不能拉大車。理論的一一般模型 Together with the inpressibility theorem, this allows us to make general statements about these ,可得出一些一般的性質(zhì) Univ. 可計(jì)算理論 2020/11/17 53/77 Counting Primes less than N 估計(jì) 比 N 小的素?cái)?shù)的個(gè)數(shù) Q: How many primes are there less than N? Let p1,…,p m be the m primes ? N. We know that we can describe N by: 因子分解 m21 eme2e1 pppN ???Hence e1,…,e m gives a description of N. Furthermore, for each j we have: ej ? log(N). Thus e1,…,e m requires ? m增加能力后, EXE的字節(jié)數(shù)增加了。 } How does the M,y encoding work? Univ. 可計(jì)算理論 2020/11/17 17/77 Measuring the size of (M,y) We want to express the size of the TM M and y in a number of bits. But how many bits is a specific (M,y) pair? Other key idea: We fix a universal Turing machine U that,on input M,y, simulates the TM M on y (yielding output x). 為了 一致、公平 地比較,用通用圖靈機(jī) U,模擬 M on y int U (M,y) { return ( M(y) )。 一個(gè)圖靈機(jī) M ,M 長(zhǎng)度 2020 , 一個(gè)串 w 復(fù)雜度2020,且不可壓縮, M 一定不能描述 W. 我們稱為 小馬 拉 大車 這一直觀感覺 在后面有用 許多軟件。 1k長(zhǎng)的程序不能描述 windows Univ. 可計(jì)算理論 2020/11/17 41/77 Inpressibility 不可壓縮的串 ep217,cp149 Theorem : For every n there exists at least one inpressible string x?{0,1}n with K(x)?n. 存在任意長(zhǎng)的不可壓縮串 Proof (by pigeonhole argument): 用 鴿巢原理 – There are 2n different strings x in {0,1}n. – There is one description of length 0, two descriptions of length 1,…, and 2 n–1 descriptions of length n–1. In total: 2n–1 descriptions of length smaller than , there has to be an x?{0,1}n that has a minimal description of at least n bits. 長(zhǎng)度從 1(n1)的串比長(zhǎng)度為 n的串的個(gè)數(shù)少 Univ. 可計(jì)算理論 2020/11/17 42/77 Inpressibility 不可壓縮的串 ep217,cp149 Theorem : For every n there exists at least one inpressible string x?{0,1}n with K(x)?n. 存在任意長(zhǎng)的不可壓縮串 Proof (by pigeonhole argument): 用 鴿巢原理 – There are 2n different strings x in {0,1}n. 1+2+4+…,+ 2 n–1= 2n–1 2n 長(zhǎng)度從 1(n1)的串的總數(shù) , 長(zhǎng)度為 n的串的總數(shù) 目標(biāo) 待壓 Univ. 可計(jì)算理論 2020/11/17 43/77 Indexing Trick Compression idea: If S is a small set that is easy to describe, then we can describe an x?S by: description of enumeration of S index of x in S Let S = {s1,…,s N}. We can indicate every x?S by the 1?j?N such that sj=x. 用編號(hào) j代替字符串 sj , 就短多了。log(log(N)). The total description MyN requires no more than c + mdel (3)自學(xué) 1) Enumerate a statements that follows from A 2) If this statement is of the form ―K(x) n‖, then print(x) and stop 3) Otherwise: generate next statement and go to 2) The above program can be expressed with less than 2|A| + 2log(n) + c bits. (The constant c does not depend on A or n.) Also, the program outputs a string x with K(x)n. Contradiction if n 2|A| + 2log(n) + c. 下頁 Univ. 可計(jì)算理論 2020/11/17 66/77 Compression and G246。 與最小描述的定義矛盾。 情況越混亂, 熵 H越大,要表達(dá)它所需要的平均碼長(zhǎng)越大, ? 春秋戰(zhàn)國(guó),等兵力分布時(shí),戰(zhàn)爭(zhēng)多,混亂,熵大,后來逐漸統(tǒng)一,熵變小,信息變小 (戰(zhàn)爭(zhēng)故事也不多,不精彩了)秦統(tǒng)一中國(guó)后,國(guó)家對(duì)立信息的 編碼只要 0位即可 ? 直觀感覺:要說清 混亂的事情,比較費(fèi)口舌(費(fèi)信息量) ? 男生寢室的熵比較大,例如一雙鞋子分居兩地,敘述起來要多說些話,作清潔后 熵變小。符合這一深層次的信息壓縮定理。 an irregular one has no such summary. Univ. 可計(jì)算理論 2020/11/17 10/77 Regularity We can give a short description of ―0101010101 010101010101010101010101‖ by ―17 times 01‖. For the other ―01011101010010100011100011 00011011‖ this seems more problematic. Suggests: 規(guī)律性 使得描述較短(有規(guī)律的,信息量較?。? A regular string is a string that has a sho