【文章內(nèi)容簡介】
yk+ry, then ? expected cost ≤ (x+rx/k) * k + (y+ry/k) * k = dx+dy = ||ab||1 ? Total expected cost ≤ EEMD?(A,B) dx k k k Embedding into ?1 using the Deposition Lemma ? For randomlyshifted cutgrid G of side length k, we have: ? EEMD?(A,B) ≤ ∑ i EEMDk(Ai, Bi) + k*EEMD?/k(AG, BG) ? 3*EEMD?(A,B) ? ?[ ∑ i EEMDk(Ai, Bi) ] ? EEMD?(A,B) ? ?[ k*EEMD?/k(AG, BG) ] ? To embed into ?1, we applying it recursively for k=3 ? Choose randomlyshifted cutgrid G1 on [?]2 ? Obtain many grids [3]2, and a big grid [?/3]2 ? Then choose randomlyshifted cutgrid G2 on [?/3]2 ? Obtain more grids [3]2, and another big grid [?/32]2 ? Then choose randomlyshifted cutgrid G3 on [?/9]2 ? ? ? Then, embed each of the small grids [3]2 into ?1, using O(1) distortion embedding, and concatenate the embeddings Proving recursion works ? Embedding does not contract distances: ? EEMD?(A,B) ≤ ? ∑ i EEMDk(Ai, Bi) + k*EEMD?/k(AG1, BG1) ≤ ? ∑ i EEMDk(Ai, Bi) + k∑ i EEMDk(AG1,i, BG1,i)+k*EEMD?/k(AG2, BG2) ≤ ? ? Embedding distorts distances by O(log ?), in expectation: ? (3logk?) * EEMD?(A,B) ? ? 3* EEMD?(A, B) + (3logk?/k)*EEMD?(A, B) ? ? ?[ ∑ i EEMDk(Ai, Bi) + (3logk?/k)*k*EEMD?/k(AG1, BG1) ] ? ? ? ? By Markov’s, it’s O(log ?) distortion with 90% probability Final theorem ? Theorem: can embed EMD over [?]2 into ?1 with O(log ?) distortion. ? Dimension required: O(?2), but a set A of size s maps to a vector that has only O(s*log ?) nonzero coordinates. ? Time: can pute in O(s*log ?) ? Randomized: does not contract, but large distortortion happens with 10% ? Applications: ? Can pute EMD(A,B) in time O(s*log ?) ? NNS: O(c*log ?) approximation, with O(n1+1/c*s) space, O(n1/c *s*log ?) query time. Embeddings of various metrics ? Embeddings into ?1 Metric Upper bound Earthmover distance (ssized sets in 2D plane) O(log s) [Cha02, IT03] Earthmover distance (ssized sets in {0,1}d) O(log s*log d) [AIK08] Edit distance over {0,1}d (= indels to tranform xy) 2?? ( log??) [OR05] Ulam (edit distance between nonrepetitive strings) O(log d) [CK06] Block edit distance O?(log d) [MS00, CM07] Lower bound Ω( log??) [NS07] Ω(log s) [KN05] ?(log d) [KN05,KR06] ??(log d) [AK07] 4/3 [Cor03] Curse of nonembeddability into ?1 ? ? ?1 natural target for many metrics, and have algorithms ? Will see two example of “going beyond ?1” ? Sketching for EMD ? Embedding of Ulam metric into product spaces