【正文】
e office all receives one calls self the sand Arab rich business [?] and so on electronic mail , which sends out 。 The threat will be able after public place and so on the airport to start the biochemistry attack , [?] highly alerts after the maintenance. Multiple Reference Translations ert aft and iand t , w h aler r h busert after to be the on the r h busand o on , h ter pl e and o on t to s bi afBLEU Tends to Predict Human Judgments R2 = 8 8 . 0 %R2 = 9 0 . 2 % 2 . 5 2 . 0 1 . 5 1 . 0 0 . 50 . 00 . 51 . 01 . 52 . 02 . 5 2 . 5 2 . 0 1 . 5 1 . 0 0 . 5 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5H u m a n J u dg m e n t sNIST ScoreA d e q u a cyF l u e n cyL i n e a r(A d e q u a cy)L i n e a r(F l u e n cy)slide from G. Doddington (NIST) (variant of BLEU) WordBased Statistical MT Statistical MT Systems Spanish Broken English English Spanish/English Bilingual Text English Text Statistical Analysis Statistical Analysis Que hambre tengo yo What hunger have I, Hungry I am so, I am so hungry, Have I that hunger … I am so hungry Statistical MT Systems Spanish Broken English English Spanish/English Bilingual Text English Text Statistical Analysis Statistical Analysis Que hambre tengo yo I am so hungry Translation Model P(s|e) Language Model P(e) Decoding algorithm argmax P(e) * P(s|e) e Three Problems for Statistical MT ? Language model – Given an English string e, assigns P(e) by formula – good English string high P(e) – random word sequence low P(e) ? Translation model – Given a pair of strings f,e, assigns P(f | e) by formula – f,e look like translations high P(f | e) – f,e don’t look like translations low P(f | e) ? Decoding algorithm – Given a language model, a translation model, and a new sentence f … find translation e maximizing P(e) * P(f | e) The Classic Language Model Word NGrams Goal of the language model choose among: He is on the soccer field He is in the soccer field Is table the on cup the The cup is on the table Rice shrine American shrine Rice pany American pany The Classic Language Model Word NGrams Generative approach: w1 = START repeat until END is generated: produce word w2 according to a big table P(w2 | w1) w1 := w2 P(I saw water on the table) = P(I | START) * P(saw | I) * P(water | saw) * P(on | water) * P(the | on) * P(table | the) * P(END | table) Probabilities can be learned from online English text. Translation Model? Mary did not slap the green witch Maria no di243。 una botefada a la bruja verde Sourcelanguage morphological analysis Source parse tree Semantic representation Generate target structure Generative approach: Translation Model? Mary did not slap the green witch Maria no di243。 una botefada a la bruja verde Sourcelanguage morphological analysis Source parse tree Semantic representation Generate target structure Generative story: What are all the possible moves and their associated probability tables? The Classic Translation Model Word Substitution/Permutation [IBM Model 3, Brown et al., 1993] Mary did not slap the green witch Mary not slap slap slap the green witch n(3|slap) Maria no di243。 una botefada a la bruja verde d(j|i) Mary not slap slap slap NULL the green witch PNull Maria no di243。 una botefada a la verde bruja t(la|the) Generative approach: Probabilities can be learned from raw bilingual text. Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … All word alignments equally likely All P(frenchword | englishword) equally likely Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … “l(fā)a” and “the” observed to cooccur frequently, so P(la | the) is increased. Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … “house” cooccurs with both “l(fā)a” and “maison”, but P(maison | house) can be raised without limit, to , while P(la | house) is limited because of “the” (pigeonhole principle) Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … settling down after another iteration Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … Inherent hidden structure revealed by EM training! For details, see: ? “A Statistical MT Tutorial Workbook” (Knight, 1999). ? “The Mathematics of Statistical Machine Translation” (Brown et al, 1993) ? Software: GIZA++ Statistical Machine Translation … la maison … la maison bleue … la fleur … … the house … the blue house … the flower … P(juste | fair) = P(juste | correct) = P(juste | right) = … new French sentence Possible English translations, to be rescored by language model Decoding for “Classic” Models ? Of all conceivable English word strings, find the one maximizing P(e) x P(f | e) ? Decoding is an NPplete challenge – (Knight, 1999) ? Several search strategies are available ? Each potential English output is called a hypothesis. The Classic Results ? la politique de la haine . (Foreign Original) ? politics of hate . (Reference Translation) ? the policy of the hatred . (IBM4+Ngrams+Stack) ? nous avons sign233。 le protocole . (Foreign Original) ? we did sign the memorandum of agreement . (Reference Translation) ? we have signed the protocol . (IBM4+Ngra