【正文】
夠由一個(gè)是原始模擬頻率兩倍的數(shù)字頻率重建出來(lái)。例如,一個(gè) 20kHz 的音頻信號(hào)能準(zhǔn)確地被表示為一個(gè) 的數(shù)字信號(hào)樣本。這些模型已經(jīng)發(fā)展到這種程度,在一個(gè)安靜的環(huán)境中準(zhǔn)確率可以達(dá)到 90℅以上。 :這種模型應(yīng)用了內(nèi)置于程序中的語(yǔ)言數(shù)據(jù)庫(kù)。為了做到這一點(diǎn),該程序使用了動(dòng)態(tài)規(guī)劃算法。 :基于知識(shí)的語(yǔ)音識(shí)別技術(shù)分析語(yǔ)音的聲譜圖以收集數(shù)據(jù)和制定規(guī)則,這些數(shù)據(jù)和規(guī)則回饋與操作者的命令和語(yǔ)句等值的信息。 :隨機(jī)語(yǔ)音識(shí)別技術(shù)在今天最為常見(jiàn)。最流行的隨機(jī)概率模型是 HMM(隱馬爾科夫模型)。在分析語(yǔ)音輸入的時(shí)候, HMM 被證明是成功的,因?yàn)樵撍惴紤]到了語(yǔ)言模型,人類(lèi)說(shuō)話的聲音模型和已知的所有詞匯。 如前所述,利用隨機(jī)模型來(lái)分析語(yǔ)言的程序是今天最流行的,并且證明是最成功的。這增強(qiáng)了語(yǔ)音軟件的功能。這個(gè)軟件是成功的。這些常量變成了語(yǔ)音識(shí)別技術(shù)算法中的一環(huán),這樣以后就能夠提供更好的語(yǔ) 音識(shí)別。 聽(tīng)寫(xiě) 關(guān)于指令識(shí)別的第二點(diǎn)是聽(tīng)寫(xiě)。另外,許多公司看重聽(tīng)寫(xiě)在翻譯過(guò)程中的價(jià)值,在這個(gè)過(guò)程中,使用者可以把他們的語(yǔ)言翻譯成為信件,這樣使用者就可以說(shuō)給他們母語(yǔ)中另一部分人聽(tīng)。 語(yǔ)句翻譯中存在的錯(cuò)誤 當(dāng)語(yǔ)音識(shí)別技術(shù)處理你的語(yǔ)句的時(shí)候,它們的準(zhǔn)確率取 決于它們減少錯(cuò)誤的能力。當(dāng)一個(gè)句子中一個(gè)單詞被弄錯(cuò),那就叫做單個(gè)詞匯出錯(cuò)。指令成功率是由對(duì)指令的精確翻譯決定的。 商業(yè) 主要的語(yǔ)音技術(shù)公司 隨著語(yǔ)音技術(shù)產(chǎn)業(yè)的發(fā)展,更多的公司帶著他們新的產(chǎn)品和理念進(jìn)入這一領(lǐng)域。他們?cè)?2021 年收入 億美元。詳細(xì)信息,請(qǐng)?jiān)L問(wèn) Nuance 公司(納斯達(dá)克股票代碼: NUAN)總部設(shè)在 伯靈頓,開(kāi)發(fā)商業(yè)和客戶服務(wù)使用語(yǔ)音和圖像技術(shù)。 Vlingo 最近與雅虎聯(lián)手合作,為雅虎的移動(dòng)搜索服務(wù) — 一鍵通功能提供語(yǔ)音識(shí)別技術(shù)。 專(zhuān)利侵權(quán)訴訟 考慮到這兩項(xiàng)業(yè)務(wù)和技術(shù)的高度競(jìng) 爭(zhēng)性,各公司之間有過(guò)無(wú)數(shù)次的專(zhuān)利侵權(quán)訴訟并不奇怪。使用已經(jīng)被另一家公司或個(gè)人申請(qǐng)專(zhuān)利的技術(shù),即使這項(xiàng)技術(shù)是你自己獨(dú)立研發(fā)的,你也可能被要求賠償,并并可能不公正地禁止你以后使用該項(xiàng)技術(shù)。下面是對(duì)一些專(zhuān)利侵權(quán)訴訟的敘述。 語(yǔ)音識(shí)別未來(lái)的發(fā)展 今后的發(fā)展趨勢(shì)和應(yīng)用 醫(yī)療行業(yè) 醫(yī)療行業(yè)有多年來(lái)一直在 宣傳 電子病歷 (EMR)。沒(méi)有足夠的 人員將大量的病人信息輸入成為電子格式 ,因此 ,紙質(zhì) 記錄 依然盛行 。 軍事 畢業(yè)文獻(xiàn)翻譯 國(guó)防工業(yè)研究語(yǔ)音識(shí)別軟件 試圖將其應(yīng)用復(fù)雜化而非更有效率和親切。 軍方指揮中心同樣正在嘗試?yán)谜Z(yǔ)音識(shí)別技術(shù)在危急關(guān)頭用快速和簡(jiǎn)易的方式進(jìn)入他們掌握的大量資料庫(kù)。軍方宣布,正在努力利用語(yǔ)音識(shí)別軟件把數(shù)據(jù)轉(zhuǎn)換成為病人的記錄。 data entry。 automated processing of telephone calls) — a main element of socalled natural language processing through puter speech technology. Speech derives from sounds created by the human articulatory system, including the lungs, vocal cords, and tongue. Through exposure to variations in speech patterns during infancy, a child learns to recognize the same words or phrases despite different modes of pronunciation by different people— ., pronunciation differing in pitch, tone, emphasis, intonation pattern. The cognitive ability of the brain enables humans to achieve that remarkable capability. As of this writing (2021), we can reproduce that capability in puters only to a limited degree, but in many ways still useful. The Challenge of Speech Recognition Writing systems are ancient, going back as far as the Sumerians of 6,000 years ago. The phonograph, which allowed the analog recording and playback of speech, dates to 1877. Speech recognition had to await the development of puter, however, due to multifarious problems with the recognition of speech. First, speech is not simply spoken textin the same way that Miles Davis playing So What can hardly be captured by a notefornote rendition as sheet music. What humans understand as discrete words, phrases or sentences with clear boundaries are actually delivered as a continuous stream of sounds: Iwenttothestoreyesterday, rather than I went to the store yesterday. Words can also blend, with Whaddayawa? representing What do you want? Second, there is no onetoone correlation between the sounds and letters. In English, there are slightly more than five vowel lettersa, e, i, o, u, and sometimes y and w. There are more than twenty different vowel sounds, though, and the exact count can vary depending on the accent of the speaker. The reverse problem also occurs, where more than one letter can 畢業(yè)文獻(xiàn)翻譯 represent a given sound. The letter c can have the same sound as the letter k, as in cake, or as the letter s, as in citrus. In addition, people who speak the same language do not use the same sounds, . languages vary in their phonology, or patterns of sound anization. There are different accentsthe word 39。 could be pronounced watter, wadder, woader, wattah, and so on. Each person has a distinctive pitch when they speakmen typically having the lowest pitch, women and children have a higher pitch (though there is wide variation and overlap within each group.) Pronunciation is also colored by adjacent sounds, the speed at which the user is talking, and even by the user39。s feelings or intentions: Oh, like, you know, well. There are also sounds that are a part of speech that are not considered words: er, um, uh. Coughing, sneezing, laughing, sobbing, and even hiccupping can be a part of what is spoken. And the environment adds its own noises。s speech profile. Results dipped as low as 60 percent if the recognizer was not adjusted. Audrey worked by recognizing phonemes, or individual sounds that were considered distinct from each other. The phonemes were correlated to reference models of phonemes that were generated by training the recognizer. Over the next two decades, researchers spent large amounts of time and money trying to improve upon this concept, with little success. Computer hardware improved by leaps and bounds, speech synthesis improved steadily, and Noam Chomsky39。s generative work in phonology also led mainstream linguistics to abandon the concept of the phoneme altogether, in favour of breaking down the sound patterns of language into smaller, more discrete features. In 1969, John R. Pierce wrote a forthright letter to the Journal of the Acoustical Society of America, where much of the research o