【正文】
is the periodogram of clean speed signal and |D(λ, k) |178。從畢業(yè)論文的選題、研究到撰寫、修改,再到最終的完成,整個(gè)過程賈老師都耐心指導(dǎo),使我能夠從中不斷的學(xué)習(xí)和提高。該算法可以廣泛地應(yīng)用于語音增強(qiáng)系統(tǒng),能夠有效地提高信噪比,抑制音樂噪聲。δ 取值5,I (λ , k)是上式中的指標(biāo)函數(shù)。圖44描繪了真實(shí)的噪聲功率譜和用我們的算法所估計(jì)出來的噪聲功率譜,噪聲是由白噪聲和F16戰(zhàn)斗機(jī)噪聲組合而成,Fs=8k,信噪比SNR=5dB。 非平穩(wěn)噪聲自適應(yīng)算法設(shè)觀察到的帶噪語音為: (417)其中,s(t)是純凈語音,n(t)是加性噪聲。另外,噪聲估計(jì)是通過最小值來搜索,然后再對其進(jìn)行修正,所以算法比較簡單。為了將信號轉(zhuǎn)化到頻域,將信號分成長度為L 個(gè)采樣點(diǎn)的幀信號,幀間重疊為R 點(diǎn)。但若后續(xù)n幀的平均能量或幅度尚未超過EU 而能量又降到EL 之下,則該幀不能作為初始起點(diǎn)S1,然后繼續(xù)尋找下一個(gè)平均能量或幅度超過EU 的幀,若后續(xù)n 幀的平均能量或幅度超過EU,則將此幀計(jì)為S1,該幀就可以作為根據(jù)能量信號找到的語音的起點(diǎn)。由上面定義出發(fā),計(jì)算過零率容易受低頻干擾,所以需要對上述定義做一點(diǎn)修改,設(shè)置一個(gè)門限T,將過零率的含義修改為跨過正負(fù)門限。而且在低信噪比下,VAD 的誤檢率會(huì)增大,在不能正確判斷出有聲/無聲段的情況下,估計(jì)出來的噪聲很難保證準(zhǔn)確性。這種方法是在噪聲估計(jì)窗內(nèi)搜索最小值作為噪聲估計(jì)量,而且此算法對窗長的選擇比較敏感,當(dāng)窗長比較長時(shí),對非平穩(wěn)噪聲的跟蹤速度慢,而且容易出現(xiàn)噪聲低估;當(dāng)窗長比較短時(shí),比較容易出現(xiàn)將語音的低能量成分當(dāng)作噪聲。通過基于語音活動(dòng)性檢測的噪聲估計(jì)算法,對能量和最小過零率的語音端點(diǎn)進(jìn)行檢測,仿真結(jié)果得出,我們需要魯棒性更強(qiáng)的算法,即使在有語音存在的情況下,也能夠?qū)崿F(xiàn)噪聲的連續(xù)估計(jì)和不斷更新。2004年Rangachari 和Loizou提出了一種快速估計(jì)方法,不僅使得帶噪語音子帶中語音出現(xiàn)概率計(jì)算更準(zhǔn)確,而且噪聲譜的更新在連續(xù)時(shí)間內(nèi)不依賴固定時(shí)間的窗長,但是在語音或噪聲能量過高時(shí)噪聲的估計(jì)就會(huì)慢下來, 時(shí),就會(huì)削弱一些語音能量。因此,為了實(shí)現(xiàn)精確的噪聲估計(jì),就要對噪聲譜進(jìn)行實(shí)時(shí)的估計(jì)。然而,語音增強(qiáng)技術(shù)作為一種預(yù)處理技術(shù),是消除這些噪聲干擾的一個(gè)最重要的手段,它通過對帶噪語音進(jìn)行處理來改善語音質(zhì)量,使人們易于接受或提高語音處理系統(tǒng)的性能。單通道語音系統(tǒng)在實(shí)際應(yīng)用中較為常見,如電話,手機(jī)等。此算法的基本思路是先用一個(gè)最優(yōu)平滑濾波對帶噪語音的功率譜濾波,得到一個(gè)噪聲的粗略估計(jì)。通過對基于最小統(tǒng)計(jì)量的噪聲估計(jì)方法和改進(jìn)的最小統(tǒng)計(jì)量控制遞歸平均噪聲估計(jì)算法研究發(fā)現(xiàn)這些噪聲估計(jì)方法可以在語音存在段進(jìn)行噪聲估計(jì),能夠有效地跟蹤非平穩(wěn)噪聲。在以后的噪聲估計(jì)算法的研究中要進(jìn)一步完善噪聲功率譜的估計(jì)算法,進(jìn)一步將噪聲估計(jì)方法和其他方法相結(jié)合,爭取得到更加精確的噪聲估計(jì)。在一給定幀的某個(gè)子帶中語音是否存在的概率可以由帶噪語音的局部能量值與其待定時(shí)間窗內(nèi)的最小值的比值決定,把該比值與某一門限做比較,小的比值意味著該子帶中不存在語音,反之,意味著該子帶內(nèi)存在語音。 基于語音活動(dòng)性檢測的噪聲估計(jì)算法 短時(shí)能量語音信號和噪聲信號的區(qū)別可以體現(xiàn)在他們的能量上,對于一列疊加有噪聲干擾的語音信號而言,其語音段的能量是噪聲段能量疊加語音聲波能量之和。所以,有了經(jīng)典的端點(diǎn)檢測方法——Lawrennce Rabiner[24]提出的以過零率Z 和能量E 為特征進(jìn)行端點(diǎn)檢測。語音結(jié)束點(diǎn)S2 的檢測方法與檢測起點(diǎn)相同,從后向前搜索,找出第一個(gè)平均能量幅度高于EL、且其前向幀的平均能量或幅度在超出EU 前沒有下降到EL 以下的幀號,記為N2,隨后根據(jù)過零率向N2+25 幀搜索,若有3 幀以上的ZZT,則將結(jié)束點(diǎn)N2 定為滿足ZZT的最后的幀號即Ne,否則即以N2 作為結(jié)束點(diǎn)。因此,對原有的噪聲估計(jì)還需要一個(gè)隨時(shí)間變化的平滑系數(shù)α 、一個(gè)偏差補(bǔ)償系數(shù)與加速跟蹤方法。而且該算法可能會(huì)偶爾削弱低能量音素,時(shí)間太長,但如果減小窗口的長度,跟蹤到的頻譜最小值不夠準(zhǔn)確,這樣會(huì)導(dǎo)致語音信號的失真,特別是語音的持續(xù)時(shí)間超過窗口長度時(shí)。圖(43)是帶噪語音的功率譜和其局部最小值。在一給定幀的某個(gè)子帶中語音是否存在的概率可以由帶噪語音的局部能量值與其待定時(shí)間窗內(nèi)的最小值的比值決定,把該比值與某一門限做比較,小的比值意味著該子帶中不存在語音,反之,意味著該子帶內(nèi)存在語音。通過對帶噪語音信號功率譜進(jìn)行一階遞歸平滑得到噪聲功譜[8]: (436)其中,既(七,f)為受語音存在概率p(k,z)控制的自適應(yīng)平滑因子。本文從語音活動(dòng)檢測和不需要進(jìn)行語音檢測的連續(xù)自適應(yīng)噪聲估計(jì)算法入手,VAD方法雖然有易于實(shí)現(xiàn),但是對非平穩(wěn)噪聲的跟蹤力度不夠而直接導(dǎo)致增強(qiáng)算法無法及時(shí)更新噪聲特性,同時(shí)在經(jīng)典算法的基礎(chǔ)上,研究了一種快速有效的噪聲估計(jì)方法。 speech coding184。. We can obtain an estimate of the power spectrum of the noise by tracking the minimum of P(λ,k).Our finite window smoothing constant α chosen experimentally not too low or too high. There are two main issues with the spectral minimal – tracking approach the existence of a bias in the noise estimate and the possible overestimate of the noise level because of inappropriate choice of the smoothing constant. More accurate noise estimation algorithm can be developed by deriving a bias factor to pensate for the lower noise values and by incorporating a smoothing constant that is not fixed but varies with time and frequency. The noise estimation algorithm using MS is summarized as below [12]. For each frame λ do following steps1. Compute the shortterm periodogram |Y(λ, k)|178。 speech coders and many other speech processing systems. In most speech enhancement algorithms it is assumed that an estimate of noise spectrum is available. Noise estimate is critical part and it is important for speech enhancement algorithms. Performance of speech enhancement algorithms depends on correct estimation of noise. Simple approach to estimate the noise spectrum of the signal using a Voice Activity Detector (VAD) another approach to estimate the noise using different noise estimation algorithms Noise estimation algorithms that continuously track the noise spectrum. It is challenging task to estimate the noise spectrum even during speech activity hence Researcher developed many noise estimation algorithms which are explained in next section.2. Voice Activity DetectionSimple approach to estimate and update the noise spectrum during the silent segments of the signal using a Voice Activity Detector (VAD). The process of discriminating between the voice activity that is speech presence and silence that is speech absence is called voice activity detection. VAD algorithms typically extract some type of feature (. short time energy, zero crossing etc.) from the input signal and pared against threshold value, usually determined during speech absent period. Generally output of VAD algorithms is binary decision on a framebyframe basis having frame duration 2030 msec. A segment of speech is declared to contain voice activity (VAD = ‘1’) if measured value exceed a predetermined threshold otherwise it is declared a noise (VAD = ‘0’) figure 1shows VAD decisions. Several VAD algorithms were proposed based on various types of features extracted from the signal. Noise estimation can have major impact on the quality and Intelligibility of speech signal.Figure 1 shows VAD decisions [3]The early VAD Algorithms were based on energy levels and zero crossing [4], Ceptral features [4], the Itakura LPC spectral distance measures and the periodicity measures [2]. Some of VAD Algorithms are used in (GSM) System [3], cellular Networks [3], and digital cordless telephone systems [3]. VAD Algorithms are suitable for discontinues transmission in voice munication systems as they can be used to save the battery life of cellular phones. The majority of the VAD Algorithms encounter problems in low SNR conditions, particularly when the noise is nonstationary [1, 2]. Having an accurate VAD Algorithm in a nonstationary environment might not be sufficient in speech enhancement. Applications, as on accurate noise estimation is required at all times, even during speech activity. In case of Noise estimation algorithms they continuously track the noise spectrum therefore more suited for speech enhancement applications in nonstationary Scenarios.3. Classes of Noise Estimation AlgorithmsThere are three classes of noise estimation algorithms. Minimal tracking Algorithms, Time Recursive Algorithms and Histogram based Algorithms. All algorithms operate in the following fashion. First the signal is analyzed using short time spectra puted from short over