【正文】
r a m e t e r amp。 s n r fo r 0 d Bp a r a m e t e r amp。 s n r fo r 5 d B 25 Comparing with the parameter value by wave atom transform, shows similar results. Therefore, the same threshold parameter 3 and are chosen for hard and soft thresholdings, respectively. So the wave atom speech enhancement algorithm is shown in Fig. , and is summarized as follows. 1. Determine the threshold using the wave atom coefficients at the beginning part, ., silence region, of a noisy speech signal 2. Compute the discrete wave atom transform for noisy speech 3. Apply the thresholding method 4. Compute the inverse wave atom transform to get enhanced speech 26 T h r e s h o l d c a l c u l a t i o nN o i s y s p e e c hW a v e a t o m t r a n s f o r mT h r e s h o l d i n g I n v e r s e w a v e a t o m t r a n s f o r mE n h a n c e d s p e e c h Fig. Block diagram of the speech enhancement algorithm 27 4. EXPERIMENTS AND DISCUSSIONS In this section, we evaluated the performance of the wave atom algorithm using simulated signals. Experimental Conditions Six English sentences spoken by three male and three female speakers are used for the experiments. Speech signals are sampled at 8000Hz with 16 bits/sample. For the generation of noisy speech, additive Gaussian noise was added to clean speech signals with various global SNR from 5dB to 15dB. For the quantitative evaluation, we measured the average SNR and PESQ of the noisy and enhanced speech. PESQ (Perceptual Evaluation of Speech Quality) is a family of standards prising a test methodology for automated assessment of the speech quality. In order to pare results with wave atom transform, wavelet transform with Daubechies (D8) filter was used to get the enhanced speech. The conditions of the experiment are summarized in Table . 28 Table Conditions of experiment Results and Discussions We enhanced noisy signals using both wave atom transform and the wavelet transform, and puted the average output SNR of the results for male and female speakers. The waveforms of six sources as shown in Fig. . As an example, the waveforms of noisy signals for woman3 were shown . Input SNRs of 5dB, 0dB, 5dB, 10dB and 15dB were used in this experiment. Source 1 Clean speech signal (using ?man1?, ?man2?, ?man3?, ?woman1?, ?woman2?, ?woman3?) Source 2 Noise ( ?white Gaussian?) Sampling rate 8 kHz Threshold ?????? = ?? ??? Types of result SNR and PESQ SNR value 5 dB, 0 dB, 5dB, 10dB, 15dB parameter ?? Hard thresholding: 3, Soft thresholding: 29 (a) (b) (c) (d) (e) (f) Fig. Waveform of clean speech (a) man (b) man2 (c) man3 (d) woman1 (e) woman2 (f) woman3 0 0 . 5 1 1 . 5 2 2 . 5x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 3c l e a n s i g n a l m a n 1s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5x 1 04 0 . 2 5 0 . 2 0 . 1 5 0 . 1 0 . 0 500 . 0 50 . 10 . 1 50 . 20 . 2 5c l e a n s i g n a l m a n 2s a m p l eamplitude0 1 2 3 4 5 6 7x 1 04 0 . 5 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 40 . 5c l e a n s i g n a l m a n 3s a m p l eamplitude0 1 2 3 4 5 6 7x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 3c l e a n s i g n a l w o m a n 1s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 4c l e a n s i g n a l w o m a n 2s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 32 0 . 100 . 10 . 20 . 30 . 4c l e a n s i g n a l w o m a n 3s a m p l eamplitude 30 (a) (b) (c) (d) (e) Fig. Waveform of noisy speech for woman3 (a) 5dB (b) 0dB (c) 5dB (d) 10dB (e) 15dB 0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 4n o i s y s i g n a l fo r w o m a n 3 ( 5 d B )s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 4n o i s y s i g n a l fo r w o m a n 3 ( 0 d B )s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 32. 100 . 20 . 34n o i s y s i g n a l fo r w o m a n 3 ( 5 d B )s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 4n o i s y s i g n a l fo r w o m a n 3 ( 1 0 d B )s a m p l eamplitude0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5x 1 04 0 . 4 0 . 3 0 . 2 0 . 100 . 10 . 20 . 30 . 4n o i s y s i g n a l fo r w o m a n 3 ( 1 5 d B )s a m p l eamplitude 31 Firstly, the SNR of enhanced speech with hard thresholding are given in Table and Table , where m and f denote the male speaker and female speaker, respectively. Fig. shows the parison of SNR and PESQ for hard thresholding and we can find that the performances of wavelet transform are poor than wave atom transform. Table Comparison of SNR with hard thresholding for wave atom transform SNRin (dB) SNRout(dB) m1 m2 m3 AVEmale f1 f2 f3 AVEfemale AVEall 5 0 5 10 15 Table Comparison of SNR with hard thresholding for wavelet transform SNRin (dB) SNRout(dB) m1 m2 m3 AVEmale f1 f2 f3 AVEfemale AVEall 5