【正文】
[5]張雪英,數(shù)字語音處理及MATLAB仿真[M].北京:[6]趙力,語音信號處理[M].北京:[7]易克初,田斌,付強語音信號處理[M].北京:[8]占君,張倩,滿謙MATLAB函數(shù)查詢手冊[M].北京:[9]趙毅,尹雪飛,陳克安一種新的基于倒譜的共振峰頻率檢測算法[J]應(yīng)用聲學第29卷第6期2010年11月[10]張琨,高思超,畢靖MATLAB 2010從入門到精通[M].北京:[11]高西全 ,丁玉美語音信號處理(第三版)[M].西安:[12] RabinerL ,JuangB H. Fundamental of Speech York:Prentice Hall ,1993[13] Furui Independent Isolated Word Recognition Using Dynamic Feature of Speech Spectrum. IEEE Trans on Acoustics,Speech,Signal Processing,1986,34 (1):52~59[14] Methods of Pitch ,1968。16(1):262266[15] , for Automatic Formant Analysis of Voiced .,1997。47(2)。634648附錄I 相關(guān)程序 55close allclearclc[x fs]=wavread(39。39。)。bank=mel(24,256,fs,0,39。m39。)。%Mel濾波器的階數(shù)為24,fft變換的長度為256,采樣頻率為8000Hz% 歸一化mel濾波器組系數(shù)bank=full(bank)。bank=bank/max(bank(:))。% DCT系數(shù),12*24for k=1:12 n=0:23。 dctcoef(k,:)=cos((2*n+1)*k*pi/(2*24))。end% 歸一化倒譜提升窗口w = 1 + 6 * sin(pi * [1:12] ./ 12)。w = w/max(w)。% 預(yù)加重濾波器xx=double(x)。xx=filter([1 ],1,xx)。% 語音信號分幀xx=enframe(xx,256,80)。%對x 256點分為一幀% 計算每幀的MFCC參數(shù)for i=1:size(xx,1) y = xx(i,:)。 s = y39。 .* hamming(256)。 t = abs(fft(s))。%fft快速傅立葉變換 t = t.^2。 c1=dctcoef * log(bank * t(1:129))。 c2 = c1.*w39。 m(i,:)=c239。end%求取差分系數(shù)dtm = zeros(size(m))。for i=3:size(m,1)2 dtm(i,:) = 2*m(i2,:) m(i1,:) + m(i+1,:) + 2*m(i+2,:)。enddtm = dtm / 3。%合并mfcc參數(shù)和一階差分mfcc參數(shù)ccc = [m dtm]。%去除首尾兩幀,因為這兩幀的一階差分參數(shù)為0ccc = ccc(3:size(m,1)2,:)。subplot(211)ccc_1=ccc(:,1)。plot(1:length(ccc_1),ccc_1)。title(39。MFCC39。)。axis([1 length(ccc_1) min(ccc_1) max(ccc_1)])。xlabel(39。采樣點39。)。ylabel(39。幅值39。)。title(39。MFCC特征參數(shù)39。)[h,w]=size(ccc)。A=size(ccc)。subplot(2,1,2)plot([1,w],A)。xlabel(39。維數(shù)39。)。ylabel(39。幅值差39。)。title(39。維數(shù)與幅值差的關(guān)系39。)。 function f=enframe(x,win,inc)%ENFRAME split signal up into (overlapping) frames: one per row. F=(X,WIN,INC)%% F = ENFRAME(X,LEN) splits the vector X(:) up into% frames. Each frame is of length LEN and occupies% one row of the output matrix. The last few frames of X% will be ignored if its length is not divisible by LEN.% It is an error if X is shorter than LEN.% F = ENFRAME(X,LEN,INC) has frames beginning at increments of INC% The centre of frame I is X((I1)*INC+(LEN+1)/2) for I=1,2,...% The number of frames is fix((length(X)LEN+INC)/INC)%% F = ENFRAME(X,WINDOW) or ENFRAME(X,WINDOW,INC) multiplies% each frame by WINDOW(:)% Copyright (C) Mike Brookes 1997% Version: $Id: ,v 2006/06/22 19:07:50 dmb Exp $%% VOICEBOX is a MATLAB toolbox for speech processing.% Home page: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This program is free software。 you can redistribute it and/or modify% it under the terms of the GNU General Public License as published by% the Free Software Foundation。 either version 2 of the License, or% (at your option) any later version.%% This program is distributed in the hope that it will be useful,% but WITHOUT ANY WARRANTY。 without even the implied warranty of% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the% GNU General Public License for more details.%% You can obtain a copy of the GNU General Public License from% ftp://% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%nx=length(x(:))。nwin=length(win)。if (nwin == 1) len = win。else len = nwin。endif (nargin 3) inc = len。endnf = fix((nxlen+inc)/inc)。f=zeros(nf,len)。indf= inc*(0:(nf1)).39。inds = (1:len)。f(:) = x(indf(:,ones(1,len))+inds(ones(nf,1),:))。if (nwin 1) w = win(:)39。 f = f .* w(ones(nf,1),:)。end function [x,mn,mx]=mel(p,n,fs,fl,fh,w)%MELBANKM determine matrix for a melspaced filterbank [X,MN,MX]=(P,N,FS,FL,FH,W)%% Inputs: p number of filters in filterbank% n length of fft% fs sample rate in Hz% fl low end of the lowest filter as a fraction of fs (default = 0)% fh high end of highest filter as a fraction of fs (default = )% w any sensible bination of the following:% 39。t39。 triangular shaped filters in mel domain (default)% 39。n39。 hanning shaped filters in mel domain% 39。m39。 hamming shaped filters in mel domain%% 39。z39。 highest and lowest filters taper down to zero (default)% 39。y39。 lowest filter remains at 1 down to 0 frequency and% highest filter remains at 1 up to nyquist freqency%% If 39。ty39。 or 39。ny39。 is specified, the total power in the fft is preserved.%% Outputs: x a sparse matrix containing the filterbank amplitudes% If x is the only output argument then size(x)=[p,1+floor(n/2)]% otherwise size(x)=[p,mxmn+1]% mn the lowest fft bin with a nonzero coefficient% mx the highest fft bin with a nonzero coefficient%% Usage: f=fft(s)。 f=fft(s)。% x=melbankm(p,n,fs)。 [x,na,nb]=melbankm(p,n,fs)。% n2=1+floor(n/2)。 z=log(x*(f(na:nb)).*conj(f(na:nb)))。% z=log(x*abs(f(1:n2)).^2)。% c=dct(z)。 c(1)=[]。%% To plot filterbanks . plot(melbankm(20,256,8000)39。)%% Copyright (C) Mike Brookes 1997% Version: $Id: ,v 2005/02/21 15:22:13 dmb Exp $%% VOICEBOX is a MATLAB toolbox for speech processing.% Home page: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This program is free software。 you can redistribute it and/or modify% it under the terms of the GNU General Public License as published by% the Free Software Foundation。 either version 2 of the License, or% (at your option) any later version.%% This program is distributed in the hope that it will be useful,% but WITHOUT ANY WARRANTY。 without even the implied warranty of% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the% GNU General Public L