【正文】
ignal is the fundamental problem of speaker recognition system.Speech recognition can be viewed as a pattern recognition task,which includes training and recognition.Generally,speech signal can be viewed as a time sequence and characterized by the powerful hidden Markov model (HMM).Through the feature extraction,the speech signal is transferred into feature vectors and act asobservations.In the training procedure,these observationswill feed to estimate the model parameters of HMM.These parameters include probability density function for the observations and their corresponding states,transition probability between the states,etc.After the parameter estimation,the trained models can be used for recognition task.The input observations will be recognized as the resulted words and the accuracy can be evaluated. 3.Theory andmethodExtraction of speaker independent features from the speech signal is the fundamental problem of speaker recognition system.The standard methodology for solving this problem uses Linear Predictive Cepstral Coefficients(LPCC)and MelFrequency Cepstral Coefficient(MFCC).Both these methods are linear procedures based on the assumption that speaker features have properties caused by the vocal tract resonances.These features form the basic spectral structure of the speech signal.However,the nonlinear information in speech signals is not easily extracted by the present feature extraction methodologies.So we use fractal dimension to measure non2linear speech turbulence.This paper investigates and implements speaker identification system using both traditional LPCC and nonlinear multiscaled fractal dimension feature extraction.3.3 Improved feature extractions methodConsidering the respective advantages on expressing speech signal of LPCC and fractal dimension,we mix both to be the feature signal,that is,fractal dimension denotes the self2similarity,periodicity and randomness of speech time wave shape,meanwhile LPCC feature is good for speech quality and high on identification rate.Due to ANN′s nonlinearity,selfadaptability,robust and selflearning such obvious advantages,its good classification and input2output reflection ability are suitable to resolve speech recognition problem.Due to the number of ANN input nodes being fixed,therefore time regularization is carried out to the feature parameter before inputted to the neural network[9].In our experiments,LPCC and fractal dimension of each sample are need to get through the network of time regularization separately,LPCC is 4frame data(LPCC1,LPCC2,LPCC3,LPCC4,each frame parameter is 14D),fractal dimension is regularized to be12frame data(FD1,FD2,…,FD12,each frame parameter is 1D),so that the feature vector of each sample has 4*14+1*12=68D,the order is,the first 56 dimensions are LPCC,the rest 12 dimensions are fractal dimensions.Thus,such mixed feature parameter can show speech linear and nonlinear characteristics as well.4.Architectures and Features of ASRASR is a cutting edge technology that allows a puter or even a handheld PDA (Myers,2000) to identify words that are read aloud or spoken into any soundrecording device.The ultimate purpose of ASR technology is to allow 100% accuracy with all words that are intelligibly spoken by any person regardless of vocabulary size,background noise,or speaker variables (CSLU,2002).However,most ASR engineers admit that the current accuracy level for a large vocabulary unit of speech (e.g.,the sentence) remains less than 90%. Dragon39。特征提取,抽取反應(yīng)語(yǔ)音本質(zhì)的特征參數(shù),形成特征矢量序列。步驟三:打開機(jī)器人的電源,進(jìn)行語(yǔ)音訓(xùn)練,訓(xùn)練過(guò)程按照下面進(jìn)行:按順序訓(xùn)練以下15條指令:“名稱”,“開始”,“準(zhǔn)備”,“跳舞”,“再來(lái)一曲”,“開始”,“向前走”,“倒退”,“右轉(zhuǎn)”,“左轉(zhuǎn)”,“準(zhǔn)備”,“向左瞄準(zhǔn)”,“向右瞄準(zhǔn)”,“發(fā)射”,“連續(xù)發(fā)射”。開啟該功能后,IOA0和IOA1將發(fā)出每16ms電平變化一次的方波?!痉祷刂怠繜o(wú)。【參數(shù)】該參數(shù)定義語(yǔ)音輸入來(lái)源,通過(guò)MIC語(yǔ)音輸入還是LINE_IN電壓模擬量輸入。利用BSR_DeleteSDGroup函數(shù)可以把RAM空間中所有的特征模型刪除,釋放出所需空間。在整個(gè)函數(shù)中對(duì)IO口操作用到了2種方法,一種是使用指針直接操作IO口,這種方法在頻繁修改IO口的位時(shí)顯得麻煩,但是在一般操作的時(shí)候比較好用。 語(yǔ)音播放函數(shù)停止語(yǔ)音播放調(diào)用語(yǔ)音播放初始化函數(shù)調(diào)用語(yǔ)音播放準(zhǔn)備播放函數(shù)清看門狗調(diào)用語(yǔ)音播放系統(tǒng)服務(wù)函數(shù)返回判斷語(yǔ)音播放是否結(jié)束YN開始 SACM_S480自動(dòng)播放流程圖在本系統(tǒng)的軟件設(shè)計(jì)當(dāng)中,將語(yǔ)音播放的程序設(shè)計(jì)為語(yǔ)音播放模塊,可方便地調(diào)用;語(yǔ)音播放程序分為兩部分,一是播放流程控制,一是中斷播放服務(wù)程序。第三章 系統(tǒng)軟件設(shè)計(jì) 程序分析在主函數(shù)中調(diào)用相關(guān)函數(shù)完成特定人語(yǔ)音的訓(xùn)練,然后再訓(xùn)練成功后進(jìn)行語(yǔ)音識(shí)別,根據(jù)識(shí)別的命令執(zhí)行相關(guān)的操作。 機(jī)器人電機(jī)線路、電源線路與驅(qū)動(dòng)電路板的連接驅(qū)動(dòng)電路板上面一排焊孔就是電機(jī)與驅(qū)動(dòng)電路板的接口。5個(gè)電機(jī)共有10條連線。 SPCE061A 最小系統(tǒng) 適用于語(yǔ)音信號(hào)處理的主要特點(diǎn)SPCE061A除了具有集成度高、性能可靠、價(jià)格低廉的特點(diǎn)外,在A/D和D/A中都設(shè)有針對(duì)語(yǔ)音信號(hào)處理的功能,若把A/D,D/A轉(zhuǎn)換接口與其內(nèi)核u39。第二章 硬件系統(tǒng)設(shè)計(jì) SPCE061A的簡(jiǎn)介 概述SPCE061A是臺(tái)灣凌陽(yáng)科技公司研制的一個(gè)16位結(jié)構(gòu)的微控制器。發(fā)射電機(jī)的轉(zhuǎn)速較高,高速的旋轉(zhuǎn)帶動(dòng)轉(zhuǎn)盤依靠摩擦力把飛盤發(fā)射出去。提高了語(yǔ)音信息處理速度,可以快速對(duì)語(yǔ)音進(jìn)行應(yīng)答。語(yǔ)音識(shí)別是人機(jī)接口設(shè)計(jì)的一項(xiàng)重要內(nèi)容,也是語(yǔ)音信號(hào)處理中非常重要的應(yīng)用技術(shù),正逐步成為信息技術(shù)中人機(jī)交互的關(guān)鍵技術(shù)。實(shí)現(xiàn)了機(jī)器人的智能性和先進(jìn)性。有權(quán)將論文(設(shè)計(jì))用于非贏利目的的少量復(fù)制并允許論文(設(shè)計(jì))進(jìn)入學(xué)校圖書館被查閱。 作者簽名: 日期: 畢業(yè)論文(設(shè)計(jì))授權(quán)使用說(shuō)明本論文(設(shè)計(jì))作者完全了解瓊州學(xué)院有關(guān)保留、使用畢業(yè)論文(設(shè)計(jì))的規(guī)定,學(xué)校有權(quán)保留論文(設(shè)計(jì))并向相關(guān)部門送交論文(設(shè)計(jì))的電子版和紙質(zhì)版。語(yǔ)音識(shí)別的目的是通過(guò)語(yǔ)音,使機(jī)器人了解人的意思從而執(zhí)行相應(yīng)的命令,完成相應(yīng)的動(dòng)作。隨著科學(xué)技術(shù)的快速發(fā)展和人們物質(zhì)生活水平要求的提高,機(jī)器人已經(jīng)漸漸地融入人們生活的各個(gè)領(lǐng)域,去代替或者協(xié)助人們?nèi)ネ瓿筛呶?,繁瑣或者幫助殘障人群的工作。我?guó)在語(yǔ)音機(jī)器人領(lǐng)域發(fā)展進(jìn)步很快,如凌陽(yáng)智能語(yǔ)音識(shí)別機(jī)器人,以凌陽(yáng)單片機(jī)SPCE061A為核心,改裝市場(chǎng)上的玩具機(jī)器人,使改裝后的機(jī)器人具有語(yǔ)音識(shí)別能力,通過(guò)語(yǔ)音命令對(duì)其進(jìn)行控制,這也正是本課題所研究的。發(fā)射電機(jī)和推進(jìn)電機(jī)的轉(zhuǎn)動(dòng)方向是固定的。另外增加了特定人語(yǔ)音識(shí)別的功能,通過(guò)命令來(lái)控制機(jī)器人,使機(jī)器人智能化。 SPCE061A 最小系統(tǒng)SPCE061A最小系統(tǒng)當(dāng)中,包括SPCE061A芯片外圍的基本模塊,有:晶振輸入模塊(OSC)、鎖相環(huán)外圍電路(PLL)、復(fù)位電路(RESET)、指示燈(LED)等??梢愿鶕?jù)顏色和部位辨別線路的作用,下面分別介紹:該