ÄãºÃ£¬»¶Ó­À´µ½¾­¹ÜÖ®¼Ò [µÇ¼] [×¢²á]

ÉèΪÊ×Ò³ | ¾­¹ÜÖ®¼ÒÊ×Ò³ | Êղر¾Õ¾

ÓïÒôÐźŴ¦Àí¼¼Êõ¼°Ó¦ÓÃ_ͨÐŹ¤³ÌרҵÂÛÎÄ·¶ÎÄ

·¢²¼Ê±¼ä£º2015-01-24 À´Ô´£ºÈË´ó¾­¼ÃÂÛ̳
ͨÐŹ¤³ÌרҵÂÛÎÄ Ä¿Â¼ 1 Ð÷ÂÛ 1 1.1 ÓïÒôÐźŴ¦Àí¼¼Êõ¸ÅÊö 1 1.1.1 ÓïÒô±àÂë¼¼Êõ£¨Speech Coding 2 1.1.2 ÓïÒôºÏ³É¼¼Êõ£¨Speech Synthesis 2 1.1.3 ÓïÒôʶ±ð¼¼Êõ£¨Speech Recognition 3 1.1.4 ÓïÒôÔöÇ¿¼¼Êõ£¨Speech Enhancement£© 3 1.2 ¿ÎÌâÑо¿±³¾° 4 1.3 ±¾ÎĵÄÖ÷ÒªÄÚÈÝ 4 2 ÓïÒôÐźŷÖÎö 6 2.1 ÓïÒôÐźŵÄÊý×ÖÄ£ÐÍ 6 2.1.1 ¼¤ÀøÄ£ÐÍ 6 2.1.2 ÉùµÀÄ£ÐÍ 7 2.1.3 ·øÉäÄ£ÐÍ 7 2.2 ÓïÒôÐźŵÄʱÓò·ÖÎö 7 2.3 ÓïÒôÐźŵÄƵÓò·ÖÎö 9 2.3.1 ¶Ìʱ¸µÁ¢Ò¶±ä»» 9 2.3.2 ¶Ìʱ¸µÁ¢Ò¶·´±ä»» 10 3 ÓïÒôʶ±ðÀíÂÛ 11 3.1 ÓïÒôʶ±ðµÄ·¢Õ¹ÀúÊ· 11 3.2 ÓïÒôʶ±ð¶¨Òå 12 3.3 ÓïÒôʶ±ðµÄ·ÖÀà 12 3.3.1 °´Ê¶±ðÆ÷µÄÀàÐÍ 12 3.3.2 °´Ê¶±ðÆ÷¶ÔʹÓÃÕßµÄÊÊÓ¦Çé¿ö 12 3.3.3 °´ÓïÒô´Ê»ã±íµÄ´óС 13 3.4 ÓïÒôʶ±ðµÄ»ù±¾Ô­Àí 13 3.5 ÓïÒôʶ±ðµÄÌصã 14 3.6 ÓïÒôʶ±ð¼¼ÊõµÄÀ§ÄÑ 14 4 ÓïÒôʶ±ðϵͳµÄʵÏÖ 16 4.1 µäÐÍÓïÒôʶ±ðϵͳ 16 4.1.1 Ô¤´¦Àí 16 4.1.2 ÌØÕ÷ÌáÈ¡ 16 4.1.3 ѵÁ· 17 4.1.4 ʶ±ð 17 4.1.5 ºó´¦Àí 18 4.2 ÓïÒôʶ±ð¼¼Êõ 18 4.2.1 ¶¯Ì¬Ê±¼ä¹æÕû£¨DTW£©¼¼Êõ 18 4.2.2 ÒþÂí¶û¿É·òÄ£ÐÍ£¨HMM£©¼¼Êõ 19 4.2.3 ʸÁ¿Á¿»¯(VQ)¼¼Êõ 24 4.2.4 È˹¤Éñ¾­ÍøÂ磨ANN£©¼¼Êõ 24 4.2.5 »ìºÏģʽʶ±ð¼¼Êõ 25 4.3 ÓïÒôʶ±ðϵͳµÄÐÔÄÜÆÀ¼Û±ê×¼ 25 4.4 ¹ÂÁ¢´Êʶ±ðϵͳµÄʵÏÖ 26 4.4.1 ÕûÌåϵͳÈí¼þÁ÷³Ìͼ 26 4.4.2 ÓïÒôÐźŵIJɼ¯ 28 4.4.3 Ô¤´¦Àí 28 4.4.4 ÌØÕ÷²ÎÊýµÄÌáÈ¡ 31 4.4.5 Ä£°åѵÁ· 33 4.4.6 ģʽƥÅäʶ±ð 34 5¡¡ÏµÍ³·ÂÕæ 36 5.1 ÓïÒôµÄ²É¼¯ 36 5.2 ¶Ëµã¼ì²â 36 5.3 ʶ±ð²âÊÔ 38 5.3.1 DTWËã·¨²âÊÔ 38 5.3.2 »ùÓÚHMMÄ£ÐÍËã·¨²âÊÔ 39 5.4 ·ÖÎöÓë×ܽá 40 6 Ó¦Óà 41 7 ×ܽáÓëÕ¹Íû 42 ²Î¿¼ÎÄÏ× 44 ÖÂл 46 ÕªÒª ÓïÒôÐźŴ¦ÀíÊÇÑо¿ÓÃÊý×ÖÐźŴ¦Àí¼¼Êõ¶ÔÓïÒôÐźŽøÐд¦ÀíµÄÒ»ÃÅÐÂÐËѧ¿Æ¡£ÓïÒôÐźŴ¦ÀíµÄÓ¦Óü«Îª¹ã·º£¬ÆäÖеÄÖ÷Òª¼¼Êõ°üÀ¨ÓïÒô±àÂë¡¢ÓïÒôºÏ³É¡¢ÓïÒôʶ±ðºÍÓïÒôÔöÇ¿µÈ¡£±¾ÎÄÑ¡È¡ÓïÒôʶ±ð×÷ΪÖصãÌÖÂÛ¿ÎÌâ¡£ ÓïÒôʶ±ð¾ÍÊÇÈüÆËã»úÌý¶®È˵Ļ°£¬²¢×ö³öÕýÈ·µÄ·´Ó¦¡£Ä¿Ç°Ö÷Á÷µÄÓïÒôʶ±ð¼¼ÊõÊÇ»ùÓÚͳ¼Æģʽʶ±ðµÄ»ù±¾ÀíÂÛ¡£ ±¾ÎÄÊ×ÏȶÔÓïÒôÐźŴ¦Àí½øÐÐÁ˸ÅÊö£¬ÆäÖаüÀ¨¸÷ÖÖ´¦Àí¼¼Êõ¡¢·¢Õ¹¼°Ó¦Ó᣽ÓÏÂÀ´Ö÷Òª½éÉÜÁËÓïÒôʶ±ð·½ÃæµÄ֪ʶ¡£¸ù¾ÝÓïÒôʶ±ðϵͳµÄ»ù±¾¹¹³ÉÄ£ÐÍ£¬½éÉÜÁËÔ¤´¦Àí¡¢¶Ëµã¼ì²âµ½Ä£°åÆ¥Åä¸÷¸ö²¿·ÖËùÉæ¼°µ½µÄÓïÒôÊý×ÖÐźŴ¦ÀíÔ­ÀíºÍ·½·¨¡£ÖصãÑо¿Á˹ÂÁ¢´Êʶ±ðϵͳµÄÔ­Àí¡¢¹¹³É¼°¸÷²¿·ÖµÄʵÏÖËã·¨¡£²¢ÔÚMATLABƽ̨ÉϽøÐÐÁËϵͳµÄ·ÂÕæ¡£ ¹Ø¼ü´Ê£º¶Ëµã¼ì²â£¬ÌØÕ÷ÌáÈ¡£¬¶¯Ì¬Ê±¼ä¹æÕû£¬ÒþÂí¶û¿É·òÄ£ÐÍ Abstract Speech signal processing is a new developing discipline which has a research on the speech signal using the technology of digital signal processing. The application is very widespread. Speech coding, speech synthesis, speech recognition and speech enhancement are the primary kinds of technology of speech signal processing. In this thesis, what the author research for is just a kind of them¡ªspeech recognition. Speech recognition is letting the computer understand our human being and react rightly. Now the mainstream of speech recognition technology is base on the basic theory of statistic mode recognition. In this paper, firstly, the author introduces the summary of the speech signal processing, including kinds of technologies of processing, development and application. Then discuss the knowledge of speech recognition. According to the model of speech¡ªrecognition system, it describes the fundamental of every part (including pretreatment ,end point detection and template matching )such as the speech data signal process and discusses some methods of the realization. The author put the emphases on the realization of isolated speech-recognition system, discussing the principium and the methods. Finally the author simulate on the PC using MATLAB. Key words: Endpoint Detection£¬Abstracting Characteristic£¬Dynamic Time Warping£¬Hidden Markov Model
¾­¹ÜÖ®¼Ò¡°Ñ§µÀ»á¡±Ð¡³ÌÐò
  • ɨÂë¼ÓÈë¡°¿¼ÑÐѧϰ±Ê¼ÇȺ¡±
ÍƼöÔĶÁ
¾­¼ÃѧÏà¹ØÎÄÕÂ
±êÇ©ÔÆ
¾­¹ÜÖ®¼Ò¾«²ÊÎÄÕÂÍƼö