CN105719650A - Speech recognition method and system - Google Patents
Speech recognition method and system Download PDFInfo
- Publication number
- CN105719650A CN105719650A CN201610065010.2A CN201610065010A CN105719650A CN 105719650 A CN105719650 A CN 105719650A CN 201610065010 A CN201610065010 A CN 201610065010A CN 105719650 A CN105719650 A CN 105719650A
- Authority
- CN
- China
- Prior art keywords
- identification module
- speech data
- module
- order word
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000000605 extraction Methods 0.000 claims description 12
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 7
- 230000007812 deficiency Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a speech recognition method and system, and aims at overcoming the disadvantage that a present speech recognition system cannot be applied to intelligent hardware in large scale. The method comprises the following steps that speech data is obtained; a command word recognition module is used to recognize the speech data, if the command word recognition module is capable of recognizing the speech data, a speech data result recognized by the command word recognition module is output, and if not, the speech data is input to a dictation recognition module; and the dictation recognition module recognizes the input speech data, and obtains a final speech data result. According to the speech recognition method and system, a command word is recognized from the input speech at first, dictation recognition is carried out if the speech data is not recognized via command word recognition, and the recognition result is provided; and the scale that the speech recognition system is applied to the intelligent hardware is expanded to certain extent.
Description
Technical field
The present invention relates to field of speech recognition, particularly to the method and system of a kind of speech recognition.
Background technology
Speech recognition technology is exactly allow machine by identifying and voice signal is changed into the technology of corresponding word or order by understanding process.Current existing speech recognition system includes dictation and identifies and order word identification, and both technology all existing defects.The deficiency that dictation identifies is in that to require of a relatively high to computer hardware and communication network, and response time is long;Although order word identification need not network still its identification content be restricted, it is impossible to meets a large amount of content aware demand of needs, therefore, also cannot large-scale application speech recognition on current Intelligent hardware.
Summary of the invention
In order to overcome prior art speech recognition system can not the deficiency of large-scale application Intelligent hardware, it is an object of the invention to provide the method and system of a kind of speech recognition being easy to speech recognition system large-scale application.
For solving the problems referred to above, the technical solution adopted in the present invention is as follows: a kind of method providing speech recognition, comprises the following steps:
S101: obtain speech data;
S102: by speech data described in order word identification module identification, if described order word identification module identifies described speech data, then exports the speech data result of described order word identification module identification;If it is not, then input to dictating identification module;
S103: by dictating input described in identification module identification to the speech data dictating identification module, and obtain final speech data result.
Preferably, step S102 comprises the following steps:
Ripple storehouse is built according to order word;
The ripple of the speech data of acquisition being compared with the ripple in ripple storehouse, if having, then exporting the speech data result of order word identification module identification;If nothing, then input to dictating identification module.
Preferably, step S103 comprises the following steps:
From described input to characteristic information extraction the speech data of dictation identification module;
Utilize the speech data result that hidden Markov model processing feature information acquisition is final.
Preferably, described characteristic information is MFCC or PLP.
There is provided the system of a kind of speech recognition, it is characterised in that including acquisition module, order word identification module and dictation identification module, described order word identification module connects described acquisition module, and described dictation identification module connects described order word identification module;Wherein,
Described acquisition module is used for obtaining speech data;
Described order word identification module is used for identifying described speech data, if described order word identification module identifies described speech data, then exports the speech data result of described order word identification module identification;If it is not, then input to described dictation identification module;
Described dictation identification module is for identifying the speech data that described order word identification module inputs, and obtains final speech data result.
Preferably, described order word identification module includes building module and comparing module, described structure module is for building ripple storehouse according to order word, described comparing module is for comparing the ripple of the speech data of acquisition with the ripple in ripple storehouse, if having, then export the speech data result of described order word identification module identification;If it is not, then input to dictating identification module.
Preferably, described dictation identification module includes extraction module and model module, described extraction module is for from described input to characteristic information extraction the speech data of dictation identification module, and described model module is used for the speech data result utilizing hidden Markov model processing feature information acquisition final.
Preferably, described dictation identification module is HTK sound identification module.
Compared to existing technology, the beneficial effects of the present invention is:
First the method and system of this kind of speech recognition by carrying out order word identification after phonetic entry, if order word identifies recognition result, identify, if unidentified go out recognition result; carry out dictation identify, finally provide recognition result, network need not can be relied on not by when identifying content constraints by speech recognition technology in hardware configuration that need not be too high, it still is able to have higher accuracy of identification, meanwhile, speech recognition system application scale on Intelligent hardware is also expanded to a certain extent.
Accompanying drawing explanation
Fig. 1 is the flow chart of the method for a kind of speech recognition of the embodiment of the present invention;
Fig. 2 is the function structure chart of the system of a kind of speech recognition of the embodiment of the present invention.
Identifier declaration in figure:
1001, acquisition module;1002, order word identification module;1003, dictation identification module.
Detailed description of the invention
Below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.
Referring to Fig. 1, Fig. 1 and illustrate the flow chart of a kind of audio recognition method of embodiment provided by the invention, the method for this speech recognition comprises the following steps:
S101: obtain speech data;
S102: by speech data described in order word identification module identification, if described order word identification module identifies described speech data, then exports the speech data result of described order word identification module identification;If it is not, then input to dictating identification module;
Specifically, step S102 comprises the following steps:
Ripple storehouse is built according to order word;
The ripple of the speech data of acquisition being compared with the ripple in ripple storehouse, if having, then exporting the speech data result of order word identification module identification;If nothing, then input to dictating identification module.
S103: by dictating input described in identification module identification to the speech data dictating identification module, and obtain final speech data result.
Specifically, step S103 comprises the following steps:
From described input to characteristic information extraction the speech data of dictation identification module;
Utilize the speech data result that hidden Markov model processing feature information acquisition is final.
Alternatively, features described above information can be MFCC (Mel-FrequencyCepstralCoefficients, Mel frequency cepstral coefficient) or PLP (PerceptualLinearPrediction, perception linear predictor coefficient).
The embodiment one identification system of a kind of offer of the present invention, it includes acquisition module 1001, order word identification module 1002 and dictation identification module 1003, described order word identification module 1002 connects acquisition module 1001, and described dictation identification module 1003 connects described order word identification module 1002;Wherein,
Described acquisition module 1001 is used for obtaining speech data;
Described order word identification module 1002 is used for identifying described speech data, if described order word identification module 1002 identifies described speech data, then exports the speech data result that described order word identification module 1002 identifies;If it is not, then input to described dictation identification module 1003;
Described dictation identification module 1003 is for identifying the speech data that described order word identification module 1002 inputs, and obtains final speech data result.
Order word identification module 1002 includes building module and comparing module, wherein, builds module for building ripple storehouse according to order word;Comparing module is for comparing the ripple of the speech data of acquisition with the ripple in ripple storehouse, if having, then exports the speech data result that described order word identification module 1002 identifies, if nothing, then input extremely dictation identification module 1003.
Dictation identification module 1003 includes extraction module and model module, and wherein, extraction module is for from described input to characteristic information extraction the speech data of dictation identification module 1003;Model module is used for the speech data result utilizing hidden Markov model processing feature information acquisition final.
Preferably, dictation identification module 1003 is HTK sound identification module.
Compared with prior art, the method have the advantages that
First the method and system of this kind of speech recognition by carrying out order word identification after phonetic entry, if order word identifies result, identify, if unidentified go out recognition result; carry out dictation identify, finally provide recognition result, network need not can be relied on not by when identifying content constraints by speech recognition technology in hardware configuration that need not be too high, it still is able to have higher accuracy of identification, meanwhile, speech recognition system application scale on Intelligent hardware is also expanded to a certain extent.
Above-mentioned embodiment is only the preferred embodiment of the present invention, it is impossible to limit the scope of protection of the invention with this, and the change of any unsubstantiality that those skilled in the art does on the basis of the present invention and replacement belong to present invention scope required for protection.
Claims (8)
1. the method for a speech recognition, it is characterised in that comprise the following steps:
S101: obtain speech data;
S102: by speech data described in order word identification module identification, if described order word identification module identifies described speech data, then exports the speech data result of described order word identification module identification;If it is not, then input to dictating identification module;
S103: by dictating input described in identification module identification to the speech data dictating identification module, and obtain final speech data result.
2. the method for speech recognition as claimed in claim 1, it is characterised in that step S102 comprises the following steps:
Ripple storehouse is built according to order word;
The ripple of the speech data of acquisition being compared with the ripple in ripple storehouse, if having, then exporting the speech data result of order word identification module identification;If nothing, then input to dictating identification module.
3. the method for speech recognition as claimed in claim 1, it is characterised in that step S103 comprises the following steps:
From described input to characteristic information extraction the speech data of dictation identification module;
Utilize the speech data result that hidden Markov model processing feature information acquisition is final.
4. the method for speech recognition as claimed in claim 3, it is characterised in that described characteristic information is MFCC or PLP.
5. the system of a speech recognition, it is characterised in that including acquisition module, order word identification module and dictation identification module, described order word identification module connects described acquisition module, and described dictation identification module connects described order word identification module;Wherein,
Described acquisition module is used for obtaining speech data;
Described order word identification module is used for identifying described speech data, if described order word identification module identifies described speech data, then exports the speech data result of described order word identification module identification;If it is not, then input to described dictation identification module;
Described dictation identification module is for identifying the speech data that described order word identification module inputs, and obtains final speech data result.
6. the system of speech recognition as claimed in claim 5, it is characterized in that, described order word identification module includes building module and comparing module, described structure module is for building ripple storehouse according to order word, described comparing module is for comparing the ripple of the speech data of acquisition with the ripple in ripple storehouse, if having, then export the speech data result of described order word identification module identification;If it is not, then input to dictating identification module.
7. the system of speech recognition as claimed in claim 5, it is characterized in that, described dictation identification module includes extraction module and model module, described extraction module is for from described input to characteristic information extraction the speech data of dictation identification module, and described model module is used for the speech data result utilizing hidden Markov model processing feature information acquisition final.
8. the system of speech recognition as claimed in claim 5, it is characterised in that described dictation identification module is HTK sound identification module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610065010.2A CN105719650A (en) | 2016-01-30 | 2016-01-30 | Speech recognition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610065010.2A CN105719650A (en) | 2016-01-30 | 2016-01-30 | Speech recognition method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105719650A true CN105719650A (en) | 2016-06-29 |
Family
ID=56154485
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610065010.2A Pending CN105719650A (en) | 2016-01-30 | 2016-01-30 | Speech recognition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105719650A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106371801A (en) * | 2016-09-23 | 2017-02-01 | 安徽声讯信息技术有限公司 | Voice mouse system based on voice recognition technology |
CN106653013A (en) * | 2016-09-30 | 2017-05-10 | 北京奇虎科技有限公司 | Speech recognition method and device |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN2634587Y (en) * | 2003-03-06 | 2004-08-18 | 深圳市和而泰电子科技有限公司 | Sound controlled washing machine controller |
CN1537663A (en) * | 2003-10-23 | 2004-10-20 | 天威科技股份有限公司 | Speech identification interdynamic type doll |
CN1692406A (en) * | 2003-02-03 | 2005-11-02 | 三菱电机株式会社 | Vehicle mounted controller |
CN101192925A (en) * | 2006-11-20 | 2008-06-04 | 华为技术有限公司 | Speaker validation method and system and media resource control entity and processing entity |
CN102723081A (en) * | 2012-05-30 | 2012-10-10 | 林其灿 | Voice signal processing method, voice and voiceprint recognition method and device |
CN102841772A (en) * | 2012-08-06 | 2012-12-26 | 四川长虹电器股份有限公司 | Method of displaying files through voice control intelligent terminal |
CN102968992A (en) * | 2012-11-26 | 2013-03-13 | 北京奇虎科技有限公司 | Voice identification processing method for internet explorer and internet explorer |
CN202838947U (en) * | 2012-08-20 | 2013-03-27 | 上海闻通信息科技有限公司 | Voice remote controller |
CN103475551A (en) * | 2013-09-11 | 2013-12-25 | 厦门狄耐克电子科技有限公司 | Intelligent home system based on voice recognition |
CN103714816A (en) * | 2012-09-28 | 2014-04-09 | 三星电子株式会社 | Electronic appratus, server and control method thereof |
CN104160372A (en) * | 2012-02-24 | 2014-11-19 | 三星电子株式会社 | Method and apparatus for controlling lock/unlock state of terminal through voice recognition |
CN104269016A (en) * | 2014-09-22 | 2015-01-07 | 北京奇艺世纪科技有限公司 | Alarm method and device |
CN104575504A (en) * | 2014-12-24 | 2015-04-29 | 上海师范大学 | Method for personalized television voice wake-up by voiceprint and voice identification |
CN104732590A (en) * | 2015-03-09 | 2015-06-24 | 北京工业大学 | Sign language animation synthesis method |
CN105120048A (en) * | 2015-07-21 | 2015-12-02 | 广东欧珀移动通信有限公司 | Method and system for recording call voice |
-
2016
- 2016-01-30 CN CN201610065010.2A patent/CN105719650A/en active Pending
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1692406A (en) * | 2003-02-03 | 2005-11-02 | 三菱电机株式会社 | Vehicle mounted controller |
CN2634587Y (en) * | 2003-03-06 | 2004-08-18 | 深圳市和而泰电子科技有限公司 | Sound controlled washing machine controller |
CN1537663A (en) * | 2003-10-23 | 2004-10-20 | 天威科技股份有限公司 | Speech identification interdynamic type doll |
CN101192925A (en) * | 2006-11-20 | 2008-06-04 | 华为技术有限公司 | Speaker validation method and system and media resource control entity and processing entity |
CN104160372A (en) * | 2012-02-24 | 2014-11-19 | 三星电子株式会社 | Method and apparatus for controlling lock/unlock state of terminal through voice recognition |
CN102723081A (en) * | 2012-05-30 | 2012-10-10 | 林其灿 | Voice signal processing method, voice and voiceprint recognition method and device |
CN102841772A (en) * | 2012-08-06 | 2012-12-26 | 四川长虹电器股份有限公司 | Method of displaying files through voice control intelligent terminal |
CN202838947U (en) * | 2012-08-20 | 2013-03-27 | 上海闻通信息科技有限公司 | Voice remote controller |
CN103714816A (en) * | 2012-09-28 | 2014-04-09 | 三星电子株式会社 | Electronic appratus, server and control method thereof |
CN102968992A (en) * | 2012-11-26 | 2013-03-13 | 北京奇虎科技有限公司 | Voice identification processing method for internet explorer and internet explorer |
CN103475551A (en) * | 2013-09-11 | 2013-12-25 | 厦门狄耐克电子科技有限公司 | Intelligent home system based on voice recognition |
CN104269016A (en) * | 2014-09-22 | 2015-01-07 | 北京奇艺世纪科技有限公司 | Alarm method and device |
CN104575504A (en) * | 2014-12-24 | 2015-04-29 | 上海师范大学 | Method for personalized television voice wake-up by voiceprint and voice identification |
CN104732590A (en) * | 2015-03-09 | 2015-06-24 | 北京工业大学 | Sign language animation synthesis method |
CN105120048A (en) * | 2015-07-21 | 2015-12-02 | 广东欧珀移动通信有限公司 | Method and system for recording call voice |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106371801A (en) * | 2016-09-23 | 2017-02-01 | 安徽声讯信息技术有限公司 | Voice mouse system based on voice recognition technology |
CN106653013A (en) * | 2016-09-30 | 2017-05-10 | 北京奇虎科技有限公司 | Speech recognition method and device |
CN106653013B (en) * | 2016-09-30 | 2019-12-20 | 北京奇虎科技有限公司 | Voice recognition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103700370B (en) | A kind of radio and television speech recognition system method and system | |
CN106448663A (en) | Voice wakeup method and voice interaction device | |
CN108074576A (en) | Inquest the speaker role's separation method and system under scene | |
CN110097870B (en) | Voice processing method, device, equipment and storage medium | |
CN101923857A (en) | Extensible audio recognition method based on man-machine interaction | |
CN102915731A (en) | Method and device for recognizing personalized speeches | |
CN104538034A (en) | Voice recognition method and system | |
CN105931637A (en) | User-defined instruction recognition speech photographing system | |
CN105225665A (en) | A kind of audio recognition method and speech recognition equipment | |
CN110706707B (en) | Method, apparatus, device and computer-readable storage medium for voice interaction | |
CN113674746B (en) | Man-machine interaction method, device, equipment and storage medium | |
CN109215634A (en) | A kind of method and its system of more word voice control on-off systems | |
CN110246496A (en) | Audio recognition method, system, computer equipment and storage medium | |
EP4226363A1 (en) | Adapting hotword recognition based on personalized negatives | |
CN111862943B (en) | Speech recognition method and device, electronic equipment and storage medium | |
CN105719650A (en) | Speech recognition method and system | |
US20040193416A1 (en) | System and method for speech recognition utilizing a merged dictionary | |
CN114267342A (en) | Recognition model training method, recognition method, electronic device and storage medium | |
CN113611316A (en) | Man-machine interaction method, device, equipment and storage medium | |
CN111477226A (en) | Control method, intelligent device and storage medium | |
CN114399992B (en) | Voice instruction response method, device and storage medium | |
CN102592592A (en) | Voice data extraction method and device | |
CN114155845A (en) | Service determination method and device, electronic equipment and storage medium | |
CN114121022A (en) | Voice wake-up method and device, electronic equipment and storage medium | |
Sawakare et al. | Speech recognition techniques: a review |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160629 |