CN103811000A - Voice recognition system and voice recognition method - Google Patents

Voice recognition system and voice recognition method Download PDF

Info

Publication number
CN103811000A
CN103811000A CN201410062780.2A CN201410062780A CN103811000A CN 103811000 A CN103811000 A CN 103811000A CN 201410062780 A CN201410062780 A CN 201410062780A CN 103811000 A CN103811000 A CN 103811000A
Authority
CN
China
Prior art keywords
audio
grammar
database
information
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410062780.2A
Other languages
Chinese (zh)
Inventor
蔡中军
贾春晖
周京蕙
王翀
郑潜
余代员
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Shenzhen Co Ltd
Original Assignee
China Mobile Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Shenzhen Co Ltd filed Critical China Mobile Shenzhen Co Ltd
Priority to CN201410062780.2A priority Critical patent/CN103811000A/en
Publication of CN103811000A publication Critical patent/CN103811000A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention relates to a voice recognition system and a voice recognition method. The voice recognition method includes the following steps of: S1, acquiring audio information to be recognized and region information corresponding to the audio information; S2, calling a voice database and a grammar database which correspond to the region information according to the region information, and calling corresponding grammar documents in the grammar database; and S3, recognizing the audio information according to the grammar documents and the voice database. The voice recognition system and the voice recognition method have the advantage of improving the recognition rate of multiple tones and large vocabularies.

Description

Speech recognition system and method
Technical field
The present invention relates to field of speech recognition, relate in particular to a kind of speech recognition system and method.
Background technology
Existing voice is known method for distinguishing and is mainly contained dynamic time reform technology DTW, vector quantization technology VQ, hidden Markov model HMM and artificial neural network ANN.
The dynamic time technology DTW that reforms is a kind of pattern match and model training technology early, and its applied dynamic programming method has successfully solved the phonic signal character argument sequence difficult problem that duration does not wait relatively time, in alone word voice identification, has obtained superperformance.
Vector quantization technology VQ extracts eigenvector from training utterance, obtain feature vector set, generate code book by LBG algorithm, in the time of identification, extract feature vector sequence from tested speech, they are mated with each code book, calculate average quantization error separately, select the code book of average quantization error minimum, as the voice that are identified.
Hidden Markov model HMM be voice signal time varying characteristic have a ginseng representation, it is described the statistical property of signal jointly by two stochastic processes that are mutually related, one of them is hidden (unobservable) has the Markov chain of finite state, and another is the stochastic process (observable) of the observation vector that is associated with each state of Markov chain.The feature of hidden Markov chain will disclose by the signal characteristic can observe.Like this, the feature that voice time varying signal is a certain section is just described by the stochastic process of corresponding states observation symbol, and signal is described by the transition probability of hidden Markov chain over time.Model parameter comprises HMM topological structure, state transition probability and describes one group of random function observing symbol statistical property.According to the feature of random function, HMM model can be divided into Discrete Hidden Markov Models (with HMM and semicontinuous hidden Markov model.
The application of artificial neural network in speech recognition is the another focus of studying now.ANN is a self-adaptation nonlinear dynamical system in essence, has simulated the principle of human neuronal activity, has self-study, association, contrast, reasoning and abstract ability.
Current above-mentioned main flow audio recognition method all comes with some shortcomings, and wherein main shortcoming is: for same vocabulary, if adopt the accent of different regions to say, tone color has certain change, and this can cause phonetic recognization rate greatly to reduce.
Summary of the invention
The defect that can cause when changing for tone color in existing voice recognition technology phonetic recognization rate to reduce, provides a kind of speech recognition system and method.
The technical scheme that technical solution problem of the present invention adopts is: a kind of audio recognition method is provided, comprises the following steps:
S1: gather audio-frequency information to be identified and the regional information corresponding with described audio-frequency information;
S2: call the speech database corresponding with this area's information and grammar database according to described regional information, call grammar file corresponding in described grammar database;
S3: described audio-frequency information is identified according to described grammar file and described speech database.
In audio recognition method provided by the invention, described step S1 also comprises: gather the class of business information corresponding with described audio-frequency information;
The grammar file calling in described grammar database in described step S2 further comprises: in described grammar database, call the grammar file corresponding with described class of business information according to described class of business information.
In audio recognition method provided by the invention, before described step S1, also comprise step S0: set up multiple speech databases and multiple grammar database and preserve corresponding class of business generative grammar file in each described grammar database according to different regions.
In audio recognition method provided by the invention, in described step S1, the regional information that the described audio-frequency information of described collection is corresponding comprises: the number of server of described audio-frequency information is sent in inquiry, according to described number inquiry and extract the regional information that described audio-frequency information is corresponding.
In audio recognition method provided by the invention, also comprise at described step S3: key word or word are set, while starting to identify described audio-frequency information, start timing, while recognizing described key word or word, stop timing, output recognition time.
The present invention also provides a kind of speech recognition system, comprising:
Acquisition module, described acquisition module comprises for gathering audio-frequency information to be identified the first collecting unit and for gathering the second collecting unit of the regional information corresponding with described audio-frequency information;
Scheduler module, described scheduler module is used for according to the described regional information Selection and call speech database corresponding with this area and grammar database and calls corresponding grammar file at described grammar database;
Identification module, described identification module is for identifying described audio-frequency information according to described grammar file and described speech database.
In speech recognition system provided by the invention, described acquisition module also comprises the 3rd collecting unit for gathering the class of business information corresponding with described audio-frequency information, and described scheduler module is also for calling the grammar file corresponding with described class of business information according to described class of business information at described grammar database.
In speech recognition system provided by the invention, described speech recognition system also comprises memory module, described memory module is for storing described multiple speech databases and the described multiple grammar database set up according to different regions, corresponding class of business generative grammar file in each described grammar database.
In speech recognition system provided by the invention, described the second collecting unit comprises the first inquiry subelement of number for inquiring about the server that sends described audio-frequency information and for according to described number inquiry and extract first of regional information that described audio-frequency information is corresponding and extract subelement.
In speech recognition system provided by the invention, also comprise timing module and module is set, the described module that arranges is for arranging key word or word, described timing module is for timing in the time that described identification module starts to identify described audio-frequency information, and described identification module stops timing and exports recognition time while recognizing described key word or word.
Speech recognition system provided by the invention and method have following beneficial effect with respect to prior art: because speech recognition system provided by the invention and method are when the identification audio-frequency information to be identified, what call is speech database and the grammar database corresponding with the regional information of audio-frequency information, therefore can avoid same vocabulary because of accent difference from different places, thereby cause phonetic recognization rate to reduce, and because the class of business of different regions is different, therefore call corresponding grammar database for the class of business situation of different regions and can improve the efficiency of speech recognition, therefore the present invention has the beneficial effect that improves tone color discrimination.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the theory diagram of the speech recognition system in first embodiment of the invention;
Fig. 2 is the theory diagram of second collecting unit of the present invention in embodiment illustrated in fig. 1;
Fig. 3 is the theory diagram of the speech recognition system in second embodiment of the invention;
Fig. 4 is the theory diagram of the speech recognition system in third embodiment of the invention;
Fig. 5 is the theory diagram of the speech recognition system in fifth embodiment of the invention;
Fig. 6 is the FB(flow block) of the audio recognition method in first embodiment of the invention;
Fig. 7 is the FB(flow block) of the audio recognition method in second embodiment of the invention;
Fig. 8 is the FB(flow block) of the audio recognition method in third embodiment of the invention.
Embodiment
Change in order to solve existing tone color in prior art the defect that causes phonetic recognization rate greatly to reduce, innovative point of the present invention is: provide corresponding speech database and grammar database for the different tone colors in different regions, to improve phonetic recognization rate.
Clearer for object of the present invention, technical scheme and advantage are described, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be understood that, specific embodiment described herein only, for explaining the present invention, is not intended to limit the present invention.
Fig. 1 shows the speech recognition system in first embodiment of the invention, it is mainly used in business monitoring, the audio-frequency information that monitoring server automatic imitation real user is sent instruction the server reply to operator to the server of operator is identified, to judge whether business service level meets expection.This speech recognition system comprises: acquisition module 1, scheduler module 2 and identification module 3.This acquisition module 1 comprises the first collecting unit 11 for gathering audio-frequency information to be identified and for gathering the regional information corresponding with this audio-frequency information the second collecting unit 12.Scheduler module 2 is for calling the speech database corresponding with this area and grammar database and calling corresponding grammar file at grammar database according to regional Information Selection.Identification module 3 is for identifying and export recognition result according to the grammar file and the speech database that call to audio-frequency information.Acquisition module 1, scheduler module 2 and identification module 3 connect successively.
This speech recognition system realizes by installation procedure on monitoring server.The CPU by monitoring server carries out work and realizes the function of acquisition module 1, scheduler module 2 and identification module 3 according to the software program of realizing these functions.
Monitoring server Reality simulation user sends instruction to the server of operator, and the server of operator automatically replies audio-frequency information, and the first collecting unit 11 that is arranged on the acquisition module 1 on monitoring server gathers this audio-frequency information to be identified by communication link.The second collecting unit 12 of acquisition module 1 is for obtaining the regional information that the audio-frequency information to be identified with this is corresponding and the regional information collecting being sent to scheduler module 2.Scheduler module 2, according to this regional information, is called and the corresponding speech database of this area's information and grammar database, and calls the corresponding grammar file in grammar database.Sound identification module 3 is identified audio-frequency information to be identified according to the speech database and the grammar file that call, and exports recognition result.
As shown in Figure 2, the second collecting unit 12 can comprise the server for inquiring about operator number first inquiry subelement and for according to this number inquiry and extract the server of operator regional information first extract subelement.Scheduler module 2 is scheduling speech database and the grammar database corresponding with this area's information just.
Understandably, the second collecting unit 12 can also obtain the regional information that audio-frequency information to be identified is corresponding by other means.
In the present embodiment, identification module 3 adopts stencil matching HMM method to realize speech recognition.This identification module 3 comprises feature extraction unit, acoustics recognition unit, speech recognition unit.Wherein, feature extraction unit is for carrying out feature extraction to the corresponding speech waveform of the audio-frequency information collecting, to obtain Speech acoustics feature.Can adopt traditional speech feature extraction algorithm to carry out feature extraction to speech waveform, for example, extract MFCC (Mel frequency cepstrum system), LPC (linear forecast coding coefficient), speech energy etc.The characteristic quantity of each acoustic model in the speech database that acoustics recognition unit calls according to scheduler module 2 contrasts with the characteristic quantity that feature extraction unit is extracted successively, to obtain the phone string that this audio-frequency information is corresponding.
Wherein, acoustic model is can become which type of characteristic quantity according to voice to carry out the data that modeling obtains.Because each area is for the pronunciation difference of same word, therefore its acoustic model is also not identical, so will set up multiple speech databases according to the difference in each area, in each speech database, be provided with the acoustic model of setting up according to the pronunciation custom of this area, this can improve speech recognition accuracy rate and recognition speed.The grammar file that speech recognition unit calls according to scheduler module 2 is identified phone string, obtains word strings output.Because the service conditions in each area is different, therefore, set up corresponding grammar database according to the service conditions of different regions, can improve phonetic recognization rate.
For the same business of areal, its voice vocabulary is substantially fixing.Therefore, can be adopted as each business and write a grammar file among a small circle.Grammar file limits some word phrases by specific principle combinations the data model of voice recognition unit Output rusults together.Therefore, as shown in Figure 3, in a second embodiment, on the basis of the first embodiment, acquisition module 1 can also comprise that the 3rd collecting unit 13, the three collecting units 13 are for gathering the class of business information corresponding with this audio-frequency information.Scheduler module 2 is first called the grammar database corresponding with this area's information according to regional information, and then in this grammar database, calls the grammar file corresponding with this class of business according to class of business information.In this grammar file of producing according to class of business, mainly comprise with the key vocabularies phrase of this traffic aided etc., therefore can improve recognition efficiency.
For example, there are b1, b2, b3, tetra-kinds of business of b4 in B area.Corresponding B area, sets up grammar database according to these four kinds of business, and in this grammar database, correspondence comprises four grammar files.In each grammar file, only have with the various common grammer vocabulary of this kind of traffic aided and with the corresponding phone string of this common grammer vocabulary.In the time that recognition unit recognizes this phone string, export the word strings corresponding with this phone string as recognition result output.
In this embodiment, the detailed process of scheduler module 2 schedule voice databases and grammar database is: obtain audio-frequency information, corresponding province information.Obtain current speech database library information: the path of the regional information that current speech database is corresponding, the name of current speech database, current speech database.Obtain sound bank information to be identified: province to be identified information, sound bank name to be identified, sound bank to be identified path.Current speech library information and sound bank information to be identified are compared.If inconsistent, the backup sound bank in corresponding province and current speech storehouse are switched, return to the grammar file in corresponding province, current speech storehouse.If unanimously, directly return to the grammar file in corresponding province, current speech storehouse.
As shown in Figure 4, in the 3rd embodiment, on the basis of the second embodiment, speech recognition system provided by the invention also comprises memory module 4.Memory module 4 is for storing the multiple speech databases and the multiple grammar database that generate according to different regions.Scheduler module 2 is called speech database and grammar database from this memory module 4.
In the 4th embodiment, on the basis of the 3rd embodiment, speech recognition system provided by the invention also comprises reparation module.When scheduler module 2 schedule voice failed database, speech database is repaired automatically.In practical work process, the reason of scheduler module 2 schedule voice failed databases has a lot.Sound bank switches and can carry out with rename form.If rename mistake, automatically repair process is exactly to be revised and come by the sound bank of rename mistake, or adopt artificial delete sound bank automatically repair procedure will arrive in backup sound bank path and found and copy to path, existing voice storehouse.Be likely also that this speech database is in occupied state, now just need to wait for, after occupied speech database is used up, can again call for scheduler module 2.Be likely in storage module 4, there is no the speech database corresponding with this area, so now just need to send the information that lacks this speech database.Repair module and there is automatic reparation speech data library facility.Each speech database has identical speech database for subsequent use, in the time calling the speech database failure in a certain area, switches to and calls corresponding speech database for subsequent use.
Understandably, on the basis of above-described embodiment, in the 5th embodiment, as shown in Figure 5, speech recognition system provided by the invention also comprises module 5 and timing module 6 is set, and this arranges sensitivity, identification languages information etc. that module 5 can arrange identification.This arranges module 5 can also arrange key word or word, timing module 6 timing in the time that identification module 3 starts to identify, stop timing when identification module 3 recognizes when key word that module 6 sets in advance or word are set, timing module 6 is exported identification module 3 and is recognized key word or word time used.This time can be used for judging whether the recognition efficiency of identification module 3 reaches expection.
As shown in Figure 6, the present invention also provides a kind of audio recognition method, and this audio recognition method comprises the following steps in the first embodiment:
S1: gather the regional information that audio-frequency information to be identified and audio-frequency information are corresponding.In this step, monitoring server simulation true man input instruction and send to the server of operator, the server of operator automatically replies audio-frequency information according to this instruction, and the first collecting unit 11 that is arranged on the acquisition module 1 in monitoring server gathers this audio-frequency information by communication link.The second collecting unit 12 of acquisition module 1 can send by inquiry the number of the server of the operator of audio-frequency information, then according to this number inquiry and extract the regional information corresponding to server of operator, that is to say the regional information that audio-frequency information is corresponding.
S2: according to the regional Information Selection speech database corresponding with this area and grammar database; Call the grammar file in grammar database.In this step, the regional information that scheduler module 2 gathers according to the second collecting unit 12 calls the speech database corresponding with this area's information and grammar database in memory module 4, and the step of going forward side by side is dispatched the grammar file in this grammar database.
S3: identification module 3 is identified audio-frequency information according to the grammar file and the speech database that are scheduled, and exports recognition result.
On the basis of the first embodiment of this audio recognition method, as shown in Figure 7, in a second embodiment, step S1 can further include following steps: the 3rd collecting unit 13 of acquisition module 1 gathers the class of business information corresponding with audio-frequency information.Correspondingly, in step S2, the grammar file calling in grammar database is specially: scheduler module 2 is called the grammar file corresponding with this class of business information according to class of business information in grammar database.In this step, can to identification module, identification sensitivity be set according to the class of business of identification in advance.Key word or vocabulary also can be set, timing module 5 starts timing in the time that identification module 3 starts to identify, stop timing when identification module 3 recognizes when key word that module 6 sets in advance or word are set, timing module 5 is exported identification module 3 and is recognized key word or word time used.This time can be used for judging whether the recognition efficiency of identification module 3 reaches expection.As shown in Figure 8, on the basis of the second embodiment, this audio recognition method can also comprise step S0 in the 3rd embodiment: set up multiple speech databases and multiple grammar database according to different regions and be kept in storage module 4, in each grammar database, generating and have multiple grammar files according to the difference of class of business.
In the 4th embodiment, regional information that is to say province information, and this audio recognition method comprises the following steps:
S1: acquisition module 1 gathers province information that path, audio-frequency information to be identified place, title and this audio-frequency information to be identified are corresponding, identify business information that this audio-frequency information to be identified is corresponding, with the key word of this traffic aided, and the communication of these collections to scheduler module.
S2: scheduler module 2 reads affiliated province, current speech data Kuku, title, path, and regional information under current speech database and the corresponding province of audio-frequency information information to be identified are compared.
If it is identical with the corresponding province of the audio-frequency information information of band identification that scheduler module 2 judges corresponding province, current speech data Kuku information, do not need to carry out speech data Kuku blocked operation, directly return to path and the title of the grammar database corresponding with the province information of audio-frequency information to be identified.
If scheduler module 2 judges the corresponding genus of current speech database province information and to import province into different, scheduler module 2 is carried out speech database switching, by the backup speech database in the corresponding province of current speech data Kuku RNTO, by the speech database RNTO current speech database corresponding with audio-frequency information to be identified, finally return to the grammar database in the corresponding province of current speech database again.And call the grammar file in this grammar database according to class of business information.
S3: if scheduler module 2 schedule voice databases and grammar database are unsuccessful, directly exit identification process, and carry out the automatic retrieval of speech database and automatically repair.Identification module 3 is arranged according to parameters such as the required sensitivity of identification, degree of accuracy, identification languages if dispatched successfully.
Create identification context according to the identifying of identification module 3, comprise and be identified as merit information, error message, abnormal information etc. for recording information that identifying produces.
In identification module 3, create identification audio stream, the Formart parameter of audio stream is set, as play frequency, hertz etc., and the audio-frequency information of needs identification is tied on identification audio stream.Activate identification module 3, identification module 3 and load speech database and grammar file and import identification audio stream and start identification process, after identification module 3 is activated, etc. special Windows self-defined message to be identified is triggered.In the time that the special Windows self-defined message of identification is triggered, if the parameter of Windows self-defined message is end of identification, identification module exits identification process.If the parameter of Windows self-defined message is for identification content, by identification contents extraction and be saved in recognition result.If the parameter of Windows self-defined message is identification error, process according to concrete type of error.
Can also create in this embodiment other self-defined prompting messages and other prompting message conflicts of guaranteeing to get along well, the identification message processing function of specific messages and the identification message type of key decryptor are set.
In sum, speech recognition system provided by the invention and method are owing to being provided with speech database and grammar database for different regions, in the process of identification, correspondence calls this speech database and grammar database can provide phonetic recognization rate and recognition speed.In addition, corresponding each business is also provided with grammar file specially in grammar database, and in identifying, genuine class of business, calls corresponding grammar file, can further improve phonetic recognization rate and recognition speed.
Should be understood that; by reference to the accompanying drawings embodiments of the invention are described above; but the present invention is not limited to above-mentioned embodiment; above-mentioned embodiment is only schematic; rather than restrictive, those of ordinary skill in the art, under enlightenment of the present invention, is not departing from the scope situation that aim of the present invention and claim protect; also can make a lot of forms, within these all belong to protection of the present invention.

Claims (10)

1. an audio recognition method, is characterized in that, comprises the following steps:
S1: gather audio-frequency information to be identified and the regional information corresponding with described audio-frequency information;
S2: call the speech database corresponding with this area's information and grammar database according to described regional information, call grammar file corresponding in described grammar database;
S3: described audio-frequency information is identified according to described grammar file and described speech database.
2. audio recognition method according to claim 1, is characterized in that, described step S1 also comprises: gather the class of business information corresponding with described audio-frequency information;
The grammar file calling in described grammar database in described step S2 further comprises: in described grammar database, call the grammar file corresponding with described class of business information according to described class of business information.
3. audio recognition method according to claim 2, it is characterized in that, before described step S1, also comprise step S0: set up multiple speech databases and multiple grammar database and preserve corresponding class of business generative grammar file in each described grammar database according to different regions.
4. according to the audio recognition method described in claims 1 to 3 any one, it is characterized in that, in described step S1, the regional information that the described audio-frequency information of described collection is corresponding comprises: the number of server of described audio-frequency information is sent in inquiry, according to described number inquiry and extract the regional information that described audio-frequency information is corresponding.
5. according to the audio recognition method described in claims 1 to 3 any one, it is characterized in that, also comprise at described step S3: key word or word are set, timing while starting to identify described audio-frequency information, while recognizing described key word or word, stop timing, output recognition time.
6. speech recognition system according to claim 1, is characterized in that, comprising:
Acquisition module (1), described acquisition module (1) comprises for gathering audio-frequency information to be identified the first collecting unit (11) and for gathering second collecting unit (12) of the regional information corresponding with described audio-frequency information;
Scheduler module (2), described scheduler module (2) is for according to the described regional information Selection and call speech database corresponding with this area and grammar database and call corresponding grammar file at described grammar database;
Identification module (3), described identification module (3) is for identifying described audio-frequency information according to described grammar file and described speech database.
7. speech recognition system according to claim 6, it is characterized in that, described acquisition module (1) also comprises the 3rd collecting unit (13) for gathering the class of business information corresponding with described audio-frequency information, and described scheduler module (2) is also for calling the grammar file corresponding with described class of business information according to described class of business information at described grammar database.
8. speech recognition system according to claim 7, it is characterized in that, also comprise memory module (4), described memory module (4) is for storing described multiple speech databases and the described multiple grammar database set up according to different regions, corresponding class of business generative grammar file in each described grammar database.
9. according to the speech recognition system described in claim 6 to 8 any one, it is characterized in that, described the second collecting unit (12) comprises the first inquiry subelement of number for inquiring about the server that sends described audio-frequency information and for according to described number inquiry and extract first of regional information that described audio-frequency information is corresponding and extract subelement.
10. according to the speech recognition system described in claim 6 to 8 any one, it is characterized in that, also comprise timing module (5) and module (6) is set, the described module (5) that arranges is for arranging key word or word, described timing module (5) for starting timing in the time that described identification module (6) starts to identify described audio-frequency information, and described identification module (6) stops timing and exports recognition time while recognizing described key word or word.
CN201410062780.2A 2014-02-24 2014-02-24 Voice recognition system and voice recognition method Pending CN103811000A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410062780.2A CN103811000A (en) 2014-02-24 2014-02-24 Voice recognition system and voice recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410062780.2A CN103811000A (en) 2014-02-24 2014-02-24 Voice recognition system and voice recognition method

Publications (1)

Publication Number Publication Date
CN103811000A true CN103811000A (en) 2014-05-21

Family

ID=50707679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410062780.2A Pending CN103811000A (en) 2014-02-24 2014-02-24 Voice recognition system and voice recognition method

Country Status (1)

Country Link
CN (1) CN103811000A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105206263A (en) * 2015-08-11 2015-12-30 东莞市凡豆信息科技有限公司 Speech and meaning recognition method based on dynamic dictionary
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Audio recognition method and system
CN106328146A (en) * 2016-08-22 2017-01-11 广东小天才科技有限公司 Video subtitle generating method and device
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN107958666A (en) * 2017-05-11 2018-04-24 小蚁科技(香港)有限公司 Method for the constant speech recognition of accent
CN108648749A (en) * 2018-05-08 2018-10-12 上海嘉奥信息科技发展有限公司 Medical speech recognition construction method and system based on voice activated control and VR
CN112530440A (en) * 2021-02-08 2021-03-19 浙江浙达能源科技有限公司 Intelligent voice recognition system for power distribution network scheduling tasks based on end-to-end model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1412741A (en) * 2002-12-13 2003-04-23 郑方 Chinese speech identification method with dialect background
CN101026644A (en) * 2006-02-22 2007-08-29 华为技术有限公司 Communication terminal and method for displaying mobile phone calling initiation place information
CN101329868A (en) * 2008-07-31 2008-12-24 林超 Speech recognition optimizing system aiming at locale language use preference and method thereof
CN102968987A (en) * 2012-11-19 2013-03-13 百度在线网络技术(北京)有限公司 Speech recognition method and system
US8504370B2 (en) * 2006-10-23 2013-08-06 Sungkyunkwan University Foundation For Corporate Collaboration User-initiative voice service system and method
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1412741A (en) * 2002-12-13 2003-04-23 郑方 Chinese speech identification method with dialect background
CN101026644A (en) * 2006-02-22 2007-08-29 华为技术有限公司 Communication terminal and method for displaying mobile phone calling initiation place information
US8504370B2 (en) * 2006-10-23 2013-08-06 Sungkyunkwan University Foundation For Corporate Collaboration User-initiative voice service system and method
CN101329868A (en) * 2008-07-31 2008-12-24 林超 Speech recognition optimizing system aiming at locale language use preference and method thereof
CN102968987A (en) * 2012-11-19 2013-03-13 百度在线网络技术(北京)有限公司 Speech recognition method and system
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105206263A (en) * 2015-08-11 2015-12-30 东莞市凡豆信息科技有限公司 Speech and meaning recognition method based on dynamic dictionary
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Audio recognition method and system
CN106328146A (en) * 2016-08-22 2017-01-11 广东小天才科技有限公司 Video subtitle generating method and device
CN107958666A (en) * 2017-05-11 2018-04-24 小蚁科技(香港)有限公司 Method for the constant speech recognition of accent
CN108648749A (en) * 2018-05-08 2018-10-12 上海嘉奥信息科技发展有限公司 Medical speech recognition construction method and system based on voice activated control and VR
CN112530440A (en) * 2021-02-08 2021-03-19 浙江浙达能源科技有限公司 Intelligent voice recognition system for power distribution network scheduling tasks based on end-to-end model
CN112530440B (en) * 2021-02-08 2021-05-07 浙江浙达能源科技有限公司 Intelligent voice recognition system for power distribution network scheduling tasks based on end-to-end model

Similar Documents

Publication Publication Date Title
CN103811000A (en) Voice recognition system and voice recognition method
US8914294B2 (en) System and method of providing an automated data-collection in spoken dialog systems
CN104903954B (en) The speaker verification distinguished using the sub- phonetic unit based on artificial neural network and identification
CN102270450B (en) System and method of multi model adaptation and voice recognition
CN106504768B (en) Phone testing audio frequency classification method and device based on artificial intelligence
CN108364662B (en) Voice emotion recognition method and system based on paired identification tasks
CN107808659A (en) Intelligent sound signal type recognition system device
CN108447471A (en) Audio recognition method and speech recognition equipment
CN106683677A (en) Method and device for recognizing voice
CN107731233A (en) A kind of method for recognizing sound-groove based on RNN
CN109036412A (en) voice awakening method and system
CN102982811A (en) Voice endpoint detection method based on real-time decoding
CN112581963B (en) Voice intention recognition method and system
CN110600014B (en) Model training method and device, storage medium and electronic equipment
CN108877769B (en) Method and device for identifying dialect type
CN111933108A (en) Automatic testing method for intelligent voice interaction system of intelligent network terminal
CN110853616A (en) Speech synthesis method, system and storage medium based on neural network
CN106710591A (en) Voice customer service system for power terminal
CN110992959A (en) Voice recognition method and system
CN114783424A (en) Text corpus screening method, device, equipment and storage medium
CN113744727A (en) Model training method, system, terminal device and storage medium
CN114420169B (en) Emotion recognition method and device and robot
CN116189657A (en) Multi-mode voice recognition error correction method and system
CN113990288B (en) Method for automatically generating and deploying voice synthesis model by voice customer service
JP6594273B2 (en) Questioning utterance determination device, method and program thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140521

RJ01 Rejection of invention patent application after publication