CN102074231A - Voice recognition method and system - Google Patents

Voice recognition method and system Download PDF

Info

Publication number
CN102074231A
CN102074231A CN2010106143655A CN201010614365A CN102074231A CN 102074231 A CN102074231 A CN 102074231A CN 2010106143655 A CN2010106143655 A CN 2010106143655A CN 201010614365 A CN201010614365 A CN 201010614365A CN 102074231 A CN102074231 A CN 102074231A
Authority
CN
China
Prior art keywords
information
user
voice
model
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010106143655A
Other languages
Chinese (zh)
Inventor
冯雁
杨永胜
黄石磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
VANYINDA CO Ltd
Original Assignee
VANYINDA CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by VANYINDA CO Ltd filed Critical VANYINDA CO Ltd
Priority to CN2010106143655A priority Critical patent/CN102074231A/en
Publication of CN102074231A publication Critical patent/CN102074231A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to a voice recognition method. The voice recognition method comprises the following steps of: acquiring voice; performing characteristic extraction on the acquired voice; acquiring scene information of a user and matching with a grammar model or a language model according to the scene information; and performing pattern matching algorithm according to the matched grammar model or language model to obtain a voice recognition result. By the voice recognition method, accuracy of voice recognition can be improved. In addition, the invention also provides a voice recognition system.

Description

Audio recognition method and speech recognition system
[technical field]
The present invention relates to speech recognition technology, relate in particular to a kind of audio recognition method and speech recognition system.
[background technology]
Speech recognition is that the vocabulary content in people's the voice is converted to computer-readable input, for example button, binary coding or character string etc.Traditional audio recognition method is to gather voice, again the voice that collect are carried out feature extraction, feature extraction is meant and will obtains one group of vector after speech waveform process linearity or the nonlinear operation, pass through pattern matching algorithm then, vector is converted to and the immediate pronunciation unit sequence of model, and then is converted to voice identification result.Yet traditional this audio recognition method only carries out pattern match according to voice that collect and fixing acoustic model and language model (perhaps syntactic model), and recognition accuracy is not high.
[summary of the invention]
Based on this, be necessary to provide a kind of audio recognition method that can improve the speech recognition accuracy.
A kind of audio recognition method may further comprise the steps:
Gather voice;
The voice of gathering are carried out feature extraction;
Obtain user's scene information, according to described scene information coupling syntactic model or language model;
Syntactic model or language model according to described coupling carry out pattern matching algorithm, obtain voice identification result.
Preferably, described method also comprises obtains user position information, according to the step of described positional information coupling syntactic model or language model.
Preferably, described method also comprises the step according to described positional information and scene information coupling Pronounceable dictionary;
Described syntactic model or language model according to described coupling carries out pattern matching algorithm, and the step that obtains voice identification result is:
Syntactic model, language model and Pronounceable dictionary according to described coupling carry out pattern matching algorithm, obtain voice identification result.
Preferably, described positional information detects geographic position or the GPS locating information that provides automatically for user's terminal device, and described scene information is the scene delta data in the user interaction process.
Preferably, geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
In addition, also be necessary to provide a kind of speech recognition system that can improve the speech recognition accuracy.
A kind of speech recognition system comprises client and carries out mutual server with it, and described client comprises:
Voice acquisition module is used to gather voice;
First communication module, the voice that are used for gathering are sent to server;
Described server comprises:
Second communication module is used to receive the voice that described first communication module sends;
Characteristic extracting module is used for described voice are carried out feature extraction;
Sound identification module is used to obtain user's scene information, according to described scene information coupling syntactic model or language model, carries out pattern matching algorithm according to the syntactic model or the language model of described coupling, obtains voice identification result.
Preferably, described client also comprises:
The information acquisition module is used to obtain user's scene information and positional information;
Described first communication module also is used for described scene information and positional information are sent to described server.
Preferably, described sound identification module also is used to obtain user position information, according to described positional information coupling syntactic model or language model; Described server also comprises the database that is used to store user position information and scene information.
Preferably, described sound identification module also is used for according to described positional information and scene information coupling Pronounceable dictionary, carries out pattern matching algorithm according to syntactic model, language model and the Pronounceable dictionary of described coupling, obtains voice identification result.
Preferably, described positional information detects positional information or the GPS locating information that provides automatically for user's terminal device, and described scene information is the scene delta data in the user interaction process.
Preferably, geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
Above-mentioned audio recognition method and speech recognition system, scene information coupling syntactic model or language model according to the user, can when carrying out pattern matching algorithm, change the parameter of syntactic model or language model according to user's scene information, make syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
[description of drawings]
Fig. 1 is the process flow diagram of an audio recognition method among the embodiment;
Fig. 2 is the structural representation of a speech recognition system among the embodiment;
Fig. 3 is the structural representation of the speech recognition system among another embodiment.
[embodiment]
Fig. 1 shows an audio recognition method flow process among the embodiment, and this method flow may further comprise the steps:
Step S102 gathers voice.In one embodiment, by being installed in the client software input voice of portable terminal, for example, user's click keys is talked after entering the voice collecting pattern, phonetic entry finishes after the click keys once more, client software is gathered voice, and the voice that collect can be sent to background server and handle.
Step S104 carries out feature extraction to the voice of gathering.The data that collect are speech waveform, speech waveform is carried out feature extraction after, obtain the voice acoustic feature.Can adopt traditional phonetic feature extraction algorithm that speech waveform is carried out feature extraction, for example extract MFCC (Mel frequency cepstrum system), LPC (linear forecast coding coefficient), speech energy etc.
Step S106 obtains user's scene information, according to described scene information coupling syntactic model or language model.User's scene information is meant the scene delta data in the user interaction process.The user realizes various application by the client software that is installed in portable terminal, can produce the scene delta data in the reciprocal process of application system.Query context that produces when for example, inquiring about shopping information, Flight Information and Query Result etc.
Scene information coupling appropriate grammar model or language model according to the user, for example, during the user inquiring firm name, adopt the big syntactic model of each trade name probability of occurrence, during user inquiring clothes shop information, then adopt clothes shop's title probability big syntactic model or language model.
Step S108 carries out pattern matching algorithm according to the syntactic model or the language model that mate, obtains voice identification result.The speech recognition resource needed has speech model, syntactic model and Pronounceable dictionary etc., according to the above-mentioned voice acoustic feature that obtains, from speech recognition resources, find the result of mating most, can adopt traditional Viterbi (Viterbi) algorithm to carry out speech recognition, obtain voice identification result.
Scene information by the user changes the parameter of syntactic model or language model, makes syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
In one embodiment, said method also comprises: obtain user position information, according to positional information coupling syntactic model or language model.User position information detects geographic position or the GPS locating information that provides automatically for user's terminal device.In addition, user position information can also be geographic position or the GPS locating information that the user initiatively provides or revises, the scene delta data that described scene information is initiatively set or changed for the user.User self the geographic position of filling in by client software for example, this geographic position is stored in server as userspersonal information's a part, and when this information of user's modification, server upgrades.The GPS locating information can be obtained in real time, when the change in location at user place, obtains user's GPS locating information, then can obtain the residing position of active user.Also can obtain the geographic position that the user sets, mate syntactic model or language model according to the geographic position that the user sets.For example, user's terminal device detects the current position of user in Beijing, and the user sets self geographical position in Shanghai, then according to Shanghai this geographic position coupling syntactic model or language model.
But at the relation data of server end maintenance position information and syntactic model, language model, get access to user position information after, can mate appropriate grammar model or language model according to positional information.For example, user position information is the Beijing area, is called main syntactic model, language model with then mating the Beijing area.When the user moves to Shanghai from Beijing, get access to user's current position information, be called main syntactic model, language model with mating above Haiti district.
In another embodiment, said method also comprises: according to user position information and scene information coupling Pronounceable dictionary.Among this embodiment, according to user position information and scene information coupling appropriate grammar model, language model and Pronounceable dictionary, then syntactic model, language model and the Pronounceable dictionary according to coupling carries out pattern matching algorithm, obtains voice identification result.
After carrying out pattern matching algorithm, obtain the word sequence of one or more speech, in the speech that obtains, choose the speech of probability of occurrence maximum and form word sequence, be voice identification result.Voice identification result can be symbol, numerical value or a literal etc., and the voice that for example collect are " today ", and the result that identification obtains can be " today ", " jintian ", " today " etc., and this result can do subsequent treatment in application program.
Fig. 2 shows a speech recognition system among the embodiment, and this system comprises that client 100 reaches and client 100 is carried out mutual server 200, wherein:
Client 100 comprises voice acquisition module 102 and first communication module 104, and wherein: voice acquisition module 102 is used to gather voice; The voice that first communication module 104 is used for collecting send to server 200.In one embodiment, the user can begin input by being installed in the application software input voice on the portable terminal behind button click, stop input once more behind the button click, 102 of voice acquisition module are gathered voice, send to server 200 by first communication module 104 and handle.
Server 200 comprises second communication module 202, characteristic extracting module 204 and sound identification module 206, and wherein: second communication module 202 is used to receive the voice that first communication module 104 sends; Characteristic extracting module 204 is used for these voice are carried out feature extraction; Sound identification module 206 is used to obtain user's scene information, according to scene information coupling syntactic model or language model, carries out pattern matching algorithm according to the syntactic model or the language model that mate, obtains voice identification result.User's scene information is the scene delta data in the user interaction process.
Among this embodiment, the data that second voice module 202 receives are speech waveform, and 204 pairs of speech waveforms of characteristic extracting module carry out feature extraction, obtain the voice acoustic feature.Can adopt traditional feature extraction algorithm to extract MFCC (Mel frequency cepstrum system), LPC (linear forecast coding coefficient), the speech energy etc. of voice.
The user produces the scene delta data by the various application software that are installed on the portable terminal, for example, and query context that produces when inquiry shopping information, Flight Information and Query Result etc.Sound identification module 206 is according to user's scene information coupling appropriate grammar model or language model, for example, during the user inquiring firm name, adopt big syntactic model of each trade name probability of occurrence or language model, during user inquiring clothes shop information, then adopt clothes shop's title probability big syntactic model or language model.The speech recognition resource needed has speech model, syntactic model and Pronounceable dictionary etc., according to the above-mentioned voice acoustic feature that obtains, sound identification module 206 finds the result of mating most from speech recognition resources, can adopt traditional Viterbi (Viterbi) algorithm to carry out speech recognition, obtain voice identification result.
Scene information by the user changes the parameter of syntactic model or language model, makes syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
Fig. 3 shows the speech recognition system among another embodiment, and this system is on basis embodiment illustrated in fig. 1, and client 100 also comprises information acquisition module 106, and server 200 also comprises database 208. wherein:
Information acquisition module 106 is used to obtain user's scene information and positional information.Among this embodiment, first communication module 104 sends to server 200 with user's scene information and positional information.User's scene information is the scene delta data in the user interaction process, and user position information can be that geographic position or the GPS locating information that provides is provided user's terminal device automatically.It also can be the geographic position that the user initiatively provides or revises, for example, the user fills in by client software the geographic position of self, this geographic position is stored in the database 208 of server 200 as userspersonal information's a part, when this information of user's modification, database 208 upgrades.The GPS locating information can be obtained in real time, when the change in location at user place, obtains user's GPS locating information, then can obtain the residing position of active user.Also can obtain the geographic position that the user sets, mate syntactic model, language model according to the geographic position that the user sets.For example, user's terminal device detects the current position of user in Beijing, and the user sets self geographical position in Shanghai, then according to this geographic position coupling syntactic model of Shanghai, language model.
Database 208 is used to store user position information and scene information.In addition, database 208 also can be used for the storaged voice recognition resource, promptly is used to carry out speech model, syntactic model and the Pronounceable dictionary etc. of speech recognition.
Sound identification module 206 also is used to obtain user position information, according to positional information coupling syntactic model or language model.But the relation data of maintenance position information and syntactic model, language model in database 208 after sound identification module 206 gets access to user position information, can mate appropriate grammar model or language model.
But the relation data at database 208 maintenance position information and syntactic model, language model after sound identification module 206 gets access to user position information, can mate appropriate grammar model or language model according to positional information.For example, user position information is the Beijing area, is called main syntactic model or language model with then mating the Beijing area.When the user moves to Shanghai from Beijing, get access to user's current position information, be called main syntactic model or language model with mating above Haiti district.
In another embodiment, sound identification module 206 also is used to obtain positional information and scene information, according to positional information and scene information coupling Pronounceable dictionary, carry out pattern matching algorithm according to syntactic model, language model and the Pronounceable dictionary of described coupling, obtain voice identification result.
After sound identification module 206 carries out pattern matching algorithm, obtain the word sequence of one or more speech, in the speech that obtains, choose the speech of probability of occurrence maximum and form word sequence, be voice identification result.Voice identification result can be symbol, numerical value or a literal etc., and the voice that for example collect are " today ", and the result that identification obtains can be " today ", " jintian ", " today " etc., and this result can do subsequent treatment in application program.
The above embodiment has only expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (11)

1. audio recognition method may further comprise the steps:
Gather voice;
The voice of gathering are carried out feature extraction;
Obtain user's scene information, according to described scene information coupling syntactic model or language model;
Syntactic model or language model according to described coupling carry out pattern matching algorithm, obtain voice identification result.
2. audio recognition method according to claim 1 is characterized in that, described method also comprises obtains user position information, according to the step of described positional information coupling syntactic model or language model.
3. audio recognition method according to claim 2 is characterized in that, described method also comprises the step according to described positional information and scene information coupling Pronounceable dictionary;
Described syntactic model or language model according to described coupling carries out pattern matching algorithm, and the step that obtains voice identification result is:
Syntactic model, language model and Pronounceable dictionary according to described coupling carry out pattern matching algorithm, obtain voice identification result.
4. according to claim 2 or 3 described audio recognition methods, it is characterized in that described positional information detects geographic position or the GPS locating information that provides automatically for user's terminal device, described scene information is the scene delta data in the user interaction process.
5. according to claim 2 or 3 described audio recognition methods, it is characterized in that geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
6. a speech recognition system comprises client and carries out mutual server with it, it is characterized in that described client comprises:
Voice acquisition module is used to gather voice;
First communication module, the voice that are used for gathering are sent to server;
Described server comprises:
Second communication module is used to receive the voice that described first communication module sends;
Characteristic extracting module is used for described voice are carried out feature extraction;
Sound identification module is used to obtain user's scene information, according to described scene information coupling syntactic model or language model, carries out pattern matching algorithm according to the syntactic model or the language model of described coupling, obtains voice identification result.
7. speech recognition system according to claim 6 is characterized in that, described client also comprises:
The information acquisition module is used to obtain user's scene information and positional information;
Described first communication module also is used for described scene information and positional information are sent to described server.
8. speech recognition system according to claim 6 is characterized in that described sound identification module also is used to obtain user position information, according to described positional information coupling syntactic model or language model; Described server also comprises the database that is used to store user position information and scene information.
9. speech recognition system according to claim 6, it is characterized in that, described sound identification module also is used for according to described positional information and scene information coupling Pronounceable dictionary, syntactic model, language model and Pronounceable dictionary according to described coupling carry out pattern matching algorithm, obtain voice identification result.
10. according to any described speech recognition system in the claim 6 to 9, it is characterized in that, described positional information detects positional information or the GPS locating information that provides automatically for user's terminal device, and described scene information is the scene delta data in the user interaction process.
11. according to any described speech recognition system in the claim 6 to 9, it is characterized in that, geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
CN2010106143655A 2010-12-30 2010-12-30 Voice recognition method and system Pending CN102074231A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010106143655A CN102074231A (en) 2010-12-30 2010-12-30 Voice recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010106143655A CN102074231A (en) 2010-12-30 2010-12-30 Voice recognition method and system

Publications (1)

Publication Number Publication Date
CN102074231A true CN102074231A (en) 2011-05-25

Family

ID=44032749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010106143655A Pending CN102074231A (en) 2010-12-30 2010-12-30 Voice recognition method and system

Country Status (1)

Country Link
CN (1) CN102074231A (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103474063A (en) * 2013-08-06 2013-12-25 福建华映显示科技有限公司 Voice recognition system and method
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN103594085A (en) * 2012-08-16 2014-02-19 百度在线网络技术(北京)有限公司 Method and system providing speech recognition result
CN103903611A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Speech information identifying method and equipment
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105161110A (en) * 2015-08-19 2015-12-16 百度在线网络技术(北京)有限公司 Bluetooth connection-based speech recognition method, device and system
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN105336326A (en) * 2011-09-28 2016-02-17 苹果公司 Speech recognition repair using contextual information
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105488044A (en) * 2014-09-16 2016-04-13 华为技术有限公司 Data processing method and device
CN105516289A (en) * 2015-12-02 2016-04-20 广东小天才科技有限公司 Method and system for assisting voice interaction based on position and action
WO2016101577A1 (en) * 2014-12-24 2016-06-30 中兴通讯股份有限公司 Voice recognition method, client and terminal device
CN105788598A (en) * 2014-12-19 2016-07-20 联想(北京)有限公司 Speech processing method and electronic device
CN105845133A (en) * 2016-03-30 2016-08-10 乐视控股(北京)有限公司 Voice signal processing method and apparatus
CN105869635A (en) * 2016-03-14 2016-08-17 江苏时间环三维科技有限公司 Speech recognition method and system
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Audio recognition method and system
CN106205606A (en) * 2016-08-15 2016-12-07 南京邮电大学 A kind of dynamic positioning and monitoring method based on speech recognition and system
CN106228983A (en) * 2016-08-23 2016-12-14 北京谛听机器人科技有限公司 Scene process method and system during a kind of man-machine natural language is mutual
CN106558306A (en) * 2015-09-28 2017-04-05 广东新信通信息系统服务有限公司 Method for voice recognition, device and equipment
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN103956169B (en) * 2014-04-17 2017-07-21 北京搜狗科技发展有限公司 A kind of pronunciation inputting method, device and system
CN107316635A (en) * 2017-05-19 2017-11-03 科大讯飞股份有限公司 Audio recognition method and device, storage medium, electronic equipment
CN107483714A (en) * 2017-06-28 2017-12-15 努比亚技术有限公司 A kind of audio communication method, mobile terminal and computer-readable recording medium
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107845382A (en) * 2012-06-21 2018-03-27 谷歌有限责任公司 Dynamic language model
CN107945792A (en) * 2017-11-06 2018-04-20 百度在线网络技术(北京)有限公司 Method of speech processing and device
CN108242237A (en) * 2016-12-26 2018-07-03 现代自动车株式会社 Speech processing device, the vehicle and method of speech processing with the equipment
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN108924370A (en) * 2018-07-23 2018-11-30 携程旅游信息技术(上海)有限公司 Call center's outgoing call speech waveform analysis method, system, equipment and storage medium
CN109065045A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Audio recognition method, device, electronic equipment and computer readable storage medium
CN109509473A (en) * 2019-01-28 2019-03-22 维沃移动通信有限公司 Sound control method and terminal device
CN109509466A (en) * 2018-10-29 2019-03-22 Oppo广东移动通信有限公司 Data processing method, terminal and computer storage medium
CN109801619A (en) * 2019-02-13 2019-05-24 安徽大尺度网络传媒有限公司 A kind of across language voice identification method for transformation of intelligence
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110085228A (en) * 2019-04-28 2019-08-02 广西盖德科技有限公司 Phonetic code application method, applications client and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN110364165A (en) * 2019-07-18 2019-10-22 青岛民航凯亚系统集成有限公司 Flight dynamic information voice inquiry method
CN110459203A (en) * 2018-05-03 2019-11-15 百度在线网络技术(北京)有限公司 A kind of intelligent sound guidance method, device, equipment and storage medium
CN110827824A (en) * 2018-08-08 2020-02-21 Oppo广东移动通信有限公司 Voice processing method, device, storage medium and electronic equipment
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
WO2020119541A1 (en) * 2018-12-11 2020-06-18 阿里巴巴集团控股有限公司 Voice data identification method, apparatus and system
CN111986651A (en) * 2020-09-02 2020-11-24 上海优扬新媒信息技术有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN112102815A (en) * 2020-11-13 2020-12-18 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium
CN114120979A (en) * 2022-01-25 2022-03-01 荣耀终端有限公司 Optimization method, training method, device and medium of voice recognition model
WO2023029442A1 (en) * 2021-08-30 2023-03-09 佛山市顺德区美的电子科技有限公司 Smart device control method and apparatus, smart device, and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
CN101345051A (en) * 2008-08-19 2009-01-14 南京师范大学 Speech control method of geographic information system with quantitative parameter
US20090228281A1 (en) * 2008-03-07 2009-09-10 Google Inc. Voice Recognition Grammar Selection Based on Context
CN101593518A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 The balance method of actual scene language material and finite state network language material

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
US20090228281A1 (en) * 2008-03-07 2009-09-10 Google Inc. Voice Recognition Grammar Selection Based on Context
CN101593518A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 The balance method of actual scene language material and finite state network language material
CN101345051A (en) * 2008-08-19 2009-01-14 南京师范大学 Speech control method of geographic information system with quantitative parameter

Cited By (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105336326A (en) * 2011-09-28 2016-02-17 苹果公司 Speech recognition repair using contextual information
CN107845382A (en) * 2012-06-21 2018-03-27 谷歌有限责任公司 Dynamic language model
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN103594085A (en) * 2012-08-16 2014-02-19 百度在线网络技术(北京)有限公司 Method and system providing speech recognition result
CN103594085B (en) * 2012-08-16 2019-04-26 百度在线网络技术(北京)有限公司 It is a kind of that the method and system of speech recognition result are provided
CN103903611A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Speech information identifying method and equipment
CN103903611B (en) * 2012-12-24 2018-07-03 联想(北京)有限公司 A kind of recognition methods of voice messaging and equipment
CN103474063B (en) * 2013-08-06 2015-12-23 福建华映显示科技有限公司 Voice identification system and method
CN103474063A (en) * 2013-08-06 2013-12-25 福建华映显示科技有限公司 Voice recognition system and method
CN103956169B (en) * 2014-04-17 2017-07-21 北京搜狗科技发展有限公司 A kind of pronunciation inputting method, device and system
CN105448292B (en) * 2014-08-19 2019-03-12 北京羽扇智信息科技有限公司 A kind of time Speech Recognition System and method based on scene
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105488044A (en) * 2014-09-16 2016-04-13 华为技术有限公司 Data processing method and device
CN105788598A (en) * 2014-12-19 2016-07-20 联想(北京)有限公司 Speech processing method and electronic device
WO2016101577A1 (en) * 2014-12-24 2016-06-30 中兴通讯股份有限公司 Voice recognition method, client and terminal device
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN104954532A (en) * 2015-06-19 2015-09-30 深圳天珑无线科技有限公司 Voice recognition method, voice recognition device and mobile terminal
CN105161110A (en) * 2015-08-19 2015-12-16 百度在线网络技术(北京)有限公司 Bluetooth connection-based speech recognition method, device and system
CN105161110B (en) * 2015-08-19 2017-11-17 百度在线网络技术(北京)有限公司 Audio recognition method, device and system based on bluetooth connection
CN106558306A (en) * 2015-09-28 2017-04-05 广东新信通信息系统服务有限公司 Method for voice recognition, device and equipment
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN105516289A (en) * 2015-12-02 2016-04-20 广东小天才科技有限公司 Method and system for assisting voice interaction based on position and action
CN105869635B (en) * 2016-03-14 2020-01-24 江苏时间环三维科技有限公司 Voice recognition method and system
CN105869635A (en) * 2016-03-14 2016-08-17 江苏时间环三维科技有限公司 Speech recognition method and system
WO2017166631A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Voice signal processing method, apparatus and electronic device
CN105845133A (en) * 2016-03-30 2016-08-10 乐视控股(北京)有限公司 Voice signal processing method and apparatus
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Audio recognition method and system
CN106205606A (en) * 2016-08-15 2016-12-07 南京邮电大学 A kind of dynamic positioning and monitoring method based on speech recognition and system
CN106228983A (en) * 2016-08-23 2016-12-14 北京谛听机器人科技有限公司 Scene process method and system during a kind of man-machine natural language is mutual
CN106228983B (en) * 2016-08-23 2018-08-24 北京谛听机器人科技有限公司 A kind of scene process method and system in man-machine natural language interaction
CN108242237A (en) * 2016-12-26 2018-07-03 现代自动车株式会社 Speech processing device, the vehicle and method of speech processing with the equipment
CN107316635A (en) * 2017-05-19 2017-11-03 科大讯飞股份有限公司 Audio recognition method and device, storage medium, electronic equipment
CN107483714A (en) * 2017-06-28 2017-12-15 努比亚技术有限公司 A kind of audio communication method, mobile terminal and computer-readable recording medium
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107945792A (en) * 2017-11-06 2018-04-20 百度在线网络技术(北京)有限公司 Method of speech processing and device
CN107945792B (en) * 2017-11-06 2021-05-28 百度在线网络技术(北京)有限公司 Voice processing method and device
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN110459203A (en) * 2018-05-03 2019-11-15 百度在线网络技术(北京)有限公司 A kind of intelligent sound guidance method, device, equipment and storage medium
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN108924370B (en) * 2018-07-23 2020-12-15 携程旅游信息技术(上海)有限公司 Call center outbound voice waveform analysis method, system, equipment and storage medium
CN108924370A (en) * 2018-07-23 2018-11-30 携程旅游信息技术(上海)有限公司 Call center's outgoing call speech waveform analysis method, system, equipment and storage medium
CN110827824A (en) * 2018-08-08 2020-02-21 Oppo广东移动通信有限公司 Voice processing method, device, storage medium and electronic equipment
CN109065045A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Audio recognition method, device, electronic equipment and computer readable storage medium
CN109509466A (en) * 2018-10-29 2019-03-22 Oppo广东移动通信有限公司 Data processing method, terminal and computer storage medium
CN111312233A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice data identification method, device and system
WO2020119541A1 (en) * 2018-12-11 2020-06-18 阿里巴巴集团控股有限公司 Voice data identification method, apparatus and system
CN109509473B (en) * 2019-01-28 2022-10-04 维沃移动通信有限公司 Voice control method and terminal equipment
CN109509473A (en) * 2019-01-28 2019-03-22 维沃移动通信有限公司 Sound control method and terminal device
CN109801619A (en) * 2019-02-13 2019-05-24 安徽大尺度网络传媒有限公司 A kind of across language voice identification method for transformation of intelligence
CN110085228A (en) * 2019-04-28 2019-08-02 广西盖德科技有限公司 Phonetic code application method, applications client and system
CN110364165A (en) * 2019-07-18 2019-10-22 青岛民航凯亚系统集成有限公司 Flight dynamic information voice inquiry method
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
CN111986651A (en) * 2020-09-02 2020-11-24 上海优扬新媒信息技术有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN111986651B (en) * 2020-09-02 2023-09-29 度小满科技(北京)有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium
CN112102833B (en) * 2020-09-22 2023-12-12 阿波罗智联(北京)科技有限公司 Speech recognition method, device, equipment and storage medium
CN112102815A (en) * 2020-11-13 2020-12-18 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
CN112102815B (en) * 2020-11-13 2021-07-13 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
WO2023029442A1 (en) * 2021-08-30 2023-03-09 佛山市顺德区美的电子科技有限公司 Smart device control method and apparatus, smart device, and readable storage medium
CN114120979A (en) * 2022-01-25 2022-03-01 荣耀终端有限公司 Optimization method, training method, device and medium of voice recognition model

Similar Documents

Publication Publication Date Title
CN102074231A (en) Voice recognition method and system
CN103187053B (en) Input method and electronic equipment
CN104715752B (en) Audio recognition method, apparatus and system
CN103164403B (en) The generation method and system of video index data
CN102592591B (en) Dual-band speech encoding
CN103594085B (en) It is a kind of that the method and system of speech recognition result are provided
CN105448292A (en) Scene-based real-time voice recognition system and method
WO2014101717A1 (en) Voice recognizing method and system for personalized user information
CN102510426A (en) Personal assistant application access method and system
CN106328124A (en) Voice recognition method based on user behavior characteristics
CN204134197U (en) Intelligent toy system
KR20150134993A (en) Method and Apparatus of Speech Recognition Using Device Information
CN103685520A (en) Method and device for pushing songs on basis of voice recognition
CN110019741B (en) Question-answering system answer matching method, device, equipment and readable storage medium
CN103794211A (en) Voice recognition method and system
CN104123930A (en) Guttural identification method and device
CN108536680B (en) Method and device for acquiring house property information
WO2019075829A1 (en) Voice translation method and apparatus, and translation device
CN104240698A (en) Voice recognition method
JP2010032865A (en) Speech recognizer, speech recognition system, and program
CN113409774A (en) Voice recognition method and device and electronic equipment
CN101222703A (en) Identity verification method for mobile terminal based on voice identification
CN112735394B (en) Semantic parsing method and device for voice
CN103811008A (en) Audio frequency content identification method and device
CN104484426A (en) Multi-mode music searching method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110525