CN103440867B - Audio recognition method and system - Google Patents
Audio recognition method and system Download PDFInfo
- Publication number
- CN103440867B CN103440867B CN201310335050.0A CN201310335050A CN103440867B CN 103440867 B CN103440867 B CN 103440867B CN 201310335050 A CN201310335050 A CN 201310335050A CN 103440867 B CN103440867 B CN 103440867B
- Authority
- CN
- China
- Prior art keywords
- clouds
- recognition result
- engine
- local
- identifies
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The invention discloses a kind of audio recognition method and system, the method includes: obtain the voice messaging that user sends;Described voice messaging is sent respectively to high in the clouds and identifies that engine and this locality identify engine, so that described high in the clouds identifies that described voice messaging is identified by engine and local identification engine respectively;If first receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns, then export described high in the clouds recognition result;If first receiving the described local local recognition result identifying engine, and confidence level corresponding to described local recognition result is more than the confidence interval upper limit set, then export described local recognition result.Utilize the present invention, can be bad at network or also be able to provide the user reliable voice identification result in the case of there is no network.
Description
Technical field
The present invention relates to technical field of voice recognition, be specifically related to a kind of audio recognition method and system.
Background technology
Growing along with Computer Science and Technology, speech recognition technology is the most ripe.And be widely used in
Mobile phone, TV, the field such as vehicle-mounted.As a example by vehicle-mounted, owing to people can not operate interface with hands easily when driving so that voice
Identify as a kind of interactive mode the most easily, make vehicle-mounted to provide more function.In prior art, speech recognition
Pattern is usually: receive the voice messaging of user, sets up with high in the clouds speech recognition server and is connected, and sends voice messaging to servicing
Device, is identified this information by server, returns again to recognition result to client.But not necessarily have stable in mobile device
Network connects, and high in the clouds returns and may experience bigger delay in this case, reduces Consumer's Experience, even without network, leads
Cause high in the clouds identification can not use.
Summary of the invention
The present invention provides a kind of audio recognition method and system, can be bad at network or also can in the case of not having network
Enough provide the user reliable voice identification result.
To this end, the present invention provides following technical scheme:
A kind of audio recognition method, including:
Obtain the voice messaging that user sends;
Described voice messaging is sent respectively to high in the clouds and identifies engine and local identification engine, draw so that described high in the clouds identifies
Hold up and described voice messaging is identified by local identification engine respectively;
If first receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns, then export described high in the clouds and identify knot
Really;
If first receiving the described local local recognition result identifying engine, and described local recognition result being corresponding
Confidence level more than the confidence interval upper limit set, then exports described local recognition result.
Preferably, described method also includes:
If described confidence level is in described confidence interval, within the waiting time set, reduce described confidence the most successively
The interval upper limit of degree;
If receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns within described waiting time, then export institute
State high in the clouds recognition result;
If do not receive the high in the clouds recognition result that described high in the clouds identifies that engine returns within described waiting time, and institute
The confidence level stating local recognition result corresponding is more than the confidence interval upper limit after reducing, then export described local recognition result.
Preferably, each waiting time is identical or different.
Preferably, described method also includes:
If after the number of times reducing the described confidence interval upper limit exceedes the frequency threshold value of setting, described local recognition result
Corresponding confidence level is still less than the confidence interval lower limit after reducing, and does not receives described high in the clouds recognition result yet, then to
User returns recognition failures information.
Preferably, described method also includes:
If first receiving described local recognition result, and confidence level corresponding to described local recognition result is less than setting
Confidence interval lower limit, then abandon described local recognition result, continue waiting for described high in the clouds and identify that engine returns to high in the clouds and identifies
Result;
If the waiting time exceedes the obstruction duration of setting, then return recognition failures information to user.
Preferably, described method also includes:
After receiving the speech recognition request that user sends, open high in the clouds and identify that engine and this locality identify engine.
A kind of speech recognition system, including:
Voice messaging acquiring unit, for obtaining the voice messaging that user sends;
Transmitting element, identifies engine and local identification engine for described voice messaging is sent respectively to high in the clouds, so that
Described high in the clouds identifies that described voice messaging is identified by engine and local identification engine respectively;
Receive unit, identify that the high in the clouds recognition result of engine return and described local identification are drawn for receiving described high in the clouds
The local recognition result held up;
Output unit, for first receiving, at described reception unit, the high in the clouds recognition result that described high in the clouds identifies that engine returns
Time, export described high in the clouds recognition result;The described local local recognition result identifying engine is first received at described reception unit,
And when the confidence level that described local recognition result is corresponding is more than the confidence interval upper limit set, export described local identification knot
Really.
Preferably, described system also includes:
Confidence level adjustment unit, for when described confidence level is in described confidence interval, successively in the wait set
The described confidence interval upper limit is reduced in duration;
Described output unit, is additionally operable to described reception unit within described waiting time and receives described high in the clouds identification engine
During the high in the clouds recognition result returned, export described high in the clouds recognition result;Within described waiting time, described reception unit does not receives
Identify the high in the clouds recognition result that engine returns to described high in the clouds, and confidence level corresponding to described local recognition result is more than reducing
After the confidence interval upper limit time, export described local recognition result.
Preferably, described system also includes:
Statistic unit, reduces the number of times of the described confidence interval upper limit for adding up described confidence level adjustment unit;
Described output unit, is additionally operable to after described number of times exceedes the frequency threshold value of setting, if local recognition result pair
The confidence level answered, still less than the confidence interval lower limit after reducing, and does not receives described high in the clouds recognition result yet, then to
Family returns recognition failures information.
Preferably, described reception unit, it is additionally operable to formerly receive described local recognition result, and described local identification
When confidence level corresponding to result is less than the confidence interval lower limit set, abandons described local recognition result, continue waiting for described
High in the clouds identifies that engine returns high in the clouds recognition result;And after the waiting time exceedes the obstruction duration of setting, return identification to user
Failure information.
Preferably, described system also includes:
Trigger element, for after receiving the speech recognition request that user sends, opens high in the clouds and identifies engine and this locality
Identify engine.
The audio recognition method of embodiment of the present invention offer and system, identify this locality and combine with high in the clouds identification, connecing
After receiving the voice messaging that user sends, described voice messaging is sent respectively to high in the clouds and identifies that engine and local identification engine enter
Row identifies.And when formerly receiving the high in the clouds recognition result that high in the clouds identifies engine return, directly output high in the clouds recognition result.As
Fruit first receives the local local recognition result identifying engine, and confidence level corresponding to local recognition result is more than the confidence set
During the interval upper limit of degree, then the local recognition result of output.And adhere to that high in the clouds recognition result is better than local recognition result, if high in the clouds
Identification can return result before this locality identifies and provides a relatively accurate recognition result, then use high in the clouds recognition result.From
And can complete when there is no network insertion to utilize local identification engine to complete the local function without network, as made a phone call,
Send short messages, listen music etc..
Further, if the confidence level of the local recognition result first received is relatively low, in the confidence interval arranged,
Then by constantly reducing the confidence level thresholding that this locality identifies, until having a qualified output or recognition failures.
This locality identification is combined by the scheme provided due to the embodiment of the present invention with high in the clouds identification, it is ensured that at network not
Well or provide reliable voice identification result as much as possible in the case of there is no network.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to institute in embodiment
The accompanying drawing used is needed to be briefly described, it should be apparent that, the accompanying drawing in describing below is only described in the present invention
A little embodiments, for those of ordinary skill in the art, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is a kind of flow chart of embodiment of the present invention audio recognition method;
Fig. 2 is the another kind of flow chart of embodiment of the present invention audio recognition method;
Fig. 3 is a kind of structural representation of embodiment of the present invention speech recognition system;
Fig. 4 is the another kind of structural representation of embodiment of the present invention speech recognition system.
Detailed description of the invention
In order to make those skilled in the art be more fully understood that the scheme of the embodiment of the present invention, below in conjunction with the accompanying drawings and implement
The embodiment of the present invention is described in further detail by mode.
The embodiment of the present invention provides a kind of audio recognition method and system, identifies in conjunction with high in the clouds and this locality identifies, Ke Yi
Do not have during network insertion to complete to utilize local identification engine to complete the local function without network, as made a phone call, send short messages, listening
Music etc..Can also be according to the requirement dynamically reduced the time delay that network connects local engine results.
As it is shown in figure 1, be a kind of flow chart of embodiment of the present invention audio recognition method, comprise the following steps:
Step 101, obtains the voice messaging that user sends.
Step 102, is sent respectively to described voice messaging high in the clouds and identifies engine and local identification engine, so that described cloud
End identifies that described voice messaging is identified by engine and local identification engine respectively.
Specifically, the voice messaging that can send with recording module record user.The voice messaging recorded can be straight
Sending and receiving are given high in the clouds and are identified engine and local identification engine;First can also filter out effective information start-stop with voice detection module
Point, is then then forwarded to high in the clouds and identifies engine and local identification engine.
Step 103, if first receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns, then exports described high in the clouds
Recognition result.
Because the server identification engine performance in high in the clouds is powerful, recognition result has higher confidence level, is therefore preferentially connecing
After receiving high in the clouds recognition result, can directly export this recognition result.
Step 104, if first receiving the described local local recognition result identifying engine, and described local identification is tied
The confidence level that fruit is corresponding is more than the confidence interval upper limit set, then export described local recognition result.
Owing to, in the case of network environment is bad, the recognition result in high in the clouds may have sizable delay.Now, obtain
Local recognition result corresponding to this voice messaging and the confidence value of this result, if what this confidence value was arranged more than system
Confidence level thresholding, illustrates that this recognition result is completely available, therefore the local recognition result of output, it is not necessary to wait that high in the clouds identifies again
Result.
Visible, that the embodiment of the present invention provides audio recognition method, identifies this locality and combines, according to cloud with high in the clouds identification
Priority and the confidence level of the preferential local recognition result returned that end recognition result and local recognition result return determine choosing
Knowledge add result.And adhere to that the result in high in the clouds is better than this locality all the time, if high in the clouds identifies can identify one phase of offer in this locality
To returning result before identifying accurately, just use the result in high in the clouds.
In order to solve further network delay or network unavailable in the case of also be able to the language that obtains that there is certain accuracy rate
Sound recognition result, another embodiment of audio recognition method of the present invention can also dynamically adjust local knowledge according to current network condition
Other confidence level thresholding, in the result that the shortest output time delay is best.
As in figure 2 it is shown, be the another kind of flow chart of embodiment of the present invention audio recognition method, comprise the following steps:
Step 201 in Fig. 2 is identical with the step 101 in Fig. 1 to step 103 to step 203, does not repeats them here.
Step 204, if first receiving the local local recognition result identifying engine, then obtains local recognition result corresponding
Confidence level.
It addition, in step 204, the confidence level according to local recognition result is needed to determine follow-up process operation, it is ensured that
Best result is exported within the shortest time delay.Specifically, if confidence level is less than the confidence interval lower limit set, then
Perform step 205;If confidence level is in the confidence interval set, then perform step 208;If confidence level is more than setting
The confidence interval upper limit, then perform step 213.
Step 205, abandons local recognition result, continues waiting for high in the clouds and identifies that engine returns high in the clouds recognition result.
Step 206, it is judged that whether the waiting time exceedes the obstruction duration of setting;If it is, perform step 207;Otherwise
Continue waiting for.
Step 207, returns recognition failures information to user.
Step 208, reduces the described confidence interval upper limit successively within the waiting time set.
Step 209, it is judged that whether receive high in the clouds recognition result within described waiting time.If it is, execution step
210;Otherwise, step 211 is performed.
Step 210, output high in the clouds recognition result.
Step 211, it is judged that whether the confidence level that local recognition result is corresponding is more than on the confidence interval after current reduction
Limit.If it is, perform step 213;Otherwise, step 212 is performed.
Step 212, it is judged that reduce the number of times of the described confidence interval upper limit and whether exceed the frequency threshold value of setting and (such as may be used
Be frequency threshold value can be 1 to 3 etc.).If it is, perform step 207;Otherwise, step 208 is returned.
Step 213, the local recognition result of output.
It should be noted that the waiting time mentioned in above-mentioned steps 208 is between the time reducing the confidence interval upper limit
Every, can be such as the 2-5 second etc., and the time interval every time reducing the confidence interval upper limit can be identical, it is also possible to be different.
And the waiting time mentioned in above-mentioned steps 206 is two different concepts from above-mentioned waiting time, the described waiting time refers to
Waiting the time receiving high in the clouds recognition result, its starting point can be described voice messaging to be sent respectively to high in the clouds identify engine
Identify that engine starts timing with this locality, it is also possible to be to start timing, to this embodiment of the present invention after abandoning local recognition result
Do not limit.
It addition, in actual applications, do not receive in the certain time after every time reducing the described confidence interval upper limit
High in the clouds recognition result, and in the case of confidence level corresponding to local recognition result can not meet requirement, it is also possible to do not go to judge
Whether the number of times reducing the described confidence interval upper limit exceedes the frequency threshold value of setting, but judges whether the time waited exceedes
The waiting time limited, if it does, then return recognition failures information to user, to prevent the waiting time long, affect user
Experience.
Owing to high in the clouds has the speech data comparison of powerful server handling ability and magnanimity, recognition result confidence level
Height, and local identification is without network support, has the highest recognition speed and the widest scope of application, more especially suitable nothings are stable
In the mobile device that network connects.Therefore, this locality is identified and ties mutually with high in the clouds identification by the audio recognition method of the embodiment of the present invention
Close, take into account both respective advantages, after getting the voice messaging that user sends, be sent simultaneously to high in the clouds and identify engine
Identify that engine is identified with this locality.Can return before this locality identifies and provides a relatively accurate identification if high in the clouds identifies
As a result, then high in the clouds recognition result is used.Otherwise, constantly reduce the confidence level thresholding that this locality identifies, until have one qualified
Output or recognition failures, therefore can ensure that bad at network or provide reliable voice as far as possible in the case of not having network
Recognition result.
The audio recognition method of the embodiment of the present invention, by simple efficient local identify engine meet network obstructed time
Identification to local command, during further, since accept or reject, to high in the clouds and local recognition result, the delay that strategy can reduce identification
Between, the confidence level thresholding of local identification can be dynamically adjusted according to current network condition, thus ensure in the shortest delay
The result that time output is best.
In addition, it is necessary to explanation, in actual applications, can receive user send speech recognition request after,
Open high in the clouds and identify engine and local identification engine.Such as, described speech recognition request can press speech recognition key user
Time send, or provide a user with voice arousal function, on backstage, always on recording, sends out when recognizing special key words
Send.
This locality is identified that engine can use the recognition methods of some routines to the identification of special key words, such as, originally
Ground identifies that engine reads the grammar file that predefined is good, That file defines the set of the order word that speech recognition is supported,
And the set of identical action command word all exists in dictionary, local identify that engine can efficiently access.Local identification engine passes through
Grammar file generates one and identifies network, and the local characteristic information identifying engine extraction input voice is also carried out on network identifying
Route matching, final every user says any a word as defined in the range of this grammar file, all can be recognized by the system,
Thus know and described special key words.
Certainly, high in the clouds identifying, which kind of speech recognition technology engine and local identification engine specifically use, the present invention implements
Example does not limits, and especially this locality is identified engine, can need to select, all without affecting this according to concrete application scenarios
The bright the effect above that can reach.
Correspondingly, the embodiment of the present invention also provides for a kind of speech recognition system, as it is shown on figure 3, be a kind of knot of this system
Structure schematic diagram.
In this embodiment, described system includes:
Voice messaging acquiring unit 301, for obtaining the voice messaging that user sends.
Transmitting element 302, identifies engine and local identification engine for described voice messaging is sent respectively to high in the clouds, with
Described voice messaging is identified by engine and local identification engine respectively to make described high in the clouds identify.
Receive unit 303, identify that the high in the clouds recognition result of engine return and described this locality are known for receiving described high in the clouds
The local recognition result of other engine.
Output unit 304, for first receiving, at reception unit 303, the high in the clouds identification knot that described high in the clouds identifies that engine returns
Time really, export described high in the clouds recognition result;The described local local identification knot identifying engine is first received receiving unit 303
Really, and confidence level corresponding to described local recognition result more than the confidence interval upper limit set time, export and described local know
Other result.
The speech recognition system that the embodiment of the present invention provides, identifies this locality and combines with high in the clouds identification, know according to high in the clouds
Priority and the confidence level of the preferential local recognition result returned that other result returns with local recognition result determine selection
Know and add result.And adhere to that the result in high in the clouds is better than this locality all the time, if high in the clouds identifies can identify in this locality that providing one aligns
Return result before true identification, just use the result in high in the clouds.
In order to solve further network delay or network unavailable in the case of also be able to the language that obtains that there is certain accuracy rate
Sound recognition result, another embodiment of speech recognition system of the present invention can also dynamically adjust local knowledge according to current network condition
Other confidence level thresholding, in the result that the shortest output time delay is best.
As shown in Figure 4, it is the structural representation of another embodiment of speech recognition system of the present invention.
Unlike embodiment illustrated in fig. 3, in this embodiment, described system also includes:
Confidence level adjustment unit 401, for when described confidence level is in described confidence interval, successively set etc.
The described confidence interval upper limit is reduced in treating duration.
Correspondingly, in this embodiment, described output unit 304 is additionally operable within described waiting time receive unit 303
When receiving the high in the clouds recognition result that described high in the clouds identifies engine return, export described high in the clouds recognition result;When described wait
In long, reception unit 303 does not receives the high in the clouds recognition result that described high in the clouds identifies that engine returns, and described local identification is tied
When the confidence level that fruit is corresponding is more than the confidence interval upper limit after reducing, export described local recognition result.
It addition, in order to prevent from waiting the overlong time of recognition result output, affect Consumer's Experience, as shown in Figure 4, this system
Also can farther include: statistic unit 402, be used for adding up described confidence level adjustment unit 401 and reduce described confidence interval
The number of times of limit.
Correspondingly, the number of times that output unit 304 can be additionally used in described statistic unit 401 statistics exceedes the number of times threshold of setting
After value, if confidence level corresponding to local recognition result is still less than the confidence interval lower limit after reducing, and receive not yet
Described high in the clouds recognition result, then return recognition failures information to user.
In order to ensure the accuracy rate of the local recognition result of output, in above-mentioned Fig. 3 and embodiment illustrated in fig. 4, described
Reception unit 303 can be additionally used in and formerly receives described local recognition result, and the confidence that described local recognition result is corresponding
When degree is less than the confidence interval lower limit set, abandon described local recognition result, continue waiting for described high in the clouds and identify that engine returns
Return high in the clouds recognition result;And after the waiting time exceedes the obstruction duration of setting, return recognition failures information to user.Certainly,
In actual applications, it is also possible to by receive unit 303 by above-mentioned situation notify output unit 304, and by output unit 304 to
Family returns recognition failures information.
It addition, high in the clouds identify engine and the local unlatching identifying engine can by have different in the way of, such as, in above-mentioned each reality
Executing in example, described system may also include trigger element (not shown), is used for after receiving the speech recognition request that user sends,
Open high in the clouds and identify engine and local identification engine.Described speech recognition request can be sent out when user presses speech recognition key
Send, or provide a user with voice arousal function, the always on recording on backstage, send when recognizing special key words.
This locality is identified that engine can use the recognition methods of some routines to the identification of special key words, such as, originally
Ground identifies that engine reads the grammar file that predefined is good, That file defines the set of the order word that speech recognition is supported,
And the set of identical action command word all exists in dictionary, local identify that engine can efficiently access.Local identification engine passes through
Grammar file generates one and identifies network, and the local characteristic information identifying engine extraction input voice is also carried out on network identifying
Route matching, final every user says any a word as defined in the range of this grammar file, all can be recognized by the system,
Thus know and described special key words.
Certainly, high in the clouds identifying, which kind of speech recognition technology engine and local identification engine specifically use, the present invention implements
Example does not limits, and especially this locality is identified engine, can need to select, all without affecting this according to concrete application scenarios
The bright the effect above that can reach.
Visible by foregoing description, the speech recognition system of the embodiment of the present invention, drawn by simple efficient local identification
Hold up meet network obstructed time identification to local command, further, since can to the choice strategy in high in the clouds and local recognition result
To reduce the time delay identified, the confidence level thresholding of local identification can be dynamically adjusted according to current network condition, from
And ensure in the result that the shortest output time delay is best.
Each embodiment in this specification all uses the mode gone forward one by one to describe, identical similar portion between each embodiment
Dividing and see mutually, what each embodiment stressed is the difference with other embodiments.Real especially for system
For executing example, owing to it is substantially similar to embodiment of the method, so describing fairly simple, relevant part sees embodiment of the method
Part illustrate.
It should be noted that system embodiment described above is only schematically, wherein said as separated part
The unit of part explanation can be or may not be physically separate, and the parts shown as unit can be or also may be used
Not to be physical location, i.e. may be located at a place, or can also be distributed on multiple NE.Can be according to reality
Need select some or all of module therein to realize the purpose of the present embodiment scheme.Those of ordinary skill in the art exist
In the case of not paying creative work, i.e. it is appreciated that and implements.
Being described in detail the embodiment of the present invention above, the present invention is carried out by detailed description of the invention used herein
Illustrating, the explanation of above example is only intended to help to understand the method and apparatus of the present invention;Simultaneously for this area one
As technical staff, according to the thought of the present invention, the most all will change, to sum up institute
Stating, this specification content should not be construed as limitation of the present invention.
Claims (9)
1. an audio recognition method, it is characterised in that including:
Obtain the voice messaging that user sends;
Described voice messaging is sent respectively to high in the clouds identify engine and local identify engine so that described high in the clouds identify engine and
Described voice messaging is identified by local identification engine respectively;
If first receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns, then export described high in the clouds recognition result;
If first receive the described local local recognition result identifying engine, and the confidence that described local recognition result is corresponding
Degree more than the confidence interval upper limit set, then exports described local recognition result;
If described confidence level is in described confidence interval, within the waiting time set, reduce described confidence level district the most successively
Between the upper limit;
If receiving the high in the clouds recognition result that described high in the clouds identifies that engine returns within described waiting time, then export described cloud
End recognition result;
If within described waiting time, do not receive the high in the clouds recognition result that described high in the clouds identifies that engine returns, and described
The confidence level that ground recognition result is corresponding is more than the confidence interval upper limit after reducing, then export described local recognition result.
Method the most according to claim 1, it is characterised in that each waiting time is identical or different.
Method the most according to claim 1, it is characterised in that described method also includes:
If after the number of times reducing the described confidence interval upper limit exceedes the frequency threshold value of setting, described local recognition result is corresponding
Confidence level still less than the confidence interval lower limit after reducing, and do not receive described high in the clouds recognition result yet, then to user
Return recognition failures information.
Method the most according to claim 1, it is characterised in that described method also includes:
If first receive described local recognition result, and confidence level the putting less than setting that described local recognition result is corresponding
Confidence interval lower limit, then abandon described local recognition result, continues waiting for described high in the clouds and identifies that engine returns high in the clouds recognition result;
If the waiting time exceedes the obstruction duration of setting, then return recognition failures information to user.
5. according to the method described in any one of Claims 1-4, it is characterised in that described method also includes:
After receiving the speech recognition request that user sends, open high in the clouds and identify that engine and this locality identify engine.
6. a speech recognition system, it is characterised in that including:
Voice messaging acquiring unit, for obtaining the voice messaging that user sends;
Transmitting element, identifies engine and local identification engine for described voice messaging is sent respectively to high in the clouds, so that described
High in the clouds identifies that described voice messaging is identified by engine and local identification engine respectively;
Receive unit, identify that the high in the clouds recognition result of engine return and described this locality identify engine for receiving described high in the clouds
Local recognition result;
Output unit, during for first receiving the high in the clouds recognition result of described high in the clouds identification engine return at described reception unit,
Export described high in the clouds recognition result;The described local local recognition result identifying engine is first received at described reception unit, and
And confidence level corresponding to described local recognition result more than the confidence interval upper limit set time, export and described local identify knot
Really;
Confidence level adjustment unit, for when described confidence level is in described confidence interval, successively in the waiting time set
The interior reduction described confidence interval upper limit;
Described output unit, is additionally operable to described reception unit within described waiting time and receives described high in the clouds identification engine return
High in the clouds recognition result time, export described high in the clouds recognition result;Within described waiting time, described reception unit does not receives institute
State the high in the clouds recognition result that high in the clouds identifies that engine returns, and after confidence level corresponding to described local recognition result is more than reducing
During the confidence interval upper limit, export described local recognition result.
System the most according to claim 6, it is characterised in that described system also includes:
Statistic unit, reduces the number of times of the described confidence interval upper limit for adding up described confidence level adjustment unit;
Described output unit, is additionally operable to after described number of times exceedes the frequency threshold value of setting, if local recognition result is corresponding
Confidence level is still less than the confidence interval lower limit after reducing, and does not receives described high in the clouds recognition result yet, then return to user
Return recognition failures information.
System the most according to claim 6, it is characterised in that
Described reception unit, is additionally operable to formerly receive described local recognition result, and described local recognition result is corresponding
When confidence level is less than the confidence interval lower limit set, abandon described local recognition result, continue waiting for the identification of described high in the clouds and draw
Hold up return high in the clouds recognition result;And after the waiting time exceedes the obstruction duration of setting, return recognition failures information to user.
9. according to the system described in any one of claim 6 to 8, it is characterised in that described system also includes:
Trigger element, for after receiving the speech recognition request that user sends, opens high in the clouds and identifies that engine identifies with local
Engine.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310335050.0A CN103440867B (en) | 2013-08-02 | 2013-08-02 | Audio recognition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310335050.0A CN103440867B (en) | 2013-08-02 | 2013-08-02 | Audio recognition method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103440867A CN103440867A (en) | 2013-12-11 |
CN103440867B true CN103440867B (en) | 2016-08-10 |
Family
ID=49694558
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310335050.0A Active CN103440867B (en) | 2013-08-02 | 2013-08-02 | Audio recognition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103440867B (en) |
Families Citing this family (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2851896A1 (en) | 2013-09-19 | 2015-03-25 | Maluuba Inc. | Speech recognition using phoneme matching |
JP6054283B2 (en) * | 2013-11-27 | 2016-12-27 | シャープ株式会社 | Speech recognition terminal, server, server control method, speech recognition system, speech recognition terminal control program, server control program, and speech recognition terminal control method |
CN103730119B (en) * | 2013-12-18 | 2017-01-11 | 惠州市车仆电子科技有限公司 | Vehicle-mounted man-machine voice interaction system |
US9601108B2 (en) * | 2014-01-17 | 2017-03-21 | Microsoft Technology Licensing, Llc | Incorporating an exogenous large-vocabulary model into rule-based speech recognition |
CN104536978A (en) * | 2014-12-05 | 2015-04-22 | 奇瑞汽车股份有限公司 | Voice data identifying method and device |
CN105824857A (en) * | 2015-01-08 | 2016-08-03 | 中兴通讯股份有限公司 | Voice search method, device and terminal |
CN105261366B (en) * | 2015-08-31 | 2016-11-09 | 努比亚技术有限公司 | Audio recognition method, speech engine and terminal |
CN105118508B (en) * | 2015-09-14 | 2018-10-23 | 百度在线网络技术(北京)有限公司 | Audio recognition method and device |
CN106782546A (en) * | 2015-11-17 | 2017-05-31 | 深圳市北科瑞声科技有限公司 | Audio recognition method and device |
CN105551494A (en) * | 2015-12-11 | 2016-05-04 | 奇瑞汽车股份有限公司 | Mobile phone interconnection-based vehicle-mounted speech recognition system and recognition method |
CN105551488A (en) * | 2015-12-15 | 2016-05-04 | 深圳Tcl数字技术有限公司 | Voice control method and system |
CN106910504A (en) * | 2015-12-22 | 2017-06-30 | 北京君正集成电路股份有限公司 | A kind of speech reminding method and device based on speech recognition |
CN105931639B (en) * | 2016-05-31 | 2019-09-10 | 杨若冲 | A kind of voice interactive method for supporting multistage order word |
CN106328148B (en) * | 2016-08-19 | 2019-12-31 | 上汽通用汽车有限公司 | Natural voice recognition method, device and system based on local and cloud hybrid recognition |
CN106228975A (en) * | 2016-09-08 | 2016-12-14 | 康佳集团股份有限公司 | The speech recognition system of a kind of mobile terminal and method |
CN106384594A (en) * | 2016-11-04 | 2017-02-08 | 湖南海翼电子商务股份有限公司 | On-vehicle terminal for voice recognition and method thereof |
CN106558313A (en) * | 2016-11-16 | 2017-04-05 | 北京云知声信息技术有限公司 | Audio recognition method and device |
CN106847291A (en) * | 2017-02-20 | 2017-06-13 | 成都启英泰伦科技有限公司 | Speech recognition system and method that a kind of local and high in the clouds is combined |
CN108573706B (en) * | 2017-03-10 | 2021-06-08 | 北京搜狗科技发展有限公司 | Voice recognition method, device and equipment |
CN107464567A (en) * | 2017-07-24 | 2017-12-12 | 深圳云知声信息技术有限公司 | Audio recognition method and device |
WO2019036849A1 (en) * | 2017-08-21 | 2019-02-28 | 深圳前海达闼云端智能科技有限公司 | Substance detection method, device thereof, and detection terminal |
CN107564525A (en) * | 2017-10-23 | 2018-01-09 | 深圳北鱼信息科技有限公司 | Audio recognition method and device |
CN107785019A (en) * | 2017-10-26 | 2018-03-09 | 西安Tcl软件开发有限公司 | Mobile unit and its audio recognition method, readable storage medium storing program for executing |
WO2019127151A1 (en) * | 2017-12-27 | 2019-07-04 | 深圳达闼科技控股有限公司 | Detection method, detection device, and server |
CN110060668A (en) * | 2018-02-02 | 2019-07-26 | 上海华镇电子科技有限公司 | The system and method for identification delay is reduced in a kind of speech recognition controlled |
CN110299136A (en) * | 2018-03-22 | 2019-10-01 | 上海擎感智能科技有限公司 | A kind of processing method and its system for speech recognition |
CN108847219B (en) * | 2018-05-25 | 2020-12-25 | 台州智奥通信设备有限公司 | Awakening word preset confidence threshold adjusting method and system |
CN110970032A (en) * | 2018-09-28 | 2020-04-07 | 深圳市冠旭电子股份有限公司 | Sound box voice interaction control method and device |
CN111091819A (en) * | 2018-10-08 | 2020-05-01 | 蔚来汽车有限公司 | Voice recognition device and method, voice interaction system and method |
CN109493862B (en) * | 2018-12-24 | 2021-11-09 | 深圳Tcl新技术有限公司 | Terminal, voice server determination method, and computer-readable storage medium |
CN109869862A (en) * | 2019-01-23 | 2019-06-11 | 四川虹美智能科技有限公司 | The control method and a kind of air-conditioning system of a kind of air-conditioning, a kind of air-conditioning |
CN110148416B (en) * | 2019-04-23 | 2024-03-15 | 腾讯科技(深圳)有限公司 | Speech recognition method, device, equipment and storage medium |
CN110223683A (en) * | 2019-05-05 | 2019-09-10 | 安徽省科普产品工程研究中心有限责任公司 | Voice interactive method and system |
CN110265018B (en) * | 2019-07-01 | 2022-03-04 | 成都启英泰伦科技有限公司 | Method for recognizing continuously-sent repeated command words |
CN113053369A (en) * | 2019-12-26 | 2021-06-29 | 青岛海尔空调器有限总公司 | Voice control method and device of intelligent household appliance and intelligent household appliance |
CN111261166B (en) * | 2020-01-15 | 2022-09-27 | 云知声智能科技股份有限公司 | Voice recognition method and device |
CN111477225B (en) * | 2020-03-26 | 2021-04-30 | 北京声智科技有限公司 | Voice control method and device, electronic equipment and storage medium |
CN112905247A (en) * | 2021-01-25 | 2021-06-04 | 斑马网络技术有限公司 | Method and device for automatically detecting and switching languages, terminal equipment and storage medium |
CN112896048A (en) * | 2021-03-15 | 2021-06-04 | 中电科创智联(武汉)有限责任公司 | Vehicle-mounted all-around display system and method based on mobile phone interconnection and voice recognition |
CN113380254A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition method, device and medium based on cloud computing and edge computing |
CN113380253A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition system, device and medium based on cloud computing and edge computing |
CN114446279A (en) * | 2022-02-18 | 2022-05-06 | 青岛海尔科技有限公司 | Voice recognition method, voice recognition device, storage medium and electronic equipment |
CN114550719A (en) * | 2022-02-21 | 2022-05-27 | 青岛海尔科技有限公司 | Method and device for recognizing voice control instruction and storage medium |
CN115410579B (en) * | 2022-10-28 | 2023-03-31 | 广州小鹏汽车科技有限公司 | Voice interaction method, voice interaction device, vehicle and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1181684B1 (en) * | 1999-03-26 | 2004-11-03 | Scansoft, Inc. | Client-server speech recognition |
CN102496364A (en) * | 2011-11-30 | 2012-06-13 | 苏州奇可思信息科技有限公司 | Interactive speech recognition method based on cloud network |
CN102708865A (en) * | 2012-04-25 | 2012-10-03 | 北京车音网科技有限公司 | Method, device and system for voice recognition |
CN103137129A (en) * | 2011-12-02 | 2013-06-05 | 联发科技股份有限公司 | Voice recognition method and electronic device |
-
2013
- 2013-08-02 CN CN201310335050.0A patent/CN103440867B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1181684B1 (en) * | 1999-03-26 | 2004-11-03 | Scansoft, Inc. | Client-server speech recognition |
CN102496364A (en) * | 2011-11-30 | 2012-06-13 | 苏州奇可思信息科技有限公司 | Interactive speech recognition method based on cloud network |
CN103137129A (en) * | 2011-12-02 | 2013-06-05 | 联发科技股份有限公司 | Voice recognition method and electronic device |
CN102708865A (en) * | 2012-04-25 | 2012-10-03 | 北京车音网科技有限公司 | Method, device and system for voice recognition |
Also Published As
Publication number | Publication date |
---|---|
CN103440867A (en) | 2013-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103440867B (en) | Audio recognition method and system | |
CN104715752B (en) | Audio recognition method, apparatus and system | |
US9691390B2 (en) | System and method for performing dual mode speech recognition | |
CN107241689B (en) | Earphone voice interaction method and device and terminal equipment | |
US9583102B2 (en) | Method of controlling interactive system, method of controlling server, server, and interactive device | |
CN110557451B (en) | Dialogue interaction processing method and device, electronic equipment and storage medium | |
US10636414B2 (en) | Speech processing apparatus and speech processing method with three recognizers, operation modes and thresholds | |
US7689424B2 (en) | Distributed speech recognition method | |
US11244686B2 (en) | Method and apparatus for processing speech | |
CN1722230A (en) | Allocation of speech recognition tasks and combination of results thereof | |
CN102708865A (en) | Method, device and system for voice recognition | |
CN103117058A (en) | Multi-voice engine switch system and method based on intelligent television platform | |
WO2014176894A1 (en) | Voice processing method and terminal | |
CN105975063B (en) | A kind of method and apparatus controlling intelligent terminal | |
CN110992955A (en) | Voice operation method, device, equipment and storage medium of intelligent equipment | |
EP4040764A2 (en) | Method and apparatus for in-vehicle call, device, computer readable medium and product | |
CN111356117A (en) | Voice interaction method and Bluetooth device | |
CN106059997A (en) | Vehicle-mounted voice interaction method and system | |
WO2022206704A1 (en) | Voice interaction method and electronic device | |
CN109964473B (en) | Voice service response method and device | |
EP3059731A1 (en) | Method and apparatus for automatically sending multimedia file, mobile terminal, and storage medium | |
CN106228975A (en) | The speech recognition system of a kind of mobile terminal and method | |
CN109410926A (en) | Voice method for recognizing semantics and system | |
CN111128166B (en) | Optimization method and device for continuous awakening recognition function | |
CN113132214B (en) | Dialogue method, dialogue device, dialogue server and dialogue storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant after: Iflytek Co., Ltd. Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant before: Anhui USTC iFLYTEK Co., Ltd. |
|
COR | Change of bibliographic data | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |