CN103811003A - Voice recognition method and electronic equipment - Google Patents

Voice recognition method and electronic equipment Download PDF

Info

Publication number
CN103811003A
CN103811003A CN201210454965.9A CN201210454965A CN103811003A CN 103811003 A CN103811003 A CN 103811003A CN 201210454965 A CN201210454965 A CN 201210454965A CN 103811003 A CN103811003 A CN 103811003A
Authority
CN
China
Prior art keywords
voice messaging
recognition result
conditioned
cognition
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210454965.9A
Other languages
Chinese (zh)
Other versions
CN103811003B (en
Inventor
戴海生
王茜莺
汪浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201210454965.9A priority Critical patent/CN103811003B/en
Priority to US14/079,219 priority patent/US9959865B2/en
Publication of CN103811003A publication Critical patent/CN103811003A/en
Application granted granted Critical
Publication of CN103811003B publication Critical patent/CN103811003B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides a voice recognition method and electronic equipment. The method is applied to the electronic equipment which has the voice recognition service. The method comprises obtaining first voice information; recognizing the first voice information through a first recognition model to obtain a first recognition result; determining whether the first recognition result meets a first preset condition; recognizing the first voice information through a second recognition model which is different from the first recognition model to obtain a second recognition result if the first recognition result meets the first preset condition; controlling the electronic equipment to execute corresponding instructions based on the second recognition result.

Description

A kind of audio recognition method and electronic equipment
Technical field
The present invention relates to electronic technology field, relate in particular to a kind of audio recognition method and electronic equipment.
Background technology
Along with the development of electronic technology, for the convenience of man-machine interaction, integrated speech-recognition services on increasing electronic equipment, so user can control electronic equipment by voice easily, and need not depend on physical control device, for example mouse, keyboard.
In the prior art, utilize the course of work of speech-recognition services generally: sound input device, the real-time typing acoustic information of for example microphone, then by the acoustic information of the real-time typing also real-time sound identification module that is transferred to simultaneously, then sound identification module carries out a series of processing to acoustic information, for example first carry out pre-service, pre-service comprises filtering, sampling and quantification, windowing etc.; Then pretreated voice signal is carried out to characteristic parameter extraction, obtain eigenvector, then the each template in the eigenvector getting and template base is carried out to similarity comparison, similarity soprano is exported as recognition result.And template in template base is to train in advance, gives an account of by each word in vocabulary, then its characteristic mass is deposited in template base as template.Next be exactly according to the corresponding relation of recognition result and operational order again, get corresponding operational order, then operate accordingly according to this operational order.
But, the inventor finds realizing in process of the present invention, scheme of the prior art is no matter be what kind of acoustic information of typing, all to carry out aforesaid identification process, until identify result, and till should having operational order or not corresponding to operational order, but in practice, may not sometimes user's sound by the acoustic information of microphone typing, not even people's sound, if also processed one time according to above-mentioned identification process, the ratio that so real effectively voice command accounts for total identification amount is just lower, also be that phonetic recognization rate is lower, also affect recognition efficiency reduces simultaneously.
Summary of the invention
The invention provides a kind of audio recognition method and electronic equipment, in order to solve all acoustic informations are all carried out to complete identification process causing the technical matters that phonetic recognization rate is lower, recognition efficiency is lower of existing in prior art.
One aspect of the present invention provides a kind of audio recognition method, is applied in an electronic equipment, and described electronic equipment has speech-recognition services, and described method comprises: obtain the first voice messaging; Identify described the first voice messaging by the first model of cognition, obtain the first recognition result; Judge whether described the first recognition result meets first pre-conditioned; Meet described first when pre-conditioned at described the first recognition result, identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result; Based on described the second recognition result, control described electronic equipment and carry out corresponding steering order.
Optionally, do not meet described first when pre-conditioned at described the first recognition result, described method also comprises: abandon described the first voice messaging.
Optionally, pass through before the first model of cognition identifies described the first voice messaging described, described method also comprises: judge whether described the first voice messaging meets second pre-conditioned; When described the first voice messaging does not meet described second when pre-conditioned, abandon described the first voice messaging; When described the first voice messaging meets described second when pre-conditioned, execution step: identify described the first voice messaging by the first model of cognition.
Optionally, described first model of cognition that passes through is identified described the first voice messaging, obtains the first recognition result, is specially: identify whether the user that described the first voice messaging is corresponding is predesignated subscriber, obtain the first recognition result; Wherein, in the time that user corresponding to described the first voice messaging is not described predesignated subscriber, represent that described the first voice messaging does not meet described first pre-conditioned, in the time that user corresponding to described the first voice messaging is described predesignated subscriber, represent that described the first voice messaging meets described first pre-conditioned.
Optionally, described acquisition the first voice messaging, specifically comprises: described the first voice messaging is carried out to end-point detection, obtain described the first voice messaging after detecting.
Optionally, meet described first when pre-conditioned at described the first recognition result, describedly identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result, be specially: identify described the first voice messaging by the second model of cognition, obtain the 3rd recognition result; Based on described the first recognition result and described the 3rd recognition result, obtain described the second recognition result.
Optionally, described speech-recognition services is in closed condition, when steering order corresponding to described the second recognition result is when waking instruction up, described based on described the second recognition result, controlling described electronic equipment carries out corresponding steering order and is specially: described in execution, wake instruction up, wake described speech-recognition services up.
The present invention provides a kind of electronic equipment on the other hand, and described electronic equipment has speech-recognition services, and described electronic equipment comprises: circuit board; Sound acquiring, is connected in described circuit board, for obtaining the first voice messaging; Process chip, is arranged on described circuit board, for identify described the first voice messaging by the first model of cognition, obtains the first recognition result; Judge whether described the first recognition result meets first pre-conditioned; Meet described first when pre-conditioned at described the first recognition result, identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result; Control chip, is arranged on described circuit board, for based on described the second recognition result, controls described electronic equipment and carries out corresponding steering order.
Optionally, described process chip specifically, also for not meeting described first at described the first recognition result when pre-conditioned, abandons described the first voice messaging.
Optionally, described process chip comprises the first sub-process chip and the second sub-process chip, and described the first sub-process chip is second pre-conditioned specifically for judging that whether described the first voice messaging meets; When described the first voice messaging does not meet described second when pre-conditioned, abandon described the first voice messaging; When described the first voice messaging meets described second when pre-conditioned, described the second sub-process chip is specifically for identifying described the first voice messaging by the first model of cognition.
Optionally, described process chip specifically also comprises the 3rd sub-process chip, specifically for identifying whether the user that described the first voice messaging is corresponding is predesignated subscriber, obtains the first recognition result; Wherein, in the time that user corresponding to described the first voice messaging is not described predesignated subscriber, represent that described the first voice messaging does not meet described first pre-conditioned, in the time that user corresponding to described the first voice messaging is described predesignated subscriber, represent that described the first voice messaging meets described first pre-conditioned.
Optionally, described sound acquiring also comprises detection chip, for described the first voice messaging is carried out to end-point detection, obtains described the first voice messaging after detecting.
Optionally, described process chip also comprises the 4th sub-process chip, for meeting described first at described the first recognition result when pre-conditioned, identifies described the first voice messaging by the second model of cognition, obtains the 3rd recognition result; Based on described the first recognition result and described the 3rd recognition result, obtain described the second recognition result.
Optionally, described speech-recognition services is in closed condition, and when steering order corresponding to described the second recognition result is when waking instruction up, described control chip wakes instruction up described in carrying out, and wakes described speech-recognition services up.
The one or more technical schemes that provide in the embodiment of the present invention, at least have following technique effect or advantage:
In an embodiment of the present invention, first carry out first step identification by the first model of cognition for voice messaging, then judge according to the result of first step identification whether this result meets first pre-conditioned, judging whether also will continue identification goes down, only, meeting this when pre-conditioned, just carry out next step identification by the second model of cognition, and then obtain recognition result, according to recognition result, carry out corresponding steering order.Thus, because first by the screening of the first step, only have qualified could continuation to identify, so the ratio that the last recognition result obtaining is effective recognition result is uprised, also improved discrimination, and the voice messaging that those are tackled by the first step just need not continue the work of identification, so improved the efficiency of identification.
Further, in one embodiment of the invention, directly abandon not meeting pre-conditioned voice messaging, and need not do follow-up processing to it, thus greatly reduce unwanted calculated amount, and the second model of cognition need not calculate, and has also saved electric weight.
Further again, in one embodiment of the invention, also utilizing before the first model of cognition identifies, a Rule of judgment is set again, directly judge whether voice messaging itself meets second pre-conditioned, when not meeting second when pre-conditioned, just directly abandon the first voice messaging, and need not identify through the first model of cognition, so further saved electric weight and reduced calculated amount.
Further, in one embodiment of the invention, finally obtain the second recognition result by the first model of cognition and the second model of cognition, just for determining that whether steering order that the second recognition result is corresponding is for waking instruction up, when being while waking instruction up, just go to wake up speech-recognition services, allowing speech-recognition services carry out subsequent voice order identifies, and if not waking instruction up, just continue to monitor, wake instruction up until listen to, so the at this moment real speech-recognition services state in not working always, so saved greatly electric weight and calculated amount.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the audio recognition method in one embodiment of the invention;
Fig. 2 is the Organization Chart of the electronic equipment in one embodiment of the invention.
Embodiment
The embodiment of the present invention provides a kind of audio recognition method and electronic equipment, solves all acoustic informations are all carried out to complete identification process causing the technical matters that phonetic recognization rate is lower, recognition efficiency is lower of existing in prior art.
Technical scheme in the embodiment of the present invention is to solve above-mentioned technical matters, and general thought is as follows:
First carry out first step identification by the first model of cognition for voice messaging, then judge according to the result of first step identification whether this result meets first pre-conditioned, judging whether also will continue identification goes down, only meeting this when pre-conditioned, just carry out next step identification by the second model of cognition, and then acquisition recognition result, according to recognition result, carry out corresponding steering order.Thus, because first by the screening of the first step, only have qualified could continuation to identify, so the ratio that the last recognition result obtaining is effective recognition result is uprised, also improved discrimination, and the voice messaging that those are tackled by the first step just need not continue the work of identification, so improved the efficiency of identification.
In order better to understand technique scheme, below in conjunction with Figure of description and concrete embodiment, technique scheme is described in detail.
One embodiment of the invention provides a kind of audio recognition method, is applied in an electronic equipment, and this electronic equipment is for example mobile phone, PDA(personal digital assistant), panel computer or notebook computer.This electronic equipment has speech-recognition services.
Next, please refer to Fig. 1, Fig. 1 is the process flow diagram of the audio recognition method in the present embodiment, and the method comprises:
Step 101: obtain the first voice messaging;
Step 102: identify described the first voice messaging by the first model of cognition, obtain the first recognition result;
Step 103: judge whether described the first recognition result meets first pre-conditioned;
Step 104: meet described first when pre-conditioned at described the first recognition result, identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result;
Step 105: based on described the second recognition result, control described electronic equipment and carry out corresponding steering order.
The implementation process of audio recognition method in the present embodiment will be described according to different application scenarioss in detail below.
In the first embodiment, suppose that speech-recognition services opens.In step 101, can be for example the typing voice messaging real-time by microphone, obtain the first voice messaging, in specific implementation process, can also carry out end-point detection to the first voice messaging, for example carry out end-point detection based on short-time energy and short-time average zero passage dose rate, to determine accurately the starting point and ending point of voice from the voice signal obtaining, distinguish voice signal and non-speech audio, so can reduce the collection capacity of the first voice messaging, save the workload of subsequent step, get rid of the interference of unvoiced segments or noise segment, improve the performance of speech-recognition services.In following embodiment, the first voice messaging can be both the voice messaging carrying out after end-point detection, can be also the voice messaging of crossing without end-point detection, and the enforcement of subsequent step is all similar.
Then perform step 102, the first voice messaging that is about to obtain is identified this first voice messaging by the first model of cognition, obtains the first recognition result, in specific implementation process, the first model of cognition can have numerous embodiments, below describes for example respectively.
The first, the first model of cognition is for example specific user's voice recognition model, when in the time that step 101 gets the first voice messaging, just identify by the first model of cognition whether the user that this first voice messaging is corresponding is predesignated subscriber, also identify this first voice messaging whether Shi Gai predesignated subscriber send, concrete example contrasts by vocal print in this way, it is one pre-conditioned whether the similarity of seeing vocal print exceedes, in the present embodiment, first pre-conditioned be for example that similarity value is more than or equal to 98%; The result of supposing the first voice messaging identification is that similarity value is 99%, so just contrasts 99% and first pre-conditioned 98%, and result is to be greater than, and so just represents what the first voice messaging Shi Gai predesignated subscriber sent; The result of supposing the first voice messaging identification is that similarity value is 97%, so just contrasts 97% and first pre-conditioned 98%, and result is to be less than, and so just represents what the first voice messaging Bu Shigai predesignated subscriber sent.
The second, the first model of cognition is simple model of cognition, only identify wherein one or two feature of this first voice messaging, then obtain the recognition result of this feature and two features, in the present embodiment, first pre-conditioned be for example that the mark of the matching degree to this one or two feature will reach certain threshold value, in the time that the mark of the matching degree in the first recognition result is more than or equal to threshold value, determine that the first recognition result meets first pre-conditioned.Because only identify feature one or two, calculated amount is smaller.
The third, the first model of cognition is simple model of cognition, different from the second is, simple model of cognition in the present embodiment is the whole sound characteristics of identification, but what adopt is fuzzy algorithm, be that algorithm is fairly simple, carry out fuzzy matching, so calculated amount is much smaller compared to accurate Calculation and exact matching.Then in the present embodiment, obtain the first recognition result through so simple model of cognition identification, then can judge whether the possibility that this first voice messaging is voice command exceedes a threshold value, first is pre-conditioned, if be more than or equal to this threshold value, illustrate that the first recognition result meets first pre-conditioned.
Below for example understand three kinds of situations of the first model of cognition, but in practice, the first model of cognition can also be other model, as long as calculated amount is less than the calculated amount of only once identifying in whole identifying in prior art, the application is not restricted.
When the first model of cognition by above-mentioned is through identification, and judge that the first recognition result meets first when pre-conditioned, just perform step 104, by the second model of cognition, the first voice messaging is further identified, below aforementioned correspondence three kind of first model of cognition illustrated to the second model of cognition.
The first, when definite the first voice messaging is that this predesignated subscriber sends, so just represent that this first voice messaging is that authorized user sends, the first voice messaging can further have been identified, at this moment, just enable the second model of cognition and identify the first voice messaging, idiographic flow for example extracted characteristic parameter before this, obtain eigenvector, then the each template in the eigenvector getting and template base is carried out to similarity comparison, similarity soprano is exported as recognition result, identical with identification process of the prior art, after identifying like this, can obtain the second recognition result.
The second, the second model of cognition is sophisticated identification model, identifies other features of identifying through the first model of cognition, for example three, five, even more features, also can all identify whole features one time again, finally identification obtains a recognition result, i.e. the second recognition result.Concrete, if employing is only to analyze residue character, the first recognition result and the recognition result that utilize the second model of cognition to obtain can be considered so, for example consider score and the weight of each feature, finally obtain the second recognition result.
The third, the second model of cognition is sophisticated identification model, corresponding, different from the sophisticated identification model of the second is, sophisticated identification model in the present embodiment is to adopt accurate algorithm to carry out exact matching, so can obtain more accurate recognition result, i.e. the second recognition result.Certainly, also can consider the first recognition result, for example, give the different weight of recognition result twice, finally determine the second recognition result corresponding with the first voice messaging.
Equally, three kinds of modes of above-mentioned the second model of cognition are also just for for example, not for limiting the present invention, as long as can obtain the model that can determine according to recognition result the recognition result of voice command through the second model of cognition identification.
When obtaining after the second recognition result by said method or additive method, just perform step 105, based on the second recognition result, control electronic equipment and carry out corresponding steering order.In specific implementation process, be for example first to determine corresponding voice command according to the second recognition result, then carry out corresponding steering order according to voice command.And voice command corresponding to the second recognition result is for example the order of making a phone call, the order of editing short message, in practice, can also be other orders, the application is not restricted.
Seen from the above description, because first by the screening of the first step, only have qualified could continuation to identify, so the ratio that the last recognition result obtaining is effective recognition result is uprised, also improved discrimination, and the voice messaging that those are tackled by the first step just need not continue the work of identification, so improved the efficiency of identification.
In a further embodiment, when in step 103, the result of judgement is that the first recognition result does not meet first when pre-conditioned, just directly abandon the first voice messaging, and can not carry out follow-up identification, so greatly reduce unwanted calculated amount, and the second model of cognition need not calculate, also save electric weight.
For further save power with reduce calculated amount, in the present embodiment, also before execution step 102, directly judge whether the first voice messaging meets second pre-conditioned, when the first voice messaging does not meet second when pre-conditioned, just abandon the first voice messaging; When the first voice messaging meets second when pre-conditioned, just perform step 102.
Specifically, can judge whether the first voice messaging is voice, rather than noise, the for example metallic sound of sound of the wind, building ground, or the sound of animal, for example barking, mew, if when the first voice messaging is people's sound, just perform step 102, if not, just can directly abandon the first voice messaging, so saved the calculated amount of the first model of cognition and the second model of cognition, simultaneously also because the first model of cognition and the second model of cognition need not calculate, so reduced power consumption.
In another embodiment, second is pre-conditioned, also can be that the user that the first voice messaging is corresponding is as the aforementioned predesignated subscriber, if the result of judgement represents the user Bu Shigai predesignated subscriber that the first voice messaging is corresponding, illustrate that so the user that this first voice messaging is corresponding does not have control authority to this electronic equipment, so just need not perform step 102 and follow-up each step, but directly abandon.
In a second embodiment, suppose that speech-recognition services is not now unlocked, because if speech-recognition services is always in starting state, will carry out speech recognition flow process always, so will cause large power consumption and calculated amount, so the present embodiment wakes small routine up on resident one of the operating system backstage of electronic equipment, whether be to wake instruction up by the instruction that wakes small routine identification user up, if words just start speech-recognition services, below will the implementation process of the audio recognition method in the present embodiment be described by concrete example.
Wake small routine up and monitor the sound of sound input device typing always, it is step 101, obtain the first voice messaging, then perform step 102, in the present embodiment, the first model of cognition for example can adopt three kinds of models described in the first embodiment, can certainly be to judge whether this first voice messaging is voice, if voice just carry out step 104; When the judged result of step 103 meets first pre-conditionedly, so just utilize the second model of cognition to identify, obtain the second recognition result.Whether then compare the second recognition result is to wake instruction up, can arrange in the present embodiment to wake up and in small routine, only include two voice commands, one is opening voice identification service, one is to close speech-recognition services, thus the second recognition result is compared, just as long as compare twice, can determine the second recognition result corresponding whether be to wake instruction up, so comparison speed is fast, calculated amount is little, can save power.
When the second recognition result corresponding be to wake instruction up, step 105 is specially to carry out and wakes instruction up so, wakes speech-recognition services up, speech-recognition services startup like this, user just can be undertaken by voice and electronic equipment alternately.Equally, also can speech-recognition services be closed by such mode, with save power, then wake small routine up and continue to monitor, wake instruction up until listen to, just wake speech-recognition services up.
For example, current speech identification service is in closed condition, at this moment user has said one " Mytip " to electronic equipment, waking so small routine up will listen to, can first carry out aforementioned the second pre-conditioned judgement, it is people's sound that judgement is found, so just can then perform step 102, identify by the first model of cognition, obtain a recognition result, for example utilize fuzzy diagnosis once, discovery may be to wake instruction up, so just continuing to utilize the second model of cognition accurately identifies, obtain the second recognition result, find it is to wake instruction up really, so just perform step 105, carry out and wake instruction up, control electronic equipment opening voice identification service.
And for example user does not also speak, just the kitten in room has been a sound, and after at this moment waking small routine up and listening to, just judgement finds it is not voice, so just directly abandons this voice messaging, then continues to monitor.
Again for example, preliminary judgement has been passed through, and is voice, so just can judge by step 101, for example, find that this voice messaging is not that this user sends, so at this moment still can abandon this voice messaging, then continues to monitor.
Again for example, after step 104 is finished, through contrast, the second recognition result is not to wake instruction up, so at this moment, wake small routine up and just continue to monitor the acoustic information from the typing of sound input device, until listen to " Mytip ", just can wake speech-recognition services up.
Each embodiment can implement separately above, also can be in conjunction with enforcement, and those skilled in the art can select according to actual conditions.
The 3rd embodiment, in the present embodiment, the second model of cognition in the first embodiment is the speech-recognition services in the second embodiment, and the first model of cognition in the first embodiment is the small routine that wakes up in the second embodiment, so judge that the first recognition result meets first when pre-conditioned waking small routine up, for example judge the user of the first voice messaging and be this predesignated subscriber, also be the voice command that certain Shi Gai predesignated subscriber sends, so just wake the second model of cognition up, make the second model of cognition can enter duty, further what voice command corresponding to identification the first voice messaging be, it is for example the order of making a phone call.If not this predesignated subscriber, so do not wake the second model of cognition up, so in the present embodiment, after step 103, before step 104, also comprise step: meet first when pre-conditioned at the first recognition result, wake the second model of cognition up.
Based on same inventive concept, below by introducing the concrete framework of the electronic equipment of realizing above-mentioned audio recognition method in the embodiment of the present invention, please refer to Fig. 2, electronic equipment comprises: circuit board 201; Sound acquiring 202, is connected in circuit board 201, for obtaining the first voice messaging; Process chip 203, is arranged on circuit board 201, for identifying the first voice messaging by the first model of cognition, obtains the first recognition result; Judge whether the first recognition result meets first pre-conditioned; Meet described first when pre-conditioned at the first recognition result, identify described the first voice messaging by the second model of cognition different from the first model of cognition, obtain the second recognition result; Control chip 204, is arranged on circuit board 201, for based on the second recognition result, controls electronic equipment and carries out corresponding steering order.
Further, process chip 203 specifically, also for not meeting first at the first recognition result when pre-conditioned, abandons the first voice messaging.
In one embodiment, process chip 203 comprises the first sub-process chip and the second sub-process chip, and the first sub-process chip is second pre-conditioned specifically for judging that whether the first voice messaging meets; When the first voice messaging does not meet second when pre-conditioned, abandon the first voice messaging; When the first voice messaging meets second when pre-conditioned, the second sub-process chip is specifically for identifying the first voice messaging by the first model of cognition.
Further, process chip 203 specifically also comprises the 3rd sub-process chip, specifically for identifying whether the user that the first voice messaging is corresponding is predesignated subscriber, obtains the first recognition result; Wherein, in the time of user Bu Shi predesignated subscriber corresponding to the first voice messaging, represent that the first voice messaging does not meet first pre-conditioned, in the time that user corresponding to the first voice messaging be predesignated subscriber, represent that the first voice messaging is satisfied first pre-conditioned.
Further, process chip 203 also comprises the 4th sub-process chip, for meeting first at the first recognition result when pre-conditioned, identifies the first voice messaging by the second model of cognition, obtains the 3rd recognition result; Based on the first recognition result and the 3rd recognition result, obtain the second recognition result.
In another embodiment, sound acquiring 201 also comprises detection chip, for the first voice messaging is carried out to end-point detection, obtains the first voice messaging after detecting.Wherein, detection chip also can be arranged on circuit board 201.
In another embodiment, speech-recognition services is in closed condition, and when steering order corresponding to the second recognition result is when waking instruction up, control chip 204 wakes instruction up specifically for carrying out, and wakes speech-recognition services up.
Wherein, sound acquiring is for example microphone, can be a microphone, can be also microphone array.
In addition, process chip 203 and control chip 204 can be two independent chips, also can be integrated on same chip.
And the first sub-process chip of process chip 203, the second sub-process chip, the 3rd sub-process chip and the 4th sub-process chip can be also four independently chips, also can be integrated on same chip.
Various variation patterns in audio recognition method in previous embodiment and instantiation are equally applicable to the electronic equipment of the present embodiment, by the aforementioned detailed description to audio recognition method, those skilled in the art can clearly know the implementation method of electronic equipment in the present embodiment, so succinct for instructions, is not described in detail in this.
The one or more technical schemes that provide in the embodiment of the present invention, at least have following technique effect or advantage:
In an embodiment of the present invention, first carry out first step identification by the first model of cognition for voice messaging, then judge according to the result of first step identification whether this result meets first pre-conditioned, judging whether also will continue identification goes down, only, meeting this when pre-conditioned, just carry out next step identification by the second model of cognition, and then obtain recognition result, according to recognition result, carry out corresponding steering order.Thus, because first by the screening of the first step, only have qualified could continuation to identify, so the ratio that the last recognition result obtaining is effective recognition result is uprised, also improved discrimination, and the voice messaging that those are tackled by the first step just need not continue the work of identification, so improved the efficiency of identification.
Further, in one embodiment of the invention, directly abandon not meeting pre-conditioned voice messaging, and need not do follow-up processing to it, thus greatly reduce unwanted calculated amount, and the second model of cognition need not calculate, and has also saved electric weight.
Further again, in one embodiment of the invention, also utilizing before the first model of cognition identifies, a Rule of judgment is set again, directly judge whether voice messaging itself meets second pre-conditioned, when not meeting second when pre-conditioned, just directly abandon the first voice messaging, and need not identify through the first model of cognition, so further saved electric weight and reduced calculated amount.
Further, in one embodiment of the invention, finally obtain the second recognition result by the first model of cognition and the second model of cognition, just for determining that whether steering order that the second recognition result is corresponding is for waking instruction up, when being while waking instruction up, just go to wake up speech-recognition services, allowing speech-recognition services carry out subsequent voice order identifies, and if not waking instruction up, just continue to monitor, wake instruction up until listen to, so the at this moment real speech-recognition services state in not working always, so saved greatly electric weight and calculated amount.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt complete hardware implementation example, completely implement software example or the form in conjunction with the embodiment of software and hardware aspect.And the present invention can adopt the form at one or more upper computer programs of implementing of computer-usable storage medium (including but not limited to magnetic disk memory and optical memory etc.) that wherein include computer usable program code.
The present invention is with reference to describing according to process flow diagram and/or the block scheme of the method for the embodiment of the present invention, equipment (system) and computer program.Should understand can be by the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, the instruction that makes to carry out by the processor of computing machine or other programmable data processing device produces the device for realizing the function of specifying at flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame.
These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, the instruction that makes to be stored in this computer-readable memory produces the manufacture that comprises command device, and this command device is realized the function of specifying in flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make to carry out sequence of operations step to produce computer implemented processing on computing machine or other programmable devices, thereby the instruction of carrying out is provided for realizing the step of the function of specifying in flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame on computing machine or other programmable devices.
Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention is also intended to comprise these changes and modification interior.

Claims (14)

1. an audio recognition method, is applied in an electronic equipment, and described electronic equipment has speech-recognition services, it is characterized in that, described method comprises:
Obtain the first voice messaging;
Identify described the first voice messaging by the first model of cognition, obtain the first recognition result;
Judge whether described the first recognition result meets first pre-conditioned;
Meet described first when pre-conditioned at described the first recognition result, identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result;
Based on described the second recognition result, control described electronic equipment and carry out corresponding steering order.
2. the method for claim 1, is characterized in that, does not meet described first when pre-conditioned at described the first recognition result, and described method also comprises:
Abandon described the first voice messaging.
3. the method for claim 1, is characterized in that, passes through before the first model of cognition identifies described the first voice messaging described, and described method also comprises:
Judge that whether described the first voice messaging meets second pre-conditioned;
When described the first voice messaging does not meet described second when pre-conditioned, abandon described the first voice messaging;
When described the first voice messaging meets described second when pre-conditioned, execution step: identify described the first voice messaging by the first model of cognition.
4. the method for claim 1, is characterized in that, described first model of cognition that passes through is identified described the first voice messaging, obtains the first recognition result, is specially:
Identify whether the user that described the first voice messaging is corresponding is predesignated subscriber, obtain the first recognition result; Wherein, in the time that user corresponding to described the first voice messaging is not described predesignated subscriber, represent that described the first voice messaging does not meet described first pre-conditioned, in the time that user corresponding to described the first voice messaging is described predesignated subscriber, represent that described the first voice messaging meets described first pre-conditioned.
5. the method for claim 1, is characterized in that, described acquisition the first voice messaging, specifically comprises:
Described the first voice messaging is carried out to end-point detection, obtain described the first voice messaging after detecting.
6. the method for claim 1, it is characterized in that, meet described first when pre-conditioned at described the first recognition result, describedly identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result, be specially:
Identify described the first voice messaging by the second model of cognition, obtain the 3rd recognition result;
Based on described the first recognition result and described the 3rd recognition result, obtain described the second recognition result.
7. the method for claim 1, it is characterized in that, described speech-recognition services is in closed condition, when steering order corresponding to described the second recognition result is when waking instruction up, described based on described the second recognition result, control described electronic equipment and carry out corresponding steering order and be specially:
Described in execution, wake instruction up, wake described speech-recognition services up.
8. an electronic equipment, described electronic equipment has speech-recognition services, it is characterized in that, and described electronic equipment comprises:
Circuit board;
Sound acquiring, is connected in described circuit board, for obtaining the first voice messaging;
Process chip, is arranged on described circuit board, for identify described the first voice messaging by the first model of cognition, obtains the first recognition result; Judge whether described the first recognition result meets first pre-conditioned; Meet described first when pre-conditioned at described the first recognition result, identify described the first voice messaging by the second model of cognition different from described the first model of cognition, obtain the second recognition result;
Control chip, is arranged on described circuit board, for based on described the second recognition result, controls described electronic equipment and carries out corresponding steering order.
9. electronic equipment as claimed in claim 8, is characterized in that, described process chip specifically, also for not meeting described first at described the first recognition result when pre-conditioned, abandons described the first voice messaging.
10. electronic equipment as claimed in claim 8, is characterized in that, described process chip comprises the first sub-process chip and the second sub-process chip, and described the first sub-process chip is second pre-conditioned specifically for judging that whether described the first voice messaging meets; When described the first voice messaging does not meet described second when pre-conditioned, abandon described the first voice messaging; When described the first voice messaging meets described second when pre-conditioned, described the second sub-process chip is specifically for identifying described the first voice messaging by the first model of cognition.
11. electronic equipments as claimed in claim 10, is characterized in that, described process chip specifically also comprises the 3rd sub-process chip, specifically for identifying whether the user that described the first voice messaging is corresponding is predesignated subscriber, obtain the first recognition result; Wherein, in the time that user corresponding to described the first voice messaging is not described predesignated subscriber, represent that described the first voice messaging does not meet described first pre-conditioned, in the time that user corresponding to described the first voice messaging is described predesignated subscriber, represent that described the first voice messaging meets described first pre-conditioned.
12. electronic equipments as claimed in claim 8, is characterized in that, described sound acquiring also comprises detection chip, for described the first voice messaging is carried out to end-point detection, obtain described the first voice messaging after detecting.
13. electronic equipments as claimed in claim 8, it is characterized in that, described process chip also comprises the 4th sub-process chip, for meeting described first at described the first recognition result when pre-conditioned, identify described the first voice messaging by the second model of cognition, obtain the 3rd recognition result; Based on described the first recognition result and described the 3rd recognition result, obtain described the second recognition result.
14. electronic equipments as claimed in claim 8, it is characterized in that, described speech-recognition services is in closed condition, when steering order corresponding to described the second recognition result is when waking instruction up, described control chip wakes instruction up described in carrying out, and wakes described speech-recognition services up.
CN201210454965.9A 2012-11-13 2012-11-13 A kind of audio recognition method and electronic equipment Active CN103811003B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201210454965.9A CN103811003B (en) 2012-11-13 2012-11-13 A kind of audio recognition method and electronic equipment
US14/079,219 US9959865B2 (en) 2012-11-13 2013-11-13 Information processing method with voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210454965.9A CN103811003B (en) 2012-11-13 2012-11-13 A kind of audio recognition method and electronic equipment

Publications (2)

Publication Number Publication Date
CN103811003A true CN103811003A (en) 2014-05-21
CN103811003B CN103811003B (en) 2019-09-24

Family

ID=50707680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210454965.9A Active CN103811003B (en) 2012-11-13 2012-11-13 A kind of audio recognition method and electronic equipment

Country Status (1)

Country Link
CN (1) CN103811003B (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
CN105206271A (en) * 2015-08-25 2015-12-30 北京宇音天下科技有限公司 Intelligent equipment voice wake-up method and system for realizing method
CN105529025A (en) * 2014-09-28 2016-04-27 联想(北京)有限公司 Voice operation input method and electronic device
CN105609103A (en) * 2015-12-18 2016-05-25 合肥寰景信息技术有限公司 Speech instant recognition system
CN106033331A (en) * 2015-03-16 2016-10-19 联想(北京)有限公司 Information processing method and electronic device
CN106157950A (en) * 2016-09-29 2016-11-23 合肥华凌股份有限公司 Speech control system and awakening method, Rouser and household electrical appliances, coprocessor
WO2017024835A1 (en) * 2015-08-13 2017-02-16 中兴通讯股份有限公司 Voice recognition method and device
CN106448663A (en) * 2016-10-17 2017-02-22 海信集团有限公司 Voice wakeup method and voice interaction device
CN106653031A (en) * 2016-10-17 2017-05-10 海信集团有限公司 Voice wake-up method and voice interaction device
CN106782569A (en) * 2016-12-06 2017-05-31 深圳增强现实技术有限公司 A kind of augmented reality method and device based on voiceprint registration
WO2017096843A1 (en) * 2015-12-10 2017-06-15 乐视控股(北京)有限公司 Headset device control method and device
CN107680590A (en) * 2017-09-18 2018-02-09 北京小蓦机器人技术有限公司 A kind of method, equipment and storage medium for being used to handle natural language instructions
CN107767861A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
CN107767860A (en) * 2016-08-15 2018-03-06 中兴通讯股份有限公司 A kind of voice information processing method and device
CN107767863A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
CN108335695A (en) * 2017-06-27 2018-07-27 腾讯科技(深圳)有限公司 Sound control method, device, computer equipment and storage medium
CN108461084A (en) * 2018-03-01 2018-08-28 广东美的制冷设备有限公司 Speech recognition system control method, control device and computer readable storage medium
CN108511002A (en) * 2018-01-23 2018-09-07 努比亚技术有限公司 The recognition methods of hazard event voice signal, terminal and computer readable storage medium
CN108665889A (en) * 2018-04-20 2018-10-16 百度在线网络技术(北京)有限公司 The Method of Speech Endpoint Detection, device, equipment and storage medium
CN108717851A (en) * 2018-03-28 2018-10-30 深圳市三诺数字科技有限公司 A kind of audio recognition method and device
CN109065036A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Method, apparatus, electronic equipment and the computer readable storage medium of speech recognition
CN110223672A (en) * 2019-05-16 2019-09-10 九牧厨卫股份有限公司 A kind of multilingual audio recognition method of off-line type
CN110299139A (en) * 2019-06-29 2019-10-01 联想(北京)有限公司 A kind of sound control method, device and electronic equipment
CN110598762A (en) * 2019-08-26 2019-12-20 Oppo广东移动通信有限公司 Audio-based trip mode detection method and device and mobile terminal
CN110675869A (en) * 2019-08-28 2020-01-10 紫光云(南京)数字技术有限公司 Method and device for controlling applications in smart city app through voice
CN110853633A (en) * 2019-09-29 2020-02-28 联想(北京)有限公司 Awakening method and device
CN111767793A (en) * 2020-05-25 2020-10-13 联想(北京)有限公司 Data processing method and device
CN111951793A (en) * 2020-08-13 2020-11-17 北京声智科技有限公司 Method, device and storage medium for awakening word recognition
CN112116926A (en) * 2019-06-19 2020-12-22 北京猎户星空科技有限公司 Audio data processing method and device and model training method and device
CN113614826A (en) * 2019-03-27 2021-11-05 三星电子株式会社 Multimodal interaction with an intelligent assistant in a voice command device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050114132A1 (en) * 2003-11-21 2005-05-26 Acer Inc. Voice interactive method and system
CN101441869A (en) * 2007-11-21 2009-05-27 联想(北京)有限公司 Method and terminal for speech recognition of terminal user identification
CN201307938Y (en) * 2008-09-02 2009-09-09 宇龙计算机通信科技(深圳)有限公司 Mobile terminal
CN102316227A (en) * 2010-07-06 2012-01-11 宏碁股份有限公司 Data processing method for voice call process
CN102549653A (en) * 2009-10-02 2012-07-04 独立行政法人情报通信研究机构 Speech translation system, first terminal device, speech recognition server device, translation server device, and speech synthesis server device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050114132A1 (en) * 2003-11-21 2005-05-26 Acer Inc. Voice interactive method and system
CN101441869A (en) * 2007-11-21 2009-05-27 联想(北京)有限公司 Method and terminal for speech recognition of terminal user identification
CN201307938Y (en) * 2008-09-02 2009-09-09 宇龙计算机通信科技(深圳)有限公司 Mobile terminal
CN102549653A (en) * 2009-10-02 2012-07-04 独立行政法人情报通信研究机构 Speech translation system, first terminal device, speech recognition server device, translation server device, and speech synthesis server device
CN102316227A (en) * 2010-07-06 2012-01-11 宏碁股份有限公司 Data processing method for voice call process

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015154412A1 (en) * 2014-09-05 2015-10-15 中兴通讯股份有限公司 Method and device for awakening voice control system, and terminal
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
CN105529025A (en) * 2014-09-28 2016-04-27 联想(北京)有限公司 Voice operation input method and electronic device
CN106033331A (en) * 2015-03-16 2016-10-19 联想(北京)有限公司 Information processing method and electronic device
CN106033331B (en) * 2015-03-16 2019-07-26 联想(北京)有限公司 Information processing method and electronic equipment
WO2017024835A1 (en) * 2015-08-13 2017-02-16 中兴通讯股份有限公司 Voice recognition method and device
CN105206271A (en) * 2015-08-25 2015-12-30 北京宇音天下科技有限公司 Intelligent equipment voice wake-up method and system for realizing method
WO2017096843A1 (en) * 2015-12-10 2017-06-15 乐视控股(北京)有限公司 Headset device control method and device
CN105609103A (en) * 2015-12-18 2016-05-25 合肥寰景信息技术有限公司 Speech instant recognition system
CN107767860A (en) * 2016-08-15 2018-03-06 中兴通讯股份有限公司 A kind of voice information processing method and device
CN107767863A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
CN107767861A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
CN106157950A (en) * 2016-09-29 2016-11-23 合肥华凌股份有限公司 Speech control system and awakening method, Rouser and household electrical appliances, coprocessor
CN106653031A (en) * 2016-10-17 2017-05-10 海信集团有限公司 Voice wake-up method and voice interaction device
CN106448663A (en) * 2016-10-17 2017-02-22 海信集团有限公司 Voice wakeup method and voice interaction device
CN106448663B (en) * 2016-10-17 2020-10-23 海信集团有限公司 Voice awakening method and voice interaction device
CN106782569A (en) * 2016-12-06 2017-05-31 深圳增强现实技术有限公司 A kind of augmented reality method and device based on voiceprint registration
CN108335695A (en) * 2017-06-27 2018-07-27 腾讯科技(深圳)有限公司 Sound control method, device, computer equipment and storage medium
CN107680590A (en) * 2017-09-18 2018-02-09 北京小蓦机器人技术有限公司 A kind of method, equipment and storage medium for being used to handle natural language instructions
CN107680590B (en) * 2017-09-18 2020-10-02 北京小蓦机器人技术有限公司 Method, device and storage medium for processing natural language command
CN108511002A (en) * 2018-01-23 2018-09-07 努比亚技术有限公司 The recognition methods of hazard event voice signal, terminal and computer readable storage medium
CN108511002B (en) * 2018-01-23 2020-12-01 太仓鸿羽智能科技有限公司 Method for recognizing sound signal of dangerous event, terminal and computer readable storage medium
CN108461084A (en) * 2018-03-01 2018-08-28 广东美的制冷设备有限公司 Speech recognition system control method, control device and computer readable storage medium
CN108717851A (en) * 2018-03-28 2018-10-30 深圳市三诺数字科技有限公司 A kind of audio recognition method and device
CN108717851B (en) * 2018-03-28 2021-04-06 深圳市三诺数字科技有限公司 Voice recognition method and device
CN108665889B (en) * 2018-04-20 2021-09-28 百度在线网络技术(北京)有限公司 Voice signal endpoint detection method, device, equipment and storage medium
CN108665889A (en) * 2018-04-20 2018-10-16 百度在线网络技术(北京)有限公司 The Method of Speech Endpoint Detection, device, equipment and storage medium
CN109065036A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Method, apparatus, electronic equipment and the computer readable storage medium of speech recognition
CN113614826A (en) * 2019-03-27 2021-11-05 三星电子株式会社 Multimodal interaction with an intelligent assistant in a voice command device
CN110223672A (en) * 2019-05-16 2019-09-10 九牧厨卫股份有限公司 A kind of multilingual audio recognition method of off-line type
CN110223672B (en) * 2019-05-16 2021-04-23 九牧厨卫股份有限公司 Offline multi-language voice recognition method
CN112116926A (en) * 2019-06-19 2020-12-22 北京猎户星空科技有限公司 Audio data processing method and device and model training method and device
CN110299139A (en) * 2019-06-29 2019-10-01 联想(北京)有限公司 A kind of sound control method, device and electronic equipment
CN110598762A (en) * 2019-08-26 2019-12-20 Oppo广东移动通信有限公司 Audio-based trip mode detection method and device and mobile terminal
CN110675869A (en) * 2019-08-28 2020-01-10 紫光云(南京)数字技术有限公司 Method and device for controlling applications in smart city app through voice
CN110853633A (en) * 2019-09-29 2020-02-28 联想(北京)有限公司 Awakening method and device
CN111767793A (en) * 2020-05-25 2020-10-13 联想(北京)有限公司 Data processing method and device
CN111951793A (en) * 2020-08-13 2020-11-17 北京声智科技有限公司 Method, device and storage medium for awakening word recognition
CN111951793B (en) * 2020-08-13 2021-08-24 北京声智科技有限公司 Method, device and storage medium for awakening word recognition

Also Published As

Publication number Publication date
CN103811003B (en) 2019-09-24

Similar Documents

Publication Publication Date Title
CN103811003A (en) Voice recognition method and electronic equipment
CN106448663B (en) Voice awakening method and voice interaction device
CN103456299B (en) A kind of method and device controlling speech recognition
CN102298443B (en) Smart home voice control system combined with video channel and control method thereof
EP2550651B1 (en) Context based voice activity detection sensitivity
CN109979438A (en) Voice awakening method and electronic equipment
CN111223497A (en) Nearby wake-up method and device for terminal, computing equipment and storage medium
CN107481718A (en) Audio recognition method, device, storage medium and electronic equipment
CN110232916A (en) Method of speech processing, device, computer equipment and storage medium
CN107767863A (en) voice awakening method, system and intelligent terminal
CN106601230B (en) Logistics sorting place name voice recognition method and system based on continuous Gaussian mixture HMM model and logistics sorting system
CN105139858A (en) Information processing method and electronic equipment
CN104123939A (en) Substation inspection robot based voice interaction control method
CN102005070A (en) Voice identification gate control system
CN105810213A (en) Typical abnormal sound detection method and device
CN103543979A (en) Voice outputting method, voice interaction method and electronic device
CN111462756B (en) Voiceprint recognition method and device, electronic equipment and storage medium
CN110223687B (en) Instruction execution method and device, storage medium and electronic equipment
CN110838296B (en) Recording process control method, system, electronic device and storage medium
CN111081217A (en) Voice wake-up method and device, electronic equipment and storage medium
CN109801646A (en) Voice endpoint detection method and device based on fusion features
CN112669822B (en) Audio processing method and device, electronic equipment and storage medium
CN104103280A (en) Dynamic time warping algorithm based voice activity detection method and device
CN110970020A (en) Method for extracting effective voice signal by using voiceprint
CN113140219A (en) Regulation and control instruction generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant