CN104464735A

CN104464735A - Voice information recognition method and device, and terminal

Info

Publication number: CN104464735A
Application number: CN201410768511.8A
Authority: CN
Inventors: 韩庆普
Original assignee: Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Current assignee: Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority date: 2014-12-12
Filing date: 2014-12-12
Publication date: 2015-03-25

Abstract

The invention provides a voice information recognition method. The voice information recognition method is characterized by comprising the steps of receiving input voice information, calling matched information matched with the voice information from a preset voice base corresponding to user identification information according to the user identification information corresponding to the voice information, and recognizing the voice information according to the matched information to carry out operation according to the recognized target voice information. Accordingly, the invention further provides a voice information recognition device and a terminal. By means of the technical scheme, a multi-system multi-user voice recognition method can be supported, voice recognition accuracy can be effectively improved, and the application occasions of voice recognition and the experience effect can be improved.

Description

Voice information identification method, voice messaging recognition device and terminal

Technical field

The present invention relates to field of terminal technology, in particular to a kind of voice information identification method, a kind of voice messaging recognition device and a kind of terminal.

Background technology

The user operation means of smart mobile phone are more and more diversified, and wherein speech recognition application is also more and more extensive.What existing voice recognition technology mainly adopted is that predetermined identification data model is fixed in foundation in advance, remote library (speech recognition database is fixed in high in the clouds) compares counterpart method with local user storehouse.Wherein fix voice recognition data model and refer to that developer gathers the public database of a large amount of speech data foundation, be generally deployed in long-range high in the clouds; Remote library and local user storehouse contrast then main manifestations for merge long-distance cloud end data and local data, to be combined into comparatively complete information feed back to user.But existing speech recognition technology acquiescence can only support a user, cannot support multisystem, this diversified demand of multi-user; In addition prior art is just simple merges long-distance cloud end data and local user's character word stock, do not consider mark and the binding of other personal information be associated with user, thus can better not provide the service of the personalized speech identification with identity characteristic for user, the Consumer's Experience requirement that consumer is more and more fastidious cannot be adapted to.

Therefore, need a kind of new technical scheme, the audio recognition method of multisystem, multi-user can be supported, can effectively improve speech recognition accuracy, strengthen application scenarios and the experience effect of speech recognition.

Summary of the invention

The present invention, just based on above-mentioned technical matters, proposes a kind of new technical scheme, can support the audio recognition method of multisystem, multi-user, can effectively improve speech recognition accuracy, strengthens application scenarios and the experience effect of speech recognition.

In view of this, an aspect of of the present present invention proposes a kind of voice information identification method, it is characterized in that, comprising: the voice messaging receiving input; The user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information; According to described match information, described voice messaging is identified, to operate according to the target voice information after identification.

In this technical scheme, after receiving the voice messaging of input, by the user totem information corresponding according to voice messaging, the match information matched with this voice messaging is called from the default sound bank corresponding with user totem information, terminal can be made accurately to identify voice messaging according to this match information, effectively to improve speech recognition accuracy, strengthen the application scenarios of speech recognition, thus according to this accurately target voice information carry out proper operation.

In technique scheme, preferably, according to the memory command received, by described user totem information and described match information corresponding stored in described default sound bank.

In this technical scheme, by by the user totem information of each user and match information corresponding stored in default sound bank, different user can be made to preset sound bank from the individual of correspondence and to call different match information, thus meet the individual demand of user, terminal accurately can be identified the voice messaging of different user, certainly, this terminal can for being provided with the terminal of multiple system, and the default sound bank of different user can be stored in different system.

In technique scheme, preferably, described match information comprises: at least one information in the waveform of the frequency of the intensity of other address information of the contact person in the information to be replaced of the appointed information in described voice messaging, described voice messaging, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding; And when described match information comprises described information to be replaced and other address information described, describedly according to described match information, described voice messaging to be identified, specifically comprise: according to described information to be replaced, described appointed information in described voice messaging is replaced, and determine that the contact method of the contact person in described voice messaging is to identify described voice messaging according to other address information described.

In this technical scheme, when match information comprises information to be replaced and other address information, by the fallibility appointed information in voice messaging is replaced with correct information to be replaced, the recognition correct rate of terminal to the voice messaging of user can be strengthened, certainly, because user may for multiple and other addresses of arbitrary contact person may be different from the address stored in address list to the address of arbitrary contact person, thus by other address information according to contact person in voice messaging, accurately can determine that the contact method of the contact person in voice messaging is to improve the accuracy identifying this voice messaging further further.

In addition, due to the intensity of the voice signal in the voice messaging of each user, the frequency of voice signal is all different with the waveform of voice signal, and the intensity of the voice signal of each user, the frequency of voice signal and the waveform of voice signal can change with the change of environment and mood, therefore, by the intensity by the voice signal in voice messaging, the frequency of voice signal and the waveform of voice signal are as match information, terminal can be made according to the intensity presetting the user's voice signal in the past stored in sound bank, the frequency of voice signal and the voice messaging of the waveform of voice signal to this input of user accurately identify, to improve the accuracy rate of this voice messaging identification further.

In technique scheme, preferably, according to the acquisition order received, from the voice call content of described terminal and/or obtain the information described to be replaced of described appointed information in short message; And according to the change order received, change the information described to be replaced of described appointed information.

In this technical scheme, because user often has other to call to the contact person in address list, and these addresses often occur in the note, voice call content of terminal, thus, the information to be replaced of appointed information can be obtained farthest, as much as possible by the voice call content in terminal and/or short message, so that terminal, when identifying the voice messaging of user, can improve the voice messaging recognition correct rate of terminal; Certainly, in order to improve the voice messaging recognition correct rate of terminal further, user can also, constantly to terminal input change order, to change the information to be replaced of fallibility appointed information, thus make this information to be replaced be always the information meeting user's use habit most.

In technique scheme, preferably, described user totem information comprises: at least one information in the face feature information of the finger print information of described user, the intensity of the described voice signal of described user, the frequency of the described voice signal of described user, the waveform of the described voice signal of described user, described user.

In this technical scheme, this user totem information includes but not limited to: the finger print information of user, the intensity of described voice signal of user, the frequency of the voice signal of user, the waveform of the voice signal of user, the face feature information of user, and the diversity of user totem information, the binding, the default sound bank of different identification informations with individual to meet the different use habits of user of user can be made.

Also proposed a kind of voice messaging recognition device according to a further aspect in the invention, comprising: receiving element, receive the voice messaging of input; Call unit, the user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information; Recognition unit, identifies described voice messaging according to described match information, to operate according to the target voice information after identification.

In this technical scheme, after receiving the voice messaging of input, by the user totem information corresponding according to voice messaging, the match information matched with this voice messaging is called from the default sound bank corresponding with user totem information, terminal can be made accurately to identify voice messaging according to this match information, effectively to improve speech recognition accuracy, strengthen the application scenarios of speech recognition, thus according to this exactly target voice information carry out proper operation.

In technique scheme, preferably, storage unit, according to the memory command received, by described user totem information and described match information corresponding stored in described default sound bank.

In technique scheme, preferably, described match information comprises: at least one information in the waveform of the frequency of the intensity of other address information of the contact person in the information to be replaced of the appointed information in described voice messaging, described voice messaging, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding; And when described match information comprises described information to be replaced and other address information described, described recognition unit specifically for: according to described information to be replaced, described appointed information in described voice messaging is replaced, and determines that the contact method of the contact person in described voice messaging is to identify described voice messaging according to other address information described.

In technique scheme, preferably, also comprise: acquiring unit, according to the acquisition order received, from the voice call content of described terminal and/or obtain the information described to be replaced of described appointed information in short message; And changing unit, according to the change order received, change the information described to be replaced of described appointed information.

In this technical scheme, because user often has other to call to the contact person in address list, and these addresses often occur in the note, voice call content of terminal, thus, by the information to be replaced of appointed information can be obtained in the voice call content of terminal and/or short message farthest, as much as possible, so that terminal, when identifying the voice messaging of user, can improve the voice messaging recognition correct rate of terminal; Certainly, in order to improve the voice messaging recognition correct rate of terminal further, user can also, constantly to terminal input change order, to change the information to be replaced of fallibility appointed information, thus make this information to be replaced be always the information meeting user's use habit most.

Also proposed a kind of terminal according to another aspect of the invention, comprising: voice messaging recognition device according to any one of technique scheme.

In this technical scheme, by arranging voice messaging recognition device in terminal, terminal can be made to support the audio recognition method of multisystem, multi-user, can effectively improve speech recognition accuracy, strengthen application scenarios and the experience effect of speech recognition.

By technique scheme, the audio recognition method of multisystem, multi-user can be supported, can effectively improve speech recognition accuracy, strengthen application scenarios and the experience effect of speech recognition.

Accompanying drawing explanation

Fig. 1 shows the schematic flow sheet of voice information identification method according to an embodiment of the invention;

Fig. 2 shows the block diagram of voice messaging recognition device according to an embodiment of the invention;

Fig. 3 shows the block diagram of terminal according to an embodiment of the invention;

Fig. 4 shows the schematic flow sheet that user totem information according to an embodiment of the invention and individual preset the method that sound bank is bound mutually;

Fig. 5 shows the schematic flow sheet of the method calling match information according to one embodiment of present invention;

Fig. 6 shows the schematic flow sheet of voice information identification method according to another embodiment of the invention;

Fig. 7 shows the schematic flow sheet of the defining method of other address information of the contact person in voice messaging according to an embodiment of the invention.

Embodiment

In order to more clearly understand above-mentioned purpose of the present invention, feature and advantage, below in conjunction with the drawings and specific embodiments, the present invention is further described in detail.It should be noted that, when not conflicting, the feature in the embodiment of the application and embodiment can combine mutually.

Set forth a lot of detail in the following description so that fully understand the present invention; but; the present invention can also adopt other to be different from other modes described here and implement, and therefore, protection scope of the present invention is not by the restriction of following public specific embodiment.

Fig. 1 shows the schematic flow sheet of voice information identification method according to an embodiment of the invention.

As shown in Figure 1, show the voice information identification method of embodiments of the invention, comprising: step 102, receive the voice messaging of input; Step 104, the user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information; Step 106, identifies described voice messaging according to described match information, to operate according to the target voice information after identification.

Fig. 2 shows the block diagram of voice messaging recognition device according to an embodiment of the invention.

As shown in Figure 2, show the voice messaging recognition device 200 of embodiments of the invention, comprising: receiving element 202, receive the voice messaging of input; Call unit 204, the user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information; Recognition unit 206, identifies described voice messaging according to described match information, to operate according to the target voice information after identification.

In technique scheme, preferably, storage unit 208, according to the memory command received, by described user totem information and described match information corresponding stored in described default sound bank.

In technique scheme, preferably, described match information comprises: at least one information in the waveform of the frequency of the intensity of other address information of the contact person in the information to be replaced of the appointed information in described voice messaging, described voice messaging, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding; And when described match information comprises described information to be replaced and other address information described, described recognition unit 206 specifically for: according to described information to be replaced, described appointed information in described voice messaging is replaced, and determines that the contact method of the contact person in described voice messaging is to identify described voice messaging according to other address information described.

In technique scheme, preferably, also comprise: acquiring unit 210, according to the acquisition order received, from the voice call content of described terminal and/or obtain the information described to be replaced of described appointed information in short message; And changing unit, according to the change order received, change the information described to be replaced of described appointed information.

Fig. 3 shows the block diagram of terminal according to an embodiment of the invention.

As shown in Figure 3, show the terminal of embodiments of the invention, comprising: voice messaging recognition device 200 according to any one of technique scheme.

In this technical scheme, by arranging voice messaging recognition device 200 in terminal, terminal can be made to support the audio recognition method of multisystem, multi-user, can effectively improve speech recognition accuracy, strengthen application scenarios and the experience effect of speech recognition.

Fig. 4 shows the schematic flow sheet that user totem information according to an embodiment of the invention and individual preset the method that sound bank is bound mutually.

As shown in Figure 4, the method that the user totem information showing one embodiment of the present of invention is bound mutually with personally identifiable information, comprising:

Step 402, user inputs individual marker characteristic (i.e. user totem information), to bind mutually with personally identifiable information, particularly, this process is initiatively initiated by user, and user can use the biological characteristics such as finger print information, voiceprint (intensity of voice signal and/or frequency and/or waveform), face feature information or adopt password mode to start to set up the user totem information relevant to personally identifiable information in terminal.

Step 404, judges whether user has preserved this personally identifiable information; Particularly, if judge be, then proceed to step 408, user totem information to the personally identifiable information of this existing user, and is preset in sound bank with the personally identifiable information corresponding stored of existing user the individual that this user totem information (or personally identifiable information of this existing user) is corresponding by binding input identification information; If when judging no, enter step 406, then newly-built personally identifiable information, newly-built to this user totem information and this personally identifiable information is bound mutually, and user totem information is preset in sound bank the individual that this user totem information (or this newly-built personally identifiable information) is corresponding with newly-built personally identifiable information corresponding stored.

Fig. 5 shows the schematic flow sheet of the method calling match information according to one embodiment of present invention.

As shown in Figure 5, call the method for match information according to one embodiment of present invention, comprising:

Step 502, starts the speech recognition mode of terminal.

Step 504, judges whether the user totem information (i.e. individual subscriber marker characteristic) receiving user, and when judged result is for being, enters step 506.

Step 506, presets the individual corresponding with user totem information the match information that in sound bank, retrieval matches with the user totem information of user.

Step 508, judges whether to retrieve the match information matched with the user totem information receiving user, when judged result is, enters step 510.

Step 510, if retrieve the match information matched with the user totem information of user, then voice activated identification personality frame, namely allows to preset sound bank from the individual that this user totem information is corresponding to call match information and identify with the voice messaging inputted user.

Fig. 6 shows the schematic flow sheet of voice information identification method according to another embodiment of the invention.

As shown in Figure 6, show the voice information identification method of an alternative embodiment of the invention, comprising:

Step 602, voice activated identification personality frame, namely allows to call match information identify from presetting sound bank with the corresponding individual of user totem information of input with the voice messaging inputted user.

Step 604, judges that whether the voice messaging received is wrong, if the determination result is YES, then enters step 606 and step 608.

Step 606, constantly corrects the appointed information in voice messaging, and using the appointed information after correction as the responsive words (i.e. information to be replaced) of user's fallibility.

Step 608, the frequency of voice signal in acquisition voice messaging and/or the waveform of the intensity of voice signal and/or voice signal, so that improve the recognition correct rate of voice messaging.

Step 610, binds with user totem information again by information to be replaced and the frequency of voice signal of user and/or the waveform of the intensity of voice signal and/or voice signal.

Step 612, presets sound bank by the information to be replaced after binding and the frequency of voice signal of user and/or the Waveform storage of the intensity of voice signal and/or voice signal the individual corresponding with user totem information.

Particularly, speech recognition equipment is when identifying the voice of user's input, user can be regular correct to obtain user's expected results to the words recognition result (i.e. appointed information) of system feedback, the appointed information after correction in this case will be saved the everyday expressions preset for the individual corresponding with user totem information in sound bank as the responsive words (i.e. information to be replaced) of user's fallibility; Simultaneously terminal will gather the sound frequency data of this user and/or the intensity of voice signal and/or the waveform of voice signal, and is saved in the individual corresponding with user totem information and presets in sound bank, so that improve the recognition correct rate of voice messaging.

As shown in Figure 7, show the defining method of other address information of the contact person in the voice messaging of one embodiment of the present of invention, comprising:

Step 702, detecting user speech dialog context and note.

Step 704, determines the address keyword that whether there is contact person in voice call content and note.Particularly, judge whether to there is address property keyword, if the determination result is YES, then enter step 706.

Step 706, judge whether the access times of this address keyword (i.e. other address information) are greater than setting threshold values, if the determination result is YES, then enter step 708, by these other address information setting be this contact person treat binding tab (namely determining other address information of this contact person); If judged result is no, then return step 704.

Step 710, judges whether user confirms this label (namely determine these other address), particularly, if the determination result is YES, enters step 712, binds other addresses (i.e. this label) contact person to this user of this contact person.

Particularly, relevant contact information mainly refers to contact person, it identifies with integration flow process as follows in detail: for contact person, many times user directly might not contact with them with the kith and kin's name of oneself usually, and often with the pet name or address acute pyogenic infection of finger tip, in this case mobile phone is understood Auto-Sensing and is identified the address noun that in subscriber phone and note, occurrence number is more, the addresses such as such as father, mother, son's wife, wife, and these address property words are associated with actual contact person, namely add contact person's label.When user uses contact person's label to carry out speech recognition operation first time under individual marking mode, system can provide prompting to show confirmation, and after user confirms, this label will be tied to corresponding contact person.

Technical scheme of the present invention will be illustrated below:

1) user A opens mobile phone, uses raw user totem information (vocal print, fingerprint, facial characteristics etc.) or password mode to set up a user A people and presets sound bank.Now, user B also can set up a user B people and preset sound bank.

2) associated person information " Qin Lan 18566668888, Liu Si fine jade 17099996666 " is had in mobile phone, the wife that the former " Qin Lan " is user A, the mother of user B; The daughter that the latter-" Liu Siqi " is user A, the elder sister of user B.User A is in speech recognition process, and system detects address property keyword by communication process and short message, and " Qin Lan " is increased label " son's wife ", " Liu Siqi " is increased label " daughter ".In like manner, for user B, " Qin Lan " is increased label " mother ", " Liu Siqi " is increased label " elder sister ".

3) user A is in use speech recognition process, has correction action, be corrected as user's expected result: gather → check by system default recognition result for following words; User B is corrected as: set → geometry.In this case, " gather → check " individual that can add user A to as responsive words (i.e. the information to be replaced of appointed information) and preset sound bank, " set → geometry " is understood and be added a user B people to as responsive words (i.e. the information to be replaced of appointed information) and preset sound bank.

4) mobile phone enters speech recognition state, identifies according to system default state.Now, if user A have activated individual marking mode, then extraction user A people is preset the data in sound bank by follow-up speech recognition process.Such as, user A sends voice command " send short messages say that I tonight will sit on check task to my son's wife ", now system " sends note to Qin Lan-I will sit on the task of checking tonight " by directly providing recognition result, instead of first allowing user A confirm " what is your name for your son's wife ", short message content can not wrong identification be also " I will sit on set task tonight ".Obviously, speech recognition operation flow process can be simplified like this, promote Consumer's Experience.

So far, namely mobile phone completes once from setting up user totem information to extracting and the entire flow of the default sound bank of user application individual for user A.Similarly, mobile phone for user B from set up user totem information to extract and user application individual to preset the entire flow of sound bank identical with the process of user A.Although it should be noted that the present embodiment is described with unique user, because present invention achieves individual mark function, make multi-user, set that multi-mode can adopt multiple single user realizes.

More than be described with reference to the accompanying drawings technical scheme of the present invention, by technical scheme of the present invention, the audio recognition method of multisystem, multi-user can be supported, can effectively improve speech recognition accuracy, strengthen application scenarios and the experience effect of speech recognition.

The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a voice information identification method, is characterized in that, comprising:

Receive the voice messaging of input;

The user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information;

According to described match information, described voice messaging is identified, to operate according to the target voice information after identification.

2. voice information identification method according to claim 1, is characterized in that,

According to the memory command received, by described user totem information and described match information corresponding stored in described default sound bank.

3. voice information identification method according to claim 1, is characterized in that,

Described match information comprises: at least one information in the waveform of the frequency of the intensity of other address information of the contact person in the information to be replaced of the appointed information in described voice messaging, described voice messaging, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding, voice signal that described voice messaging is corresponding; And

When described match information comprises described information to be replaced and other address information described, describedly according to described match information, described voice messaging to be identified, specifically comprises:

According to described information to be replaced, described appointed information in described voice messaging is replaced, and determine that the contact method of the contact person in described voice messaging is to identify described voice messaging according to other address information described.

4. voice information identification method according to claim 3, is characterized in that,

According to the acquisition order received, from the voice call content of described terminal and/or obtain the information described to be replaced of described appointed information in short message; And

According to the change order received, change the information described to be replaced of described appointed information.

5. voice information identification method according to any one of claim 1 to 4, is characterized in that,

Described user totem information comprises: at least one information in the face feature information of the finger print information of described user, the intensity of the described voice signal of described user, the frequency of the described voice signal of described user, the waveform of the described voice signal of described user, described user.

6. a voice messaging recognition device, is characterized in that, comprising:

Receiving element, receives the voice messaging of input;

Call unit, the user totem information corresponding according to described voice messaging, calls the match information matched with described voice messaging from the default sound bank corresponding with described user totem information;

Recognition unit, identifies described voice messaging according to described match information, to operate according to the target voice information after identification.

7. voice messaging recognition device according to claim 6, is characterized in that,

Storage unit, according to the memory command received, by described user totem information and described match information corresponding stored in described default sound bank.

8. voice messaging recognition device according to claim 6, is characterized in that,

When described match information comprise described information to be replaced and described other address information time, described recognition unit specifically for:

9. voice messaging recognition device according to claim 8, is characterized in that, also comprise:

Acquiring unit, according to the acquisition order received, from the voice call content of described terminal and/or obtain the information described to be replaced of described appointed information in short message; And

Changing unit, according to the change order received, changes the information described to be replaced of described appointed information.

10. the voice messaging recognition device according to any one of claim 6 to 9, is characterized in that,

11. 1 kinds of terminals, is characterized in that, comprising: voice messaging recognition device according to any one of claim 6 to 10.