CN103794214A

CN103794214A - Information processing method, device and electronic equipment

Info

Publication number: CN103794214A
Application number: CN201410083622.5A
Authority: CN
Inventors: 戴中原; 戴海生
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2014-03-07
Filing date: 2014-03-07
Publication date: 2014-05-14

Abstract

The invention provides an information processing method which is applied to electronic equipment. According to the information processing method, when voice input of a user is recognized, matching is conducted on the voice input according to a voice recognition engine, so that multiple matching results are obtained, when the selection fed back by the user is received, the selection of the user is determined to be the first matching result, the voice input, the multiple matching results and the first matching result selected by the user are recorded, and adaptation training is conducted on a model of the voice recognition engine according the recorded content, so that the first matching result is generated when matching is conducted on the voice input for another time based on the voice recognition engine. According to the information processing method, selection of the matching result by the user is considered, the degree of matching between the voice input and the recognition result selected by the user is improved, the voice recognition precision of the voice recognition engine is improved, the recognition performance is optimized, and user experience is improved.

Description

A kind of information processing method, device and electronic equipment

Technical field

The invention belongs to field of speech recognition, relate in particular to a kind of information processing method, device and electronic equipment.

Background technology

Along with the development of electronic technology, electronic equipment carries out speech recognition and has become a kind of conventional technology.

User inputs one section of voice messaging, and the identification engine in electronic equipment is identified this voice messaging automatically, and the result of this voice messaging of Identification display.

But, in prior art, to identify according to the voice messaging of user's input, the result obtaining is multiple, user is select target result from multiple results according to demand.In this identifying, what may be presented at first place is not user's objective result, but other confusion result, now, user needs manual select target result, and interactive efficiency is lower.

Summary of the invention

In view of this, the object of the present invention is to provide a kind of method of speech recognition, the selection of identification engine in conjunction with user is adjusted recognition result, improves the precision of identifying speech of identification engine.

A kind of information processing method, described method is applied to electronic equipment, is provided with voice collecting unit in described electronic equipment, and the method comprises:

Receive the phonetic entry that described voice collecting unit gathers;

Based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Export described matching result group;

Receive the input operation of user feedback;

From described matching result group, determine first matching result according to described input operation;

Record described phonetic entry, described matching result group and described the first matching result;

Described phonetic entry to described record, described matching result group and described the first matching result carry out the model adaptation training of speech recognition engine, to make producing described the first matching result when described phonetic entry coupling based on described speech recognition engine next time.

Above-mentioned method, preferred, before receiving the phonetic entry of described voice collecting unit collection, also comprise:

Obtain the user's who carries out described phonetic entry identity information.

Above-mentioned method, preferred, also comprise:

In recording described phonetic entry, described matching result group and described the first matching result, record the user's of described phonetic entry identity information, carry out targetedly model adaptation training for described speech recognition engine for described user's articulation type with the identity information that makes the described phonetic entry of described record, described matching result group, described the first matching result and described user, while coupling with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time, produce described the first matching result.

Above-mentioned method, preferred, the model adaptation training that described described phonetic entry, described matching result group and described the first matching result to described record carries out speech recognition engine comprises:

Based on described matching result group and described the first matching result, determine the second matching result except described the first matching result;

The numerical value of the matching rate of described the first matching result and described phonetic entry is risen on the basis of currency to the first value;

The numerical value of the matching rate of described the second matching result and described phonetic entry is reduced on the basis of currency to the second value;

Wherein, described the first value is greater than described the second value.

Above-mentioned method, preferred, after the described input operation of described foundation is determined first matching result from described matching result group, described record described phonetic entry, described matching result group and described the first matching result before, also comprise:

Judge that according to pre-conditioned described the first matching result responds operation corresponding to described phonetic entry and whether completes;

In the time completing, carry out described described phonetic entry, described matching result group and described the first matching result step of recording.

A kind of signal conditioning package, is applied to electronic equipment, is provided with voice collecting unit in described electronic equipment, and described device comprises:

The first receiver module, the phonetic entry gathering for receiving described voice collecting unit;

Matching module, for described phonetic entry being mated based on speech recognition engine, obtains the matching result group that at least 2 matching results be made up of relevant to described phonetic entry;

Output module, for exporting described matching result group;

The second receiver module, for receiving the input operation of user feedback;

Select module, for determining first matching result according to described input operation from described matching result group;

Logging modle, for recording described phonetic entry, described matching result group and described the first matching result;

Training module, carry out the model adaptation training of speech recognition engine for described phonetic entry, described matching result group and described the first matching result to described record, to make producing described the first matching result when search phonetic entry coupling based on described speech recognition engine next time.

Above-mentioned device, preferred, also comprise:

Acquisition module, for obtaining the user's who carries out described phonetic entry identity information.

Above-mentioned device, preferably, described logging modle, recording described phonetic entry, when described matching result group and described the first matching result, also record the user's of described phonetic entry identity information, to make the described phonetic entry of described record, described matching result group, described the first matching result and described user's identity information carries out model adaptation training targetedly for described speech recognition engine for described user's articulation type, while coupling with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time, produce described the first matching result.

Above-mentioned device, preferred, described training module comprises:

Taxon, for based on described matching result group and described the first matching result, determines the second matching result except described the first matching result;

First revises unit, for the numerical value of the matching rate of described the first matching result and described phonetic entry being risen on the basis of currency to the first value;

Second revises unit, for the numerical value of the matching rate of described the second matching result and described phonetic entry being reduced on the basis of currency to the second value;

Wherein, described the first value is greater than described the second value.

Above-mentioned device, preferred, also comprise:

Judge module, for judging that according to pre-conditioned described the first matching result responds operation corresponding to described phonetic entry and whether completes;

In the time completing, trigger recording module.

A kind of electronic equipment, comprising: the signal conditioning package as described in above-mentioned any one and to as described in the voice collecting unit that gathers of the phonetic entry of electronic equipment.

Known via above-mentioned technical scheme, a kind of information processing method of the application, described method is applied to electronic equipment, in the method, when user's phonetic entry is identified, according to speech recognition engine, this phonetic entry is mated and obtained multiple matching results, in the time receiving the selection of user feedback, what determine that user selects is the first matching result, record this phonetic entry, the first matching result that multiple matching results and this user select, and the content according to record is carried out adaptive training to the model of speech recognition engine, so that based on this speech recognition engine, described phonetic entry is mated next time time produce this first matching result.In this disposal route, combine the selection of user's matching result, improve the matching degree of the recognition result of phonetic entry and user's selection, and then improved the precision of identifying speech of identification engine, optimize recognition performance, improve user and experience.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is the process flow diagram of a kind of information processing method embodiment 1 of providing of the application;

Fig. 2 is the process flow diagram of a kind of information processing method embodiment 2 of providing of the application;

Fig. 3 is the process flow diagram of a kind of information processing method embodiment 3 of providing of the application;

Fig. 4 is the process flow diagram of a kind of information processing method embodiment 4 of providing of the application;

Fig. 5 is the process flow diagram of a kind of information processing method embodiment 5 of providing of the application;

Fig. 6 is the structural representation of a kind of signal conditioning package embodiment 1 of providing of the application;

Fig. 7 is the structural representation of a kind of signal conditioning package embodiment 2 of providing of the application;

Fig. 8 is the structural representation of a kind of signal conditioning package embodiment 3 of providing of the application;

Fig. 9 is the structural representation of a kind of signal conditioning package embodiment 4 of providing of the application;

Figure 10 is the structural representation of a kind of signal conditioning package embodiment 5 of providing of the application.

Embodiment

For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.

For the independence of more specifically emphasizing to implement, this instructions relates to number of modules or unit.For example, module or unit can be realized by hardware circuit, and this hardware circuit comprises special VLSI circuit or gate array, such as logic chip, transistor, or other assembly.Module or unit also can be realized in programmable computer hardware, such as field effect programmable gate array, programmable logic array, programmable logic device etc.

Module or unit also can be realized in the software performed by various forms of processors.Such as an executable code module can comprise computer instruction block one or more entities or logic, this block may be formed as, such as, object, program or function.But, not can not the needing physically to put together by operating part of identification module or unit, but can be formed by the different instruction that is stored in diverse location, in the time combining in logic, form module or unit and reach this module or the desired object in unit.

In fact, executable code module or unit can be a single instruction or multiple instruction, even can be distributed in the several different code section that is arranged in different programs, and across several memory devices.Similarly, service data can and be shown in this module or unit by identification, and can be with any suitable form enforcement and in any suitable data structure form inner tissue.Service data can assemble single data set, maybe can be distributed in the different position with different memory devices, and is only present in a system or network in electronic signal mode at least in part.

" embodiment " or similar term that this instructions is mentioned represent the characteristic relevant with embodiment, structure or feature, are included at least one embodiment of the present invention.Therefore term " in one embodiment " that, this instructions occurs, " in an embodiment " and similar term may but inevitablely all point to identical embodiment.

Moreover, characteristic of the present invention, structure or feature can be by any way in conjunction with in one or more embodiments.Below explanation will provide many specific details, such as examples such as programming, software module, user's selection, network trading, data base querying, database structure, hardware module, hardware circuit, hardware chips, so that the understanding to the embodiment of the present invention to be provided.But those of ordinary skill in the related art will find out the present invention, even without utilizing wherein one or more specific detail, or utilize other method, assembly, material etc. also can implement.On the other hand, be the present invention that avoids confusion, known structure, material or operation are not described in detail.

The process flow diagram of a kind of information processing method embodiment 1 that the application as shown in Figure 1 provides, described method can be applied to an electronic equipment, described electronic equipment can be the electronic equipments such as desktop computer, notebook, panel computer, mobile phone, intelligent television, intelligent watch, Wearable equipment, in described electronic equipment, be provided with voice collecting unit, be used for the voice of environment to external world and gather, the voice in this external environment especially refer to the voice that use the user of this electronic equipment to send in this application.

Step S101: receive the phonetic entry that described voice collecting unit gathers;

Wherein, this phonetic entry is the voice for searching for that user sends, and these voice can comprise: the digital contents such as telephone number, combination of numbers, also can be the word contents such as name, and be even word and digital combination, in the present embodiment, do not limit.

Wherein, this voice collecting unit can be Real-time Collection phonetic entry, also can be after wait user opens and gathers.

Wherein, in the time that this voice unit is Real-time Collection phonetic entry, in this phonetic entry, can comprises and start the enabled instruction that speech recognition is searched for.

Concrete, one or one group of triggering voice signal that this enabled instruction can be preset, in the time that this phonetic entry is consistent with this triggering voice signal, start this speech recognition engine, and execution step S102 mates phonetic entry.

Step S102: based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Wherein, the mode that this speech recognition engine mates phonetic entry has two kinds: a kind of for to mate in limited alternative content; Another kind of for this phonetic entry is identified, directly generate the multiple recognition results relevant to this phonetic entry.

Wherein, in the present embodiment, this limited alternative content can, for the content of storing in electronic equipment, specifically comprise: contact person in address list name, file name etc.

Wherein, when user is in electronic equipment when searching for contact persons name, the mode that this speech recognition engine mates phonetic entry can adopt in limited alternative content mates.

Wherein, when user is in the time that electronic equipment uses some content of web search by input voice information, because resources in network is numerous, so the mode that this speech recognition engine mates phonetic entry can adopt the mode of multiple recognition results that direct generation is relevant to this phonetic entry, select therein by user the recognition result needing again, to carry out web search according to this recognition result.

Concrete, in the time receiving phonetic entry, the speech recognition engine based on default is identified coupling to this phonetic entry, obtain at least 2 matching results, this matching result is the content relevant to this phonetic entry, and, using this matching result obtaining as matching result group.

It should be noted that, this speech recognition can also can be carried out at the Cloud Server being associated with this electronic equipment in this locality identification.

Wherein, when this speech recognition is in the time that the Cloud Server being associated with electronic equipment carries out, this electronic equipment is provided with network element, in the time receiving this phonetic entry, by this network element, this phonetic entry is uploaded to and in Cloud Server, identifies coupling, and in the time that Cloud Server identification coupling obtains matching result group and feeds back, this electronic equipment receives this matching result group by network element.

Step S103: export described matching result group;

Wherein, in this matching result group, contain multiple matching results, need user therefrom to select an objective result corresponding with user's input voice.

Concrete, the content of the matching result group of this output can show in the display unit of this electronic equipment.

Wherein, in this matching result group, can, to each matching result according to sorting with the height of the matching rate of this phonetic entry, the matching result that matching rate is higher be sorted front so that first user see the matching result that this sequence is forward.

Step S104: receive the input operation of user feedback;

Wherein, the input operation of user feedback represents the matching result that user selects in the plurality of matching result.

Wherein, the input operation of this user feedback can be accomplished in several ways.

Such as, in the time being provided with touch-screen in this electronic equipment, the multiple matching results that obtain in this coupling of touch screen display, user selects an objective result in this touch-screen, the operation of this selection is as the input operation of user feedback, and electronic equipment receives the input operation of this user feedback by this touch-screen;

Such as, in the time adopting keyboard in this electronic equipment, the multiple matching results that show on display screen, user is according to the selected objective result of the operation at this keyboard, the operation that should select by keyboard also can be used as the input operation of user feedback, and electronic equipment receives the input operation of this user feedback by this touch-screen;

Or while adopting mouse to select in this electronic equipment, the multiple matching results that adopt mouse to show on display screen are selected, a selected objective result, the operation that this mouse is selected also can be used as the input operation of user feedback;

Or, user selects by voice control, as the voice signal of user's input " selecting second ", make electronic equipment obtain this voice signal, and this voice signal is identified and drawn and select second as objective result, should also can be used as the input operation of user feedback by the operation of voice selecting.

Step S105: determine first matching result according to described input operation from described matching result group;

Wherein, according to the input operation of this user feedback, from this matching result group, determine that a matching result corresponding with this input operation is as the first matching result.

Wherein, this first matching result is in this speech recognition process, the objective result that user selects.

In actual enforcement, when determining that after this first matching result, electronic equipment responds this first matching result, start to carry out with this first matching result operating accordingly.

For example, in the time that this phonetic entry is used for inquiring about address book contact, after determining that the first matching result is object contact person, can carry out the operation of dialing this object contact person phone.

For example, in the time that this phonetic entry is used for network inquiry content, after definite the first matching result is searched key word, can carry out web search according to this searched key word.

Step S106: record described phonetic entry, described matching result group and described the first matching result;

Wherein, the voice of this input are identified to the result of coupling and carried out record, comprise and record this phonetic entry, matching result group and the first matching result.

It should be noted that, when this speech recognition when carrying out in Cloud Server, can this phonetic entry, matching result group and the first matching result be uploaded in Cloud Server and be stored by the network element of electronic equipment.

Step S107: the described phonetic entry to described record, described matching result group and described the first matching result carry out the model adaptation training of speech recognition engine, to make producing described the first matching result when described phonetic entry coupling based on described speech recognition engine next time.

Wherein, because this result of this phonetic entry being identified based on this speech recognition engine is multiple matching results relevant to this phonetic entry, this speech recognition engine is accurate not to the recognition accuracy of this phonetic entry, therefore, using this phonetic entry, matching result group and this first matching result as input, the model of this speech recognition engine is carried out to adaptive training, to increase the degree of association of this first matching result and this phonetic entry, improve the accuracy of the identification coupling of this speech recognition engine to this phonetic entry.

So, model adaptation training based on speech recognition engine, while making based on this speech recognition engine, this phonetic entry to be mated next time, can accurately identify coupling, obtain this first matching result, user, without again selecting from multiple matching results, has simplified user's operating process, improves user and experiences.

It should be noted that this adaptive training to speech recognition engine model can be realized and also can carry out at the Cloud Server being associated with this electronic equipment in this locality.

Wherein, when this adaptive training to speech recognition engine model is in the time that the Cloud Server being associated with electronic equipment carries out, this electronic equipment is provided with network element, by described phonetic entry, described matching result group and described first matching result of record, upload in Cloud Server by this network element, the speech recognition engine model being arranged in this Cloud Server is carried out to adaptive training.

It should be noted that, also can judge that user uses the operation of this electronic equipment according to the attitude of electronic equipment, while being handheld terminal as this electronic equipment, the attitude of handheld terminal shows being operating as while making a phone call of user, directly use this speech recognition engine to mate this phonetic entry and contact person in address list, reduce the scope of this phonetic entry of identification coupling.

Concrete, when this user uses this handheld terminal to make a phone call, the attitude of this handheld terminal can comprise: the angle of this handheld terminal and vertical direction meet default angular range or, the surface temperature value of this handheld terminal meets default temperature range, or, the distance value that this handheld terminal the detects condition such as in default distance, also can be two combinations wherein, or meets full terms.

Because everyone pronunciation custom is different,, in the time that user is carried out to speech recognition match, also need to be accustomed to for this user's pronunciation.

Referring to Fig. 2, be a kind of information processing method embodiment 2 process flow diagrams provided by the invention.

Step S201: the identity information that obtains the user who carries out described phonetic entry;

Wherein, each user's identity information is unique, obtains the user's of this phonetic entry information, for this user is identified to coupling targetedly, to obtain the matching result for this user.

Wherein, this user's identity information can obtain in several ways, comprising: the modes such as recognition of face, Application on Voiceprint Recognition, fingerprint recognition and Data Enter.

Such as, in the time that this identity information obtains by recognition of face mode, electronic equipment is placed on the region within the scope of distance users face predeterminable range by user, and electronic equipment obtains this user's facial characteristics, to determine this user's identity information.

And for example, in the time that this identity information obtains by Data Enter mode, in this electronic equipment, predeterminable area arranges dialog boxes for login, and user inserts and represents the information of identity in this dialog box, realizes determining of identity information to this user.

Or in the time that this identity information obtains by fingerprint recognition mode, finger corresponding to identification is placed on fingerprint pickup area in electronic equipment by user, electronic equipment obtains this user's fingerprint characteristic, to determine this user's identity information.

Or in the time that this identity information obtains by Application on Voiceprint Recognition mode, user sends test sound to this electronic equipment, so that this electronic equipment obtains this user's sound, and identifies, to determine this user's identity information.

It should be noted that, in the time adopting Application on Voiceprint Recognition mode to obtain this user's identity information, can gather after phonetic entry in voice collecting unit, while receiving this phonetic entry, these voice are carried out to Application on Voiceprint Recognition, to make adopting the model of corresponding speech recognition engine to identify coupling to this phonetic entry according to user's identity information.

Step S202: receive the phonetic entry that described voice collecting unit gathers;

Step S203: based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Step S204: export described matching result group;

Step S205: receive the input operation of user feedback;

Step S206: determine first matching result according to described input operation from described matching result group;

Wherein, step S202-206 is consistent with step S101-105 in embodiment 1, in the present embodiment, repeats no more.

Step S207: the identity information that records the user of described phonetic entry, described matching result group, described the first matching result and described phonetic entry;

Wherein, the result using the user's of described phonetic entry, described matching result group, described the first matching result and described phonetic entry identity information as this identification, carries out record to this result.

It should be noted that, when this speech recognition when carrying out in Cloud Server, can the user's of this phonetic entry, matching result group, the first matching result phonetic entry identity information be uploaded in Cloud Server and be stored by the network element of electronic equipment.

Step S208: the user's of the described phonetic entry to described record, described matching result group, described the first matching result and described phonetic entry identity information carries out speech recognition engine and carries out model adaptation training targetedly for described user's articulation type, produces described the first matching result with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time while coupling.

Wherein, this identification coupling is the identification coupling of carrying out for this user's phonetic entry, produce multiple matching results, so, this speech recognition engine is accurately accurate not for the identification of this user's phonetic entry, by this phonetic entry, matching result group, the user's of the first matching result and phonetic entry identity information is as input, this speech recognition engine is carried out carrying out model adaptation training targetedly for this user's articulation type, increase the degree of association of this phonetic entry of this first matching result and this user input, improve the accuracy of the identification coupling of this phonetic entry of this speech recognition engine to this user.

For example, when being " zhangshan " in the listed user Li Si's of this electronic equipment phonetic entry, speech recognition engine mates the matching result obtaining and is followed successively by " Zhang Shan ", " Zhang San ", " bolt " three matching results, and user pronunciation is nonstandard, its objective result is " Zhang San ", determine that according to user's selection the first matching result is for " Zhang San ", now, electronic equipment records this phonetic entry " zhangshan ", three matching results of " Zhang Shan " " Zhang San " " bolt ", the first matching result " Zhang San " and this user Li Si's identity information is as the result of this identification coupling, according to this identification matching result, speech recognition engine is carried out to model adaptation training targetedly for user Li Si's articulation type.In the time that user Li Si phonetic entry is next time " zhangshan ", speech recognition engine mates the matching result " Zhang San " obtaining, and needn't from multiple matching results, again choose again, simplifies user's operation, has improved user's experience.

It should be noted that, when this user carries out phonetic entry when coupling identification first for non-, the historical information of the subscriber identity information getting and storage is compared, judge the user for having recorded who obtains this user, and in the time this user's phonetic entry being identified by speech recognition engine, adopt the speech recognition engine model corresponding with this user, this speech recognition engine model has passed through the training of last user phonetic entry, this speech recognition engine model can carry out specific aim identification for this user's articulation type, and recognition accuracy is higher.

It should be noted that, in the time that a certain non-login user uses this electronic equipment, can identify coupling to this user's phonetic entry, but do not record its identification matching result.

Further, the owner user of this electronic equipment, also can setup and use authority, and in the time not obtaining the user of rights of using and use this electronic equipment, this user fails by authority recognition, this user's phonetic entry is not identified to coupling.

Referring to Fig. 3, it is the process flow diagram of a kind of information processing method embodiment 3 provided by the invention.

Step S301: receive the phonetic entry that described voice collecting unit gathers;

Step S302: based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Step S303: export described matching result group;

Step S304: receive the input operation of user feedback;

Step S305: determine first matching result according to described input operation from described matching result group;

Step S306: record described phonetic entry, described matching result group and described the first matching result;

Wherein, step S301-306 is consistent with the step S101-106 in above-described embodiment 1, in the present embodiment, repeats no more.

Step S307: based on described matching result group and described the first matching result, determine the second matching result except described the first matching result;

Wherein, in this matching result group, have at least 2 matching results, one of them is the first matching result, and remaining is the second matching result.

As, in above-mentioned example, this matching result group is " Zhang Shan ", " Zhang San " and " bolt ", should " Zhang San " be the first matching result, remaining " Zhang Shan " and " bolt " is the second matching result.

Rapid S308: the numerical value of the matching rate of described the first matching result and described phonetic entry is risen on the basis of currency to the first value;

Wherein, in the time that speech recognition engine is identified this phonetic entry, mark in each matching result is had to the matching rate of itself and this phonetic entry, the pronunciation of this matching result and this phonetic entry are more approaching, and its matching rate is higher.

Wherein, in the time having determined the first matching result according to user's selection, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of user, for user for immediate, therefore, the numerical value of the matching rate of this first matching result and this phonetic entry is risen on the basis of currency to the first value.

Step S309: the numerical value of the matching rate of described the second matching result and described phonetic entry is reduced on the basis of currency to the second value.

Wherein, in the time having determined the first matching result according to user's selection, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of user, for user for immediate, and other matching result is not meet user to need, therefore, also the numerical value of the matching rate of the second matching result and this phonetic entry to be reduced on the basis of currency to the second value.

Wherein, described the first value is greater than described the second value.

Therefore, after the numerical value adjustment of this matching rate, the numerical value of the matching rate of this first matching result and this phonetic entry is the highest, the matching rate of the second matching result and this phonetic entry is lower, make the gap between the first matching result and the matching rate of the second matching result larger, be more conducive to can find this unique first matching result when again user's phonetic entry identification next time.

In above-mentioned example, when this phonetic entry of identification " zhangshan ", the matching rate of the matching result " Zhang Shan " that obtains, " Zhang San ", " bolt " is respectively 90%, 75% and 40%, in the time that the first matching result of user's selection is " Zhang San ", in the time of the model of this speech recognition engine of training, improve the matching rate that is somebody's turn to do " Zhang San ", as be increased to 95%, reduce the matching rate of other two matching results, as reduce to 40%.When phonetic entry " zhangshan ", the model of trained speech recognition engine is identified coupling to this phonetic entry, obtains matching result " Zhang San ".

Certainly, the matching rate that reduces these other matching results can adopt other modes, as, reduce respectively a certain numerical value, make this be initially the final matching rate of the matching result of high matching rate lower than the final matching rate of this first matching result.

Referring to Fig. 4, it is the process flow diagram of a kind of information processing method embodiment 4 provided by the invention.

Step S401: the identity information that obtains the user who carries out described phonetic entry;

Step S402: receive the phonetic entry that described voice collecting unit gathers;

Step S403: based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Step S404: export described matching result group;

Step S405: receive the input operation of user feedback;

Step S406: determine first matching result according to described input operation from described matching result group;

Step S407: the identity information that records the user of described phonetic entry, described matching result group, described the first matching result and described phonetic entry;

Wherein, step S401-407 is consistent with the step S201-206 in above-described embodiment 2, in the present embodiment, repeats no more.

Step S408: based on described matching result group and described the first matching result, determine the second matching result except described the first matching result;

Wherein, this user who obtains subscriber identity information can be considered the user of login.So, in the present embodiment for the pronunciation of this login user, to the model training of this speech recognition engine.

Wherein, in this matching result group of these voice, have at least 2 matching results, one of them is the first matching result, and remaining is the second matching result.

As, in above-mentioned example, this matching result group is " Zhang Shan ", " Zhang San " and " bolt ", due to the articulation problems of this login user, should " Zhang San " be confirmed to be the first matching result, remaining " Zhang Shan " and " bolt " is the second matching result.

Step S409: the numerical value of the matching rate of described the first matching result and described phonetic entry is risen on the basis of currency to the first value;

Step S410: the numerical value of the matching rate of described the second matching result and described phonetic entry is reduced on the basis of currency to the second value.

Wherein, described the first value is greater than described the second value.

Wherein, in the time having determined the first matching result according to the selection of this login user, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of this login user, for this login user for immediate, and other matching result is not meet user to need, therefore, also the numerical value of the matching rate of the second matching result and this phonetic entry to be reduced on the basis of currency to the second value.

Therefore, after the numerical value adjustment of this matching rate, the numerical value of the matching rate of this first matching result and this phonetic entry is the highest, the matching rate of the second matching result and this phonetic entry is lower, and can find unique this first matching result when the phonetic entry of this login user identification next time again.

In the present embodiment, this carries out adaptive training by being done for this login user to the model of speech recognition engine, so in the time that other users use this electronic equipment to carry out phonetic entry, the model that does not re-use this speech recognition engine of training for login user is identified coupling, when these other users do not login out-of-dately, can adopt and not carried out the model that adaptive training crosses and identify; When this user be historical log cross user time, can adopt the user's that this historical log crosses identity information to find the model of the speech recognition engine corresponding with it, the model of speech recognition engine that should be corresponding with it is also the training of the user's that crosses through this historical log articulation type.

But in actual enforcement, may there is the operation of wrong choosing in user.

Referring to Fig. 5, it is the process flow diagram of a kind of information processing method embodiment 5 provided by the invention.

Step S501: receive the phonetic entry that described voice collecting unit gathers;

Step S502: based on speech recognition engine, described phonetic entry is mated, obtain the matching result group that by least 2 matching results formed relevant to described phonetic entry;

Step S503: export described matching result group;

Step S504: receive the input operation of user feedback;

Step S505: determine first matching result according to described input operation from described matching result group;

Wherein, step S501-505 is consistent with the step S101-105 in embodiment 1, in the present embodiment, repeats no more.

Step S506: judge that according to pre-conditioned described the first matching result responds operation corresponding to described phonetic entry and whether completes;

Wherein, operation corresponding to described phonetic entry refers to user's object run, and this first matching result responds operation corresponding to described phonetic entry and comprises: respond this user's object run according to this first matching result.

Such as, when user adopts phonetic entry to realize address book contact search, operation corresponding to this phonetic entry refers to that contact person corresponding to the first matching result that user obtains with this search sets up and converses or normal talking finishes;

And for example, when user adopts phonetic entry when realizing web search, operation corresponding to this phonetic entry refers to that user adopts content that this first matching result is corresponding to carry out search operation in network or search completes;

Or in the time that this user adopts phonetic entry to input to realize the contents such as short message, operation corresponding to this phonetic entry refers to that user adopts this first matching result generate short message and send.

Concrete, when this response is while making a phone call according to the contact person of this first matching result, when the other side has connect phone, or connection exceedes Preset Time threshold value, and decision completes; Otherwise do not complete.

In the time completing, execution step S507, otherwise, finish, do not record this identification matching result.

Concrete, when user selects after a certain matching result, this electronic equipment determines that according to this selection this matching result is the first matching result, but, user has cancelled operation response corresponding to this matching result within the default time, can judge that this first matching result of determining according to the input operation of this user feedback is non-objective result,, user's selection action is maloperation, this identification matching result is not carried out to record, to prevent that user's maloperation from causing the adaptive training of the model of speech recognition engine to be affected.

It is similar that all the other operate corresponding response mode.

Step S507: in the time completing, record described phonetic entry, described matching result group and described the first matching result;

Step S508: the described phonetic entry to described record, described matching result group and described the first matching result carry out the model adaptation training of speech recognition engine, to make producing described the first matching result when described phonetic entry coupling based on described speech recognition engine next time.

Wherein, step S507-508 is consistent with the step S106-107 in embodiment 1, in the present embodiment, repeats no more.

A kind of information processing method embodiment providing with the application is corresponding, and the application also provides a kind of signal conditioning package embodiment.

Referring to Fig. 6, show the structural representation of a kind of signal conditioning package embodiment 1 that the application provides, described device can be applied to an electronic equipment, described electronic equipment can be the electronic equipments such as desktop computer, notebook, panel computer, mobile phone, intelligent television, intelligent watch, Wearable equipment, in described electronic equipment, be provided with voice collecting unit, be used for the voice of environment to external world and gather, the voice in this external environment especially refer to the voice that use the user of this electronic equipment to send in this application.

This device comprises: the first receiver module 601, matching module 602, output module 603, the second receiver module 604, selection module 605, logging modle 606 and training module 607;

The first receiver module 601, the phonetic entry gathering for receiving described voice collecting unit;

Wherein, in the time that this voice unit is Real-time Collection phonetic entry, in this phonetic entry, can comprise and start the enabled instruction that speech recognition is searched for, in this electronic equipment, also can comprise startup module, this startup module can be used for phonetic entry and default instruction voice to compare, in the time that the two is consistent, carry out response action according to this default instruction.

Concrete, one or one group of triggering voice signal that this enabled instruction can be preset, when the first receiver module 601 receives after the phonetic entry of described voice collecting unit collection, starting module judges when this phonetic entry is consistent with this triggering voice signal, start this speech recognition engine, trigger matching module 602 phonetic entry is mated.

Matching module 602, for described phonetic entry being mated based on speech recognition engine, obtains the matching result group that at least 2 matching results be made up of relevant to described phonetic entry;

Wherein, when user is in electronic equipment when searching for contact persons name, the mode that the speech recognition engine of this matching module 602 mates phonetic entry can adopt in limited alternative content mates.

Concrete, in the time receiving phonetic entry, the speech recognition engine of matching module 602 based on default identified coupling to this phonetic entry, obtain at least 2 matching results, this matching result is the content relevant to this phonetic entry, and, using this matching result obtaining as matching result group.

It should be noted that, this speech recognition can also can be carried out at the Cloud Server being associated with this electronic equipment in this locality identification, and this matching module 602 can, in electronic equipment, also can be arranged on the Cloud Server being associated with this electronic equipment.

Output module 603, for exporting described matching result group;

Concrete, output module 603 is by this matching result group output, and the content of the matching result group of this output can show in the display unit of this electronic equipment.

The second receiver module 604, for receiving the input operation of user feedback;

Select module 605, for determining first matching result according to described input operation from described matching result group;

Wherein, according to the input operation of this user feedback, select module 605 from this matching result group, to determine that a matching result corresponding with this input operation is as the first matching result.

Logging modle 606, for recording described phonetic entry, described matching result group and described the first matching result;

Wherein, the result that logging modle 606 is identified coupling to the voice of this input is carried out record, comprises and records this phonetic entry, matching result group and the first matching result.

Training module 607, carry out the model adaptation training of speech recognition engine for described phonetic entry, described matching result group and described the first matching result to described record, to make producing described the first matching result when search phonetic entry coupling based on described speech recognition engine next time.

Wherein, because this result of this phonetic entry being identified based on this speech recognition engine is multiple matching results relevant to this phonetic entry, this speech recognition engine is accurate not to the recognition accuracy of this phonetic entry, therefore, training module 607 is using this phonetic entry, matching result group and this first matching result as input, the model of this speech recognition engine is carried out to adaptive training, to increase the degree of association of this first matching result and this phonetic entry, improve the accuracy of the identification coupling of this speech recognition engine to this phonetic entry.

Referring to Fig. 7, for the structural representation of a kind of signal conditioning package embodiment 2 provided by the invention, comprising: acquisition module 701, the first receiver module 702, matching module 703, output module 704, the second receiver module 705, selection module 706, logging modle 707 and training module 708;

Wherein, this first receiver module 702, matching module 703, output module 704, the second receiver module 705, selection module 706, identical with corresponding construction function in embodiment 1, repeat no more in the present embodiment.

Acquisition module 701, for obtaining the user's who carries out described phonetic entry identity information;

Wherein, each user's identity information is unique, and acquisition module 701 obtains the user's of this phonetic entry information, for this user is identified to coupling targetedly, to obtain the matching result for this user.

Such as, in the time that this identity information obtains by recognition of face mode, electronic equipment is placed on the region within the scope of distance users face predeterminable range by user, and the camera of electronic equipment obtains this user's facial characteristics as acquisition module 701, to determine this user's identity information.

Logging modle 707, for recording user's the identity information of described phonetic entry, described matching result group, described the first matching result and described phonetic entry;

Wherein, the result of logging modle 707 using the user's of described phonetic entry, described matching result group, described the first matching result and described phonetic entry identity information as this identification, carries out record to this result.

Training module 708, the identity information that is used for the user of described phonetic entry, described matching result group, described the first matching result and described phonetic entry to described record carries out speech recognition engine and carries out model adaptation training targetedly for described user's articulation type, produces described the first matching result with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time while coupling.

Wherein, this identification coupling is the identification coupling of carrying out for this user's phonetic entry, produce multiple matching results, so, this speech recognition engine is accurately accurate not for the identification of this user's phonetic entry, therefore, training module 708 is by this phonetic entry, matching result group, the user's of the first matching result and phonetic entry identity information is as input, this speech recognition engine is carried out carrying out model adaptation training targetedly for this user's articulation type, increase the degree of association of this phonetic entry of this first matching result and this user input, improve the accuracy of the identification coupling of this phonetic entry of this speech recognition engine to this user.

Referring to Fig. 8, for a kind of signal conditioning package embodiment provided by the invention, 3 structural representation, comprising: the first receiver module 801, matching module 802, output module 803, the second receiver module 804, selection module 805, logging modle 806 and training module 807; Wherein, training module 807 comprises: unit 810 is revised in taxon 808, the first modification unit 809 and second.

Wherein, the first receiver module 801, matching module 802, output module 803, the second receiver module 804, select module 805, logging modle 806 consistent with corresponding construction function in embodiment 1, in the present embodiment, repeat no more.

Taxon 808, for based on described matching result group and described the first matching result, determines the second matching result except described the first matching result;

Wherein, in this matching result group, have at least 2 matching results, one of them is the first matching result, and taxon 808 determines that remaining is the second matching result.

First revises unit 809, for the numerical value of the matching rate of described the first matching result and described phonetic entry being risen on the basis of currency to the first value;

Wherein, in the time having determined the first matching result according to user's selection, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of user, for user for immediate, therefore, the first modification unit 809 rises to the numerical value of the matching rate of this first matching result and this phonetic entry the first value on the basis of currency.

Second revises unit 810, for the numerical value of the matching rate of described the second matching result and described phonetic entry being reduced on the basis of currency to the second value.

Wherein, described the first value is greater than described the second value.

Wherein, in the time having determined the first matching result according to user's selection, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of user, for user for immediate, and other matching result is not meet user to need, therefore, also want the second modification unit 810 numerical value of the matching rate of the second matching result and this phonetic entry to be reduced on the basis of currency to the second value.

Referring to Fig. 9, for the structural representation of a kind of signal conditioning package embodiment 4 provided by the invention, comprising: acquisition module 901, the first receiver module 902, matching module 903, output module 904, the second receiver module 905, selection module 906, logging modle 907 and training module 908; Wherein, training module 908 comprises: unit 911 is revised in taxon 909, the first modification unit 910 and second.

Wherein, acquisition module 901, the first receiver module 902, matching module 903, output module 904, the second receiver module 905, select module 906, logging modle 907 consistent with corresponding construction function in embodiment 2, in the present embodiment, repeat no more.

Taxon 909, for based on described matching result group and described the first matching result, determines the second matching result except described the first matching result;

Wherein, in this matching result group of these voice, have at least 2 matching results, one of them is the first matching result, and taxon 909 determines that remaining is the second matching result.

First revises unit 910, for the numerical value of the matching rate of described the first matching result and described phonetic entry being risen on the basis of currency to the first value;

Wherein, in the time having determined the first matching result according to user's selection, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of user, for user for immediate, therefore, the first modification unit 910 rises to the numerical value of the matching rate of this first matching result and this phonetic entry the first value on the basis of currency.

Second revises unit 911, for the numerical value of the matching rate of described the second matching result and described phonetic entry being reduced on the basis of currency to the second value;

Wherein, described the first value is greater than described the second value.

Wherein, in the time having determined the first matching result according to the selection of this login user, represent this phonetic entry pronunciation corresponding with the first matching result of its selection of this login user, for this login user for immediate, and other matching result is not meet user to need, therefore, also want the second modification unit 911 numerical value of the matching rate of the second matching result and this phonetic entry to be reduced on the basis of currency to the second value.

Referring to Figure 10, for the structural representation of a kind of signal conditioning package embodiment 5 provided by the invention, comprising: acquisition module 1001, the first receiver module 1002, matching module 1003, output module 1004, the second receiver module 1005, selection module 1006, judge module 1007, logging modle 1008 and training module 1009;

Wherein, acquisition module 1001, the first receiver module 1002, matching module 1003, output module 1004, the second receiver module 1005, select module 1006, logging modle 1008 and training module 1009 and in embodiment 1, structure function is consistent accordingly, in the present embodiment, repeat no more.

Judge module 1007, for judging that according to pre-conditioned described the first matching result responds operation corresponding to described phonetic entry and whether completes;

Concrete, when this response is while making a phone call according to the contact person of this first matching result, when the other side has connect phone, or connection exceedes Preset Time threshold value, and judge module 1007 decision complete; Otherwise do not complete.

In the time completing, trigger recording module; Otherwise, finish, do not record this identification matching result.

Concrete, when user selects after a certain matching result, this electronic equipment determines that according to this selection this matching result is the first matching result, but, user has cancelled operation response corresponding to this matching result within the default time, judge module 1007 can judge that this first matching result of determining according to the input operation of this user feedback is non-objective result,, user's selection action is maloperation, this identification matching result is not carried out to record, to prevent that user's maloperation from causing the adaptive training of the model of speech recognition engine to be affected.

It is similar that all the other operate corresponding response mode.

A kind of electronic equipment is also provided in the application, this electronic equipment comprises the signal conditioning package in voice collecting unit and above-described embodiment, and this signal conditioning package comprises: the first receiver module, matching module, output module, the second receiver module, selection module, logging modle and training module.

Wherein, the function of this signal conditioning package all modules is consistent with the function of corresponding construction in above-mentioned a kind of signal conditioning package embodiment, in the present embodiment, repeats no more.

Preferably, in the signal conditioning package of this electronic equipment, also comprise: acquisition module, for obtaining the user's who carries out described phonetic entry identity information;

Simultaneously, this logging modle, recording described phonetic entry, when described matching result group and described the first matching result, also record the user's of described phonetic entry identity information, to make the described phonetic entry of described record, described matching result group, described the first matching result and described user's identity information carries out model adaptation training targetedly for described speech recognition engine for described user's articulation type, while coupling with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time, produce described the first matching result.

Preferably, in the signal conditioning package of this electronic equipment, described training module comprises: taxon, first is revised unit and second and revised unit;

Wherein, the function of this signal conditioning package all modules unit is consistent with the function of corresponding construction in above-mentioned a kind of signal conditioning package embodiment, in the present embodiment, repeats no more.

Preferably, in the signal conditioning package of this electronic equipment, also comprise: judge module, for judging that according to pre-conditioned described the first matching result responds operation corresponding to described phonetic entry and whether completes; While completing, trigger recording module.

The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. an information processing method, is characterized in that, described method is applied to electronic equipment, is provided with voice collecting unit in described electronic equipment, and the method comprises:

Receive the phonetic entry that described voice collecting unit gathers;

Export described matching result group;

Receive the input operation of user feedback;

2. method according to claim 1, is characterized in that, before receiving the phonetic entry of described voice collecting unit collection, also comprises:

3. method according to claim 2, is characterized in that, also comprises:

4. according to the method described in claim 1 or 3, it is characterized in that, the model adaptation training that described described phonetic entry, described matching result group and described the first matching result to described record carries out speech recognition engine comprises:

Wherein, described the first value is greater than described the second value.

5. method according to claim 1, it is characterized in that, the described input operation of described foundation determines after first matching result from described matching result group, described record described phonetic entry, described matching result group and described the first matching result before, also comprise:

6. a signal conditioning package, is characterized in that, is applied to electronic equipment, is provided with voice collecting unit in described electronic equipment, and described device comprises:

Output module, for exporting described matching result group;

The second receiver module, for receiving the input operation of user feedback;

7. device according to claim 6, is characterized in that, also comprises:

8. device according to claim 7, is characterized in that,

Described logging modle, recording described phonetic entry, when described matching result group and described the first matching result, also record the user's of described phonetic entry identity information, to make the described phonetic entry of described record, described matching result group, described the first matching result and described user's identity information carries out model adaptation training targetedly for described speech recognition engine for described user's articulation type, while coupling with the described phonetic entry that makes based on described speech recognition engine, described user to be inputted next time, produce described the first matching result.

9. according to the device described in claim 6 or 8, it is characterized in that, described training module comprises:

Wherein, described the first value is greater than described the second value.

10. device according to claim 6, is characterized in that, also comprises:

In the time completing, trigger recording module.

11. 1 kinds of electronic equipments, is characterized in that, comprising: the signal conditioning package as described in claim 6-10 any one and to as described in the voice collecting unit that gathers of the phonetic entry of electronic equipment.