CN103280217B

CN103280217B - A kind of audio recognition method of mobile terminal and device thereof

Info

Publication number: CN103280217B
Application number: CN201310157943.0A
Authority: CN
Inventors: 罗永浩
Original assignee: Hammer Technology (beijing) Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2013-05-02
Filing date: 2013-05-02
Publication date: 2016-05-04
Anticipated expiration: 2033-05-02
Also published as: WO2014177015A1; US9502035B2; US20160098991A1; CN103280217A

Abstract

The embodiment of the present application discloses a kind of audio recognition method of mobile terminal. The method comprises: receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operation is the classification of dividing according to the business function of mobile terminal; Receive voice key word information, from voice key word information, determine voice keyword; According to the keywords database under class of operation item to be operated described in voice keyword retrieval, return to result for retrieval. Disclosed herein as well is a kind of speech recognition equipment of mobile terminal. The embodiment of the present application can improve efficiency and the accuracy of speech recognition.

Description

A kind of audio recognition method of mobile terminal and device thereof

Technical field

The application relates to technical field of information processing, particularly a kind of speech recognition based on mobile terminalMethod and corresponding device thereof.

Background technology

The use of mobile terminal be unable to do without interactive process. More common people in intelligent mobile terminalMachine interactive mode is by the screen of finger touch mobile terminal, is responded to by the inductor that mobile terminal is built-inFinger to press information realization mutual. Along with Apple adds Siri voice in iPhone series of productsAfter assistant's function, man-machine interaction mode is touched and is changed to voice control by traditional physics, by people'sLanguage carrys out instruction mobile terminal and meets the task that user's needs are reached. This speech recognition process allow user withMeaning provides instruction with natural language form to voice assistant class software, and the relevant apparatus of mobile terminal receivesAfter this instruction, by voice assistant class software in this locality and/or cloud server carry out speech recognition and semantic pointAnalyse, and feed back according to the result of identification and analysis.

But, due to existing voice identification, the technology imperfection of particularly semantic analysis aspect, identification is accurateReally rate is lower, especially quite high for many words, long sentence, identification and the profiling error rate of many, identification andThe result needs frequent and that user is real of analyzing are far from each other, and user need to input repeatedly, constantly revisionIdentification and the result of analyzing, had a strong impact on the accuracy that the audio recognition method based on mobile terminal is identifiedAnd agility.

Summary of the invention

For solving the problems of the technologies described above, the embodiment of the present application provides a kind of speech recognition side of mobile terminalMethod and corresponding intrument thereof, to improve accuracy and the agility of the speech recognition based on mobile terminal.

The audio recognition method of the mobile terminal that the application provides comprises:

Receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operationClassification that Wei not divide according to the business function of mobile terminal and mobile terminal user's the scope of application; InstituteStating class of operation comprises: contact person's classification, application category, music categories, Webpage search classification;

Receive voice key word information, from voice key word information, determine voice keyword, according to voiceDescribed in keyword retrieval, treat the keywords database under class of operation item, return to result for retrieval;

What described reception operated mobile terminal treats that the triggering message of class of operation specifically comprises:

Judge that whether gravitational acceleration component on the Z axis that the first monitor listens to is at 0 to 4 gravityWithin the scope of unit of acceleration, whether the gravitational acceleration component in X, Y-axis accelerates at 4 to 10 gravityWithin the scope of degree unit, and whether the distance that the second monitor listens to be zero, and described X, Y-axis are for mobileThe plane at terminal panel place, the plane that described Z axis forms perpendicular to X, Y-axis, described first monitorsDevice is the monitor to gravity sensor that receives sensor service post-registration, and described the second monitor isReceive the monitor of the sensor of adjusting the distance of sensor server post-registration; To determine if beReceive the triggering message for the treatment of class of operation that mobile terminal is operated, described class of operation is contactPeople; Described reception voice key word information is determined voice keyword, root from voice key word informationTreat the keywords database under class of operation item according to described in voice keyword retrieval, return to result for retrieval and comprise:

The voice key word information that reception comprises contact person is determined contact person from voice key word informationKeyword, according to described contact person's keyword retrieval contact library, returns to the contact person who retrieves and calls outThis contact person.

The triggering message of the class of operation to be operated that preferably, described reception operates mobile terminalSpecifically comprise:

On mobile terminal screen, present class of operation window, as a behaviour in described class of operation windowMake label corresponding to classification clicked or while being defined as focus, determine to receive mobile terminal is operatedThe triggering message of class of operation to be operated.

Further preferably, label corresponding to class of operation in described class of operation window comprises for realityContact person's label of existing communication service function, for realizing application tags, the use of Application Service FunctionIn realizing the music label of music business function and/or for realizing the webpage of on-line search business functionSearch label.

Further preferably, when according to described contact person's keyword retrieval to contact person comprise when multiple,Each contact person is numbered, receives numbering voice messaging, call out contact corresponding to numbering voice messagingPeople.

Preferably, after mobile terminal is operated, operate in the key under its class of operation item described in increaseThe frequency of corresponding keyword in dictionary, according to voice keyword retrieval treat the keyword under action-item, according to the descending ordered retrieval keywords database of the keyword frequency when in the storehouse.

Preferably, after mobile terminal is operated, meeting when pre-conditioned according to described operating result pairVoice keywords database under class of operation item upgrades.

The speech recognition equipment of the mobile terminal that the application provides comprises: trigger message sink unit, voiceKey word information receiving element, voice keyword recognition unit and keywords database retrieval unit, wherein:

Described triggering message sink unit, for receiving the class of operation for the treatment of that mobile terminal is operatedTrigger message, described class of operation is making according to the business function of mobile terminal and mobile terminal userThe classification of dividing by scope, described class of operation comprises: contact person's classification, application category, musicClassification, Webpage search classification;

Described voice key word information receiving element, for receiving voice key word information;

Described voice keyword recognition unit, for determining voice keyword from voice key word information;

Described keywords database retrieval unit, for treating under class of operation item according to described in voice keyword retrievalKeywords database, return to result for retrieval;

Described triggering message sink unit specifically comprises: snoop results judgment sub-unit and triggering message sinkSubelement, wherein:

Described snoop results judgment sub-unit, for judging the gravity on the Z axis that the first monitor listens toComponent of acceleration whether in 0 to 4 acceleration of gravity unit's scope, the acceleration of gravity in X, Y-axisWhether component is in 4 to 10 acceleration of gravity unit's scopes, and the distance that the second monitor listens to isNo is zero, and described X, Y-axis are the plane at mobile terminal panel place, and described Z axis is perpendicular to X, YThe plane that axle forms, described the first monitor be receive sensor service post-registration to gravity sensorMonitor, described the second monitor is the sensor of adjusting the distance that receives sensor server post-registrationMonitor;

Described triggering message sink subelement, while being, determining and receives moving for being in judged resultThe triggering message of what moving terminal operated treat class of operation, described class of operation is contact person;

Described voice key word information receiving element is specifically for receiving the voice keyword letter that comprises contact personBreath, described voice keyword recognition unit closes specifically for determine contact person from voice key word informationKeyword, described keyword retrieval unit is specifically for according to described contact person's keyword retrieval contact library,Return to the contact person who retrieves;

Described device also comprises calling unit, for the contact person who retrieves described in calling out.

Preferably, described triggering message sink unit specifically comprises: class of operation window present subelement andTrigger message sink subelement, wherein:

Described class of operation window presents subelement, for present class of operation window on mobile terminal screenMouthful;

Described triggering message sink subelement, for a class of operation at described class of operation windowCorresponding label is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedOther triggers message.

Further preferably, described device also comprises that contact person's numbered cell and numbering voice messaging receive singleUnit, wherein: described contact person's numbered cell, for the connection arriving according to described contact person's keyword retrievalBe that people comprises when multiple, each contact person is numbered; Described numbering voice messaging receiving element, usesIn receiving numbering voice messaging, described calling unit is specifically for calling out contact corresponding to numbering voice messagingPeople.

Preferably, described device also comprises that the keyword frequency increases unit, for being operated at mobile terminalAfter, operate in the frequency of keyword corresponding in the keywords database under its class of operation item described in increase,Described keywords database retrieval unit specifically for according to voice keyword retrieval treat the keyword under action-item, according to the descending ordered retrieval keywords database of the keyword frequency when in the storehouse.

Preferably, described device also comprises keyword updating block, after being operated at mobile terminal,According to the result of described operation, the keywords database under class of operation item is carried out more meeting when pre-conditionedNewly.

The embodiment of the present application receives the triggering of certain class of operation of dividing according to mobile terminal service functionAfter message, receive voice key word information, from voice keyword, determine voice keyword, then basisThe corresponding keywords database of voice keyword retrieval, and return to result for retrieval. With existing speech recognition technologyCompare, the embodiment of the present application, owing to class of operation being divided according to business function, makes keywords databaseOnly corresponding with each class of operation, during on the one hand according to voice keyword retrieval retrieval process object only forIn with the keywords database corresponding to the operation of mobile terminal, reduced the quantity of handling object, adapted toThe feature that the disposal ability of mobile terminal is weak; Another aspect, the quantity of the handling object that retrieval relates to subtractsMake less the time shorten of retrieving, thereby improved the efficiency of speech recognition; On the one hand, retrieval relates to againAnd the quantity of handling object reduce and make to occur that the repetition of keyword and ambiguous probability reduce, thereby carryThe high accuracy of speech recognition. And, the embodiment of the present application in the time of receiving speech information with voice keyThe form of word information receives, and is no longer common natural language, has avoided many words, long sentence and many, oneAspect is more prone to from voice messaging, extract keyword, and then has improved the efficiency of speech recognition; SeparatelyOn the one hand mate to obtain with keywords database by the keyword extracting from voice key word information and return to knotReally, be conducive to improve the accuracy of speech recognition.

Brief description of the drawings

In order to be illustrated more clearly in the embodiment of the present application or technical scheme of the prior art, below will be to reality

The accompanying drawing of executing required use in example or description of the Prior Art is briefly described, apparently,

The accompanying drawing the following describes is only some embodiment that record in the application, common for this area

Technical staff, is not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings

Obtain other accompanying drawing.

Fig. 1 is the flow chart of an embodiment of the audio recognition method of the application's mobile terminal;

Fig. 2 is the structured flowchart of an embodiment of the speech recognition equipment of the application's mobile terminal.

Detailed description of the invention

In order to make those skilled in the art person understand better the technical scheme in the application, below in conjunction withAccompanying drawing in the embodiment of the present application, retouches clearly and completely to the technical scheme in the embodiment of the present applicationState, obviously, described embodiment is only some embodiments of the present application, instead of whole enforcementExample. Based on the embodiment in the application, those of ordinary skill in the art are not making before creative workPut obtained every other embodiment, all should belong to the scope of the application's protection.

Referring to Fig. 1, the figure shows the stream of the embodiment of the audio recognition method of the application's mobile terminalJourney. This flow process comprises:

Step S101: receive the triggering message of the class of operation to be grasped to operating mobile terminal, described behaviourIt is the classification of dividing according to the business function of mobile terminal as classification;

Along with the development of information technology, mobile terminal not only only has traditional communication function, but alsoThere are many new business functions, such as, network retrieval, playing audio-video, play games etc. These are notThere is difference, the operation side that mobile terminal user realizes each business function in the character of same business functionFormula, operational order differ from one another. However, realize the common tool of various operations of same business functionHave general character, the present embodiment enters the various possible operation of mobile terminal in advance according to the difference of business functionRow category division. By the division of this class of operation make follow-up speech recognition process have clearly forProperty. The present embodiment does not limit division class of operation quantity and type out, as long as can meet actual answeringWith needs. Such as, can be according to the business function of mobile terminal itself and mobile terminal userThe scope of application marks off following classification: contact person's classification, and for name, the phone number of storing contactThe information such as code, personal characteristics can view having of this contact person in the time that speech recognition goes out certain contact personPass information, can call out this contact person, send note etc. to this contact person; Application category, forRecord the information relevant to application program such as program name, icon, memory location of application program, at voiceWhile identifying certain application program, can check the base attribute information of this application program, can be to this applicationProgram is carried out various operations: startup, unloading, deletion, renewal etc.; Music categories, for recording musicThe relevant informations such as name, Ge Shouming, album name can be checked about this in the time that speech recognition goes out certain musicThe base attribute information of music, can carry out various operations to this music: broadcasting, movement, deletion etc.;Webpage search classification, for realizing Webpage search function.

Step S102: receive voice key word information, determine voice keyword from voice key word information;

If mobile terminal user need to use voice to realize some control, the operation to mobile terminal,Can start speech recognition engine, make it in running order, in the time that needs carry out speech recognition, pass throughSpeech recognition engine receives voice key word information. The voice messaging that the present embodiment receives is to comprise with keyThe voice content that word is the theme can not be the general natural language that comprises complete sentence meaning. Such as, asFruit need to be made a phone call to Zhang, and the voice of prior art are: " making a phone call to Zhang ", and in this realityExecute under routine situation,, in the time that definite class of operation information is " contact person ", can directly say "So-and-so ", only need to provide the keyword of operation, just can control mobile terminal and realize corresponding operation.

Receive after voice key word information, need to from voice key word information, determine voice keyword.Mobile terminal user's voice messaging can not be only very accurately voice keyword conventionally, such as,May comprise some transition sound, tone sound etc., these voice belong to noise for speech recognition, needWill from voice key word information, be removed, therefrom extract voice keyword, this voice keyword is straightConnect corresponding to certain keyword in keywords database, and then corresponding certain operational order.

Step S103: according to the keywords database under class of operation item to be operated described in voice keyword retrieval,Return to result for retrieval;

Determine after voice keyword by abovementioned steps, utilize this keyword at class of operation to be operatedIn corresponding keywords database, retrieve, and return to result for retrieval. Getting after result for retrieval, canTrigger this result for retrieval and carry out the corresponding operating to mobile terminal.

It should be noted that: the step S101 in the present embodiment and S102 are passable in actual moving processParallel running or S102 step are in front S101 step rear, and the user of mobile terminal can be as frontDescribed first trigger class of operation to be operated, and then receive the voice keyword of user's input; Also canFirst to receive user's voice keyword, treat the triggering of the class of operation of operation reception user, orIn the time that treating the triggering of class of operation of operation, reception also receives voice key word information, between the twoCarrying out sequential does not affect the realization of the present application object, according to application needs, can select wherein to closeSuitable mode.

The present embodiment receives the triggering message of certain class of operation of dividing according to mobile terminal service functionAfter, receive voice key word information, from voice keyword, determine voice keyword, then according to voiceThe corresponding keywords database of keyword retrieval, and return to result for retrieval. Compared with existing speech recognition technology,The embodiment of the present application can obtain following technique effect:

(1), owing to class of operation being divided according to business function, make keywords database only with eachClass of operation correspondence, this be different from there is various different operating character comprising that existing speech recognition uses,Whole speech recognition library of mode, thus while making according to voice keyword retrieval, retrieval process object only limits toThe scope of the keywords database corresponding with the operation that will carry out mobile terminal, has reduced handling objectQuantity, the feature that the disposal ability that has adapted to mobile terminal is weak. Such as, existing voice identification storehouse comprises100 voice operating instructions, the present embodiment has carried out category division to these 100 voice operating instructions, willWherein be attributed to a classification for the instruction that realizes " contact person " function, this classification comprises 10 voice behaviourDo instruction, in the time that mobile terminal user only needs to carry out contact person's function, it will trigger under this classificationCarry out the retrieval of voice, only need in these 10 voice operating instructions, retrieve, therefore,The quantity of processing greatly reduces.

(2) quantity of the handling object relating to due to retrieval reduces, constant in the disposal ability of mobile terminalSituation under, the time that completes retrieving will shorten dramatically, and can provide in the short period of time and useThe corresponding result for retrieval of voice keyword of family input, thus the efficiency of speech recognition improved. Still withPrecedent describes, and supposes that the time of the each voice operating instruction of retrieval is 0.01s, of saying of userThe position of voice word is positioned at the 80th, according to existing voice recognition mode, by 100 languages above-mentionedIn sound operational order storehouse, carry out just finding this voice operating instruction after 80 retrieval couplings, the used time is 0.8s,If but retrieval matching operation is limited in 10 voice operating range of instructions that realize contact person's functionTime, also 0.1s only of maximum used time, has greatly shortened retrieval time as seen, thereby has improved speech recognitionEfficiency.

(3) quantity of the handling object relating to due to retrieval reduces the repetition and the ambiguity that make to occur keywordProbability reduce, thereby improved the accuracy of speech recognition. Such as, user has said " Zhang "This word, in above-mentioned 100 voice operating instructions, may find two " Zhangs ", one "So-and-so " be a contact person's storing on mobile terminal of user name, one " Zhang " usesA singer's who stores in the music libraries of family name, that is to say, this voice word exists and repeats and ambiguity,At this moment system by the user who does not know mobile terminal on earth to making a phone call to " Zhang " in telephone directory,Still need the song of " Zhang " in audition music storehouse, if acquiescence is selected the former, user is real soIdea may be to realize the latter; If acquiescence is selected the latter, the real idea of user may be to realize soThe former. But in the present embodiment, because user has specified class of operation in advance, if the classification of specifying is" contact person ", user says " Zhang ", is to want to take on the telephone with Zhang; If the classification of specifyingFor " music ", user says " Zhang ", is the song of wanting to listen Zhang, thereby can enters exactlyRow speech recognition operation.

(4) the present embodiment form with voice key word information in the time of receiving speech information receives, and is no longerCommon natural language, has avoided many words, long sentence and many, is more prone to from voice messaging on the one handExtract keyword, and then improved the efficiency of speech recognition; On the other hand by believing from voice keywordThe keyword extracting in breath mates to obtain with keywords database and returns results, and is conducive to improve speech recognitionAccuracy.

Mention in the aforementioned embodiment and need the triggering that receives the class of operation to be operated to mobile terminal to disappearBreath, in actual application, the mode that receives triggering message is varied. Such as, need userWhile using speech recognition engine operation to control mobile terminal, on mobile terminal screen, present a behaviourMake classification window, show various class of operation labels in this classification window, this class label can comprise:For realizing contact person's label of communication service function, for realizing the application program mark of Application Service FunctionSign, for realizing the music label of music business function, for realizing on-line search business functionWebpage search etc. In the time that user clicks in these class labels or Focal Point Shift to certain classWhen distinguishing label, will in system, produce a trigger event (triggering message), while monitoring this trigger eventCan think and receive the triggering message to class of operation. Also such as, when user is provided with application programWhen automatically updating function, in the time having there is the redaction of certain application program in discovering network, mobile terminalTo receive update notification, at this moment can be considered as " application program " this behaviour receiving this update notificationMake the triggering message of classification, thereby the phonetic order that can receive user is realized the renewal of application program or notUpgrade. In addition, except above-mentioned is considered as receiving class of operation based on certain touch-control event or network eventTriggering message outside, can also determine whether to receive to some usual action of mobile terminal based on userTo the triggering message of class of operation. Common action is placed into mobile phone in one's ear as user, this actionRepresent that user need to call out certain contact person, in this case, can think and receive " connectionBe people " classification. The detailed process of this triggering mode is as follows:

In the time that speech recognition engine initializes, obtain the sensor service of system, register a gravity sensorMonitor and the monitor of a range sensor, gravity sensor can provide acceleration of gravity threeThe component of individual dimension (x, y, z). In the time of mobile phone horizontal positioned, along the gravity acceleration value trend of z axleIn 9.8, and x, so the component of y axle trend and 0., voice HELPER APPLICATION Real-Time Monitoring gravity acceleratesDegree sensor return of value, (namely normally flat holding of user when mobile phone horizontal positioned or when tilting slightlyWhen mobile phone) component of z axle trends towards 7, and the return of value of judging distance sensor is non-zero simultaneously(namely before the range sensor of mobile phone, blocking without any object), meets above 2 conditions just initialChange whole flow process, and record initialization time. Distance in process before user takes in one's ear by mobile phoneSensor returns to non-zero value (without any shelter) all the time, and now state is working. When user is by handWhen machine is placed in one's ear, z axle now trend towards 2 (it should be noted that, can be at 0 to 4 at numerical valueIn acceleration of gravity unit, can meet the application's goal of the invention), the absolute value sum of x axle and y axle isTrend towards 7 (this value can in 4 to 10 scopes value), consider that x axle in one's ear placed by mobile phone by userHave the angle of an inclination, now the absolute value of x axle should be greater than 2, meet above condition andSystem is working state, and system mode will be set to WAIT_PROXI, and this state is waited for Distance-sensingDevice returns to 0 value (face blocks range sensor), once return to 0 value, start-up routine is carried out to call contactDial-up operation, if before range sensor returns to 0 value, from being initialised to the full mistake of WAIT_PROXIJourney exceeded for 2 seconds, will judge this action recognition failure. After call contact dial feature starts,User can directly call contact person's name, and system will be according to recognition result from mobile phone contact listRead qualified contact person, if there is the contact person of multiple couplings, system will be used by voice messageFamily, for example (1. Chen. Liu so-and-so), now user only need " 1 " or " 2 " can select dialMake Chen or king so-and-so, after user selects, system dials prompting user, and directlyConnect and dial to user-selected contact person. If only have a contact person, system will directly point out userDial and call.

Be not limited in the above-described embodiments after getting voice keyword and specifically how realize class of operationThe retrieval of the keywords database under other, although this does not affect the realization of the present application object. But,Same user uses in speech identifying function process long-term, must form certain and have regular habitBe used to, these customs can apply to the retrieving to keywords database. Such as, when often quilt of mobile terminalCarry out certain when operation, illustrate that to need user more frequent to the demand of this operation, at this moment, Ke YishePut a counter, record move terminal is being performed after certain operation the total degree that is performed of this operation (frequentlyInferior), using an attribute of this total degree keyword corresponding with this action in keywords database, in foundationWhen voice keyword is retrieved, according to the frequency size of keyword by the ordered retrieval keyword to school greatlyStorehouse, because user often carries out certain operation, the frequency of this operation is inevitable larger, must in keywords databaseSo forward, descending sorted order can obtain result for retrieval quickly. In addition can also move,After moving terminal is operated, meeting when pre-conditioned according to described operating result the language under class of operation itemSound keywords database upgrades. Such as, for increased a people at contacts list, so needMore new speech keywords database, inserts keywords database using the contact person of this increase as keyword, renewalTime can be to have increased a contact person at that time at every turn, can be also while restarting mobile phone at every turn, theseCan arrange according to actual conditions, in the time meeting default condition, trigger and upgrade operation.

Foregoing has described the embodiment of the method for the application's mobile terminal sound identification in detail, correspondingly,The application also provides a kind of device embodiment of mobile terminal sound identification. Referring to Fig. 2, the figure showsThe structured flowchart of the device of the application's mobile terminal sound identification. This device comprises: trigger message sinkUnit 201, voice key word information receiving element 202, voice keyword recognition unit 203 and keywordLibrary searching unit 204, wherein:

Trigger message sink unit 201, for receive mobile terminal is operated treat class of operation touchSend out message, described class of operation is the classification of dividing according to the business function of mobile terminal;

Voice key word information receiving element 202, for receiving voice key word information;

Voice keyword recognition unit 203, for determining voice keyword from voice key word information;

Keywords database retrieval unit 204, for treating under class of operation item according to described in voice keyword retrievalKeywords database, returns to result for retrieval.

The course of work of said apparatus embodiment is: trigger message sink unit 201 and receive mobile terminalThe triggering message for the treatment of class of operation operating; Voice key word information receiving element 202 receives voiceKey word information is determined voice key from voice key word information by voice keyword recognition unit 203Word; Then, treat class of operation item by keywords database retrieval unit 204 according to described in voice keyword retrievalUnder keywords database, return to result for retrieval.

This device embodiment receives the triggering of certain class of operation of dividing according to mobile terminal service functionAfter message, receive voice key word information, from voice keyword, determine voice keyword, then basisThe corresponding keywords database of voice keyword retrieval, and return to result for retrieval. With existing speech recognition technologyCompare, this device embodiment, owing to class of operation being divided according to business function, makes keywords databaseOnly corresponding with each class of operation, during on the one hand according to voice keyword retrieval retrieval process object only forIn with the keywords database corresponding to the operation of mobile terminal, reduced the quantity of handling object, adapted toThe feature that the disposal ability of mobile terminal is weak; Another aspect, the quantity of the handling object that retrieval relates to subtractsMake less the time shorten of retrieving, thereby improved the efficiency of speech recognition; On the one hand, retrieval relates to againAnd the quantity of handling object reduce and make to occur that the repetition of keyword and ambiguous probability reduce, thereby carryThe high accuracy of speech recognition. And, this device embodiment in the time of receiving speech information with voice keyThe form of word information receives, and is no longer common natural language, has avoided many words, long sentence and many, oneAspect is more prone to from voice messaging, extract keyword, and then has improved the efficiency of speech recognition; SeparatelyOn the one hand mate to obtain with keywords database by the keyword extracting from voice key word information and return to knotReally, be conducive to improve the accuracy of speech recognition.

In actual application, there is the mode of multiple trigger action classification, different modes is correspondingThe concrete structure that triggers message sink unit may be different. Provide two kinds of modes below, art technology peopleMember can know other implementation by inference based on these two kinds of modes:

One of mode: by pop-up window and receive user's click or the mode of Focal Point Shift is determined and connectReceive that class of operation triggers message. Under this mode, triggering message sink unit 201 can comprise: behaviourMake classification window and present subelement 2011 and trigger message sink subelement 2012, wherein:

Class of operation window presents subelement 2011, for present class of operation window on mobile terminal screenMouthful;

Trigger message sink subelement 2012, for a class of operation at described class of operation windowCorresponding label is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedOther triggers message.

Two of mode: the mode class confirmation of identifying user's operation by inductor receives class of operation and touchesSend out message. Under this mode, trigger message sink unit specifically comprise: snoop results judgment sub-unit andTrigger message sink subelement, wherein:

Described snoop results judgment sub-unit, for judging the gravity on the Z axis that the first monitor listens toWhether component of acceleration is whether the gravitational acceleration component in 2, X, Y-axis is 7, and the second monitorWhether the distance listening to is zero, and described X, Y-axis are the plane at mobile terminal panel place, described ZThe plane that axle forms perpendicular to X, Y-axis, described the first monitor is for receiving sensor service post-registrationThe monitor to gravity sensor, described the second monitor is to receive sensor server post-registrationThe adjust the distance monitor of sensor;

Described triggering message sink subelement, while being, determining and receives moving for being in judged resultThe triggering message of what moving terminal operated treat class of operation, described class of operation is contact person.

Under the second way, other functional units exist corresponding variation, i.e. voice key word informationReceiving element is specifically for receiving the voice key word information that comprises contact person, voice keyword recognition unitSpecifically for determine contact person's keyword from voice key word information, keyword retrieval unit is specifically usedAccording to described contact person's keyword retrieval contact library, return to the contact person who retrieves. Said apparatus is realExecute example and also comprise calling unit, for the contact person who retrieves described in calling out. Further, said apparatusEmbodiment also comprises contact person's numbered cell and numbering voice messaging receiving element, wherein: described contact personNumbered cell, for according to described contact person's keyword retrieval to contact person comprise when multiple, to oftenIndividual contact person is numbered; Described numbering voice messaging receiving element, for receiving numbering voice messaging,Described calling unit is specifically for calling out contact person corresponding to numbering voice messaging.

In addition, can also be based on some actual needs, to said apparatus embodiment carry out some distortion or etc.With replacing, to obtain more optimal technique effect. Such as, said apparatus embodiment also comprises keywordThe frequency increases unit, after being operated at mobile terminal, operates under its class of operation item described in increaseKeywords database in the frequency of corresponding keyword, described keywords database retrieval unit is specifically at rootWhen keywords database according to voice keyword retrieval under action-item, according to descending suitable of the keyword frequencyOrder search key storehouse. Can improve the speed of retrieval by increasing this unit. For another example, said apparatus is realExecute example and can also comprise keyword updating block 205, after being operated at mobile terminal, satisfied defaultWhen condition, according to described operating result, the keywords database under class of operation item is upgraded.

It should be noted that: easy for what narrate, above-described embodiment of this description and embodiment'sWhat various distortion implementations stressed is all and the difference of other embodiment or mode of texturing, eachBetween individual situation identical similar part mutually referring to. Especially, change for device the several of embodimentEnter mode, because it is substantially similar in appearance to embodiment of the method, so describe fairly simplely, be correlated with itPlace is referring to the part explanation of embodiment of the method. Each unit of device embodiment described above is passableOr can not also physically to separate, both can be positioned at a place, or also can be distributed toMultiple net environments. In actual application, can select according to the actual needs part whereinOr all the object of the present embodiment scheme is realized in unit, and those of ordinary skill in the art are not paying woundIn the situation of the property made work, be appreciated that and implement.

The above is only the application's detailed description of the invention, it should be pointed out that general for the artLogical technical staff, not departing under the prerequisite of the application's principle, can also make some improvement and profitDecorations, these improvements and modifications also should be considered as the application's protection domain.

Claims

1. an audio recognition method for mobile terminal, is characterized in that, the method comprises:

Receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operationFor the classification of dividing according to the business function of mobile terminal and mobile terminal user's the scope of application, described behaviourComprise as classification: contact person's classification, application category, music categories, Webpage search classification; Receive languageSound key word information is determined voice keyword from voice key word information;

According to the keywords database under class of operation item to be operated described in voice keyword retrieval, return to retrieval knotReally;

Judge whether the gravitational acceleration component on the Z axis that the first monitor listens to adds at 0 to 4 gravityWithin the scope of speed unit, whether the gravitational acceleration component in X, Y-axis is at 4 to 10 acceleration of gravity listsIn the scope of position, and whether the distance that the second monitor listens to be zero, and described X, Y-axis are mobile terminal faceThe plane at plate place, the plane that described Z axis forms perpendicular to X, Y-axis, described the first monitor is for receivingTo the monitor to gravity sensor of sensor service post-registration, described the second monitor is for receiving sensingThe monitor of the sensor of adjusting the distance of device server post-registration; To determine and receive movement if beThe triggering message of what terminal operated treat class of operation, described class of operation is contact person; Described receptionVoice key word information is determined voice keyword, according to voice keyword retrieval from voice key word informationThe described keywords database for the treatment of under class of operation item, returns to result for retrieval and comprises:

The voice key word information that reception comprises contact person is determined contact person and is closed from voice key word informationKeyword, according to described contact person's keyword retrieval contact library, returns to the contact person who retrieves and calls out this connectionBe people.

2. method according to claim 1, is characterized in that, described reception is grasped mobile terminalThe triggering message of the class of operation to be operated of doing specifically comprises:

On mobile terminal screen, present class of operation window, when an operation in described class of operation windowLabel corresponding to classification is clicked or while being defined as focus, determines and receives treating that mobile terminal is operatedThe triggering message of the class of operation of operation.

3. method according to claim 2, is characterized in that, the operation in described class of operation windowLabel corresponding to classification comprise contact person's label for realizing communication service function, for realizing applied businessThe application tags of function, for realizing the music label of music business function and/or existing for realizingThe Webpage search label of line search business function.

4. method according to claim 1, is characterized in that, when examining according to described contact person's keywordRope to contact person comprise when multiple, each contact person is numbered, receive numbering voice messaging, call outContact person corresponding to numbering voice messaging.

5. method according to claim 1, is characterized in that, after mobile terminal is operated, increasesThe frequency of corresponding keyword in the described keywords database operating under its class of operation item, according to voiceWhen the keywords database of keyword retrieval under action-item, close according to the ordered retrieval that the keyword frequency is descendingKeyword storehouse.

6. method according to claim 1, is characterized in that, after mobile terminal is operated, fullFoot upgrades the voice keywords database under class of operation item according to the result of described operation when pre-conditioned.

7. a speech recognition equipment for mobile terminal, is characterized in that, this device comprises: trigger message and connectReceive unit, voice key word information receiving element, voice keyword recognition unit and keywords database retrieval unit,Wherein:

Described triggering message sink unit, for receive mobile terminal is operated treat class of operation touchSend out message, described class of operation is according to the business function of mobile terminal and mobile terminal user's use modelEnclose the classification of division, described class of operation comprises: contact person's classification, and application category, music categories,Webpage search classification;

Described keywords database retrieval unit, for treating under class of operation item according to described in voice keyword retrievalKeywords database, returns to result for retrieval;

Described triggering message sink unit specifically comprises: snoop results judgment sub-unit and triggering message sinkUnit, wherein:

Described snoop results judgment sub-unit, for judging that the gravity on the Z axis that the first monitor listens to addsVelocity component whether in 0 to 4 acceleration of gravity unit's scope, the gravitational acceleration component in X, Y-axisWhether in 4 to 10 acceleration of gravity unit's scopes, and whether the distance that the second monitor listens to be zero,Described X, Y-axis are the plane at mobile terminal panel place, and it is flat that described Z axis forms perpendicular to X, Y-axisFace, described the first monitor is the monitor to gravity sensor that receives sensor service post-registration, instituteStating the second monitor is the monitor that receives the sensor of adjusting the distance of sensor server post-registration;

Described triggering message sink subelement, while being, determining and receives movement for being in judged resultThe triggering message of what terminal operated treat class of operation, described class of operation is contact person;

Described voice key word information receiving element is specifically for receiving the voice keyword letter that comprises contact personBreath, described voice keyword recognition unit specifically for determining contact person's key from voice key word informationWord, described keyword retrieval unit, specifically for according to described contact person's keyword retrieval contact library, returnsThe contact person who retrieves;

8. device according to claim 7, is characterized in that, described triggering message sink unit is concreteComprise: class of operation window presents subelement and triggers message sink subelement, wherein:

Described class of operation window presents subelement, for present class of operation window on mobile terminal screen;

Described triggering message sink subelement, for a class of operation pair at described class of operation windowThe label of answering is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedTrigger message.

9. device according to claim 7, is characterized in that, described device also comprises that contact person numbersUnit and numbering voice messaging receiving element, wherein: described contact person's numbered cell, for described in basisContact person's keyword retrieval to contact person comprise when multiple, each contact person is numbered; Described numberingVoice messaging receiving element, for receiving numbering voice messaging, described calling unit is specifically for calling out numberingThe contact person that voice messaging is corresponding.

10. device according to claim 7, is characterized in that, described device also comprises crucial word frequencyInferior increase unit, after being operated at mobile terminal, operates in the pass under its class of operation item described in increaseThe frequency of corresponding keyword in keyword storehouse, described keywords database retrieval unit is specifically for according to voiceWhen the keywords database of keyword retrieval under action-item, close according to the ordered retrieval that the keyword frequency is descendingKeyword storehouse.

11. devices according to claim 7, is characterized in that, described device also comprises keyword moreNew unit, after being operated at mobile terminal, meeting when pre-conditioned according to the result pair of described operationKeywords database under class of operation item upgrades.