CN103280217B - A kind of audio recognition method of mobile terminal and device thereof - Google Patents

A kind of audio recognition method of mobile terminal and device thereof Download PDF

Info

Publication number
CN103280217B
CN103280217B CN201310157943.0A CN201310157943A CN103280217B CN 103280217 B CN103280217 B CN 103280217B CN 201310157943 A CN201310157943 A CN 201310157943A CN 103280217 B CN103280217 B CN 103280217B
Authority
CN
China
Prior art keywords
class
voice
mobile terminal
keyword
contact person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310157943.0A
Other languages
Chinese (zh)
Other versions
CN103280217A (en
Inventor
罗永浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Hammer Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hammer Technology (beijing) Co Ltd filed Critical Hammer Technology (beijing) Co Ltd
Priority to CN201310157943.0A priority Critical patent/CN103280217B/en
Publication of CN103280217A publication Critical patent/CN103280217A/en
Priority to US14/787,926 priority patent/US9502035B2/en
Priority to PCT/CN2014/076180 priority patent/WO2014177015A1/en
Application granted granted Critical
Publication of CN103280217B publication Critical patent/CN103280217B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data
    • H04W8/183Processing at user equipment or user record carrier
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Abstract

The embodiment of the present application discloses a kind of audio recognition method of mobile terminal. The method comprises: receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operation is the classification of dividing according to the business function of mobile terminal; Receive voice key word information, from voice key word information, determine voice keyword; According to the keywords database under class of operation item to be operated described in voice keyword retrieval, return to result for retrieval. Disclosed herein as well is a kind of speech recognition equipment of mobile terminal. The embodiment of the present application can improve efficiency and the accuracy of speech recognition.

Description

A kind of audio recognition method of mobile terminal and device thereof
Technical field
The application relates to technical field of information processing, particularly a kind of speech recognition based on mobile terminalMethod and corresponding device thereof.
Background technology
The use of mobile terminal be unable to do without interactive process. More common people in intelligent mobile terminalMachine interactive mode is by the screen of finger touch mobile terminal, is responded to by the inductor that mobile terminal is built-inFinger to press information realization mutual. Along with Apple adds Siri voice in iPhone series of productsAfter assistant's function, man-machine interaction mode is touched and is changed to voice control by traditional physics, by people'sLanguage carrys out instruction mobile terminal and meets the task that user's needs are reached. This speech recognition process allow user withMeaning provides instruction with natural language form to voice assistant class software, and the relevant apparatus of mobile terminal receivesAfter this instruction, by voice assistant class software in this locality and/or cloud server carry out speech recognition and semantic pointAnalyse, and feed back according to the result of identification and analysis.
But, due to existing voice identification, the technology imperfection of particularly semantic analysis aspect, identification is accurateReally rate is lower, especially quite high for many words, long sentence, identification and the profiling error rate of many, identification andThe result needs frequent and that user is real of analyzing are far from each other, and user need to input repeatedly, constantly revisionIdentification and the result of analyzing, had a strong impact on the accuracy that the audio recognition method based on mobile terminal is identifiedAnd agility.
Summary of the invention
For solving the problems of the technologies described above, the embodiment of the present application provides a kind of speech recognition side of mobile terminalMethod and corresponding intrument thereof, to improve accuracy and the agility of the speech recognition based on mobile terminal.
The audio recognition method of the mobile terminal that the application provides comprises:
Receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operationClassification that Wei not divide according to the business function of mobile terminal and mobile terminal user's the scope of application; InstituteStating class of operation comprises: contact person's classification, application category, music categories, Webpage search classification;
Receive voice key word information, from voice key word information, determine voice keyword, according to voiceDescribed in keyword retrieval, treat the keywords database under class of operation item, return to result for retrieval;
What described reception operated mobile terminal treats that the triggering message of class of operation specifically comprises:
Judge that whether gravitational acceleration component on the Z axis that the first monitor listens to is at 0 to 4 gravityWithin the scope of unit of acceleration, whether the gravitational acceleration component in X, Y-axis accelerates at 4 to 10 gravityWithin the scope of degree unit, and whether the distance that the second monitor listens to be zero, and described X, Y-axis are for mobileThe plane at terminal panel place, the plane that described Z axis forms perpendicular to X, Y-axis, described first monitorsDevice is the monitor to gravity sensor that receives sensor service post-registration, and described the second monitor isReceive the monitor of the sensor of adjusting the distance of sensor server post-registration; To determine if beReceive the triggering message for the treatment of class of operation that mobile terminal is operated, described class of operation is contactPeople; Described reception voice key word information is determined voice keyword, root from voice key word informationTreat the keywords database under class of operation item according to described in voice keyword retrieval, return to result for retrieval and comprise:
The voice key word information that reception comprises contact person is determined contact person from voice key word informationKeyword, according to described contact person's keyword retrieval contact library, returns to the contact person who retrieves and calls outThis contact person.
The triggering message of the class of operation to be operated that preferably, described reception operates mobile terminalSpecifically comprise:
On mobile terminal screen, present class of operation window, as a behaviour in described class of operation windowMake label corresponding to classification clicked or while being defined as focus, determine to receive mobile terminal is operatedThe triggering message of class of operation to be operated.
Further preferably, label corresponding to class of operation in described class of operation window comprises for realityContact person's label of existing communication service function, for realizing application tags, the use of Application Service FunctionIn realizing the music label of music business function and/or for realizing the webpage of on-line search business functionSearch label.
Further preferably, when according to described contact person's keyword retrieval to contact person comprise when multiple,Each contact person is numbered, receives numbering voice messaging, call out contact corresponding to numbering voice messagingPeople.
Preferably, after mobile terminal is operated, operate in the key under its class of operation item described in increaseThe frequency of corresponding keyword in dictionary, according to voice keyword retrieval treat the keyword under action-item, according to the descending ordered retrieval keywords database of the keyword frequency when in the storehouse.
Preferably, after mobile terminal is operated, meeting when pre-conditioned according to described operating result pairVoice keywords database under class of operation item upgrades.
The speech recognition equipment of the mobile terminal that the application provides comprises: trigger message sink unit, voiceKey word information receiving element, voice keyword recognition unit and keywords database retrieval unit, wherein:
Described triggering message sink unit, for receiving the class of operation for the treatment of that mobile terminal is operatedTrigger message, described class of operation is making according to the business function of mobile terminal and mobile terminal userThe classification of dividing by scope, described class of operation comprises: contact person's classification, application category, musicClassification, Webpage search classification;
Described voice key word information receiving element, for receiving voice key word information;
Described voice keyword recognition unit, for determining voice keyword from voice key word information;
Described keywords database retrieval unit, for treating under class of operation item according to described in voice keyword retrievalKeywords database, return to result for retrieval;
Described triggering message sink unit specifically comprises: snoop results judgment sub-unit and triggering message sinkSubelement, wherein:
Described snoop results judgment sub-unit, for judging the gravity on the Z axis that the first monitor listens toComponent of acceleration whether in 0 to 4 acceleration of gravity unit's scope, the acceleration of gravity in X, Y-axisWhether component is in 4 to 10 acceleration of gravity unit's scopes, and the distance that the second monitor listens to isNo is zero, and described X, Y-axis are the plane at mobile terminal panel place, and described Z axis is perpendicular to X, YThe plane that axle forms, described the first monitor be receive sensor service post-registration to gravity sensorMonitor, described the second monitor is the sensor of adjusting the distance that receives sensor server post-registrationMonitor;
Described triggering message sink subelement, while being, determining and receives moving for being in judged resultThe triggering message of what moving terminal operated treat class of operation, described class of operation is contact person;
Described voice key word information receiving element is specifically for receiving the voice keyword letter that comprises contact personBreath, described voice keyword recognition unit closes specifically for determine contact person from voice key word informationKeyword, described keyword retrieval unit is specifically for according to described contact person's keyword retrieval contact library,Return to the contact person who retrieves;
Described device also comprises calling unit, for the contact person who retrieves described in calling out.
Preferably, described triggering message sink unit specifically comprises: class of operation window present subelement andTrigger message sink subelement, wherein:
Described class of operation window presents subelement, for present class of operation window on mobile terminal screenMouthful;
Described triggering message sink subelement, for a class of operation at described class of operation windowCorresponding label is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedOther triggers message.
Further preferably, described device also comprises that contact person's numbered cell and numbering voice messaging receive singleUnit, wherein: described contact person's numbered cell, for the connection arriving according to described contact person's keyword retrievalBe that people comprises when multiple, each contact person is numbered; Described numbering voice messaging receiving element, usesIn receiving numbering voice messaging, described calling unit is specifically for calling out contact corresponding to numbering voice messagingPeople.
Preferably, described device also comprises that the keyword frequency increases unit, for being operated at mobile terminalAfter, operate in the frequency of keyword corresponding in the keywords database under its class of operation item described in increase,Described keywords database retrieval unit specifically for according to voice keyword retrieval treat the keyword under action-item, according to the descending ordered retrieval keywords database of the keyword frequency when in the storehouse.
Preferably, described device also comprises keyword updating block, after being operated at mobile terminal,According to the result of described operation, the keywords database under class of operation item is carried out more meeting when pre-conditionedNewly.
The embodiment of the present application receives the triggering of certain class of operation of dividing according to mobile terminal service functionAfter message, receive voice key word information, from voice keyword, determine voice keyword, then basisThe corresponding keywords database of voice keyword retrieval, and return to result for retrieval. With existing speech recognition technologyCompare, the embodiment of the present application, owing to class of operation being divided according to business function, makes keywords databaseOnly corresponding with each class of operation, during on the one hand according to voice keyword retrieval retrieval process object only forIn with the keywords database corresponding to the operation of mobile terminal, reduced the quantity of handling object, adapted toThe feature that the disposal ability of mobile terminal is weak; Another aspect, the quantity of the handling object that retrieval relates to subtractsMake less the time shorten of retrieving, thereby improved the efficiency of speech recognition; On the one hand, retrieval relates to againAnd the quantity of handling object reduce and make to occur that the repetition of keyword and ambiguous probability reduce, thereby carryThe high accuracy of speech recognition. And, the embodiment of the present application in the time of receiving speech information with voice keyThe form of word information receives, and is no longer common natural language, has avoided many words, long sentence and many, oneAspect is more prone to from voice messaging, extract keyword, and then has improved the efficiency of speech recognition; SeparatelyOn the one hand mate to obtain with keywords database by the keyword extracting from voice key word information and return to knotReally, be conducive to improve the accuracy of speech recognition.
Brief description of the drawings
In order to be illustrated more clearly in the embodiment of the present application or technical scheme of the prior art, below will be to reality
The accompanying drawing of executing required use in example or description of the Prior Art is briefly described, apparently,
The accompanying drawing the following describes is only some embodiment that record in the application, common for this area
Technical staff, is not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings
Obtain other accompanying drawing.
Fig. 1 is the flow chart of an embodiment of the audio recognition method of the application's mobile terminal;
Fig. 2 is the structured flowchart of an embodiment of the speech recognition equipment of the application's mobile terminal.
Detailed description of the invention
In order to make those skilled in the art person understand better the technical scheme in the application, below in conjunction withAccompanying drawing in the embodiment of the present application, retouches clearly and completely to the technical scheme in the embodiment of the present applicationState, obviously, described embodiment is only some embodiments of the present application, instead of whole enforcementExample. Based on the embodiment in the application, those of ordinary skill in the art are not making before creative workPut obtained every other embodiment, all should belong to the scope of the application's protection.
Referring to Fig. 1, the figure shows the stream of the embodiment of the audio recognition method of the application's mobile terminalJourney. This flow process comprises:
Step S101: receive the triggering message of the class of operation to be grasped to operating mobile terminal, described behaviourIt is the classification of dividing according to the business function of mobile terminal as classification;
Along with the development of information technology, mobile terminal not only only has traditional communication function, but alsoThere are many new business functions, such as, network retrieval, playing audio-video, play games etc. These are notThere is difference, the operation side that mobile terminal user realizes each business function in the character of same business functionFormula, operational order differ from one another. However, realize the common tool of various operations of same business functionHave general character, the present embodiment enters the various possible operation of mobile terminal in advance according to the difference of business functionRow category division. By the division of this class of operation make follow-up speech recognition process have clearly forProperty. The present embodiment does not limit division class of operation quantity and type out, as long as can meet actual answeringWith needs. Such as, can be according to the business function of mobile terminal itself and mobile terminal userThe scope of application marks off following classification: contact person's classification, and for name, the phone number of storing contactThe information such as code, personal characteristics can view having of this contact person in the time that speech recognition goes out certain contact personPass information, can call out this contact person, send note etc. to this contact person; Application category, forRecord the information relevant to application program such as program name, icon, memory location of application program, at voiceWhile identifying certain application program, can check the base attribute information of this application program, can be to this applicationProgram is carried out various operations: startup, unloading, deletion, renewal etc.; Music categories, for recording musicThe relevant informations such as name, Ge Shouming, album name can be checked about this in the time that speech recognition goes out certain musicThe base attribute information of music, can carry out various operations to this music: broadcasting, movement, deletion etc.;Webpage search classification, for realizing Webpage search function.
Step S102: receive voice key word information, determine voice keyword from voice key word information;
If mobile terminal user need to use voice to realize some control, the operation to mobile terminal,Can start speech recognition engine, make it in running order, in the time that needs carry out speech recognition, pass throughSpeech recognition engine receives voice key word information. The voice messaging that the present embodiment receives is to comprise with keyThe voice content that word is the theme can not be the general natural language that comprises complete sentence meaning. Such as, asFruit need to be made a phone call to Zhang, and the voice of prior art are: " making a phone call to Zhang ", and in this realityExecute under routine situation,, in the time that definite class of operation information is " contact person ", can directly say "So-and-so ", only need to provide the keyword of operation, just can control mobile terminal and realize corresponding operation.
Receive after voice key word information, need to from voice key word information, determine voice keyword.Mobile terminal user's voice messaging can not be only very accurately voice keyword conventionally, such as,May comprise some transition sound, tone sound etc., these voice belong to noise for speech recognition, needWill from voice key word information, be removed, therefrom extract voice keyword, this voice keyword is straightConnect corresponding to certain keyword in keywords database, and then corresponding certain operational order.
Step S103: according to the keywords database under class of operation item to be operated described in voice keyword retrieval,Return to result for retrieval;
Determine after voice keyword by abovementioned steps, utilize this keyword at class of operation to be operatedIn corresponding keywords database, retrieve, and return to result for retrieval. Getting after result for retrieval, canTrigger this result for retrieval and carry out the corresponding operating to mobile terminal.
It should be noted that: the step S101 in the present embodiment and S102 are passable in actual moving processParallel running or S102 step are in front S101 step rear, and the user of mobile terminal can be as frontDescribed first trigger class of operation to be operated, and then receive the voice keyword of user's input; Also canFirst to receive user's voice keyword, treat the triggering of the class of operation of operation reception user, orIn the time that treating the triggering of class of operation of operation, reception also receives voice key word information, between the twoCarrying out sequential does not affect the realization of the present application object, according to application needs, can select wherein to closeSuitable mode.
The present embodiment receives the triggering message of certain class of operation of dividing according to mobile terminal service functionAfter, receive voice key word information, from voice keyword, determine voice keyword, then according to voiceThe corresponding keywords database of keyword retrieval, and return to result for retrieval. Compared with existing speech recognition technology,The embodiment of the present application can obtain following technique effect:
(1), owing to class of operation being divided according to business function, make keywords database only with eachClass of operation correspondence, this be different from there is various different operating character comprising that existing speech recognition uses,Whole speech recognition library of mode, thus while making according to voice keyword retrieval, retrieval process object only limits toThe scope of the keywords database corresponding with the operation that will carry out mobile terminal, has reduced handling objectQuantity, the feature that the disposal ability that has adapted to mobile terminal is weak. Such as, existing voice identification storehouse comprises100 voice operating instructions, the present embodiment has carried out category division to these 100 voice operating instructions, willWherein be attributed to a classification for the instruction that realizes " contact person " function, this classification comprises 10 voice behaviourDo instruction, in the time that mobile terminal user only needs to carry out contact person's function, it will trigger under this classificationCarry out the retrieval of voice, only need in these 10 voice operating instructions, retrieve, therefore,The quantity of processing greatly reduces.
(2) quantity of the handling object relating to due to retrieval reduces, constant in the disposal ability of mobile terminalSituation under, the time that completes retrieving will shorten dramatically, and can provide in the short period of time and useThe corresponding result for retrieval of voice keyword of family input, thus the efficiency of speech recognition improved. Still withPrecedent describes, and supposes that the time of the each voice operating instruction of retrieval is 0.01s, of saying of userThe position of voice word is positioned at the 80th, according to existing voice recognition mode, by 100 languages above-mentionedIn sound operational order storehouse, carry out just finding this voice operating instruction after 80 retrieval couplings, the used time is 0.8s,If but retrieval matching operation is limited in 10 voice operating range of instructions that realize contact person's functionTime, also 0.1s only of maximum used time, has greatly shortened retrieval time as seen, thereby has improved speech recognitionEfficiency.
(3) quantity of the handling object relating to due to retrieval reduces the repetition and the ambiguity that make to occur keywordProbability reduce, thereby improved the accuracy of speech recognition. Such as, user has said " Zhang "This word, in above-mentioned 100 voice operating instructions, may find two " Zhangs ", one "So-and-so " be a contact person's storing on mobile terminal of user name, one " Zhang " usesA singer's who stores in the music libraries of family name, that is to say, this voice word exists and repeats and ambiguity,At this moment system by the user who does not know mobile terminal on earth to making a phone call to " Zhang " in telephone directory,Still need the song of " Zhang " in audition music storehouse, if acquiescence is selected the former, user is real soIdea may be to realize the latter; If acquiescence is selected the latter, the real idea of user may be to realize soThe former. But in the present embodiment, because user has specified class of operation in advance, if the classification of specifying is" contact person ", user says " Zhang ", is to want to take on the telephone with Zhang; If the classification of specifyingFor " music ", user says " Zhang ", is the song of wanting to listen Zhang, thereby can enters exactlyRow speech recognition operation.
(4) the present embodiment form with voice key word information in the time of receiving speech information receives, and is no longerCommon natural language, has avoided many words, long sentence and many, is more prone to from voice messaging on the one handExtract keyword, and then improved the efficiency of speech recognition; On the other hand by believing from voice keywordThe keyword extracting in breath mates to obtain with keywords database and returns results, and is conducive to improve speech recognitionAccuracy.
Mention in the aforementioned embodiment and need the triggering that receives the class of operation to be operated to mobile terminal to disappearBreath, in actual application, the mode that receives triggering message is varied. Such as, need userWhile using speech recognition engine operation to control mobile terminal, on mobile terminal screen, present a behaviourMake classification window, show various class of operation labels in this classification window, this class label can comprise:For realizing contact person's label of communication service function, for realizing the application program mark of Application Service FunctionSign, for realizing the music label of music business function, for realizing on-line search business functionWebpage search etc. In the time that user clicks in these class labels or Focal Point Shift to certain classWhen distinguishing label, will in system, produce a trigger event (triggering message), while monitoring this trigger eventCan think and receive the triggering message to class of operation. Also such as, when user is provided with application programWhen automatically updating function, in the time having there is the redaction of certain application program in discovering network, mobile terminalTo receive update notification, at this moment can be considered as " application program " this behaviour receiving this update notificationMake the triggering message of classification, thereby the phonetic order that can receive user is realized the renewal of application program or notUpgrade. In addition, except above-mentioned is considered as receiving class of operation based on certain touch-control event or network eventTriggering message outside, can also determine whether to receive to some usual action of mobile terminal based on userTo the triggering message of class of operation. Common action is placed into mobile phone in one's ear as user, this actionRepresent that user need to call out certain contact person, in this case, can think and receive " connectionBe people " classification. The detailed process of this triggering mode is as follows:
In the time that speech recognition engine initializes, obtain the sensor service of system, register a gravity sensorMonitor and the monitor of a range sensor, gravity sensor can provide acceleration of gravity threeThe component of individual dimension (x, y, z). In the time of mobile phone horizontal positioned, along the gravity acceleration value trend of z axleIn 9.8, and x, so the component of y axle trend and 0., voice HELPER APPLICATION Real-Time Monitoring gravity acceleratesDegree sensor return of value, (namely normally flat holding of user when mobile phone horizontal positioned or when tilting slightlyWhen mobile phone) component of z axle trends towards 7, and the return of value of judging distance sensor is non-zero simultaneously(namely before the range sensor of mobile phone, blocking without any object), meets above 2 conditions just initialChange whole flow process, and record initialization time. Distance in process before user takes in one's ear by mobile phoneSensor returns to non-zero value (without any shelter) all the time, and now state is working. When user is by handWhen machine is placed in one's ear, z axle now trend towards 2 (it should be noted that, can be at 0 to 4 at numerical valueIn acceleration of gravity unit, can meet the application's goal of the invention), the absolute value sum of x axle and y axle isTrend towards 7 (this value can in 4 to 10 scopes value), consider that x axle in one's ear placed by mobile phone by userHave the angle of an inclination, now the absolute value of x axle should be greater than 2, meet above condition andSystem is working state, and system mode will be set to WAIT_PROXI, and this state is waited for Distance-sensingDevice returns to 0 value (face blocks range sensor), once return to 0 value, start-up routine is carried out to call contactDial-up operation, if before range sensor returns to 0 value, from being initialised to the full mistake of WAIT_PROXIJourney exceeded for 2 seconds, will judge this action recognition failure. After call contact dial feature starts,User can directly call contact person's name, and system will be according to recognition result from mobile phone contact listRead qualified contact person, if there is the contact person of multiple couplings, system will be used by voice messageFamily, for example (1. Chen. Liu so-and-so), now user only need " 1 " or " 2 " can select dialMake Chen or king so-and-so, after user selects, system dials prompting user, and directlyConnect and dial to user-selected contact person. If only have a contact person, system will directly point out userDial and call.
Be not limited in the above-described embodiments after getting voice keyword and specifically how realize class of operationThe retrieval of the keywords database under other, although this does not affect the realization of the present application object. But,Same user uses in speech identifying function process long-term, must form certain and have regular habitBe used to, these customs can apply to the retrieving to keywords database. Such as, when often quilt of mobile terminalCarry out certain when operation, illustrate that to need user more frequent to the demand of this operation, at this moment, Ke YishePut a counter, record move terminal is being performed after certain operation the total degree that is performed of this operation (frequentlyInferior), using an attribute of this total degree keyword corresponding with this action in keywords database, in foundationWhen voice keyword is retrieved, according to the frequency size of keyword by the ordered retrieval keyword to school greatlyStorehouse, because user often carries out certain operation, the frequency of this operation is inevitable larger, must in keywords databaseSo forward, descending sorted order can obtain result for retrieval quickly. In addition can also move,After moving terminal is operated, meeting when pre-conditioned according to described operating result the language under class of operation itemSound keywords database upgrades. Such as, for increased a people at contacts list, so needMore new speech keywords database, inserts keywords database using the contact person of this increase as keyword, renewalTime can be to have increased a contact person at that time at every turn, can be also while restarting mobile phone at every turn, theseCan arrange according to actual conditions, in the time meeting default condition, trigger and upgrade operation.
Foregoing has described the embodiment of the method for the application's mobile terminal sound identification in detail, correspondingly,The application also provides a kind of device embodiment of mobile terminal sound identification. Referring to Fig. 2, the figure showsThe structured flowchart of the device of the application's mobile terminal sound identification. This device comprises: trigger message sinkUnit 201, voice key word information receiving element 202, voice keyword recognition unit 203 and keywordLibrary searching unit 204, wherein:
Trigger message sink unit 201, for receive mobile terminal is operated treat class of operation touchSend out message, described class of operation is the classification of dividing according to the business function of mobile terminal;
Voice key word information receiving element 202, for receiving voice key word information;
Voice keyword recognition unit 203, for determining voice keyword from voice key word information;
Keywords database retrieval unit 204, for treating under class of operation item according to described in voice keyword retrievalKeywords database, returns to result for retrieval.
The course of work of said apparatus embodiment is: trigger message sink unit 201 and receive mobile terminalThe triggering message for the treatment of class of operation operating; Voice key word information receiving element 202 receives voiceKey word information is determined voice key from voice key word information by voice keyword recognition unit 203Word; Then, treat class of operation item by keywords database retrieval unit 204 according to described in voice keyword retrievalUnder keywords database, return to result for retrieval.
This device embodiment receives the triggering of certain class of operation of dividing according to mobile terminal service functionAfter message, receive voice key word information, from voice keyword, determine voice keyword, then basisThe corresponding keywords database of voice keyword retrieval, and return to result for retrieval. With existing speech recognition technologyCompare, this device embodiment, owing to class of operation being divided according to business function, makes keywords databaseOnly corresponding with each class of operation, during on the one hand according to voice keyword retrieval retrieval process object only forIn with the keywords database corresponding to the operation of mobile terminal, reduced the quantity of handling object, adapted toThe feature that the disposal ability of mobile terminal is weak; Another aspect, the quantity of the handling object that retrieval relates to subtractsMake less the time shorten of retrieving, thereby improved the efficiency of speech recognition; On the one hand, retrieval relates to againAnd the quantity of handling object reduce and make to occur that the repetition of keyword and ambiguous probability reduce, thereby carryThe high accuracy of speech recognition. And, this device embodiment in the time of receiving speech information with voice keyThe form of word information receives, and is no longer common natural language, has avoided many words, long sentence and many, oneAspect is more prone to from voice messaging, extract keyword, and then has improved the efficiency of speech recognition; SeparatelyOn the one hand mate to obtain with keywords database by the keyword extracting from voice key word information and return to knotReally, be conducive to improve the accuracy of speech recognition.
In actual application, there is the mode of multiple trigger action classification, different modes is correspondingThe concrete structure that triggers message sink unit may be different. Provide two kinds of modes below, art technology peopleMember can know other implementation by inference based on these two kinds of modes:
One of mode: by pop-up window and receive user's click or the mode of Focal Point Shift is determined and connectReceive that class of operation triggers message. Under this mode, triggering message sink unit 201 can comprise: behaviourMake classification window and present subelement 2011 and trigger message sink subelement 2012, wherein:
Class of operation window presents subelement 2011, for present class of operation window on mobile terminal screenMouthful;
Trigger message sink subelement 2012, for a class of operation at described class of operation windowCorresponding label is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedOther triggers message.
Two of mode: the mode class confirmation of identifying user's operation by inductor receives class of operation and touchesSend out message. Under this mode, trigger message sink unit specifically comprise: snoop results judgment sub-unit andTrigger message sink subelement, wherein:
Described snoop results judgment sub-unit, for judging the gravity on the Z axis that the first monitor listens toWhether component of acceleration is whether the gravitational acceleration component in 2, X, Y-axis is 7, and the second monitorWhether the distance listening to is zero, and described X, Y-axis are the plane at mobile terminal panel place, described ZThe plane that axle forms perpendicular to X, Y-axis, described the first monitor is for receiving sensor service post-registrationThe monitor to gravity sensor, described the second monitor is to receive sensor server post-registrationThe adjust the distance monitor of sensor;
Described triggering message sink subelement, while being, determining and receives moving for being in judged resultThe triggering message of what moving terminal operated treat class of operation, described class of operation is contact person.
Under the second way, other functional units exist corresponding variation, i.e. voice key word informationReceiving element is specifically for receiving the voice key word information that comprises contact person, voice keyword recognition unitSpecifically for determine contact person's keyword from voice key word information, keyword retrieval unit is specifically usedAccording to described contact person's keyword retrieval contact library, return to the contact person who retrieves. Said apparatus is realExecute example and also comprise calling unit, for the contact person who retrieves described in calling out. Further, said apparatusEmbodiment also comprises contact person's numbered cell and numbering voice messaging receiving element, wherein: described contact personNumbered cell, for according to described contact person's keyword retrieval to contact person comprise when multiple, to oftenIndividual contact person is numbered; Described numbering voice messaging receiving element, for receiving numbering voice messaging,Described calling unit is specifically for calling out contact person corresponding to numbering voice messaging.
In addition, can also be based on some actual needs, to said apparatus embodiment carry out some distortion or etc.With replacing, to obtain more optimal technique effect. Such as, said apparatus embodiment also comprises keywordThe frequency increases unit, after being operated at mobile terminal, operates under its class of operation item described in increaseKeywords database in the frequency of corresponding keyword, described keywords database retrieval unit is specifically at rootWhen keywords database according to voice keyword retrieval under action-item, according to descending suitable of the keyword frequencyOrder search key storehouse. Can improve the speed of retrieval by increasing this unit. For another example, said apparatus is realExecute example and can also comprise keyword updating block 205, after being operated at mobile terminal, satisfied defaultWhen condition, according to described operating result, the keywords database under class of operation item is upgraded.
It should be noted that: easy for what narrate, above-described embodiment of this description and embodiment'sWhat various distortion implementations stressed is all and the difference of other embodiment or mode of texturing, eachBetween individual situation identical similar part mutually referring to. Especially, change for device the several of embodimentEnter mode, because it is substantially similar in appearance to embodiment of the method, so describe fairly simplely, be correlated with itPlace is referring to the part explanation of embodiment of the method. Each unit of device embodiment described above is passableOr can not also physically to separate, both can be positioned at a place, or also can be distributed toMultiple net environments. In actual application, can select according to the actual needs part whereinOr all the object of the present embodiment scheme is realized in unit, and those of ordinary skill in the art are not paying woundIn the situation of the property made work, be appreciated that and implement.
The above is only the application's detailed description of the invention, it should be pointed out that general for the artLogical technical staff, not departing under the prerequisite of the application's principle, can also make some improvement and profitDecorations, these improvements and modifications also should be considered as the application's protection domain.

Claims (11)

1. an audio recognition method for mobile terminal, is characterized in that, the method comprises:
Receive the triggering message of the class of operation to be operated that mobile terminal is operated, described class of operationFor the classification of dividing according to the business function of mobile terminal and mobile terminal user's the scope of application, described behaviourComprise as classification: contact person's classification, application category, music categories, Webpage search classification; Receive languageSound key word information is determined voice keyword from voice key word information;
According to the keywords database under class of operation item to be operated described in voice keyword retrieval, return to retrieval knotReally;
What described reception operated mobile terminal treats that the triggering message of class of operation specifically comprises:
Judge whether the gravitational acceleration component on the Z axis that the first monitor listens to adds at 0 to 4 gravityWithin the scope of speed unit, whether the gravitational acceleration component in X, Y-axis is at 4 to 10 acceleration of gravity listsIn the scope of position, and whether the distance that the second monitor listens to be zero, and described X, Y-axis are mobile terminal faceThe plane at plate place, the plane that described Z axis forms perpendicular to X, Y-axis, described the first monitor is for receivingTo the monitor to gravity sensor of sensor service post-registration, described the second monitor is for receiving sensingThe monitor of the sensor of adjusting the distance of device server post-registration; To determine and receive movement if beThe triggering message of what terminal operated treat class of operation, described class of operation is contact person; Described receptionVoice key word information is determined voice keyword, according to voice keyword retrieval from voice key word informationThe described keywords database for the treatment of under class of operation item, returns to result for retrieval and comprises:
The voice key word information that reception comprises contact person is determined contact person and is closed from voice key word informationKeyword, according to described contact person's keyword retrieval contact library, returns to the contact person who retrieves and calls out this connectionBe people.
2. method according to claim 1, is characterized in that, described reception is grasped mobile terminalThe triggering message of the class of operation to be operated of doing specifically comprises:
On mobile terminal screen, present class of operation window, when an operation in described class of operation windowLabel corresponding to classification is clicked or while being defined as focus, determines and receives treating that mobile terminal is operatedThe triggering message of the class of operation of operation.
3. method according to claim 2, is characterized in that, the operation in described class of operation windowLabel corresponding to classification comprise contact person's label for realizing communication service function, for realizing applied businessThe application tags of function, for realizing the music label of music business function and/or existing for realizingThe Webpage search label of line search business function.
4. method according to claim 1, is characterized in that, when examining according to described contact person's keywordRope to contact person comprise when multiple, each contact person is numbered, receive numbering voice messaging, call outContact person corresponding to numbering voice messaging.
5. method according to claim 1, is characterized in that, after mobile terminal is operated, increasesThe frequency of corresponding keyword in the described keywords database operating under its class of operation item, according to voiceWhen the keywords database of keyword retrieval under action-item, close according to the ordered retrieval that the keyword frequency is descendingKeyword storehouse.
6. method according to claim 1, is characterized in that, after mobile terminal is operated, fullFoot upgrades the voice keywords database under class of operation item according to the result of described operation when pre-conditioned.
7. a speech recognition equipment for mobile terminal, is characterized in that, this device comprises: trigger message and connectReceive unit, voice key word information receiving element, voice keyword recognition unit and keywords database retrieval unit,Wherein:
Described triggering message sink unit, for receive mobile terminal is operated treat class of operation touchSend out message, described class of operation is according to the business function of mobile terminal and mobile terminal user's use modelEnclose the classification of division, described class of operation comprises: contact person's classification, and application category, music categories,Webpage search classification;
Described voice key word information receiving element, for receiving voice key word information;
Described voice keyword recognition unit, for determining voice keyword from voice key word information;
Described keywords database retrieval unit, for treating under class of operation item according to described in voice keyword retrievalKeywords database, returns to result for retrieval;
Described triggering message sink unit specifically comprises: snoop results judgment sub-unit and triggering message sinkUnit, wherein:
Described snoop results judgment sub-unit, for judging that the gravity on the Z axis that the first monitor listens to addsVelocity component whether in 0 to 4 acceleration of gravity unit's scope, the gravitational acceleration component in X, Y-axisWhether in 4 to 10 acceleration of gravity unit's scopes, and whether the distance that the second monitor listens to be zero,Described X, Y-axis are the plane at mobile terminal panel place, and it is flat that described Z axis forms perpendicular to X, Y-axisFace, described the first monitor is the monitor to gravity sensor that receives sensor service post-registration, instituteStating the second monitor is the monitor that receives the sensor of adjusting the distance of sensor server post-registration;
Described triggering message sink subelement, while being, determining and receives movement for being in judged resultThe triggering message of what terminal operated treat class of operation, described class of operation is contact person;
Described voice key word information receiving element is specifically for receiving the voice keyword letter that comprises contact personBreath, described voice keyword recognition unit specifically for determining contact person's key from voice key word informationWord, described keyword retrieval unit, specifically for according to described contact person's keyword retrieval contact library, returnsThe contact person who retrieves;
Described device also comprises calling unit, for the contact person who retrieves described in calling out.
8. device according to claim 7, is characterized in that, described triggering message sink unit is concreteComprise: class of operation window presents subelement and triggers message sink subelement, wherein:
Described class of operation window presents subelement, for present class of operation window on mobile terminal screen;
Described triggering message sink subelement, for a class of operation pair at described class of operation windowThe label of answering is clicked or while being defined as focus, receives the class of operation for the treatment of that mobile terminal is operatedTrigger message.
9. device according to claim 7, is characterized in that, described device also comprises that contact person numbersUnit and numbering voice messaging receiving element, wherein: described contact person's numbered cell, for described in basisContact person's keyword retrieval to contact person comprise when multiple, each contact person is numbered; Described numberingVoice messaging receiving element, for receiving numbering voice messaging, described calling unit is specifically for calling out numberingThe contact person that voice messaging is corresponding.
10. device according to claim 7, is characterized in that, described device also comprises crucial word frequencyInferior increase unit, after being operated at mobile terminal, operates in the pass under its class of operation item described in increaseThe frequency of corresponding keyword in keyword storehouse, described keywords database retrieval unit is specifically for according to voiceWhen the keywords database of keyword retrieval under action-item, close according to the ordered retrieval that the keyword frequency is descendingKeyword storehouse.
11. devices according to claim 7, is characterized in that, described device also comprises keyword moreNew unit, after being operated at mobile terminal, meeting when pre-conditioned according to the result pair of described operationKeywords database under class of operation item upgrades.
CN201310157943.0A 2013-05-02 2013-05-02 A kind of audio recognition method of mobile terminal and device thereof Active CN103280217B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201310157943.0A CN103280217B (en) 2013-05-02 2013-05-02 A kind of audio recognition method of mobile terminal and device thereof
US14/787,926 US9502035B2 (en) 2013-05-02 2014-04-25 Voice recognition method for mobile terminal and device thereof
PCT/CN2014/076180 WO2014177015A1 (en) 2013-05-02 2014-04-25 Voice recognition method for mobile terminal and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310157943.0A CN103280217B (en) 2013-05-02 2013-05-02 A kind of audio recognition method of mobile terminal and device thereof

Publications (2)

Publication Number Publication Date
CN103280217A CN103280217A (en) 2013-09-04
CN103280217B true CN103280217B (en) 2016-05-04

Family

ID=49062712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310157943.0A Active CN103280217B (en) 2013-05-02 2013-05-02 A kind of audio recognition method of mobile terminal and device thereof

Country Status (3)

Country Link
US (1) US9502035B2 (en)
CN (1) CN103280217B (en)
WO (1) WO2014177015A1 (en)

Families Citing this family (126)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US20120311585A1 (en) 2011-06-03 2012-12-06 Apple Inc. Organizing task items that represent tasks to perform
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US10417037B2 (en) 2012-05-15 2019-09-17 Apple Inc. Systems and methods for integrating third party services with a digital assistant
BR112015018905B1 (en) 2013-02-07 2022-02-22 Apple Inc Voice activation feature operation method, computer readable storage media and electronic device
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US10748529B1 (en) 2013-03-15 2020-08-18 Apple Inc. Voice activated device for use with a voice-based digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
CN105264524B (en) 2013-06-09 2019-08-02 苹果公司 For realizing the equipment, method and graphic user interface of the session continuity of two or more examples across digital assistants
CN103455642B (en) * 2013-10-10 2017-03-08 三星电子(中国)研发中心 A kind of method and apparatus of multimedia document retrieval
CN103578474B (en) * 2013-10-25 2017-09-12 小米科技有限责任公司 A kind of sound control method, device and equipment
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
EP3149728B1 (en) 2014-05-30 2019-01-16 Apple Inc. Multi-command single utterance input method
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
CN105407316B (en) * 2014-08-19 2019-05-31 北京奇虎科技有限公司 Implementation method, intelligent camera system and the IP Camera of intelligent camera system
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
CN105991827A (en) * 2015-02-11 2016-10-05 中兴通讯股份有限公司 Call processing method and call processing device
US10152299B2 (en) 2015-03-06 2018-12-11 Apple Inc. Reducing response latency of intelligent automated assistants
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
KR102390853B1 (en) 2015-03-26 2022-04-27 삼성전자주식회사 Method and electronic device for providing content
US10460227B2 (en) 2015-05-15 2019-10-29 Apple Inc. Virtual assistant in a communication session
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10200824B2 (en) 2015-05-27 2019-02-05 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device
US9578173B2 (en) 2015-06-05 2017-02-21 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
CN106328129B (en) * 2015-06-18 2020-11-27 中兴通讯股份有限公司 Instruction processing method and device
US20160378747A1 (en) 2015-06-29 2016-12-29 Apple Inc. Virtual assistant for media playback
CN105161099B (en) * 2015-08-12 2019-11-26 恬家(上海)信息科技有限公司 A kind of remote control device and its implementation of voice control
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10740384B2 (en) 2015-09-08 2020-08-11 Apple Inc. Intelligent automated assistant for media search and playback
US10331312B2 (en) 2015-09-08 2019-06-25 Apple Inc. Intelligent automated assistant in a media environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
CN105426357A (en) * 2015-11-06 2016-03-23 武汉卡比特信息有限公司 Fast voice selection method
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
CN105450822A (en) * 2015-11-11 2016-03-30 百度在线网络技术(北京)有限公司 Intelligent voice interaction method and device
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
CN107025046A (en) * 2016-01-29 2017-08-08 阿里巴巴集团控股有限公司 Terminal applies voice operating method and system
CN106098066B (en) * 2016-06-02 2020-01-17 深圳市智物联网络有限公司 Voice recognition method and device
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
CN107799115A (en) * 2016-08-29 2018-03-13 法乐第(北京)网络科技有限公司 A kind of audio recognition method and device
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
CN106683669A (en) * 2016-11-23 2017-05-17 河池学院 Robot speech control system
CN106603826A (en) * 2016-11-29 2017-04-26 维沃移动通信有限公司 Application event processing method and mobile terminal
CN106844484B (en) * 2016-12-23 2020-08-28 北京安云世纪科技有限公司 Information searching method and device and mobile terminal
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
KR102398390B1 (en) * 2017-03-22 2022-05-16 삼성전자주식회사 Electronic device and controlling method thereof
KR102068182B1 (en) * 2017-04-21 2020-01-20 엘지전자 주식회사 Voice recognition apparatus and home appliance system
CN107038052A (en) * 2017-04-28 2017-08-11 陈银芳 The method and terminal of voice uninstall file
CN108874797B (en) * 2017-05-08 2020-07-03 北京字节跳动网络技术有限公司 Voice processing method and device
DK201770383A1 (en) 2017-05-09 2018-12-14 Apple Inc. User interface for correcting recognition errors
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
DK180048B1 (en) 2017-05-11 2020-02-04 Apple Inc. MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
DK201770429A1 (en) 2017-05-12 2018-12-14 Apple Inc. Low-latency intelligent automated assistant
DK179549B1 (en) 2017-05-16 2019-02-12 Apple Inc. Far-field extension for digital assistant services
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US20180336892A1 (en) 2017-05-16 2018-11-22 Apple Inc. Detecting a trigger of a digital assistant
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
CN107564517A (en) * 2017-07-05 2018-01-09 百度在线网络技术(北京)有限公司 Voice awakening method, equipment and system, cloud server and computer-readable recording medium
CN107731231B (en) * 2017-09-15 2020-08-14 瑞芯微电子股份有限公司 Method for supporting multi-cloud-end voice service and storage device
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
CN108665900B (en) 2018-04-23 2020-03-03 百度在线网络技术(北京)有限公司 Cloud wake-up method and system, terminal and computer readable storage medium
US10674427B2 (en) * 2018-05-01 2020-06-02 GM Global Technology Operations LLC System and method to select and operate a mobile device through a telematics unit
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
DK180639B1 (en) 2018-06-01 2021-11-04 Apple Inc DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT
DK201870355A1 (en) 2018-06-01 2019-12-16 Apple Inc. Virtual assistant operation in multi-device environments
DK179822B1 (en) 2018-06-01 2019-07-12 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
CN109120774A (en) * 2018-06-29 2019-01-01 深圳市九洲电器有限公司 Terminal applies voice control method and system
CN108962261A (en) * 2018-08-08 2018-12-07 联想(北京)有限公司 Information processing method, information processing unit and bluetooth headset
CN108984800B (en) * 2018-08-22 2020-10-16 广东小天才科技有限公司 Voice question searching method and terminal equipment
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
CN110970032A (en) * 2018-09-28 2020-04-07 深圳市冠旭电子股份有限公司 Sound box voice interaction control method and device
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
KR102590914B1 (en) * 2018-12-14 2023-10-19 삼성전자주식회사 Electronic apparatus and Method for contolling the electronic apparatus thereof
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
CN109918040B (en) * 2019-03-15 2022-08-16 阿波罗智联(北京)科技有限公司 Voice instruction distribution method and device, electronic equipment and computer readable medium
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
DK201970509A1 (en) 2019-05-06 2021-01-15 Apple Inc Spoken notifications
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
DK180129B1 (en) 2019-05-31 2020-06-02 Apple Inc. User activity shortcut suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
DK201970511A1 (en) 2019-05-31 2021-02-15 Apple Inc Voice identification in digital assistant systems
US11468890B2 (en) 2019-06-01 2022-10-11 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
CN110561453B (en) * 2019-09-16 2020-09-29 北京觅机科技有限公司 Guided accompanying reading method of drawing robot
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11615790B1 (en) * 2019-09-30 2023-03-28 Amazon Technologies, Inc. Disambiguating contacts using relationship data
US11043220B1 (en) 2020-05-11 2021-06-22 Apple Inc. Digital assistant hardware abstraction
US11061543B1 (en) 2020-05-11 2021-07-13 Apple Inc. Providing relevant data items based on context
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11917092B2 (en) * 2020-06-04 2024-02-27 Syntiant Systems and methods for detecting voice commands to generate a peer-to-peer communication link
US11490204B2 (en) 2020-07-20 2022-11-01 Apple Inc. Multi-device audio adjustment coordination
US11438683B2 (en) 2020-07-21 2022-09-06 Apple Inc. User identification using headphones
CN112199033B (en) * 2020-09-30 2023-06-20 北京搜狗科技发展有限公司 Voice input method and device and electronic equipment
CN113838467B (en) * 2021-08-02 2023-11-14 北京百度网讯科技有限公司 Voice processing method and device and electronic equipment
CN115659302B (en) * 2022-09-22 2023-07-14 北京睿家科技有限公司 Method and device for determining missing detection personnel, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853253A (en) * 2009-03-30 2010-10-06 三星电子株式会社 Equipment and method for managing multimedia contents in mobile terminal
CN102591932A (en) * 2011-12-23 2012-07-18 优视科技有限公司 Voice search method, voice search system, mobile terminal and transfer server
CN103020069A (en) * 2011-09-22 2013-04-03 联想(北京)有限公司 Method, device and electronic equipment for searching data
CN103077176A (en) * 2012-01-13 2013-05-01 北京飞漫软件技术有限公司 Method of carrying out quick search in browser according to type of key words

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6449496B1 (en) * 1999-02-08 2002-09-10 Qualcomm Incorporated Voice recognition user interface for telephone handsets
US6741963B1 (en) * 2000-06-21 2004-05-25 International Business Machines Corporation Method of managing a speech cache
US7246063B2 (en) * 2002-02-15 2007-07-17 Sap Aktiengesellschaft Adapting a user interface for voice control
JP2004341033A (en) * 2003-05-13 2004-12-02 Matsushita Electric Ind Co Ltd Voice mediated activating unit and its method
KR20050028150A (en) * 2003-09-17 2005-03-22 삼성전자주식회사 Mobile terminal and method for providing user-interface using voice signal
CN1801846A (en) * 2004-12-30 2006-07-12 中国科学院自动化研究所 Method for earphone full-voice handset dialing interaction application
US20100105435A1 (en) 2007-01-12 2010-04-29 Panasonic Corporation Method for controlling voice-recognition function of portable terminal and radiocommunications system
DE102008051756A1 (en) 2007-11-12 2009-05-14 Volkswagen Ag Multimodal user interface of a driver assistance system for entering and presenting information
US8837901B2 (en) * 2008-04-06 2014-09-16 Taser International, Inc. Systems and methods for a recorder user interface
US20130132079A1 (en) * 2011-11-17 2013-05-23 Microsoft Corporation Interactive speech recognition
CN102663016B (en) * 2012-03-21 2015-12-16 上海触乐信息科技有限公司 Electronic equipment inputs system and method thereof that candidate frame carries out inputting Information expansion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853253A (en) * 2009-03-30 2010-10-06 三星电子株式会社 Equipment and method for managing multimedia contents in mobile terminal
CN103020069A (en) * 2011-09-22 2013-04-03 联想(北京)有限公司 Method, device and electronic equipment for searching data
CN102591932A (en) * 2011-12-23 2012-07-18 优视科技有限公司 Voice search method, voice search system, mobile terminal and transfer server
CN103077176A (en) * 2012-01-13 2013-05-01 北京飞漫软件技术有限公司 Method of carrying out quick search in browser according to type of key words

Also Published As

Publication number Publication date
WO2014177015A1 (en) 2014-11-06
US9502035B2 (en) 2016-11-22
US20160098991A1 (en) 2016-04-07
CN103280217A (en) 2013-09-04

Similar Documents

Publication Publication Date Title
CN103280217B (en) A kind of audio recognition method of mobile terminal and device thereof
US10657966B2 (en) Better resolution when referencing to concepts
US10431204B2 (en) Method and apparatus for discovering trending terms in speech requests
US9633653B1 (en) Context-based utterance recognition
US9280595B2 (en) Application query conversion
CN109522419B (en) Session information completion method and device
JP5851507B2 (en) Method and apparatus for internet search
CN110148416A (en) Audio recognition method, device, equipment and storage medium
CN107145571B (en) Searching method and device
CN107209905A (en) For personalized and task completion service, correspondence spends theme and sorted out
WO2020186828A1 (en) Quick jumping method and apparatus for application program, and electronic device and storage medium
CN107436691A (en) A kind of input method carries out method, client, server and the device of error correction
CN108920649B (en) Information recommendation method, device, equipment and medium
US10073828B2 (en) Updating language databases using crowd-sourced input
CN109903773A (en) Audio-frequency processing method, device and storage medium
CN106663113B (en) Saving and retrieving locations of objects
CN110532354A (en) The search method and device of content
CN107544684A (en) A kind of candidate word display methods and device
CN107092424A (en) A kind of display methods of error correction, device and the device of the display for error correction
CN109144458A (en) For executing the electronic equipment for inputting corresponding operation with voice
JP2023506087A (en) Voice Wakeup Method and Apparatus for Skills
CN108197105A (en) Natural language processing method, apparatus, storage medium and electronic equipment
KR102307380B1 (en) Natural language processing based call center support system and method
US11868678B2 (en) User interface sound emanation activity classification
CN110059491A (en) Data import monitoring method, device, equipment and readable storage medium storing program for executing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP01 Change in the name or title of a patent holder

Address after: 100080 Zhongguancun Haidian District street, No. 12, office of the layer 19 B1208

Patentee after: Hammer technology (Beijing) Limited by Share Ltd

Address before: 100080 Zhongguancun Haidian District street, No. 12, office of the layer 19 B1208

Patentee before: Hammer technology (Beijing) Co., Ltd.

TR01 Transfer of patent right

Effective date of registration: 20190117

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: BEIJING ZIJIE TIAODONG NETWORK TECHNOLOGY CO., LTD.

Address before: 100080 Beijing Haidian District, 19 Zhongguancun Street, 12-storey office B1208

Patentee before: Hammer technology (Beijing) Limited by Share Ltd

TR01 Transfer of patent right