CN103984415A

CN103984415A - Information processing method and electronic equipment

Info

Publication number: CN103984415A
Application number: CN201410211567.3A
Authority: CN
Inventors: 杨振奕; 王科; 徐琳
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2014-05-19
Filing date: 2014-05-19
Publication date: 2014-08-13
Anticipated expiration: 2034-05-19
Also published as: CN103984415B

Abstract

The invention discloses an information processing method and electronic equipment and solves the technical problem that personal pronouns in a voice instruction cannot be correctly identified by the existing electronic equipment. The method is applied to the electronic equipment and comprises the steps that input voice is obtained; the input voice is identified by a voice identification engine; when the condition that the input voice comprises the personal pronouns is identified, first data are obtained; a coreference object which is delegated by the personal pronouns is determined on the basis of the first data; an operation instruction is executed on the basis of the coreference object, wherein the operation instruction is an instruction which corresponds to the input voice and is identified by the voice identification engine after the input voice is identified by the voice identification engine.

Description

A kind of information processing method and electronic equipment

Technical field

The present invention relates to electronic technology field, particularly a kind of information processing method and electronic equipment.

Background technology

At present, all can identify and carry out user's voice command intelligent electronic devices such as panel computer, smart mobile phone, intelligent watch, enrich the interactive mode of user and electronic equipment, bring advantage to the user.

But present inventor finds above-mentioned prior art and at least has following technical matters:

In the voice command that electronic equipment obtains, may comprise personal pronoun, and electronic equipment is difficult to determine that this personal pronoun refers to refers to object, causes user's voice command not to be executed correctly.

Summary of the invention

The application provides a kind of information processing method and electronic equipment, there is the correctly technical matters of the personal pronoun in voice command recognition of electronic equipment for solving prior art, realize and promote electronic equipment identification, carry out the ability of voice command, and then improved user's experience.

The embodiment of the present application provides a kind of information processing method on the one hand, is applied in electronic equipment, and described method comprises: obtain input voice; Identify described input voice by speech recognition engine; In the time identifying described input voice and comprise personal pronoun, obtain the first data; Determine based on described the first data the object that refers to that described personal pronoun refers to; Refer to the instruction of object executable operations based on described, wherein, described operational order be described identify described input voice by speech recognition engine after, instruction corresponding to described input voice that described speech recognition engine identifies.

Optionally, described determine based on described the first data that described personal pronoun refers to refer to object before, described method also comprises: in the time identifying described input voice and comprise described personal pronoun, load identify label identification module, to make described identify label identification module in running status; Describedly determine based on described the first data the object that refers to that described personal pronoun refers to, comprising: by described identify label identification module, described the first data are identified, determined corresponding the first identify label of described the first data; The indication object that described the first identify label is referred to as described personal pronoun.

Optionally, described identify label identification module comprises face recognition module, described in the time identifying described input voice and comprise personal pronoun, obtain the first data, comprise: in the time identifying described input voice and comprise described personal pronoun, load image acquisition module, to make described image capture module in running status; Obtain the first image as described the first data by described image capture module, described the first image comprises who object; Describedly by described identify label identification module, described the first data are identified, determine corresponding the first identify label of described the first data, comprise: by described face recognition module, described the first image is identified, produce recognition result, described recognition result is the first identify label corresponding to the described who object in described the first image.

Optionally, described identify label identification module comprises voiceprint identification module, described in the time identifying described input voice and comprise personal pronoun, obtain the first data, comprise: in the time identifying described input voice and comprise first kind personal pronoun, load voiceprint extraction module, to make described voiceprint extraction module in running status; From described input voice, extract vocal print data by described voiceprint extraction module as described the first data; Describedly by described identify label identification module, described the first data are identified, determine corresponding the first identify label of described the first data, comprise: by described voiceprint identification module, described vocal print data are identified, determined corresponding described the first identify label of described vocal print data.

Optionally, described in the time identifying described input voice and comprise personal pronoun, obtain the first data, comprise: in the time identifying instruction corresponding to described input voice that described input voice comprise that described personal pronoun and described speech recognition engine identify and be picture searching instruction, load image acquisition module, to make described image capture module in running status; Obtain the first image as described the first data by described image capture module, described the first image comprises who object; Describedly determine based on described the first data the object that refers to that described personal pronoun refers to, comprising: described in described the first image is defined as to described personal pronoun refers to, refer to object; Describedly refer to the instruction of object executable operations based on described, comprising: extract the signature identification in described the first image; Described signature identification and picture database are compared, determine the M pictures that meets described signature identification from described picture database, M is natural number.

Optionally, described in the time identifying described input voice and comprise personal pronoun, obtain the first data, comprise: in the time identifying described input voice and comprise that personal pronoun and described personal pronoun belong to Equations of The Second Kind personal pronoun, start image capture module, to make described image capture module in running status; Gather N pictures by described image capture module, N is positive integer; Determine that by described N pictures having a directivity points to gesture; Described image capture module points to gesture based on described directivity and obtains the first image as described the first data.

The embodiment of the present application provides a kind of electronic equipment on the other hand, comprising: phonetic acquisition unit, for obtaining input voice; Voice recognition unit, for identifying described input voice by speech recognition engine; Data acquisition unit, in the time identifying described input voice and comprise personal pronoun, obtains the first data; Determining unit, for the object that refers to of determining that based on described the first data described personal pronoun refers to; Instruction execution unit, for referring to the instruction of object executable operations based on described, wherein, described operational order be described identify described input voice by speech recognition engine after, instruction corresponding to described input voice that described speech recognition engine identifies.

Optionally, described electronic equipment also comprises: the first loading unit, in the time identifying described input voice and comprise described personal pronoun, loads identify label identification module, to make described identify label identification module in running status; Described determining unit comprises identify label determining unit, for described the first data being identified by described identify label identification module, determine corresponding the first identify label of described the first data, the indication object that described the first identify label is referred to as described personal pronoun.

Optionally, described identify label identification module comprises face recognition module; Described data acquisition unit comprises that the first image obtains unit, and in the time identifying described input voice and comprise described personal pronoun, load image acquisition module, to make described image capture module in running status; And obtaining the first image as described the first data by described image capture module, described the first image comprises who object; Described identify label determining unit comprises that first determines subelement, for described the first image being identified by described face recognition module, produce recognition result, described recognition result is the first identify label corresponding to the described who object in described the first image, the indication object that described the first identify label is referred to as described personal pronoun.

Optionally, described identify label identification module comprises voiceprint identification module; Described data acquisition unit comprises that vocal print obtains unit, in the time identifying described input voice and comprise described personal pronoun, loads voiceprint extraction module, to make described voiceprint extraction module in running status; And from described input voice, extract vocal print data by described voiceprint extraction module as described the first data; Described identify label determining unit comprises that second determines subelement, for described vocal print data being identified by described voiceprint identification module, determine corresponding described the first identify label of described vocal print data, the indication object that described the first identify label is referred to as described personal pronoun.

Optionally, described data acquisition unit comprises that the second image obtains unit, for in the time identifying instruction corresponding to described input voice that described input voice comprise that described personal pronoun and described speech recognition engine identify and be picture searching instruction, load image acquisition module, to make described image capture module in running status; And obtaining the first image as described the first data by described image capture module, described the first image comprises who object; Described determining unit refers to object described in described the first image being defined as to described personal pronoun refers to; Described instruction execution unit, specifically for extracting the signature identification in described the first image, is compared described signature identification and described picture database, determines the M pictures that meets described signature identification from described picture database, and M is natural number.

Further, described data acquisition unit comprises that the 3rd image obtains unit, for in the time identifying described input voice and comprise that personal pronoun and described personal pronoun belong to Equations of The Second Kind personal pronoun, start image capture module, to make described image capture module in running status; And gather N pictures by described image capture module, determine that by described N pictures having a directivity points to gesture, and control described image capture module and point to gesture based on described directivity and obtain the first image as described the first data, N is positive integer.

The one or more technical schemes that provide in the embodiment of the present application, at least have following technique effect or advantage:

In the embodiment of the present application, can be in the time identifying input and comprise personal pronoun in voice, determine by obtaining the first data the object that refers to that this personal pronoun refers to, refer to object and can correctly carry out operational order corresponding to input voice according to what determine.And then solve the correctly technical matters of the personal pronoun in voice command recognition of electronic equipment, and promote the ability of electronic equipment identification, execution voice command, improve user's experience.

Brief description of the drawings

Fig. 1 is the schematic flow sheet of information processing method in the embodiment of the present application;

Fig. 2 is a kind of refinement schematic flow sheet of step 104 in the embodiment of the present application;

Fig. 3 is a kind of refinement schematic flow sheet of step 103 in the embodiment of the present application;

Fig. 4 is the corresponding schematic flow sheet of example one in the embodiment of the present application;

Fig. 5 is the corresponding schematic flow sheet of example two in the embodiment of the present application;

Fig. 6 is the corresponding schematic flow sheet of example three in the embodiment of the present application;

Fig. 7 is the functional block diagram of electronic equipment in the embodiment of the present application.

Embodiment

In the embodiment of the present application, electronic equipment can be the smart machines such as smart mobile phone, intelligent watch, panel computer, intelligent television, intelligent refrigerator, intelligent automobile, electronic equipment is in the time that the input voice that identify user comprise personal pronoun, obtain the object that refers to that the first data determine that this personal pronoun refers to, and then make electronic equipment correctly carry out operational order corresponding to input voice according to the object that refers to of determining.Wherein, the first data can be the images gathering by image capture module, can be also user's biological attribute datas.The embodiment of the present application technical scheme has solved the correctly technical matters of the personal pronoun in voice command recognition of electronic equipment, promotes electronic equipment identification, carries out the ability of voice command, has improved user's experience.

Below by accompanying drawing and specific embodiment, present techniques scheme is described in detail, be to be understood that the specific features in the embodiment of the present application and embodiment is the detailed explanation to present techniques scheme, instead of restriction to present techniques scheme, in the situation that not conflicting, the technical characterictic in the embodiment of the present application and embodiment can combine mutually.

Referring to Fig. 1, the information processing method of what the embodiment of the present application provided be applied to electronic equipment, comprises the steps:

Step 101: obtain input voice.

Specifically, electronic equipment can obtain by voice typing unit user's input voice.

Step 102: identify input voice by speech recognition engine.

Specifically, speech recognition engine can be the speech recognition engine of electronic equipment this locality, can be also the speech recognition engine in high in the clouds, and electronic equipment can call high in the clouds speech recognition engine by access cloud server and carry out speech recognition.

Step 103: identifying input voice while comprising personal pronoun, obtain the first data.

Specifically, the first data can be the view data that comprises who object, can be also user's biological attribute data, as vocal print data, finger print data, iris data etc.

Step 104: determine the object that refers to that personal pronoun refers to based on the first data.

Specifically, step 104 comprises two kinds of implementations:

Mode 1, determines by identify label identification module the identify label that the first data are corresponding, and this identify label is the object that refers to of personal pronoun.

Mode 2, determines that the first data are itself to refer to object.For example, instruction corresponding to the input voice that identify be make electronic equipment retrieval and personal pronoun refer to image corresponding to object time, can be according to the view data of obtaining (, the first data) carry out matching ratio pair with the picture in picture library, and then retrieve the picture corresponding with view data, now view data is and refers to object; Again for example, instruction corresponding to the input voice that identify be make electronic equipment retrieval and personal pronoun refer to voice document corresponding to object time, can be according to the vocal print data of the input voice that obtain (, the first data) carry out matching ratio pair with the voice document in sound bank, and then retrieve the voice document corresponding with vocal print data, now vocal print data are and refer to object.

Step 105: based on referring to the instruction of object executable operations, wherein, operational order be by speech recognition engine identify input voice after, instruction corresponding to input voice that speech recognition engine identifies.

Specifically, operational order is that after voice are inputted in speech recognition engine identification, electronic equipment is according to the instruction of the input speech production identifying.In arbitrary moment after the generative process of this operational order can occur in step 102, before step 105, the embodiment of the present application will not limit this.

Operational order at least comprises following several types: one, retrieves with a certain class file or a certain class file that object is corresponding that refer to of this personal pronoun electronic equipment and press from both sides; They are two years old, electronic equipment is communicated with the terminal that object is corresponding that refers to of this personal pronoun, for example, in the time that input voice are " photo is sent to him ", operational order is: make electronic equipment the photo of current demonstration be sent to the object that refers to of " he " correspondence; Its three, electronic equipment is logged in and local account or the network account that object is corresponding that refer to of this personal pronoun.

In the embodiment of the present application technique scheme, can be in the time identifying input and comprise personal pronoun in voice, determine by obtaining the first data the object that refers to that this personal pronoun refers to, refer to object and can correctly carry out operational order corresponding to input voice according to what determine.And then solve the correctly technical matters of the personal pronoun in voice command recognition of electronic equipment, and promote the ability of electronic equipment identification, execution voice command, improve user's experience.

Further, before step 104, information processing method also comprises: identifying input voice while comprising personal pronoun, load identify label identification module, to make identify label identification module in running status;

Step 104: determine based on the first data the object that refers to that personal pronoun refers to, referring to Fig. 2, comprise the steps:

Step 1041: by identify label identification module, the first data are identified, determined corresponding the first identify label of the first data;

Step 1042: the indication object that the first identify label is referred to as personal pronoun.

Specifically, identify label is to characterize the mark of user identity, can be user's name, identification card number, network account (as micro-signal, No. qq) etc.Identify label identification module is process chip or single-chip microcomputer, and this identify label identification module can be determined corresponding the first identify label of the first data, and this first identify label is the object that refers to of personal pronoun.

In the embodiment of the present application, identify label identification module determines that the mode of identify label corresponding to the first data is: identify label identification module is compared the data in the first data and characteristic of correspondence database, all corresponding identify label of each characteristic in this property data base.Therefore,, in the time determining in the first data and property data base that a characteristic matches, can determine that identify label corresponding to the first data is identify label corresponding to this characteristic.

Wherein, property data base can be in electronic equipment this locality, also can be kept on the server that electronic equipment can access.And in definite property data base with the mode of the first data character pair data be: the characteristic in the first data and character pair database is compared, determine the characteristic that is greater than setting threshold with the first Data Matching rate, personal identification corresponding to this characteristic is the personal identification that personal pronoun is corresponding; If be greater than the characteristic of setting threshold with the first Data Matching rate not unique, determine that the highest personal identification corresponding to characteristic of matching rate is the personal identification that personal pronoun is corresponding.

Further, according to the difference of the first data type, the technical scheme of determining identify label corresponding to the first data by identify label module at least comprises following several situation:

Situation 1, the first data are image, and identify label identification module is determined corresponding identify label according to image.

Concrete, identify label identification module comprises face recognition module; Step 103 comprises the steps:

Identifying input voice while comprising personal pronoun, load image acquisition module, to make image capture module in running status;

Obtain the first image as the first data by image capture module, the first image comprises who object;

Step 1041 comprises the steps: by face recognition module, the first image to be identified, and produces recognition result, and recognition result is the first identify label corresponding to the who object in the first image.

Specifically, image capture module can be the camera on electronic equipment, and the first image can, for the picture of image capture module acquisition, can be also video file.Face recognition module extracts face characteristic from the first image, face characteristic data in this face characteristic and facial feature database are compared, the corresponding identify label of each face characteristic in this facial feature database, thereby when face characteristic data, can corresponding determine the identify label that the first image is corresponding in the corresponding face characteristic transmission of data of the face characteristic of determining the first image storehouse.

Situation 2, the first data are the vocal print data of input voice, and identify label identification module is determined corresponding identify label according to vocal print data.

Concrete, step 103 comprises the steps:, in the time identifying input voice and comprise personal pronoun, to load voiceprint extraction module, to make voiceprint extraction module in running status; From input voice, extract vocal print data as the first data by voiceprint extraction module;

Step 1041 comprises the steps: by voiceprint identification module, vocal print data to be identified, and determines corresponding the first identify label of vocal print data.

Specifically, the corresponding first person pronoun of first kind personal pronoun, as " I ", " I ", " My " etc.In the time first kind personal pronoun being detected, can determine that the object that refers to of this personal pronoun is the user corresponding with input voice, thereby can determine according to the vocal print data of these input voice this user's identify label.

Voiceprint extraction module is a speech processing module, can from input voice, extract vocal print data according to certain mathematical model.Voiceprint identification module is compared the vocal print characteristic in these vocal print data and vocal print property data base, the corresponding identify label of each vocal print characteristic in this vocal print property data base, thereby in the time determining in the corresponding vocal print property data base of this vocal print data a vocal print characteristic, can corresponding determine identify label corresponding to vocal print data.

Situation 3, except vocal print data, electronic equipment can identify input voice while comprising personal pronoun again, by corresponding data acquisition unit gather user such as the other biological such as fingerprint, iris characteristic as the first data.Identify label identification module can be determined corresponding identify label according to the biological attribute data such as fingerprint, iris.

Further, step 103 comprises the steps: in the time identifying input voice and comprise that instruction corresponding to input voice that personal pronoun and speech recognition engine identify is picture searching instruction, load image acquisition module, to make image capture module in running status;

Step 104 comprises the steps: the first image to be defined as the object that refers to that personal pronoun refers to;

Step 105 comprises the steps:

Extract the signature identification in the first image;

Signature identification and picture database are compared, determine the M pictures that meets signature identification from picture database, M is natural number.

Specifically, picture searching instruction is to make electronic equipment search out the instruction of the picture corresponding with personal pronoun.In the time that operational order is picture searching instruction, can comprise by image capture module collection the first image of corresponding who object, then from this first image, extract signature identification, this signature identification can be the face characteristic of who object, can be also clothing feature, hair style feature etc.Then all pictures in the picture database of this signature identification and electronic equipment are compared, determine the M matching with this signature identification in picture database and open image.

In actual conditions, in the time determining operational order and be picture searching instruction, also can adopt the technical scheme in aforementioned circumstances 1, first identify by face recognition module the identify label that the first image is corresponding, and then retrieve picture corresponding with this identify label in picture library.But, if face recognition module is identified identify label when failure of the first image, can then adopt the technical scheme that the image in the first image and picture library is directly compared.

In addition, it is to make electronic equipment search out the situation of the voice document corresponding with first kind personal pronoun that technique scheme is equally applicable to operational order, now can from input voice, extract vocal print data by voiceprint extraction module, and the voice document in these vocal print data and sound bank is carried out to matching ratio pair, and then can determine the voice document matching with these vocal print data.

Further, in the embodiment of the present application, obtain first image corresponding with personal pronoun by image capture module and at least comprise following two kinds of modes:

Mode one, in the time identifying this personal pronoun and be first kind personal pronoun, this personal pronoun, corresponding to active user, for example, directly obtains active user's image by image capture module (, front-facing camera).

Mode two, referring to Fig. 3, step 103 comprises the steps:

Step 1031: identifying input voice while comprising that personal pronoun and personal pronoun belong to Equations of The Second Kind personal pronoun, start image capture module, to make image capture module in running status;

Step 1032: gather N pictures by image capture module, N is positive integer;

Step 1033: determine that by N pictures having a directivity points to gesture;

Step 1034: image capture module points to gesture based on directivity and obtains the first image as the first data.

Specifically, Equations of The Second Kind personal pronoun is non-first person pronoun, and correspondence is different from other individualities of active user, for example " he ", " her " etc.In the time identifying input voice Equations of The Second Kind personal pronoun, make electronic equipment obtain the view data that this this personal pronoun is corresponding, can first obtain by image capture module one or more image (can be also video file) that can determine a directivity sensing gesture, for example take user's gesture direction by front-facing camera, or take user's visual focus direction, or take user's finger or the moving direction of arm, and then can determine according to these images the directive property direction of pointing to this personal pronoun.

Determining after directivity sensing gesture, the image capture module of controlling electronic equipment gathers at the collection position corresponding with directionality sensing gesture, can obtain and comprise corresponding the first individual image of personal pronoun.Specifically can comprise again two kinds of modes: one, moves to corresponding collection position according to directionality sensing gesture control image acquisition units and complete collection; Its two, electronic equipment comprises multiple image capture modules, or image capture module has multiple acquisition windows, controls image capture module or the acquisition window corresponding with directivity sensing gesture and completes image acquisition.

Further, in the embodiment of the present application, Equations of The Second Kind personal pronoun can also comprise appellation pronoun, as Mr. Liu, beam teacher, (opening) manager etc.Electronic equipment is probably to determine corresponding the referring to object and (or can only determine multiple objects that refer to that appellation pronoun may be corresponding of these appellation pronouns from analyzing speech content, but can not determine out that unique, correct that refers to object), in this case, can adopt the technical scheme that is applicable to Equations of The Second Kind personal pronoun in the above-mentioned all technical schemes of the application, give an example no longer one by one at this.

Further, in the embodiment of the present application, comprise two or more personal pronouns in input voice time, can process according to the repetition of technique scheme and/or combination.For example, in input voice, comprise " I and she ... " time, what the vocal print data that can identify input voice by voiceprint identification module were determined " I " refers to object, or open the object that refers to of image definite " I " by the N obtaining in step 1031, then after execution step 1032,1033 is obtained the image corresponding with " she ", determine the object that refers to of " she " according to this image.

Below by instantiation, present techniques scheme is explained:

Example one, referring to Fig. 4, comprises the steps:

Step 201: electronic equipment obtains user's input voice: " searching for my photo ";

Step 202: identify input voice by speech recognition engine;

Step 203: identify in input voice and comprise first kind personal pronoun " I ", load voiceprint extraction module, extract vocal print data by voiceprint extraction module from input voice;

Step 204: voiceprint identification module, by this vocal print parameter and vocal print property data base are compared, identifies identify label corresponding to this vocal print parameter for " Li Ming ";

Step 205: determine that " Li Ming " is the object that refers to of " I ";

Step 206: carry out execution instruction corresponding to input voice, search out the photo associated with " Li Ming " from picture library.Wherein, after the generation of execution instruction can occur in step 202, any moment before step 206; The interrelational form of " Li Ming " and photo comprises: in the name of photo, comprises " Li Ming ", or in the attribute list of photo, is added with " Li Ming ", etc.

Example two, referring to Fig. 5, comprises the steps:

The rear execution step 207 of above-mentioned steps 202: identify in input voice and comprise first kind personal pronoun " I ", load image identification module, obtains the first image that comprises active user;

Step 208: determine that this first image is the object that refers to of " I ";

Step 209: carry out execution instruction corresponding to input voice, extract face characteristic from the first image, and in this face characteristic and picture library, the face characteristic of every photo is compared, and searches out the photo that face characteristic matches; Wherein, after the generation of execution instruction can occur in step 202, any moment before step 209.

Example three, referring to Fig. 6, comprises the steps:

Step 301: electronic equipment, in the time playing this local music, obtains user's input voice: " song is sent to Mr. Li ";

Step 302: identify input voice by speech recognition engine;

Step 303: identify in input voice and comprise " Mr. Li ", start image capture module;

Step 304: obtain N pictures by image capture module, wherein N pictures can be one section of N two field picture in video;

Step 305: determine a directivity sensing gesture by N pictures, wherein, directionality is pointed to gesture can be according to the user's gesture direction in N pictures or finger motion orientation determination;

Step 306: determine the collection position of image capture module according to directivity sensing gesture, gather image at definite collection position, be the first image;

Step 307: identify this first image by recognition of face engine, determine that the identify label that the first image is corresponding is " Li Ming "; Wherein, the working method of recognition of face engine is: from the first image, extract face characteristic, this face characteristic and facial feature database are compared, identify label corresponding to face characteristic data of determining coupling is " Li Ming ";

Step 308: determine that " Li Ming " is the object that refers to of " Mr. Li ";

Step 309: carry out operational order corresponding to input voice, by corresponding network service, local current broadcasting song is sent to the network service address that " Li Ming " is corresponding.Wherein, after the generation of execution instruction can occur in step 302, the arbitrary moment before step 309; Network service comprises multimedia message service, E-mail address service, micro-telecommunications services etc., and the network service address of " Li Ming " corresponds to phone number, email address, micro-signal.

Above-mentioned three examples that example is several correspondences in the embodiment of the present application technical scheme, similarly, the application gives an example no longer one by one in the concrete application of all the other technical schemes.

Referring to Fig. 7, the embodiment of the present application provides a kind of electronic equipment, and this electronic equipment can be the smart machines such as smart mobile phone, intelligent watch, panel computer, intelligent television, intelligent refrigerator, intelligent automobile.This electronic equipment comprises:

Phonetic acquisition unit 10, for obtaining input voice;

Voice recognition unit 20, for identifying input voice by speech recognition engine;

Data acquisition unit 30, for identifying input voice while comprising personal pronoun, obtains the first data;

Determining unit 40, for the object that refers to of determining that based on the first data personal pronoun refers to;

Instruction execution unit 50, for based on referring to the instruction of object executable operations, wherein, operational order be identify input voice by speech recognition engine after, instruction corresponding to input voice that speech recognition engine identifies.

Further, electronic equipment also comprises: the first loading unit, for identifying input voice while comprising personal pronoun, loads identify label identification module, to make identify label identification module in running status;

Determining unit 40 comprises identify label determining unit, for the first data being identified by identify label identification module, determines corresponding the first identify label of the first data, the indication object that the first identify label is referred to as personal pronoun.

Further, identify label identification module comprises face recognition module, and data acquisition unit 30 comprises that the first image obtains unit, for identifying input voice while comprising personal pronoun, load image acquisition module, to make image capture module in running status; And obtaining the first image as the first data by image capture module, the first image comprises who object;

Identify label determining unit comprises that first determines subelement, for the first image being identified by face recognition module, produce recognition result, recognition result is the first identify label corresponding to the who object in the first image, the indication object that the first identify label is referred to as personal pronoun.

Further, identify label identification module comprises voiceprint identification module; Data acquisition unit 30 comprises that vocal print obtains unit, for identifying input voice while comprising personal pronoun, loads voiceprint extraction module, to make voiceprint extraction module in running status; And from input voice, extract vocal print data as the first data by voiceprint extraction module;

Identify label determining unit comprises that second determines subelement, for vocal print data being identified by voiceprint identification module, determines corresponding the first identify label of vocal print data, the indication object that the first identify label is referred to as personal pronoun.

Further, data acquisition unit comprises that the second image obtains unit, for in the time identifying input voice and comprise that instruction corresponding to input voice that personal pronoun and speech recognition engine identify is picture searching instruction, load image acquisition module, to make image capture module in running status; And obtaining the first image as the first data by image capture module, the first image comprises who object;

Determining unit 40 is specifically for being defined as the first image the object that refers to that personal pronoun refers to;

Instruction execution unit 50, specifically for extracting the signature identification in the first image, is compared signature identification and picture database, determines the M pictures that meets signature identification from picture database, and M is natural number.

Further, data acquisition unit comprises that the 3rd image obtains unit, for identifying input voice while comprising that personal pronoun and personal pronoun belong to Equations of The Second Kind personal pronoun, starts image capture module, to make image capture module in running status; And gather N pictures by image capture module, by N pictures determine have one directivity point to gesture, and control image capture module based on directivity point to gesture obtain the first image as the first data, N is positive integer.

Various information processing manners in information processing method in previous embodiment and instantiation are equally applicable to the electronic equipment of the present embodiment, by the detailed description to information processing method in previous embodiment, those skilled in the art can clearly know the implementation method of electronic equipment in the present embodiment, so succinct for instructions, is not described in detail in this.

Those skilled in the art should understand, the application's embodiment can be provided as method, system or computer program.Therefore, the application can adopt complete hardware implementation example, completely implement software example or the form in conjunction with the embodiment of software and hardware aspect.And the application can adopt the form at one or more upper computer programs of implementing of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) that wherein include computer usable program code.

The application is with reference to describing according to process flow diagram and/or the block scheme of the method for the embodiment of the present application, equipment (system) and computer program.Should understand can be by the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, the instruction that makes to carry out by the processor of computing machine or other programmable data processing device produces the device for realizing the function of specifying at flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame.

These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, the instruction that makes to be stored in this computer-readable memory produces the manufacture that comprises command device, and this command device is realized the function of specifying in flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame.

These computer program instructions also can be loaded in computing machine or other programmable data processing device, make to carry out sequence of operations step to produce computer implemented processing on computing machine or other programmable devices, thereby the instruction of carrying out is provided for realizing the step of the function of specifying in flow process of process flow diagram or multiple flow process and/or square frame of block scheme or multiple square frame on computing machine or other programmable devices.

Specifically, computer program instructions corresponding to information processing method in the embodiment of the present application can be stored in CD, hard disk, on the storage mediums such as USB flash disk, in the time that the computer program instructions corresponding with information processing method in storage medium read or be performed by an electronic equipment, comprise the steps:

Obtain input voice;

Identify input voice by speech recognition engine;

Identifying input voice while comprising personal pronoun, obtain the first data;

Determine based on the first data the object that refers to that personal pronoun refers to;

Based on referring to the instruction of object executable operations, wherein, operational order be by speech recognition engine identify input voice after, instruction corresponding to input voice that speech recognition engine identifies.

Optionally, in described storage medium, also store other computer instruction, these computer instructions with step: determine based on described the first data the object that refers to that described personal pronoun refers to, before corresponding computer instruction is performed, be performed, in the time being performed, comprise the steps:

In the time identifying described input voice and comprise described personal pronoun, load identify label identification module, to make described identify label identification module in running status;

That store in storage medium and step: determine based on described the first data the object that refers to that described personal pronoun refers to, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

By described identify label identification module, described the first data are identified, determined corresponding the first identify label of described the first data;

The indication object that described the first identify label is referred to as described personal pronoun.

Optionally, described identify label identification module comprises face recognition module, in storage medium, store and step: in the time identifying described input voice and comprise personal pronoun, obtain the first data, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

In the time identifying described input voice and comprise described personal pronoun, load image acquisition module, to make described image capture module in running status;

Obtain the first image as described the first data by described image capture module, described the first image comprises who object;

That store in storage medium and step: by described identify label identification module, described the first data are identified, determined corresponding the first identify label of described the first data, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

By described face recognition module, described the first image is identified, produced recognition result, described recognition result is the first identify label corresponding to the described who object in described the first image.

Optionally, described identify label identification module comprises voiceprint identification module, in storage medium, store and step: in the time identifying described input voice and comprise personal pronoun, obtain the first data, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

In the time identifying described input voice and comprise first kind personal pronoun, load voiceprint extraction module, to make described voiceprint extraction module in running status;

From described input voice, extract vocal print data by described voiceprint extraction module as described the first data;

By described voiceprint identification module, described vocal print data are identified, determined corresponding described the first identify label of described vocal print data.

Optionally, that store in storage medium and step: in the time identifying described input voice and comprise personal pronoun, obtain the first data, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

In the time identifying instruction corresponding to described input voice that described input voice comprise that described personal pronoun and described speech recognition engine identify and be picture searching instruction, load image acquisition module, to make described image capture module in running status;

Described in being defined as to described personal pronoun refers to, described the first image refers to object;

That store in storage medium and step: refer to the instruction of object executable operations based on described, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

Extract the signature identification in described the first image;

Described signature identification and picture database are compared, determine the M pictures that meets described signature identification from described picture database, M is natural number.

That store in storage medium and step: in the time identifying described input voice and comprise personal pronoun, obtain the first data, corresponding computer instruction, being specifically performed in process, specifically comprises the steps:

In the time identifying described input voice and comprise that personal pronoun and described personal pronoun belong to Equations of The Second Kind personal pronoun, start image capture module, to make described image capture module in running status;

Gather N pictures by described image capture module, N is positive integer;

Determine that by described N pictures having a directivity points to gesture;

Described image capture module points to gesture based on described directivity and obtains the first image as described the first data.

Although described the application's preferred embodiment, once those skilled in the art obtain the basic creative concept of cicada, can make other change and amendment to these embodiment.So claims are intended to be interpreted as comprising preferred embodiment and fall into all changes and the amendment of the application's scope.

Obviously, those skilled in the art can carry out various changes and modification and the spirit and scope that do not depart from the application to the application.Like this, if these amendments of the application and within modification belongs to the scope of the application's claim and equivalent technologies thereof, the application is also intended to comprise these changes and modification interior.

Claims

1. an information processing method, is applied in electronic equipment, and described method comprises:

Obtain input voice;

Identify described input voice by speech recognition engine;

In the time identifying described input voice and comprise personal pronoun, obtain the first data;

Determine based on described the first data the object that refers to that described personal pronoun refers to;

Refer to the instruction of object executable operations based on described, wherein, described operational order be described identify described input voice by speech recognition engine after, instruction corresponding to described input voice that described speech recognition engine identifies.

2. the method for claim 1, is characterized in that, described determine based on described the first data that described personal pronoun refers to refer to object before, described method also comprises:

Describedly determine based on described the first data the object that refers to that described personal pronoun refers to, comprising:

3. method as claimed in claim 2, is characterized in that, described identify label identification module comprises face recognition module, described in the time identifying described input voice and comprise personal pronoun, obtains the first data, comprising:

Describedly by described identify label identification module, described the first data are identified, are determined corresponding the first identify label of described the first data, comprising:

4. method as claimed in claim 2, is characterized in that, described identify label identification module comprises voiceprint identification module, described in the time identifying described input voice and comprise personal pronoun, obtains the first data, comprising:

5. the method for claim 1, is characterized in that, described in the time identifying described input voice and comprise personal pronoun, obtains the first data, comprising:

Describedly refer to the instruction of object executable operations based on described, comprising:

Extract the signature identification in described the first image;

6. the method for claim 1, is characterized in that, described in the time identifying described input voice and comprise personal pronoun, obtains the first data, comprising:

Gather N pictures by described image capture module, N is positive integer;

Determine that by described N pictures having a directivity points to gesture;

7. an electronic equipment, comprising:

Phonetic acquisition unit, for obtaining input voice;

Voice recognition unit, for identifying described input voice by speech recognition engine;

Data acquisition unit, in the time identifying described input voice and comprise personal pronoun, obtains the first data;

Determining unit, for the object that refers to of determining that based on described the first data described personal pronoun refers to;

Instruction execution unit, for referring to the instruction of object executable operations based on described, wherein, described operational order be described identify described input voice by speech recognition engine after, instruction corresponding to described input voice that described speech recognition engine identifies.

8. electronic equipment as claimed in claim 7, is characterized in that, described electronic equipment also comprises:

The first loading unit, in the time identifying described input voice and comprise described personal pronoun, loads identify label identification module, to make described identify label identification module in running status;

Described determining unit comprises identify label determining unit, for described the first data being identified by described identify label identification module, determine corresponding the first identify label of described the first data, the indication object that described the first identify label is referred to as described personal pronoun.

9. electronic equipment as claimed in claim 8, is characterized in that, described identify label identification module comprises face recognition module;

Described data acquisition unit comprises that the first image obtains unit, and in the time identifying described input voice and comprise described personal pronoun, load image acquisition module, to make described image capture module in running status; And obtaining the first image as described the first data by described image capture module, described the first image comprises who object;

Described identify label determining unit comprises that first determines subelement, for described the first image being identified by described face recognition module, produce recognition result, described recognition result is the first identify label corresponding to the described who object in described the first image, the indication object that described the first identify label is referred to as described personal pronoun.

10. electronic equipment as claimed in claim 8, is characterized in that, described identify label identification module comprises voiceprint identification module;

Described data acquisition unit comprises that vocal print obtains unit, in the time identifying described input voice and comprise described personal pronoun, loads voiceprint extraction module, to make described voiceprint extraction module in running status; And from described input voice, extract vocal print data by described voiceprint extraction module as described the first data;

Described identify label determining unit comprises that second determines subelement, for described vocal print data being identified by described voiceprint identification module, determine corresponding described the first identify label of described vocal print data, the indication object that described the first identify label is referred to as described personal pronoun.

11. electronic equipments as claimed in claim 7, it is characterized in that, described data acquisition unit comprises that the second image obtains unit, for in the time identifying instruction corresponding to described input voice that described input voice comprise that described personal pronoun and described speech recognition engine identify and be picture searching instruction, load image acquisition module, to make described image capture module in running status; And obtaining the first image as described the first data by described image capture module, described the first image comprises who object;

Described determining unit refers to object described in described the first image being defined as to described personal pronoun refers to;

Described instruction execution unit, specifically for extracting the signature identification in described the first image, is compared described signature identification and described picture database, determines the M pictures that meets described signature identification from described picture database, and M is natural number.

12. electronic equipments as claimed in claim 7, it is characterized in that, described data acquisition unit comprises that the 3rd image obtains unit, for in the time identifying described input voice and comprise that personal pronoun and described personal pronoun belong to Equations of The Second Kind personal pronoun, start image capture module, to make described image capture module in running status; And gather N pictures by described image capture module, determine that by described N pictures having a directivity points to gesture, and control described image capture module and point to gesture based on described directivity and obtain the first image as described the first data, N is positive integer.