CN109032039A

CN109032039A - A kind of method and device of voice control

Info

Publication number: CN109032039A
Application number: CN201811031798.0A
Authority: CN
Inventors: 许超
Original assignee: Beijing Yushanzhi Information Technology Co Ltd
Current assignee: Mobvoi Innovation Technology Co Ltd
Priority date: 2018-09-05
Filing date: 2018-09-05
Publication date: 2018-12-18
Anticipated expiration: 2038-09-05
Also published as: CN109032039B

Abstract

The embodiment of the invention discloses a kind of method and device of voice control, to avoid multiple equipment simultaneously to the phonetic order of user respond and caused by maloperation.This method comprises: obtaining user's pose presentation, user's pose presentation is to be acquired by being located at least one of pre-set space acquisition equipment at the first moment；According to user's pose presentation, determine that user is intended to the target controlled device of control from least one controlled device in the pre-set space；It controls the target controlled device and responds the phonetic order that the user inputs at first moment.

Description

A kind of method and device of voice control

Technical field

The present invention relates to terminal applies fields, more particularly to a kind of method and device of voice control.

Background technique

Traditional control method to multiple equipment is usually controlled using the respective remote controler of equipment respectively, and These remote controlers are often mutually uncurrent, and operate excessively cumbersome.Simpler, more natural mode of operation is used in order to realize Equipment is controlled, voice control comes into being.

Currently, in order to realize voice control mode, controlled device will be equipped with camera or speech ciphering equipment, to realize view Feel identification or speech recognition.And there may be setting for multiple support voice control modes in actual application environment, in the same space It is standby, and these equipment all have camera and voice software, are easy for causing maloperation in speech control process in this way.

Summary of the invention

In view of this, the embodiment of the present invention provides a kind of method and device of voice control, main purpose is to avoid more A equipment simultaneously to the phonetic order of user respond and caused by maloperation.

According to an embodiment of the present invention in a first aspect, providing a kind of method of voice control, comprising: obtain user's posture Image, user's pose presentation are to be acquired by being located at least one of pre-set space acquisition equipment at the first moment；Root According to user's pose presentation, determine that user is intended to the target controlled and is controlled from least one controlled device in the pre-set space Equipment；It controls the target controlled device and responds the phonetic order that the user inputs at first moment.

In embodiments of the present invention, described according to user's pose presentation, it is controlled from least one of described pre-set space Determine that user is intended to the target controlled device of control in equipment, comprising: according to user's pose presentation, determine the body of user The sight angle of angle, the face angle of user and/or user；According to the body angle of the user, the face angle of user And/or the sight angle of user, by user at least one described controlled device towards target controlled device be determined as it is described Target controlled device.

In embodiments of the present invention, acquisition user's pose presentation, comprising: receive and set from least one described acquisition At least one standby image；Determine that timestamp is the image at first moment from least one described image；According to pre- If target user's model, target detection is carried out to the image that timestamp is first moment, is determined comprising target user Image, the target user is the user of at least one controlled device；The figure comprising target user that will be determined As being determined as user's pose presentation.

In embodiments of the present invention, it is defeated at first moment to respond the user for the control target controlled device The phonetic order entered, comprising: send control instruction to the target controlled device, the control instruction is used to indicate the target Controlled device responds the phonetic order that the user inputs at first moment.

In embodiments of the present invention, acquisition user's pose presentation, comprising: obtain user and inputted at first moment Phonetic order；According to preset user's sound-groove model, the user for inputting the phonetic order is identified；When identifying State user be legitimate user when, acquire user's pose presentation.

In embodiments of the present invention, it is defeated at first moment to respond the user for the control target controlled device The phonetic order entered, comprising: speech recognition is carried out to the phonetic order；The phonetic order is responded, corresponding target is executed Operation.

Second aspect according to an embodiment of the present invention provides a kind of device of voice control, comprising: obtaining unit is used In obtaining user's pose presentation, user's pose presentation is to acquire equipment first by being located at least one of pre-set space Moment acquisition；Determination unit is used for according to user's pose presentation, from least one controlled device in the pre-set space Determine that user is intended to the target controlled device of control；Control unit responds the user for controlling the target controlled device In the phonetic order of first moment input.

In embodiments of the present invention, the determination unit is specifically used for determining user's according to user's pose presentation The sight angle of body angle, the face angle of user and/or user；According to the body angle of the user, the face of user The sight angle of angle and/or user, by user at least one described controlled device towards target controlled device be determined as The target controlled device.

The third aspect according to an embodiment of the present invention, provides a kind of electronic equipment, comprising: at least one processor；With And at least one processor, the bus being connected to the processor；Wherein, the processor, memory are complete by the bus At mutual communication；The processor is used to call program instruction in the memory, to execute said one or more The method of voice control described in a technical solution.

Fourth aspect according to an embodiment of the present invention provides a kind of computer readable storage medium, and the computer can It reads storage medium and stores computer instruction, the computer instruction makes the computer execute said one or multiple technical sides The method of voice control described in case.

By above-mentioned technical proposal, a kind of method and device of voice control provided in an embodiment of the present invention, wherein first It obtains and acquires user's pose presentation that equipment acquires at the first moment by being located at least one of pre-set space, then, according to User's pose presentation determines that user is intended to the target controlled device of control from least one controlled device in pre-set space, Finally, the phonetic order that control target controlled device response user inputs at the first moment, that is to say, that in the embodiment of the present invention In, by according to user's pose presentation, determining that user is intended to the target controlled device of control in multiple controlled devices, and then control The phonetic order of target controlled device response user is made, so that multiple equipment be avoided to ring simultaneously to the phonetic order of user Answer and caused by maloperation.

The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.

Detailed description of the invention

By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:

Fig. 1 is the implementation process diagram of the sound control method in the embodiment of the present invention；

Fig. 2 is the schematic diagram of the pre-set space in the embodiment of the present invention；

Fig. 3 is the implementation process schematic diagram of controlled device of setting the goal really in the embodiment of the present invention；

Fig. 4 is the structural schematic diagram of the phonetic controller in the embodiment of the present invention；

Fig. 5 is the structural schematic diagram of the electronic equipment in the embodiment of the present invention.

Specific embodiment

Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.

In embodiments of the present invention, in same pre-set space, such as parlor, bedroom, office, compartment inside, can be with At least one electronic equipment is existed simultaneously, such as smart phone, smartwatch, tablet computer, laptop, intelligent air condition, net Network camera, intelligent sound box etc., user can control these electronic equipments by input phonetic order.But due to same Multiple electronic equipments are stored in one space, it is possible to cause user to actually want to control A equipment, be B device response user The case where phonetic order, occurs, and causes maloperation.

So, in order to solve this problem, in embodiments of the present invention, the electronic equipment in same pre-set space is according to certainly Whether body has image collecting function, can be divided into acquisition equipment and controlled device, wherein acquisition equipment is for acquiring default sky Interior image, controlled device is used to respond the phonetic order of user, and controlled device has the function of voice collecting.In reality In, the same electronic equipment can be acquisition equipment, and be by equipment, such as smart television, IP Camera, intelligent hand Machine, smartwatch etc.；Certainly, the same electronic equipment can moreover be only controlled device, such as intelligent sound box, intelligent air condition, right This, the embodiment of the present invention is not especially limited.

It in practical applications, can be with direct communication or connection between the electronic equipment in above-mentioned same pre-set space Letter.For example, these electronic equipments can log in background server using same user account, then carried out by background server Communication；Alternatively, in order to improve communication efficiency, these electronic equipments can also be by the intelligent gateway that is arranged in the pre-set space It is communicated；Furthermore in order to further improve communication efficiency, these electronic equipments can also (purple honeybee be assisted by such as Zigbee View), the wireless communication techniques such as Wi-Fi (Wireless-Fidelity, Wireless Fidelity) are communicated.Certainly, these electronic equipments Between can also be not especially limited using other communication modes, the embodiment of the present invention.

Further, the embodiment of the present invention provides a kind of method of voice control, and this method can be applied to a voice control Device processed, the phonetic controller can be applied to such as above-mentioned background server, intelligent gateway, acquisition equipment or controlled device In.

Fig. 1 is the implementation process diagram of the sound control method in the embodiment of the present invention, shown in Figure 1, this method Include:

S101: user's pose presentation is obtained；

Wherein, user's pose presentation is to be acquired by being located at least one of pre-set space acquisition equipment at the first moment 's；

Here, when user is when carrying out voice control in pre-set space at the first moment, user says a phonetic order, this When, the electronic equipment for having voice collecting function in the pre-set space, i.e., above-mentioned controlled device will receive the voice and refer to It enables.If controlled device and acquisition equipment are same equipment, controlled device control is from the first moment acquisition pre-set space Image.If controlled device is different equipment from acquisition equipment, controlled device sends acquisition instructions, acquisition to acquisition equipment After equipment receives the acquisition instructions, the image in the instruction acquisition pre-set space is responded.Since user is at the first moment In pre-set space, so, it may include the image of user, i.e. user's pose presentation in the image of acquisition equipment acquisition.Finally, Collected user's pose presentation can be sent to above-mentioned phonetic controller by acquisition equipment, at this point, phonetic controller obtains Obtain user's pose presentation.

S102: according to user's pose presentation, determine that user is intended to control from least one controlled device in pre-set space The target controlled device of system；

Here, phonetic controller identifies user's pose presentation, after obtaining user's pose presentation to obtain Obtain the posture information of user, such as body angle, the face angle of user and/or the sight angle of user of user.Then, it presses According to the posture information of user, be capable of determining that user the first moment towards direction, and then determine user towards quilt Equipment is controlled, which is the target controlled device that user is intended to control.

S103: the phonetic order that control target controlled device response user inputs at the first moment.

Here, phonetic controller is after determining target controlled device, so that it may control the response of target controlled device The phonetic order that user inputs at the first moment.At this point, if phonetic controller is applied to background server or intelligent gateway, Then phonetic controller sends control instruction and gives target controlled device, and controlled device executes the control instruction, and response is from the The phonetic order for user's input that one moment received；If phonetic controller is applied to different from target controlled device set Standby, then phonetic controller equally sends control instruction and gives target controlled device, and controlled device executes the control instruction, and response is certainly The phonetic order for the user's input arrived in the first reception；If phonetic controller is applied to target controlled device, language Sound control device control itself response from the first reception to user input phonetic order.

The method of above-mentioned voice control is illustrated with specific example below.

For example, Fig. 2 is the schematic diagram of the pre-set space in the embodiment of the present invention, and it is shown in Figure 2, in pre-set space 200 In IP Camera 201, smart television 202 and intelligent sound box 203 are installed, wherein IP Camera 201 and smart television It is provided with camera on 202, is provided with microphone on IP Camera 201, smart television 202 and intelligent sound box 203.

So, firstly, 204 station of user inputs phonetic order, such as " today towards intelligent sound box 203 in pre-set space 200 How is weather? ", smart television 202 and intelligent sound box 203 can receive the phonetic order of user's input at this time, then, Smart television 202 and/or intelligent sound box send acquisition instructions to all acquisition equipment, then, IP Camera 201 and intelligence TV 202 carries out Image Acquisition by respective camera, and collected user's pose presentation is sent to by IP Camera 201 Smart television 202, smart television 202 is according to itself and the collected user's pose presentation of IP Camera, to the posture of user Identified, obtain the posture information of user, and according to the posture information determine user towards be intelligent sound box 203, in turn Determine that intelligent sound box is the target controlled device that user is intended to control, next, smart television 202 controls 203 sound of intelligent sound box Input using family at the first moment " today, how is weather? " this phonetic order, intelligent sound box 203 are obtained by speech recognition Obtain " today, how is weather? " revert statement " today, Beijing, fine, 35 DEG C to 25 DEG C ", and by the sentence be converted to voice letter Number output.

So far, the speech control process of user is just completed.

Based on previous embodiment, Fig. 3 is the implementation process schematic diagram of controlled device of setting the goal really in the embodiment of the present invention, Shown in Figure 3, above-mentioned S102 can specifically include:

S301: according to user's pose presentation, the view of the body angle of user, the face angle of user and/or user is determined Line angle degree；

S302: according to the sight angle of the body angle of user, the face angle of user and/or user, by least one In controlled device user towards target controlled device be determined as target controlled device.

In the specific implementation process, the pose presentation of image analysis algorithm analysis user can be used in phonetic controller, To determine whether user is intended for some acquisition equipment, and then determine the practical target controlled device faced of user.

Specifically, the angle, the angle of face and/or sight angle of user's body are determined according to the pose presentation of user.Example If facial image is towards there are many angles: just facing towards referred to as ten facets, subject is two symmetrical on picture；Nine facets, Face constitutes 18 degree of side angles towards with taking lens；Eight facets are 36 degree of side angles；Seven facets are 54 degree of side angles；Six facets are 72 degree Side angle；Five facets are 90 degree of side angles, at this moment, subject only performance one eye eyeball, that is, full side on picture.It can be used Image analysis algorithm analyzes the symmetric case of the left and right sides in facial image, and then judges whether the angle of face is basic face sheet Equipment.If the angle of user's body, the angle of face and/or sight angle are basic this equipment of face, then it is assumed that user It is to there is control to be intended to this equipment.

In other embodiments of the present invention, angle, the angle of face and/or sight angle and the shooting of user's body be can recognize The angle that camera lens is constituted, by the direction of plane where the direction of plane where user's body, face and/or vertical with sight angle The angle that the direction of the plane where direction and taking lens where plane is constituted as user towards angle, when user's It is less than or equal to preset angle threshold towards angle, then it is assumed that the angle of user is basic this equipment of face, and angle threshold can It is set as 5 ° to 10 °.

So, if body is big towards angle, the difference towards angle and sight towards any two angle in angle of face In being equal to preset angle difference threshold value, then in the case where sight is less than or equal to preset angle threshold towards angle, judgement User has control to be intended to the controlled device.If the angle of body, the angle of face and this three of sight angle and taking lens structure At angle it is different, then the angle preferentially constituted using sight angle and taking lens is as user towards angle, because of sight The consciousness that angle can most react a people is intended to, and when all angles data are inconsistent, pays the utmost attention to sight angle, can be most quasi- Really identify that the control of user is intended to.

Further, in the case where that cannot determine user's sight angle according to the pose presentation of the user of acquisition, in face Towards angle be less than or equal to preset angle threshold in the case where, judge user to controlled device have control be intended to.If due to The reasons such as light can not identify the angle that sight angle and taking lens are constituted；The angle then constituted with the angle of face and taking lens It spends as user towards angle；If can not identify the angle that the angle of face and taking lens are constituted due to light etc.；Then Using the angle that the angle of body and taking lens are constituted as user towards angle.That is, three kinds of angle-datas is excellent The sequence of first grade from high to low is sight angle, the angle of face, the angle of user's body, and the higher data of priority more can be quasi- The control for really reacting user is intended to.

Certainly, phonetic controller, which can also be adopted, determines that target is controlled according to user's pose presentation with other methods and sets Standby, the embodiment of the present invention is not especially limited.

In embodiments of the present invention, in order to reduce phonetic controller to the analytical calculation amount of user's pose presentation, S101 It may include: to receive at least one image from least one acquisition equipment；Timestamp is determined from least one image Target detection is carried out to the image that timestamp was the first moment according to preset target user's model for the image at the first moment, Determine the image comprising target user, target user is the user of at least one controlled device；It will determine comprising mesh The image of mark user is determined as user's pose presentation.

Specifically, phonetic controller receives at least one image from least one acquisition equipment, then therefrom The image that timestamp was the first moment is first selected, was the first moment to timestamp according to preset target user's model then Image carry out target detection, therefrom determine include target user image, goal user can be preset right At least one controlled device has the user of access right, finally, the image comprising target user determined is determined as using Family pose presentation.

Alternatively, in other embodiments of the present invention, if controlled device and acquisition equipment are same equipment, at this point, S101 can To include: the phonetic order for obtaining user and being inputted at the first moment；According to preset user's sound-groove model, to input phonetic order User identify；When identifying user is legitimate user, user's pose presentation is acquired.

Here, acquisition equipment can carry out identification to user before acquiring user's pose presentation.That is, Acquisition equipment is receiving user after the phonetic order that the first moment inputted, according to preset user's sound-groove model, to input Phonetic order carries out Application on Voiceprint Recognition, determines the identity of the user of input phonetic order, when the user identified is legitimate user, Acquisition equipment controls itself acquisition user's pose presentation, and when the user identified is illegal user, acquisition equipment is not done any Response, so that controlled device only responds the phonetic order of specific user, avoids the maloperation of other users.

Further, in these cases, i.e. controlled device and acquisition equipment is same equipment, and S103 may include: pair Phonetic order carries out speech recognition, and voice responsive instructs, and executes corresponding object run.

Here, the phonetic order for user's input that controlled device receives itself carries out speech recognition, and then, response is known Not Chu phonetic order, execute the corresponding object run of the phonetic order.

Based on the same inventive concept, the embodiment of the present invention provides a kind of device of voice control, the phonetic controller with Said one or multiple phonetic controllers as described in the examples are consistent.

Fig. 4 is the structural schematic diagram of the phonetic controller in the embodiment of the present invention, shown in Figure 4, the voice control Device 400 includes: obtaining unit 401, and for obtaining user's pose presentation, user's pose presentation is by being located in pre-set space What at least one acquisition equipment acquired at the first moment；Determination unit 402 is used for according to user's pose presentation, from pre-set space In at least one controlled device in determine user be intended to control target controlled device；Control unit 403, for controlling target The phonetic order that controlled device response user inputs at the first moment.

In embodiments of the present invention, above-mentioned determination unit is specifically used for determining the body of user according to user's pose presentation The sight angle of angle, the face angle of user and/or user；According to the body angle of user, user face angle and/or The sight angle of user, by user at least one controlled device towards target controlled device be determined as target controlled device.

In embodiments of the present invention, above-mentioned obtaining unit, specifically for receiving from least one acquisition equipment at least One image；Determine that timestamp is the image at the first moment from least one image；According to preset target user's model, Target detection is carried out to the image that timestamp was the first moment, determines the image comprising target user, target user is at least The user of one controlled device；The image comprising target user determined is determined as user's pose presentation.

In embodiments of the present invention, above-mentioned control unit is specifically used for sending control instruction, control to target controlled device Instruction is used to indicate the phonetic order that target controlled device response user inputs at the first moment.

In embodiments of the present invention, when above-mentioned phonetic controller is arranged on target controlled device, above-mentioned acquisition list Member is specifically also used to obtain the phonetic order that user inputs at the first moment；According to preset user's sound-groove model, to input language The user of sound instruction identifies；When identifying user is legitimate user, user's pose presentation is acquired.

Further, above-mentioned control unit is specifically also used to carry out speech recognition to phonetic order；Voice responsive instruction, Execute corresponding object run.

It need to be noted that: the description of apparatus above embodiment, be with the description of above method embodiment it is similar, With the similar beneficial effect of same embodiment of the method.For undisclosed technical detail in apparatus of the present invention embodiment, please refer to The description of embodiment of the method in the present invention and understand.

Based on the same inventive concept, the embodiment of the present invention provides a kind of electronic equipment, the electronic equipment and said one or The multiple electronic equipments as described in the examples of person are consistent.

Fig. 5 is the structural schematic diagram of the electronic equipment in the embodiment of the present invention, shown in Figure 5, the electronic equipment 500 packet It includes: at least one processor 501；And at least one processor 502, the bus 503 being connect with processor 501；Wherein, it handles Device 501, memory 502 complete mutual communication by bus 503；Processor 501 is used to call the program in memory 502 Instruction, to execute the method and step of said one or multiple voice controls as described in the examples.

It need to be noted that: the description of the above electronic equipment embodiment, the description with above-mentioned apparatus embodiment are classes As, there is with Installation practice similar beneficial effect.It is thin for undisclosed technology in electronic equipment embodiment of the present invention Section, please refers to the description of Installation practice in the present invention and understands.

Based on the same inventive concept, the embodiment of the present invention provides a kind of computer readable storage medium, computer-readable to deposit Storage media stores computer instruction, and computer instruction makes computer execute said one or multiple voices as described in the examples The method and step of control.

It can be seen from the above, in embodiments of the present invention, by being determined in multiple controlled devices according to user's pose presentation User is intended to the target controlled device of control, and then controls the phonetic order of target controlled device response user, to avoid Multiple equipment simultaneously to the phonetic order of user respond and caused by maloperation.

It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.

The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs to refer to Enable the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to generate One machine so that by the instruction that the processor of computer or other programmable data processing devices executes generate for realizing The PLM plug-in unit for the function of being specified in one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, The manufacture of PLM plug-in unit is enabled, instruction PLM plug-in unit is realized in one or more flows of the flowchart and/or one, block diagram The function of being specified in box or multiple boxes.

These computer program instructions can also be loaded into computer or other programmable data processing devices, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.

Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims

1. a kind of method of voice control characterized by comprising

Obtain user's pose presentation, user's pose presentation is to acquire equipment the by being located at least one of pre-set space The acquisition of one moment；

According to user's pose presentation, determine that user is intended to the mesh of control from least one controlled device in the pre-set space Mark controlled device；

It controls the target controlled device and responds the phonetic order that the user inputs at first moment.

2. the method according to claim 1, wherein described according to user's pose presentation, from the pre-set space In at least one controlled device in determine user be intended to control target controlled device, comprising:

According to user's pose presentation, the angle of sight of the body angle of user, the face angle of user and/or user is determined Degree；

It, will at least one described quilt according to the sight angle of the body angle of the user, the face angle of user and/or user Control equipment in user towards target controlled device be determined as the target controlled device.

3. the method according to claim 1, wherein acquisition user's pose presentation, comprising:

Receive at least one image from least one the acquisition equipment；

Determine that timestamp is the image at first moment from least one described image；

According to preset target user's model, target detection is carried out to the image that timestamp is first moment, determines to wrap Image containing target user, the target user are the user of at least one controlled device；

The image comprising target user determined is determined as user's pose presentation.

4. the method according to claim 1, wherein the control target controlled device responds the user In the phonetic order of first moment input, comprising:

Control instruction is sent to the target controlled device, the control instruction is used to indicate target controlled device response institute State the phonetic order that user inputs at first moment.

5. the method according to claim 1, wherein acquisition user's pose presentation, comprising:

Obtain the phonetic order that user inputs at first moment；

According to preset user's sound-groove model, the user for inputting the phonetic order is identified；

When identifying the user is legitimate user, user's pose presentation is acquired.

6. according to the method described in claim 5, it is characterized in that, the control target controlled device responds the user In the phonetic order of first moment input, comprising:

Speech recognition is carried out to the phonetic order；

The phonetic order is responded, corresponding object run is executed.

7. a kind of device of voice control characterized by comprising

Obtaining unit, for obtaining user's pose presentation, user's pose presentation is by least one in pre-set space What a acquisition equipment acquired at the first moment；

Determination unit, for determining and using from least one controlled device in the pre-set space according to user's pose presentation Family is intended to the target controlled device of control；

Control unit responds the user for controlling the target controlled device and refers in the voice that first moment inputs It enables.

8. device according to claim 7, which is characterized in that the determination unit is specifically used for according to user's appearance State image determines the sight angle of the body angle of user, the face angle of user and/or user；According to the body of the user The sight angle of body angle, the face angle of user and/or user, by user at least one described controlled device towards mesh Mark controlled device is determined as the target controlled device.

9. a kind of electronic equipment characterized by comprising

At least one processor；

And at least one processor, the bus being connected to the processor；Wherein,

The processor, memory complete mutual communication by the bus；

The processor is used to call the program instruction in the memory, is required described in any one of 1 to 6 with perform claim The method of voice control.

10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage computer refers to It enables, the computer instruction makes the method for voice control described in any one of described computer perform claim requirement 1 to 6.