CN109377995B

CN109377995B - Method and device for controlling equipment

Info

Publication number: CN109377995B
Application number: CN201811381967.3A
Authority: CN
Inventors: 韩雪; 王慧君; 毛跃辉; 陶梦春
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2018-11-20
Filing date: 2018-11-20
Publication date: 2021-06-01
Anticipated expiration: 2038-11-20
Also published as: CN109377995A

Abstract

The invention discloses a method and a device for controlling equipment, which are used for solving the problem that the accuracy of a voice control command obtained by analysis is not high when intelligent household equipment is controlled in the prior art. The method comprises the steps of firstly matching a determined voice control command with lip language mouth shape information of a voice control command sent by a user and aiming at the intelligent household equipment, determining a first voice control command according to a matching result, and controlling the intelligent household equipment according to the first voice control command. The voice control command is matched with the lip language mouth shape when the user sends the voice control command aiming at the intelligent household equipment, and the voice control command for controlling the intelligent household equipment is determined according to the matching result, so that the accuracy of extracting the voice control command can be improved.

Description

Method and device for controlling equipment

Technical Field

The present invention relates to the field of wireless communications technologies, and in particular, to a method and an apparatus for controlling a device.

Background

The intelligent home is an ecosystem which takes a house as a platform and connects various devices in the house together through the Internet of things technology to realize intellectualization. The intelligent video interphone has the functions of intelligent light control, intelligent electric appliance control, a security monitoring system, intelligent background music, intelligent video sharing, a visual intercom system, a home theater system and the like.

The intelligent home integrates facilities related to home life by utilizing a comprehensive wiring technology, a network communication technology, a safety precaution technology, an automatic control technology and an audio and video technology, constructs an efficient management system for residential facilities and family schedule affairs, improves home safety, convenience, comfortableness and artistry, and realizes an environment-friendly and energy-saving living environment.

In the existing intelligent home environment, when a target user controls equipment in a functional furniture system through voice, an intelligent home system server directly analyzes and extracts a control voice command of the user from collected voice information containing the control voice command, and controls corresponding intelligent home equipment according to the determined control voice command. However, in daily home life, a home is usually configured with a plurality of smart home devices, and a home usually has a plurality of users, when a user controls a smart home device, other users may be controlling other smart home devices, and other users may also be talking, and at this time, voice information collected by the server is very complex, and if the voice command is directly extracted, the accuracy of analyzing and recognizing the user control voice command is not high due to noise interference.

In summary, when the smart home device is controlled, the accuracy of the user control voice command obtained through analysis is not high.

Disclosure of Invention

The invention provides a method and a device for controlling equipment, which are used for solving the problem that the accuracy of a voice control command obtained by analysis is not high when intelligent household equipment is controlled in the prior art.

In a first aspect, an embodiment of the present invention provides a method for controlling a device, where the method includes:

matching the determined voice control command with lip language mouth shape information when a user sends the voice control command aiming at the intelligent household equipment, and determining a first voice control command according to a matching result;

and controlling the intelligent household equipment according to the first voice control command.

According to the method, firstly, the determined voice control command is matched with lip language mouth shape information when a user sends the voice control command aiming at the intelligent household equipment, the first voice control command is determined according to the matching result, and the intelligent household equipment is controlled according to the first voice control command. The voice control command is matched with the lip language mouth shape when the user sends the voice control command aiming at the intelligent household equipment, and the voice control command for controlling the intelligent household equipment is determined according to the matching result, so that the accuracy of extracting the voice control command can be improved.

In one possible implementation, the matching result is determined by:

judging whether the matching degree of the determined voice control command and lip language mouth shape information when the voice control command aiming at the intelligent household equipment is sent by the user is smaller than a threshold value or not, if so, determining that noise exists in the determined voice control command;

otherwise, determining that the determined voice control command is free of noise.

The method provides a method for determining the existence of noise in the voice control command, and combines the lip language mouth shape information and the matching degree of the voice control command, so as to judge whether the noise exists in the voice control command more accurately.

In a possible implementation manner, the determining the first voice control command according to the matching result includes:

if the determined voice control command has noise, determining a first voice control command according to the voice word number information corresponding to the determined voice control command and the lip language mouth shape information;

and if the determined voice control command has no noise, taking the determined voice control command as the first voice control command.

The method determines the first voice control command according to the matching result, if no noise exists in the determined voice control command, the first voice control command is determined according to the voice word number information corresponding to the determined voice control command and the lip language mouth shape information, and if no noise exists in the determined voice control command, the determined voice control command is used as the first voice control command, so that the accuracy of analyzing the voice control command can be improved.

In a possible implementation manner, the determining a first voice control command according to the voice word number information and the lip language mouth shape information corresponding to the determined voice control command includes:

if the voice word number information corresponding to the determined voice control command is larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape, discarding the voice control command which is not matched with the lip language mouth shape information in the determined voice control command to obtain a second voice control command;

and if the voice word number information corresponding to the second voice control command is larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, and the matching degree of the second voice control command and the lip language mouth shape information when the voice control instruction aiming at the intelligent household equipment is sent by the user is smaller than a threshold value, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information according to a replacement principle to obtain a first voice control command.

The method comprises the steps of comparing voice digital information corresponding to the voice control command with the mouth shape conversion times corresponding to the lip language mouth shape of the intelligent household equipment by a user, discarding voice control commands of the lip language mouth shape which are not matched in the voice control command, and replacing the voice control command which cannot be matched with the lip language mouth shape information according to a replacement principle, so that noise in the voice control command is filtered by combining the lip language mouth shape information, and the accuracy of analyzing the voice control command is improved.

and if the voice word number information corresponding to the determined voice control command is not larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information according to a replacement principle to obtain a first voice control command.

According to the method, another method for filtering noise in the voice control command is provided, if the voice word number information corresponding to the determined voice control command is not larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, the voice control command which cannot be matched with the lip language mouth shape information is replaced according to a replacement principle, so that the noise in the voice control command is filtered by combining the lip language mouth shape information, and the accuracy of analyzing the voice control command is improved.

In one possible implementation, the voice word number information corresponding to the determined voice control command is determined by:

and analyzing the acquired voice information according to a voice recognition model, wherein the voice recognition model is obtained by training a neural network according to the voice information, the voice control command and the voice word number information.

According to the method, the acquired voice information is analyzed according to the voice recognition model, and the voice recognition model is obtained through neural network training according to the voice information, the voice control command and the voice word number information, so that the voice control command and the voice word number information corresponding to the voice information can be analyzed.

In a possible implementation manner, the mouth shape transformation time information and the alternative word information corresponding to the lip language mouth shape information are determined by the following method:

and analyzing the obtained lip language mouth shape information according to an image recognition model, wherein the image recognition model is obtained through neural network training according to the lip language mouth shape information, the mouth shape transformation frequency information and the replacement word information.

According to the method, the obtained lip language mouth shape information is analyzed according to the image recognition model, and the image recognition model is obtained through neural network training according to the lip language mouth shape information, the mouth shape transformation frequency information and the replacement word information, so that the mouth shape transformation frequency and the replacement word information corresponding to the lip language mouth shape information can be obtained when the lip language mouth shape information is analyzed.

In a second aspect, an embodiment of the present invention provides an apparatus for controlling a device, where the apparatus includes: at least one processing unit and at least one memory unit, wherein the memory unit stores program code that, when executed by the processing unit, causes the processing unit to perform the following:

In a possible implementation manner, the processing unit is specifically configured to:

In a third aspect, an embodiment of the present invention provides an apparatus for controlling a device, where the apparatus includes:

a determination module: matching the determined voice control command with lip language mouth shape information when a user sends the voice control command aiming at the intelligent household equipment, and determining a first voice control command according to a matching result;

a control module: and the intelligent household equipment is controlled according to the first voice control command.

In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method in the first aspect.

In addition, for technical effects brought by any one implementation manner of the second aspect to the fourth aspect, reference may be made to technical effects brought by different implementation manners of the first aspect, and details are not described here.

These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a flowchart of a method for controlling a device according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a first control device according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a second control device according to an embodiment of the present invention;

fig. 4 is a flowchart of a complete method for controlling a device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Along with the popularity of intelligent equipment, smart homes gradually enter the lives of people, when a user wants to control the smart homes by voice, the user needs to analyze collected voice information to obtain a voice control command, and then controls the smart homes according to the voice control command. When the collected voice information is analyzed, if only the voice information of the user aiming at the intelligent home equipment needing to be controlled exists in the voice information, the intelligent home equipment can be controlled according to the voice control command analyzed from the voice information, if other voice information exists in the voice information except the voice information of the user aiming at the intelligent home equipment needing to be controlled, for example, the voice information of other users aiming at other intelligent home equipment and the voice information of conversations among other users, when the collected voice information is analyzed, the analyzed voice control command is possibly inaccurate, and therefore the intelligent home equipment can be controlled wrongly through the voice control command.

For example, when the user a wants to open the air conditioner 1, the user a sends out voice information of "open the air conditioner", and at this time, the user B wants to control the air conditioner 2 to dehumidify, and the user B sends out voice information of "dehumidify", and when the voice information is collected, both the voices of "open the air conditioner" and "dehumidify" are collected, and then for the air conditioner 1, when the voice information is analyzed according to the collected voice information, the analyzed voice control command is "open the air conditioner, dehumidify", and at this time, the analyzed voice control command is an erroneous voice control command, which can also be understood as noise existing in the analyzed voice control command.

If noise exists in the voice control command analyzed according to the collected voice information, the noise in the voice control command needs to be filtered, and then the intelligent household equipment is controlled according to the filtered voice control command.

The execution subject in the embodiment of the present invention may be a server;

according to the embodiment of the invention, the voice information can be acquired through the microphone, and the lip language mouth shape information can be acquired through the camera.

The application scenario described in the embodiment of the present invention is for more clearly illustrating the technical solution of the embodiment of the present invention, and does not form a limitation on the technical solution provided in the embodiment of the present invention, and it can be known by a person skilled in the art that the technical solution provided in the embodiment of the present invention is also applicable to similar technical problems with the occurrence of a new service scenario.

In view of the foregoing application scenarios, an embodiment of the present invention provides a method for controlling a device, and as shown in fig. 1, the method specifically includes the following steps:

step 100, matching the determined voice control command with lip language mouth shape information when a user sends a voice control command for the intelligent household equipment, and determining a first voice control command according to a matching result;

step 101, controlling the intelligent household equipment according to a first voice control command.

In the embodiment of the invention, the determined voice control command is matched with lip language mouth shape information of the voice control command of the intelligent household equipment sent by a user, a first voice control command is determined according to a matching result, and the intelligent household equipment is controlled according to the first voice control command. The determined voice control command is matched with the lip language mouth shape information when the user sends the voice control command aiming at the intelligent household equipment, and the voice control command for controlling the intelligent household equipment is determined according to the matching result, so that the accuracy of extracting the voice control command can be improved.

In implementation, before determining the voice control command, if the microphone and the camera of the smart home device are in a sleep state, the user needs to wake up the microphone and the camera.

The method for waking up the microphone and the camera of the smart home device may be performed by using a wake-up word, for example, when a user sends a wake-up word of "air conditioner 1", the microphone and the camera connected to the air conditioner 1 are woken up; remote control wake-up may also be used, such as waking up the microphone and camera using a remote control.

After the microphone and the camera are awakened, the microphone can collect voice information, and the camera collects lip language mouth shape information of a user aiming at the intelligent household equipment.

It should be noted that, when the microphone collects the voice information, the collected voice information is the voice information that the microphone can recognize, for example, the user a sends out the voice information of "turning on the air conditioner 1", the user B sends out the voice information of "dehumidifying", and if the voice information microphones sent out by the user a and the user B can be collected, the collected voice information of the microphone at this time is "turning on the air conditioner" or "dehumidifying";

when the camera gathers lip language mouth shape information, what the collection was is the lip language mouth shape information of the intelligent household equipment that the user is directed against needs to control, for example, user A will control intelligent air conditioner 1, then user A need stand in the camera visual range who is connected with intelligent air conditioner 1, then sends speech information.

The following describes the speech information analysis and lip language mouth shape information analysis, respectively.

The microphone sends the voice information to the server after collecting the voice information, and the server analyzes the voice information after receiving the voice information.

Specifically, when analyzing the voice information, the obtained voice information may be analyzed according to a voice recognition model, where the voice recognition model is obtained by training through a neural network according to the voice information, the voice control command, and the voice word number information.

It should be noted that, the construction of the voice recognition model requires a large amount of voice information, voice control commands and voice word number information, the voice recognition model is obtained after the training of the neural network, and the voice control commands and the voice word number information corresponding to the voice information are obtained after the voice information is input into the voice recognition model.

And after analyzing the acquired voice information, the server obtains a voice control command and voice word number information corresponding to the voice information.

For example, the microphone transmits the collected voice information of "turning on the air conditioner 1" to the server, and the server analyzes the voice information after receiving the voice information, and obtains that the control command corresponding to the voice information is "on" and the number of words of voice information corresponding to the voice information is 2.

When the server analyzes the acquired voice information, there is a possibility that the analysis fails, for example, the voice information collected by a microphone connected to the air conditioner is "turn on a television", and when the server analyzes the voice information, the analysis fails, and the server may push a message that the voice analysis fails to the user.

The above is the analysis of the voice information by the server, and the following is the analysis of the lip language mouth shape information by the server.

After the lip language mouth shape information is collected by the camera, the lip language mouth shape information is sent to the server, and after the lip language mouth shape information is received by the server, the lip language mouth shape information is analyzed.

Specifically, when analyzing the lip language mouth shape information, the obtained lip language mouth shape information may be analyzed according to an image recognition model, where the image recognition model is obtained through neural network training according to the lip language mouth shape information, the mouth shape transformation frequency information, and the replacement word information.

It should be noted that, the construction of the image recognition model requires a large amount of lip language mouth shape information, mouth shape transformation frequency information, and replacement word information, the image recognition model is obtained after training of the neural network, and after the lip language mouth shape information is input into the image recognition model, the mouth shape transformation frequency information corresponding to the lip language mouth shape information and the replacement word information corresponding to the lip language mouth shape are obtained.

And after analyzing the obtained lip language mouth shape information, the server obtains mouth shape conversion frequency information and alternative word information corresponding to the lip language mouth shape information.

After the server acquires the voice information, firstly, the voice information is analyzed into a voice control command, if the analysis is successful, the voice control command acquired through the analysis is matched with the lip language mouth shape information acquired when a user sends the voice control command for the intelligent home equipment, a first voice control command is determined according to a matching result, and the intelligent home equipment is controlled according to the first voice control command.

In the implementation, the matching results are two, and if the matching degree of the first determined voice control command and the lip language mouth shape information when the user sends the voice control instruction aiming at the intelligent household equipment is less than a threshold value, the determined voice control command is determined to have noise;

and if the second matching degree is not less than the threshold value, determining that no noise exists in the voice control command.

For example, when the voice control command information analyzed by the server is ' turn on the air conditioner 1 ', and the voice control command for turning on the air conditioner 1 ' is matched with the obtained lip language mouth shape information of the user for the smart home device, the matching degree is 90%, and if the threshold value is 80%, it is determined that no noise exists in the voice control command;

for another example, when the voice control command information analyzed by the server is "turn on the air conditioner 1", and the voice control command for turning on the air conditioner 1 "is matched with the obtained lip language mouth shape information of the smart home device by the user, the matching degree is 70%, and if the threshold value is 80%, it is determined that noise exists in the voice control command.

If no noise exists in the determined voice control command, taking the voice control command as a first voice control command, namely controlling the intelligent household equipment according to the voice control command;

if noise exists in the determined voice control command, the voice control command with the noise filtered is used for controlling the intelligent home equipment, the noise is filtered according to the voice word number information corresponding to the determined voice control command and the lip language mouth shape information, and the voice control command with the noise filtered is used as a first voice control command.

In the embodiment of the present invention, two cases are used when noise is filtered for a voice control command with noise, where in the first case, the voice word number information corresponding to the voice control command is greater than the mouth shape conversion frequency information corresponding to the lip language mouth shape, and in the second case, the voice word number information corresponding to the voice control command is not greater than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, and the following respectively describes the two cases of filtering noise in the voice control command.

In case one, the voice word number information corresponding to the voice control command is larger than the mouth shape conversion times information corresponding to the lip language mouth shape information.

And if the voice word number information corresponding to the voice control command is larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, discarding the voice control command which is not matched with the lip language mouth shape information in the voice control command.

For example, if the voice control command is "turn on the air conditioner a to dehumidify", the number of words of voice information corresponding to the voice control command is 7, if the number of times of mouth shape conversion corresponding to the lip language mouth shape information is 5, the number of words of voice information is greater than the voice mouth shape information, at this time, the voice control command needs to be matched with the lip language mouth shape information, the voice control command which is not matched with the lip language mouth shape information in the voice control command is discarded, if the voice control command which is not matched with the lip language mouth shape information is "dehumidify", the voice control command "dehumidify" is discarded, the voice control command which is discarded after the "dehumidify" is "turn on the air conditioner a", and the voice control command "turn on the air conditioner a" is the voice control command after the noise is filtered.

If the matching degree is not less than the threshold value, the second voice control command is used as a first voice control command, and the intelligent household equipment is controlled according to the first voice control command;

and if the matching degree of the second voice control information and the lip language mouth shape information is smaller than a threshold value, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information which cannot be matched with the control command according to a replacement principle to obtain a first voice control command, and controlling the intelligent home equipment according to the first voice control command.

And in the second situation, the voice word number information corresponding to the voice control command is not more than the mouth shape conversion frequency information corresponding to the lip language mouth shape information.

And if the voice word number information corresponding to the voice control command is not larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information according to a replacement principle.

It should be noted that the replacement principle is to determine whether the meaning of the voice control command to be replaced is similar to that of the replacement word, if so, the replacement is performed, and if not, the voice parsing fails.

The following illustrates how to filter noise in the voice control command when the voice word number information corresponding to the voice control command is equal to the mouth shape conversion frequency information corresponding to the lip language mouth shape information.

For example, if the voice control command is "turn on the air conditioner a to increase", the voice word number information corresponding to the voice control command is 7, if the mouth shape conversion frequency corresponding to the lip language mouth shape information is 7, the voice word number information is equal to the voice mouth shape information, at this time, the voice control command needs to be matched with the lip language mouth shape information, if the voice control command "increase" is not matched with the lip language mouth shape information, correspondingly, a part of the lip language mouth shape information is not matched with the voice control command, a replacement word corresponding to the lip language mouth shape information which is not matched with the voice control command is replaced according to a replacement principle, for example, the replacement word is "dehumidification", and if the server determines that the meanings of "increase" and "dehumidification" are not similar, the voice control command is selected not to be replaced, and the voice information analysis fails;

the replacement word corresponding to the lip language mouth shape information which is not matched with the voice control instruction is 'heightening', the server judges that the meanings of 'heightening' and 'heightening' are similar, the 'heightening' is selected to be replaced by the 'heightening', the voice control instruction after replacement is 'air conditioner A heightening' opened, and namely the voice control instruction after noise filtering is 'air conditioner A heightening opened'.

The replacement word information is obtained according to a large number of experimental results and is stored in the server in advance.

It should be noted that the server determines whether the voice control command and the replacement word have similar meanings, and may determine according to some data stored in the server, for example, if the similarity of the meanings of "increase" and "increase" is 90%, the replacement may be performed.

It should be noted that, when comparing the voice word number information corresponding to the voice control command with the mouth shape conversion frequency information corresponding to the lip language mouth shape information, if the voice word number information corresponding to the voice control command is smaller than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, at this time, the voice word number information may be replaced according to a replacement principle, and it may also be considered that the voice information analysis has failed.

According to the embodiment of the invention, after the acquired voice information is analyzed into the voice control command, firstly, the voice control command analyzed according to the acquired voice information is matched with lip language mouth shape information when a user sends the voice control command aiming at the intelligent home equipment, and if the voice control command analyzed according to the matching result does not have noise, the intelligent home equipment is controlled according to the analyzed voice control command; if the noise exists in the analyzed voice control command according to the matching result, the noise in the voice control command is filtered according to the voice word number information corresponding to the analyzed voice control command and the lip language mouth shape information, and finally the intelligent home equipment is controlled according to the voice control command after the noise is filtered, so that the accuracy of extracting the voice control command can be improved.

Based on the same inventive concept, the embodiment of the present invention further provides a device for controlling a device, and since the device corresponds to the device corresponding to the method for controlling a device provided in the embodiment of the present invention, and the principle of the device for solving the problem is similar to that of the method, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.

As shown in fig. 2, an apparatus for controlling a device according to a first embodiment of the present invention includes: at least one processing unit 200 and at least one storage unit 201, wherein the storage unit 201 stores program code that, when executed by the processing unit, causes the processing unit 200 to perform the following:

Optionally, the processing unit 200 is specifically configured to:

if the voice word number information corresponding to the determined voice control command is larger than the mouth shape conversion frequency information corresponding to the lip language mouth shape information, discarding the voice control command which is not matched with the lip language mouth shape information in the determined voice control command to obtain a second voice control command;

and if the voice word number information corresponding to the second voice control command is equal to the mouth shape conversion frequency information corresponding to the lip language mouth shape information, and the matching degree of the second voice control command and the lip language mouth shape information when the voice control command for the intelligent household equipment is sent by the user is smaller than a threshold value, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information according to a replacement principle to obtain a first voice control command.

Optionally, the processing unit 200 is specifically configured to:

As shown in fig. 3, an apparatus for controlling a device according to a second embodiment of the present invention includes: determination module 300 and control module 301:

the determination module 300: the voice control device is used for matching the determined voice control command with lip language mouth shape information when a user sends the voice control command aiming at the intelligent household equipment, and determining a first voice control command according to a matching result;

the control module 301: and the intelligent household equipment is controlled according to the first voice control command.

Optionally, the determining module 300 is specifically configured to:

judging whether the matching degree of the determined voice control command and lip language mouth shape information when a user sends a voice control command aiming at the intelligent household equipment is smaller than a threshold value or not, if so, determining that noise exists in the determined voice control command;

Optionally, the determining module 300 is specifically configured to:

As shown in fig. 4, a complete method for controlling a device according to an embodiment of the present invention includes the following steps:

step 400, voice information and lip language mouth shape information of a user aiming at the intelligent household equipment are obtained;

step 401, analyzing voice information and lip language mouth shape information;

step 402, judging whether a voice control instruction is analyzed, if so, executing step 403, otherwise, exiting;

step 403, matching the voice control instruction with lip language mouth shape information;

step 404, judging whether the matching result is greater than a threshold value, if so, executing step 411, otherwise, executing step 405;

step 405, comparing the voice word number information obtained by analyzing the voice information with the mouth shape conversion frequency information obtained by analyzing the lip language mouth shape information;

step 406, determining whether the number information of the voice words is greater than the mouth shape conversion times information, if so, executing step 407, otherwise, executing step 408;

step 407, discarding the voice control command which is not matched with the lip language mouth shape information in the voice control command, and executing step 403;

step 408, judging whether the voice word number information is equal to the mouth shape conversion times information, if so, executing step 409, otherwise, exiting;

step 409, replacing the voice control command which cannot be matched with the lip language mouth shape information with replacement word information corresponding to the lip language mouth shape information according to a replacement principle;

step 410, obtaining a final voice control command;

and 411, controlling the intelligent household equipment according to the final voice control command.

The present application is described above with reference to block diagrams and/or flowchart illustrations of methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.

Accordingly, the subject application may also be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Further, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with a command execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the command execution system, apparatus, or device.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method of controlling a device, the method comprising:

controlling the intelligent household equipment according to the first voice control command;

the determining the first voice control command according to the matching result includes:

2. The method of claim 1, wherein the match result is determined by:

3. The method of claim 1, wherein the determining a first voice control command according to the voice word number information corresponding to the determined voice control command and the lip language mouth shape information comprises:

4. The method of claim 1, wherein the determining a first voice control command according to the voice word number information corresponding to the determined voice control command and the lip language mouth shape information comprises:

5. The method of claim 3 or 4, wherein the information on the number of speech words corresponding to the determined speech control command is determined by:

6. The method according to claim 3 or 4, wherein the mouth shape conversion times information and the alternative word information corresponding to the lip language mouth shape information are determined by:

7. An apparatus for controlling a device, the apparatus comprising: at least one processing unit and at least one memory unit, wherein the memory unit stores program code that, when executed by the processing unit, causes the processing unit to perform the following:

wherein the processing unit is specifically configured to:

8. The apparatus as claimed in claim 7, wherein said processing unit is specifically configured to:

9. The apparatus as claimed in claim 7, wherein said processing unit is specifically configured to:

10. The apparatus as claimed in claim 7, wherein said processing unit is specifically configured to:

11. The apparatus according to claim 9 or 10, wherein the processing unit is specifically configured to:

12. The apparatus according to claim 9 or 10, wherein the processing unit is specifically configured to: