CN111367415A

CN111367415A - Equipment control method and device, computer equipment and medium

Info

Publication number: CN111367415A
Application number: CN202010186785.1A
Authority: CN
Inventors: 何吉波; 谭北平; 谭志鹏
Original assignee: Tsinghua University; Beijing Mininglamp Software System Co ltd
Current assignee: Tsinghua University; Beijing Mininglamp Software System Co ltd
Priority date: 2020-03-17
Filing date: 2020-03-17
Publication date: 2020-07-03
Anticipated expiration: 2040-03-17
Also published as: CN111367415B

Abstract

The application provides a method, a device, a computer device and a medium for controlling equipment, wherein the method comprises the following steps: acquiring a target video for controlling a target device; determining a gesture style and a gesture motion track appearing in the target video according to a plurality of designated frame images in the target video; determining a control instruction corresponding to the target video according to the gesture style and the gesture motion track; and controlling the target equipment according to the control instruction. According to the embodiment of the application, the gesture pattern and the gesture movement track are determined in the acquired video, the control instruction is determined according to the determined gesture pattern and the determined gesture movement track, the target device is controlled, a user does not need to wear data gloves matched with the target device, the fussy steps of controlling the target device are reduced, and convenience of controlling the target device is improved.

Description

Equipment control method and device, computer equipment and medium

Technical Field

The application relates to the field of intelligent identification, in particular to a control method and device of equipment, computer equipment and a medium.

Background

Along with the development of science and technology, remote control also gradually walks into people's the field of vision, and remote control can realize the interaction of people with equipment, and under the condition that the people does not contact equipment, let equipment work according to people's instruction, and remote control can improve people's productivity at to a great extent, improves work efficiency.

Generally, people wear data gloves to realize remote control. The data glove is provided with a plurality of sensors, gestures of a hand wearing the data glove can be recognized through the sensors, and the corresponding equipment is remotely controlled according to the recognized gestures. According to the method for controlling the equipment, the designated data gloves are required to be worn, the equipment cannot be controlled under the condition that the designated data gloves are not available, and convenience in controlling the equipment is reduced.

Disclosure of Invention

In view of the above, an object of the present application is to provide a method and an apparatus for controlling a device, a computer device, and a medium, which are used to solve the problem of how to perform remote control on a device in the prior art.

In a first aspect, an embodiment of the present application provides a method for controlling a device, including:

acquiring a target video for controlling a target device;

determining a gesture style and a gesture motion track appearing in the target video according to a plurality of designated frame images in the target video;

determining a control instruction corresponding to the target video according to the gesture style and the gesture motion track;

and controlling the target equipment according to the control instruction.

Optionally, after acquiring a target video acting on a mechanical device, before determining a gesture pattern and a gesture motion trajectory appearing in the target video according to a plurality of frames of images specified in the target video, the method further includes:

performing foreground extraction on each frame of image in the target video based on the color value of each part in the image to determine an area image where a hand is located in the frame of image;

and determining whether the target video is an effective target video or not according to the area of the region image where the hand is located in each frame image of the target video.

Optionally, the determining, according to the multiple frames of images specified in the target video, a gesture pattern and a gesture motion trajectory occurring in the target video includes:

determining the outline of the hand and the position information of a positioning point in each frame of image in the target video;

determining the gesture style according to the similarity between the outline of the hand in each frame of image and the outline of the candidate standard gesture;

and determining the gesture motion track according to the position information and the time information of the positioning point in each frame of image.

Optionally, the position of the positioning point includes any one or more of the following positions:

thumb tip position, index finger tip position, middle finger tip position, ring finger tip position, little finger tip position, and centroid position.

Optionally, after determining the control instruction corresponding to the target video according to the gesture style and the gesture motion trajectory, before controlling the mechanical device according to the control instruction, the method further includes:

sending the control instruction to a user terminal so that the user terminal prompts the control instruction according to a message prompting mode;

and receiving a reply instruction of the user terminal to the prompted control instruction.

Optionally, the message prompting manner includes any one or more of the following manners:

a text prompt mode, an image prompt mode and a broadcast prompt mode.

In a second aspect, an embodiment of the present application provides a control apparatus for a device, including:

the acquisition module is used for acquiring a target video for controlling target equipment;

the first determination module is used for determining a gesture style and a gesture motion track appearing in the target video according to a plurality of frames of images appointed in the target video;

the second determining module is used for determining a control instruction corresponding to the target video according to the gesture style and the gesture motion track;

and the control module is used for controlling the target equipment according to the control instruction.

Optionally, the apparatus further includes:

the extraction module is used for carrying out foreground extraction on each frame of image in the target video based on the color value of each part in the image so as to determine an area image where a hand is located in the frame of image;

and the judging module is used for determining whether the target video is an effective target video according to the area of the region image of the hand in each frame image of the target video.

In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, performs the steps of the above method.

The method for controlling the equipment comprises the steps of firstly, obtaining a target video for controlling the target equipment; secondly, determining a gesture style and a gesture motion track appearing in the target video according to the designated multi-frame images in the target video; thirdly, determining a control instruction corresponding to the target video according to the gesture style and the gesture motion track; and finally, controlling the target equipment according to the control instruction.

According to the embodiment of the application, the gesture pattern and the gesture movement track are determined in the acquired video, the control instruction is determined according to the determined gesture pattern and the determined gesture movement track, the target device is controlled, a user does not need to wear data gloves matched with the target device, the fussy steps of controlling the target device are reduced, and convenience of controlling the target device is improved.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic flowchart of a method for controlling an apparatus according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a method for determining a gesture style and a gesture motion trajectory according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a control device of an apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a computer device 400 according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

At present, when human-computer interaction is carried out, a user generally wears a data glove matched with target equipment (equipment to be controlled), gestures made by the user when the user wears the data glove can be recognized, and the target equipment is controlled through the recognized gestures, so that the target equipment completes corresponding actions.

The control of the target device comprises the following steps:

step 1, acquiring gestures made by a hand wearing a data glove;

step 2, determining a control instruction corresponding to the gesture in a database according to the gesture;

and 3, controlling the target equipment according to the control instruction.

In the method for controlling the target device, the target device can be controlled only by wearing the data glove matched with the target device, and the data glove needs to completely recognize the gesture made by the hand, so that the data glove is required to be provided with a plurality of sensors, the data glove provided with the plurality of sensors is expensive in manufacturing cost and cannot be perfectly fit with the hand of the user, and the experience effect of the user is reduced. And each time the user controls the target equipment, the user needs to wear the corresponding data glove, so that the convenience of the user in controlling the target equipment is reduced, and the efficiency of the user in controlling the target equipment is also reduced.

For the above reasons, an embodiment of the present application provides a method for controlling a device, as shown in fig. 1, including the following steps:

s101, acquiring a target video for controlling target equipment;

s102, determining a gesture style and a gesture motion track appearing in the target video according to the designated multi-frame images in the target video;

s103, determining a control instruction corresponding to the target video according to the gesture style and the gesture motion track;

and S104, controlling the target equipment according to the control instruction.

In the above step S101, the target device refers to a device that needs to be controlled, and the target device may include any one or more of the following devices: the target video refers to a video which can be used for controlling target equipment, the target video needs to contain hand motions, and the equipment for acquiring the target video can be a camera, intelligent equipment and the like.

Specifically, according to the scheme of the application, an image pickup device for shooting the target video is required, the target video is acquired through the image pickup device, and the subsequent steps S102 to S104 can be executed only when the target video is acquired.

In step S102, the designated multi-frame image may be a frame image extracted from the target video at a preset time interval, where the preset time interval may be 1 second, 2 seconds, or 3 seconds, and the application is not limited herein. The gesture pattern refers to a hand shape formed by finger change, and comprises any one of a fist making pattern, a finger protruding pattern, a two-finger protruding pattern, a three-finger protruding pattern, a four-finger protruding pattern, a five-finger protruding pattern and the like;

wherein the one-finger protrusion pattern includes any one of: a thumb protrusion pattern, an index finger protrusion pattern, a middle finger protrusion pattern, a ring finger protrusion pattern, a little finger protrusion pattern, and the like;

the two-finger protrusion pattern includes any one of: an index finger and thumb protrusion pattern, an index finger and middle finger protrusion pattern, a middle finger and ring finger protrusion pattern, and a thumb and pinky protrusion pattern, etc.;

the three finger protrusion pattern includes any one of: protruding patterns of an index finger, a middle finger and a ring finger, protruding patterns of the middle finger, the ring finger and a little finger and the like;

the four finger protrusion pattern includes any one of: the protruding patterns of the index finger, the middle finger, the ring finger and the little finger, etc.;

the five finger protrusion pattern includes any one of: thumb, index finger, middle finger, ring finger, and pinky finger, and the like.

The gesture motion trajectory refers to a position where a hand moves in the target video while the gesture pattern remains unchanged, and includes any one of upward translation, downward translation, leftward translation, rightward translation, clockwise rotation in the horizontal direction, counterclockwise rotation in the horizontal direction, clockwise rotation perpendicular to the horizontal direction, counterclockwise rotation perpendicular to the horizontal direction, and the like.

Specifically, in a designated multi-frame image in the target video, a region image where the hand is located and the position of the hand are determined in each frame image of the multi-frame image, a gesture pattern can be determined according to the outline of the region image where the hand is located, the positions of the hand in each frame image are sorted according to time, and a gesture motion track can be determined after sorting.

For example, there is a target video, 5 frame images are extracted from the target video, in the first frame image, the gesture pattern is a forefinger protrusion pattern, and the position of the hand is 2 centimeters from the bottom edge of the frame image; in the second frame image, the gesture mode is an index finger protruding mode, and the position of the hand is 3 centimeters away from the bottom edge of the frame image; in the third frame image, the gesture mode is a forefinger protruding mode, and the distance between the position of the hand and the bottom edge of the frame image is 4 centimeters; in the fourth frame image, the gesture mode is a forefinger protruding mode, and the distance between the position of the hand and the bottom edge of the frame image is 5 centimeters; in the fifth frame image, the gesture pattern is an index finger protruding pattern, and the position of the hand is 6 centimeters away from the bottom edge of the frame image. According to the 5 frame images, the gesture pattern in the target video is an index finger protruding pattern, and the gesture motion track is upward translation.

In step S103, the control instruction refers to an instruction that can be used to control the target device, and different combinations of gesture styles and gesture motion trajectories correspond to different control instructions.

Specifically, after the gesture pattern and the gesture movement track are determined, a control instruction which has an association relation with the determined gesture pattern and the determined gesture movement track is screened out from a control instruction database. And the gesture style, the gesture motion gesture and the control instruction in the control instruction database are stored in an associated manner.

For example, in the control instruction database, the gesture pattern is a thumb-protruding pattern and the gesture motion trajectory is an upward translation, and the represented control instruction is a forward movement instruction; the gesture mode is an index finger protruding mode, the gesture motion track is upward translation, and the represented control instruction is a backward movement instruction; the gesture mode is a middle finger protruding mode, the gesture motion track is upward translation, and the represented control instruction is a leftward movement instruction; the gesture mode is a ring finger protruding mode, the gesture motion track is upward translation, and the represented control instruction is a right movement instruction; the gesture mode is a small finger protruding mode, the gesture motion track is upward translation, and the represented control command is a pivot rotation command. When the gesture pattern in the target video is detected to be the forefinger protrusion pattern and the gesture motion track is upwards translation, the control instruction represented by the target video can be determined to be a backward movement instruction in the control instruction database.

In step S104, after the control instruction is determined, the control instruction needs to be sent to the target device, and the target device can perform a corresponding action according to the control instruction.

According to the scheme, the gesture pattern and the gesture movement track are determined in the acquired video through the four steps, the control instruction is determined according to the determined gesture pattern and the determined gesture movement track, the target device is controlled, a user does not need to wear data gloves matched with the target device, the tedious steps of controlling the target device are reduced, and convenience of controlling the target device is improved.

When the video is shot in the natural environment, a plurality of images can be shot in the video, so that the target video of the application can comprise the hand of the user, the arm of the user and other objects of the environment where the user is located. In this way, when the gesture pattern and the gesture movement track are recognized in the target video, other objects except for the hand of the user interfere with the recognition of the gesture pattern and the gesture movement track, so that it is necessary to remove the interference of other objects except for the hand in the target video, and the following steps may be adopted:

step 105, performing foreground extraction on each frame of image in the target video based on the color value of each part in the image to determine an area image where a hand is located in the frame of image;

and step 106, determining whether the target video is the effective target video according to the area of the region image of the hand in each frame image of the target video.

In the above step 105, the color value refers to a color feature value of the pixel point, and the color value may be represented by a color space value of red, green, and blue, or a color space value of hue, saturation, and lightness. The color value of each portion refers to the color value of each pixel. The foreground extraction refers to distinguishing an image of a region where the hand is located from an image of a background region in each frame of image and removing the image of the background region.

In the step 105, foreground extraction is performed on each frame of image to remove a background area in the frame of image, and only an area image where a hand is located is retained, so that interference of the background on gesture patterns and gesture motion trajectory recognition can be reduced.

The foreground extraction of each frame of image in the application comprises the following steps:

1051, carrying out graying processing on the frame image;

step 1052, performing binarization processing on the grayed frame image according to the first color threshold;

1053, carrying out noise reduction processing on the frame image after binarization processing to obtain a binary image;

and 1054, removing the background in the binary image according to the second color threshold value to reserve the image of the area where the hand is located.

In the step 1051, the graying process refers to adjusting the color of the pixel point to be a gray color, that is, adjusting the red color space value, the green color space value, and the blue color space value corresponding to the pixel point to be uniform values, so that the color of the pixel point appears to be gray.

Specifically, the gray processing of the frame image can be implemented by any one of the following algorithms: component algorithms, maximum algorithms, average algorithms, weighted average algorithms, and the like. The gray processing of an image by any of the above algorithms is a common technique in the prior art, and the present application is not described in detail herein.

In step 1052, the first color threshold refers to a range of color values corresponding to all pixel points in the area image where the hand is located, and the binarization processing refers to unifying the color values of the pixel points in the area image where the hand is located into one color value and unifying the color values of the background area image into another color value, for example, setting the color value of the area image where the hand is located to 0 and setting the color value of the background area image to 255.

Specifically, the range of the color values of the pixel points in the area image where the hand is located (i.e., the first color threshold) is determined, the color values of the pixel points belonging to the first color threshold are unified into one color value, and the color values of the pixel points not belonging to the first color threshold are unified into another color value.

For example, when the color value is a color space value of hue, saturation, and lightness, the color value of the pixel of the image of the area where the hand is located is in a range of hue (2, 28) and saturation (50, 200), the color value of the pixel whose color value is in the above range is set to 0, and the color value of the pixel that does not belong to the above color value range is set to 255.

In step 1053, the noise reduction process is to smooth the edge of the region image.

Specifically, the frame image after the binarization processing is subjected to median filtering first, the edge of the frame image after the binarization processing can be processed more clearly, and then expansion and erosion processing are performed through morphology, so that a region where a hand belongs to noise in the region image (for example, an image having the same color value as a background color value, and the area of the image is far smaller than the area of the gesture region image) can be eliminated, a region where the hand belongs to noise in the background image (for example, an image having the same color value as the color value of the hand in the region image, and the area of the image is far smaller than the area of the gesture region image) is eliminated, and then gaussian filtering is performed, so that the edge of the region image where the hand is located is processed more smoothly.

In step 1054, the second color threshold refers to a color value of a pixel of the image of the region where the hand is located (e.g., the color value is 0).

Specifically, according to the color values in the frame image, the image of the area where the hand is located may be separated from the image of the background area, and the image area where the second color threshold is located is reserved.

In the above step 106, the valid target video refers to a video capable of recognizing a gesture pattern and a gesture motion trajectory.

According to the determined image of the area where the hand is located, the area of the image of the area where the hand is located can be calculated, and according to the ratio of the area of the image of the area where the hand is located to the area of the frame image and the preset ratio range, whether the target video is the effective target video can be determined. The preset ratio range is preset according to actual conditions, if the ratio of the area of the image of the area where the hand is located to the area of the frame image belongs to the preset ratio range, the current video is determined to be an effective target video, and if the ratio of the area of the image of the area where the hand is located to the area of the frame image does not belong to the preset ratio range, the current video is determined to be an ineffective target video. When the area of the image of the area where the hand is located in the invalid video is too small, the recognized gesture pattern may be deviated, and when the area of the image of the area where the hand is located in the invalid video is too large, the frame image may only contain a part of the gesture pattern, so that the gesture pattern cannot be recognized. Therefore, the area of the image of the area where the hand is located judges whether the target video is the effective target video, and the success rate of hand pattern recognition can be improved.

For example, the area of one frame image is 100 square centimeters, the preset ratio range is 60% to 80%, and five frame images G, H, J, K, L are extracted from the target video, where the area of the region image where the hand is located in G is 70 square centimeters, the area of the region image where the hand is located in H is 80 square centimeters, the area of the region image where the hand is located in J is 75 square centimeters, the area of the region image where the hand is located in K is 70 square centimeters, and the area of the region image where the hand is located in L is 65 square centimeters, then the ratio of the area of the region image where the hand is located in G to the area of the frame image is 70%, the ratio of the area of the region image where the hand is located in H to the area of the frame image is 80%, the ratio of the area of the region image where the hand is located in J to the area of the frame image is 75%, the ratio of the area of the region image where the hand is located in K to the area of the frame image is 70%, (the area of the frame, The ratio of the area of the region image where the hand of the user is located to the area of the frame image is 65%, and the ratio of the area of the region image where the hand of the user is located to the area of the frame image in each frame image is within a preset ratio range, so that the target video is an effective target video.

A video may contain a gesture pattern and a gesture motion trajectory, and therefore, the gesture pattern and the gesture motion trajectory are determined in the target video, as shown in fig. 2, step S103 includes:

s1031, aiming at each frame of image in the target video, determining the outline of the hand and the position information of the positioning point in the frame of image;

s1032, determining a gesture style according to the similarity between the outline of the hand in each frame of image and the outline of the candidate standard gesture;

and S1033, determining a gesture motion track according to the position information and the time information of the positioning point in each frame of image.

In step S1031, the contour of the hand is defined by the boundary of the region image in which the hand is located. The positioning point refers to a designated point for determining the position of the hand, specifically, the positioning point may be a certain point on the hand, and the positioning point may include any one or more of the following positions: thumb tip position, index finger tip position, middle finger tip position, ring finger tip position, little finger tip position, centroid position, and the like. Specifically, a minimum rectangle capable of containing the area image of the hand is determined according to the boundary of the area image of the hand, and the intersection point of two diagonal lines of the minimum rectangle is determined as the centroid position. The position information of the positioning points can be determined through a positioning point acquisition model, and the positioning point acquisition model is obtained through a large amount of training data training. The training of the localization point determination model may comprise the steps of:

step 11, acquiring a positioning point training sample set; the positioning point training sample set comprises a plurality of positioning point training samples;

and step 12, aiming at each training sample, taking the gesture image which is not marked as a positive sample of the positioning point determination model to be trained, taking the gesture image marked with the positioning point as a negative sample of the positioning point determination model to be trained, and training the positioning point determination model to be trained.

In the step 11, the positioning point training sample set includes a plurality of positioning point training samples, where each positioning point training sample includes an unmarked gesture image and a gesture image marked with a positioning point, the gesture image refers to a photo including an image of an area where a hand is located, and the marked positioning points in the gesture image may be a thumb fingertip position, an index finger fingertip position, a middle finger fingertip position, a ring finger fingertip position, a little finger fingertip position, and a centroid position.

In step 12, after the frame image is input, the position information of the anchor point in the frame image can be determined by the anchor point training model obtained through multiple training.

Specifically, in a frame image, the outline of the hand can be determined according to the boundary line of the image of the area where the hand is located, and the frame image is input to the positioning point determination model to acquire the position information of the positioning point in the frame image.

In step S1032, there are a plurality of candidate gesture contours in the control instruction database, each gesture contour representing a gesture pattern, a similarity between the hand contour and each candidate gesture contour is calculated, candidate gesture contours meeting a similarity threshold are selected from all candidate gesture contours, and one of the selected candidate gesture contours having a highest similarity with the hand contour is determined, so that the gesture pattern represented by the candidate gesture contour having the highest similarity is the gesture pattern represented by the hand contour.

For example, the hand contour is a, there are 5 candidate gesture contours, which are respectively candidate gesture contour B, candidate gesture contour C, candidate gesture contour D, candidate gesture contour E, and candidate gesture contour F, the similarity threshold is between 60% and 100%, according to calculation, the similarity between the hand contour A and the candidate gesture contour B is 50%, the similarity between the hand contour A and the candidate gesture contour C is 70%, the similarity between the hand contour A and the candidate gesture contour D is 30%, the similarity between the hand contour A and the candidate gesture contour E is 80%, and the similarity between the hand contour A and the candidate gesture contour F is 20%, candidate gesture contours meeting a similarity threshold may be determined as candidate gesture contour C and candidate gesture contour E, and the candidate gesture outline with the maximum similarity is the candidate gesture outline E, and the gesture style represented by the candidate gesture outline E is the gesture style represented by the outline of the hand.

In step S1033, the position information of the positioning points in each frame of image is sorted according to the time information, and the connection line of the positions of the sorted positioning points forms the gesture movement track.

When the target equipment is controlled, only a control instruction is sent to the target equipment, and the target equipment receives the control instruction, the target equipment can execute corresponding action, but the target equipment is a machine without thinking, does not have human thinking, does not have the random strain capacity, and can bring life danger to working personnel when the high-risk work is processed by the target equipment and the wrong control instruction is received; or damage may be caused to the expensive article when the expensive article is processed by the target device and an erroneous control instruction is received. Therefore, when sending the control command to the target device, the method needs to be more rigorous, and the method further comprises the following steps:

step 107, sending the control instruction to the user terminal so that the user terminal prompts the control instruction according to a message prompting mode;

and step 108, receiving a reply instruction of the user terminal aiming at the prompted control instruction.

In step 107, the user terminal refers to a device that can be used to receive a control command, and the user terminal may be a mobile phone, a computer, a tablet computer, or the like. The message prompting mode refers to a mode for prompting the control instruction, and the message prompting mode may include any one or more of the following modes: a text prompt mode, an image prompt mode and a broadcast prompt mode.

Specifically, the control instruction is prompted at the user terminal according to a preset message prompting mode, so that the user can review the control instruction again, and the user can check whether the control instruction is correct or wrong.

For example, the control command is that the target device moves forward, after the control command is sent to the user terminal, the user terminal may pop up a message box, display "the control target device moves forward" in the box, and display two buttons of confirm and cancel below the box.

In the above step 108, the reply instruction refers to a reply to the message prompted in the user terminal, and the reply instruction may include confirmation and cancellation.

Specifically, after seeing the prompt message prompted by the user terminal, the user may reply to the prompt message, and when the user checks that the control instruction is correct, the user may confirm that the control instruction may be continuously executed, and when the user checks that the control instruction is wrong, the user may cancel the control instruction from being sent to the target device. The method can improve the accuracy of the control instruction, reduce the threat to life safety caused by the target device receiving the wrong control instruction, and reduce the property loss caused by the target device receiving the wrong control instruction.

The example that the user terminal prompts the control instruction is continued, the user can judge according to the content of the prompt message, when the user confirms that the content of the prompt message is correct, the user clicks the confirmation button, the control instruction can be continuously sent to the target device, and when the user confirms that the content of the prompt message is wrong, the user clicks the cancel button, the control instruction cannot be continuously sent to the target device.

In the application, the gesture pattern and the gesture movement track are determined through the target video, the target equipment is controlled according to the control command determined according to the gesture pattern and the gesture movement track, the target equipment can be controlled without wearing data gloves completely matched with the target equipment by a user, the efficiency of controlling the target equipment is improved, the data gloves are omitted, and the production cost is also reduced. The control instruction is determined through the combination of the gesture style and the gesture motion track, various control instructions can be set, the control instructions are richer, a user can control the target equipment more comprehensively, the control efficiency of the target equipment is improved, and the working efficiency of the target equipment is also improved. And foreground extraction is carried out through the color values, so that the background is removed from the video, interference on the determination of the gesture pattern and the gesture motion track in the video is reduced, and the accuracy of the gesture pattern and the gesture motion track is improved. Whether the target video is an effective video or not can be determined through calculation of the area of the regional image of the opponent, the gesture pattern and the gesture movement track in the target video can be further identified only after the effective target video is confirmed, the gesture pattern and the gesture movement track are not identified in the invalid target video, unnecessary work is reduced, and the efficiency of identifying the gesture pattern and the gesture movement track is also improved.

As shown in fig. 3, an embodiment of the present application provides a control apparatus for a device, including:

an obtaining module 301, configured to obtain a target video for controlling a target device;

the first determining module 302 is configured to determine a gesture pattern and a gesture motion trajectory occurring in the target video according to a plurality of frames of images specified in the target video;

the second determining module 303 is configured to determine a control instruction corresponding to the target video according to the gesture style and the gesture motion trajectory;

and the control module 304 is configured to control the target device according to the control instruction.

Optionally, the apparatus further includes:

and the judging module is used for determining whether the target video is the effective target video according to the area of the region image of the hand in each frame image of the target video.

Optionally, the first determining module 302 includes:

the first determining unit is used for determining the outline of a hand and the position information of a positioning point in each frame of image in the target video;

the second determining unit is used for determining a gesture mode according to the similarity between the outline of the hand in each frame of image and the outline of the candidate standard gesture;

and the third determining unit is used for determining the gesture motion track according to the position information and the time information of the positioning point in each frame of image.

Optionally, the location point position includes any one or more of the following positions:

Optionally, the apparatus further includes:

the prompting module is used for sending the control instruction to the user terminal so that the user terminal prompts the control instruction according to a message prompting mode;

and the reply module is used for receiving a reply instruction of the user terminal aiming at the prompted control instruction.

Optionally, the message prompting mode includes any one or more of the following modes:

a text prompt mode, an image prompt mode and a broadcast prompt mode.

Corresponding to the control method of the device in fig. 1, an embodiment of the present application further provides a computer device 400, as shown in fig. 4, the device includes a memory 401, a processor 402, and a computer program stored in the memory 401 and executable on the processor 402, where the processor 402 implements the control method of the device when executing the computer program.

Specifically, the memory 401 and the processor 402 can be general memories and processors, which are not limited in particular, and when the processor 402 runs a computer program stored in the memory 401, the control method of the device can be executed, so that the problem of convenience in how to remotely control the device in the prior art is solved. According to the method and the device, the gesture pattern and the gesture movement track are determined in the acquired video, the control instruction is determined according to the determined gesture pattern and the determined gesture movement track, the target device is controlled, a user does not need to wear data gloves matched with the target device, the complex steps of controlling the target device are reduced, and convenience of controlling the target device is improved.

Corresponding to the control method of the apparatus in fig. 1, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to perform the steps of the control method of the apparatus.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk and the like, when a computer program on the storage medium is run, the control method of the device can be executed, the problem of convenience of how to remotely control the device in the prior art is solved, the gesture style and the gesture motion track are determined in the obtained video, the control instruction is determined according to the determined gesture style and the gesture motion track, the target device is controlled, a user does not need to wear a data glove matched with the target device, the tedious steps of controlling the target device are reduced, and the convenience of controlling the target device is improved.

In the embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of controlling a device, comprising:

acquiring a target video for controlling a target device;

and controlling the target equipment according to the control instruction.

2. The method according to claim 1, wherein after acquiring a target video acting on a mechanical device, before determining a gesture pattern and a gesture motion trail appearing in the target video according to a plurality of frames of images specified in the target video, the method further comprises:

3. The method according to claim 1, wherein the determining the gesture pattern and the gesture motion trail appearing in the target video according to the plurality of frames of images specified in the target video comprises:

4. The method of claim 3, wherein the location points comprise any one or more of the following positions:

5. The method according to claim 1, after determining a control instruction corresponding to the target video according to the gesture pattern and the gesture motion trajectory, and before controlling the target device according to the control instruction, further comprising:

6. The method of claim 5, wherein the message prompting means comprises any one or more of the following means:

a text prompt mode, an image prompt mode and a broadcast prompt mode.

7. A control apparatus of a device, characterized by comprising:

8. The apparatus of claim 7, further comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of the preceding claims 1-6 are implemented by the processor when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method of any one of the preceding claims 1 to 6.