CN113642413A - Control method, apparatus, device and medium - Google Patents

Control method, apparatus, device and medium Download PDF

Info

Publication number
CN113642413A
CN113642413A CN202110807879.0A CN202110807879A CN113642413A CN 113642413 A CN113642413 A CN 113642413A CN 202110807879 A CN202110807879 A CN 202110807879A CN 113642413 A CN113642413 A CN 113642413A
Authority
CN
China
Prior art keywords
human hand
target
image
target image
target area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110807879.0A
Other languages
Chinese (zh)
Inventor
孙红伟
朱理森
王翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Line Technology Co ltd
Original Assignee
New Line Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Line Technology Co ltd filed Critical New Line Technology Co ltd
Priority to CN202110807879.0A priority Critical patent/CN113642413A/en
Publication of CN113642413A publication Critical patent/CN113642413A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Abstract

The application discloses a control method, a control device, control equipment and a control medium, and belongs to the technical field of artificial intelligence. The control method comprises the following steps: acquiring a sequence of images comprising a gesture; aiming at a target image in the image sequence, determining the change trend of the human hand in the image before the target image according to the image before the target image; predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the variation trend; determining a target area of the human hand in the target image according to the target position; identifying a target area; under the condition that the characteristic information of the human hand is obtained by identifying the target area, identifying the characteristic information to obtain a gesture control command corresponding to the characteristic information; and executing the gesture control command. The control method, the control device, the control equipment and the control medium can improve gesture control efficiency.

Description

Control method, apparatus, device and medium
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a control method, a control device, control equipment and a control medium.
Background
With the development of artificial intelligence interaction technology, gesture control technology is increasingly applied to various fields, such as vehicle-mounted systems, smart homes, Virtual Reality (VR) interaction, smart phones and other fields.
The gesture control system in the related art includes 5 modules: the device comprises an image acquisition module, a hand detection module, a hand characteristic recognition module, a gesture command recognition module and a command execution module. Wherein, 5 modules are operated in series, and each cycle time in the gesture control process is the sum of the operation times of the 5 modules.
However, in the process of gesture control, the position of the human hand in the image is determined by using the human hand detection module, which takes a long time, and results in low gesture control efficiency.
Disclosure of Invention
An object of the embodiments of the present application is to provide a control method, apparatus, device, and medium, which can solve the problem of low gesture control efficiency.
In a first aspect, an embodiment of the present application provides a control method, including:
acquiring a sequence of images comprising a gesture;
aiming at a target image in the image sequence, determining the change trend of the human hand in the image before the target image according to the image before the target image;
predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the variation trend;
determining a target area of the human hand in the target image according to the target position;
identifying a target area;
under the condition that the characteristic information of the human hand is obtained by identifying the target area, identifying the characteristic information to obtain a gesture control command corresponding to the characteristic information;
and executing the gesture control command.
In a second aspect, an embodiment of the present application provides a control apparatus, including:
an acquisition module for acquiring an image sequence comprising a gesture;
the first determining module is used for determining the change trend of the human hand in the image before the target image according to the image before the target image aiming at the target image in the image sequence;
the prediction module is used for predicting the target position of the human hand in the target image according to the position and the change trend of the human hand in the previous image adjacent to the target image;
the second determining module is used for determining a target area of the human hand in the target image according to the target position;
the first identification module is used for identifying a target area;
the second identification module is used for identifying the characteristic information under the condition that the characteristic information of the human hand is obtained by identifying the target area, and obtaining a gesture control command corresponding to the characteristic information;
and the execution module is used for executing the gesture control command.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the steps of the method according to the first aspect.
In the embodiment of the application, for an image in an image sequence including a gesture, a change trend of a human hand in the image before the image is determined according to the image before the image, a target position of the human hand in the image is predicted according to the change trend and the position of the human hand in a previous image adjacent to the image, a target area of the human hand in the image is determined according to the target position, the target area is recognized, the characteristic information is recognized under the condition that the characteristic information of the human hand is obtained by recognizing the target area, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. Since the target position of the human hand in the image is predicted according to the change trend of the human hand in the image before the image and the position of the human hand in the image before the image, the time for predicting the position of the human hand in the image is shorter than the time for determining the position of the human hand in the image by using the human hand detection module, therefore, the speed for determining the position of the human hand in the image can be improved, and the gesture control efficiency can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a control method provided in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a control device provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
fig. 4 is a hardware configuration diagram of an electronic device implementing an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described clearly below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
The following describes in detail a control method, an apparatus, a device, and a medium provided in the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of a control method provided in an embodiment of the present application. The control method may include:
s101: acquiring a sequence of images comprising a gesture;
s102: aiming at a target image in the image sequence, determining the change trend of the human hand in the image before the target image according to the image before the target image;
s103: predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the variation trend;
s104: determining a target area of the human hand in the target image according to the target position;
s105: identifying a target area;
s106: under the condition that the characteristic information of the human hand is obtained by identifying the target area, identifying the characteristic information to obtain a gesture control command corresponding to the characteristic information;
s107: and executing the gesture control command.
Specific implementations of the above steps will be described in detail below.
In the embodiment of the application, for an image in an image sequence including a gesture, a change trend of a human hand in the image before the image is determined according to the image before the image, a target position of the human hand in the image is predicted according to the change trend and the position of the human hand in a previous image adjacent to the image, a target area of the human hand in the image is determined according to the target position, the target area is recognized, the characteristic information is recognized under the condition that the characteristic information of the human hand is obtained by recognizing the target area, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. Since the target position of the human hand in the image is predicted according to the change trend of the human hand in the image before the image and the position of the human hand in the image before the image, the time for predicting the position of the human hand in the image is shorter than the time for determining the position of the human hand in the image by using the human hand detection module, therefore, the speed for determining the position of the human hand in the image can be improved, and the gesture control efficiency can be improved.
In some possible implementations of the embodiment of the present application, the image sequence in S101 may be a multi-frame image acquired by using a camera.
In some possible implementations of the embodiment of the present application, in S102, the images in the image sequence may be sequentially taken as the target images. In practical application, the target image in S102 is the frame of image currently acquired by the camera.
In some possible implementations of the embodiments of the present application, the trend of change in S102 includes a position trend of change and a posture trend of change.
In some possible implementations of the embodiment of the present application, in S103, when predicting the target position of the human hand in the target image, a position prediction manner based on a motion function, a position prediction manner based on a motion model, a position prediction manner based on a historical track, or the like may be adopted. The location prediction algorithms employed include, but are not limited to, markov chain-based algorithms, hidden markov model-based algorithms, neural network-based algorithms, and the like.
In some possible implementations of the embodiment of the present application, S103 may include: and predicting the positions of the key points of the human hand in the target image, wherein the positions of the key points of the human hand in the target image refer to the coordinates of the key points of the human hand in the target image. Among them, in the gesture control, there are usually 21 human hand key points used for the gesture control.
In the embodiment of the present application, the positions of the key points of the 21 human hands in the target image can be predicted, and then the positions of the human hands in the target image can be obtained.
In some possible implementations of the embodiments of the present application, S104 may include: and determining the minimum circumscribed graphic area of the target position in the target image as the target area of the human hand in the target image.
In some possible implementations of the embodiments of the present application, the minimum circumscribed graphic area may be a minimum circumscribed circle area or a minimum circumscribed rectangle area.
In some possible implementations of the embodiments of the application, when the minimum circumscribed graphic area is the minimum circumscribed rectangle area, two sides of the minimum circumscribed rectangle of the target position may be parallel to two sides of the image, respectively.
In some possible implementations of the embodiments of the present application, the feature information of the human hand in S105 includes, but is not limited to, coordinate information of a plurality of key points of the human hand in the image.
In some possible implementations of the embodiment of the present application, in S105, the image of the target area range may be input to the human hand feature recognition module, and the human hand feature recognition module outputs feature information of the human hand.
In some possible implementations of the embodiment of the present application, in S106, the feature information of the human hand may be input to the gesture command recognition module, and the gesture command recognition module outputs the gesture control command corresponding to the feature information, so as to execute the gesture control command to control the controlled object.
In some possible implementations of the embodiment of the present application, the control method provided in the embodiment of the present application may further include: and correcting the target position using the feature information, wherein the corrected target position is used for predicting the position of the human hand in a subsequent image adjacent to the target image.
Illustratively, taking the 11 th frame image in the image sequence as an example, when the target position of the human hand in the 11 th frame image is predicted according to the 10 th frame image in the image sequence and the change trend of the human hand in the image before the 11 th frame image, the target image of the human hand in the target area range in the 11 th frame image determined according to the target position is input to the human hand feature recognition module, at this time, the human hand feature recognition model outputs the feature information of the human hand in the 11 th frame image, and further corrects the target position of the human hand in the 11 th frame image according to the feature information, and for the 12 th frame image in the image sequence, the target position of the human hand in the 12 th frame image is predicted by using the corrected target position of the human hand in the 11 th frame image.
Specifically, for each of a plurality of key points of a human hand, assuming that the coordinates of the key point in the 10 th frame image in the image sequence are used, the coordinates of the key point in the 11 th frame image in the image sequence are predicted to be (X1, Y1), and the coordinates of the key point in the 11 th frame image are output to be (X2, Y2) by using a human hand feature recognition model, the coordinates of the key point are corrected from (X1, Y1) to (X2, Y2). When the coordinates of the key point in the 12 th frame image in the image sequence are predicted, the coordinates (X2, Y2) of the key point after correction are used for prediction.
In the embodiment of the application, the position of the hand in the image is corrected, so that the accuracy of subsequent prediction of the position of the hand can be improved, and the accuracy of gesture control is further improved.
In some possible implementations of the embodiments of the present application, the control method provided in the embodiments of the present application further includes: under the condition that characteristic information of the human hand is not obtained in the target area, detecting the target area of the human hand in the target image; and continuing to execute the step of identifying the target area to obtain the characteristic information of the human hand.
Specifically, when the target position of the human hand in the target image is predicted according to the position and the variation trend of the human hand in the previous image adjacent to the target image, the target image of the target area range of the human hand in the target image determined according to the target position is input to the human hand feature recognition module, and the human hand feature recognition module does not output the feature information of the human hand, the target image is input to the human hand detection module. If the human hand detection module does not output the target area of the human hand in the target image, the human hand does not exist in the target image; if the human hand detection module outputs the target area of the human hand in the target image, the target image of the human hand output by the human hand detection module in the target area range in the target image is input to the human hand characteristic identification module, and the human hand characteristic identification module outputs the characteristic information of the human hand.
In the embodiment of the application, the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, the target image of the target area range of the human hand in the target image determined according to the target position is input to a human hand feature recognition module, and when the feature information of the human hand is not output by the human hand feature recognition module, the target image is input to a human hand detection module to obtain the target area of the human hand in the target image, so that the target area is recognized, the feature information of the human hand is obtained, the feature information is recognized, a gesture control command corresponding to the feature information is obtained, and the gesture control command is executed to control a corresponding controlled object. The accuracy of gesture control can be improved.
In some possible implementations of the embodiments of the present application, the control method provided in the embodiments of the present application further includes: under the condition that the characteristic information of the human hand is not obtained in the target area, adjusting model parameters of a position prediction model for predicting the target position; and predicting the target position of the human hand in the target image again according to the adjusted model parameters, the position of the human hand in the previous image adjacent to the target image and the change trend until the preset condition is met.
In some possible implementations of the embodiments of the present application, the model parameters in the embodiments of the present application include, but are not limited to, the ambiguity and the range of prediction, etc.
In some possible implementations of the embodiment of the present application, the preset condition in the embodiment of the present application includes, but is not limited to, that the number of times of adjusting the model parameter reaches a preset duration, that a duration of predicting a target position of the human hand in the target image reaches the preset duration, that the human hand detection module outputs a detection result of the target image, and the like.
For example, when the target position of the human hand in the target image is predicted according to the position of the human hand in the previous image adjacent to the target image and the variation trend, and the target image of the target area range of the human hand in the target image determined according to the target position is input to the human hand feature recognition module, and the human hand feature recognition module does not output the feature information of the human hand, the degree of blur of the position prediction model for predicting the target position may be increased, so that the target position of the human hand in the target image can be predicted.
Further illustratively, when the target position of the human hand in the target image is predicted based on the position and the variation tendency of the human hand in the previous image adjacent to the target image, and the target image of the target area range of the human hand in the target image determined based on the target position is input to the human hand feature recognition module, and the human hand feature recognition module does not output the feature information of the human hand, the prediction range of the position prediction model for predicting the target position may be expanded. For example, when the region of the human hand in the previous image adjacent to the target image is within a rectangular range of M1 × N1 with the coordinates (x, y) as the top left vertex, the target position of the human hand is predicted within a rectangular range of M1 × N1 with the coordinates (x, y) as the top left vertex in the target image, and at this time, if the target image of the target region range of the human hand in the target image determined from the target position is input to the human hand feature recognition module, and the human hand feature recognition module does not input the feature information of the human hand, the prediction range is adjusted, and the target position of the human hand in the target image is re-predicted within a rectangular range of M2 × N2 with the coordinates (x, y) as the top left vertex in the target image, where M2 is greater than M1 and N2 is greater than N1.
In the embodiment of the application, the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, the target image of the target area range of the human hand in the target image determined according to the target position is input to a human hand feature recognition module, and when the human hand feature recognition module does not output the feature information of the human hand, the model parameters of a position prediction model for predicting the target position are adjusted; and predicting the target position of the human hand in the target image again according to the adjusted model parameters, so that the accuracy of gesture control can be improved.
In some possible implementations of the embodiments of the present application, the control method provided in the embodiments of the present application further includes: detecting a target area of the human hand in the target image in parallel with predicting the target position of the human hand in the target image; stopping predicting the target position of the human hand in the target image under the condition that the target area is detected and the target position is not predicted; when the target position is predicted and the target area is not detected, the detection of the target area of the human hand in the target image is stopped.
In some possible implementations of the claimed embodiments, two threads may be simultaneously initiated, the first thread being for predicting a target position of the human hand in the target image by predicting a position of the human hand in a previous image adjacent to the target image based on the position of the human hand in the previous image; the second thread is used for detecting the target position of the human hand in the target image through the human hand detection module.
When the first thread obtains the target position of the human hand in the target image and the second thread does not obtain the target position of the human hand in the target image, the target position of the human hand in the target image is obtained at the moment, so that the target position of the human hand in the target image does not need to be obtained through the second thread, and the second thread can be stopped at the moment.
And when the first thread does not obtain the target position of the human hand in the target image, adjusting the model parameters of the position prediction model for predicting the target position, and then re-determining the target position of the human hand in the target image by using the first thread and the adjusted model parameters. If the second thread obtains the target position of the human hand in the target image in the process of predicting the target position of the human hand in the target image by using the first thread, the target position of the human hand in the target image does not need to be obtained by the first thread at this time because the target position of the human hand in the target image is obtained, and the first thread can be stopped at this time. If the first thread already obtains the target position of the human hand in the target image in the process of obtaining the target position of the human hand in the target image by using the second thread, at this time, the target position of the human hand in the target image is obtained, so that the second thread is not needed to obtain the target position of the human hand in the target image, and at this time, the second thread can be stopped.
It can be understood that, the time T1 of obtaining the target position of the human hand in the target image by using the first thread is T1+ n T6, where T1 is the time of acquiring the target image, T6 is the time of predicting the target position each time, and n is the number of loop predictions; and obtaining the target position of the human hand in the target image by using the second thread, wherein the time T2 is T1+ T2, wherein T1 is the time for acquiring the target image, and T2 is the time for detecting the detection target image by using the human hand detection module. The time for obtaining the target position of the human hand in the target image is T2 at most. In practical applications, the time to obtain the target position of the human hand in the target image will not reach T2 in most cases. Because the two threads run simultaneously, on one hand, the gesture recognition time can be shortened as much as possible, and the gesture control efficiency is improved, on the other hand, the running continuity of the system can be ensured, and the experience of smooth operation of a user is ensured.
Illustratively, the following description will be given taking an example in which the image sequence includes 30 frames of images.
In the 30-frame image, the 10 th frame starts to have a hand, and the 25 th frame disappears.
For the 1 st frame image, because no image exists before the 1 st frame image, the change trend of the human hand in the image before the 1 st frame image is determined to be no change trend according to the image before the 1 st frame image. And because there is no image before the 1 st frame image, and further according to the position of the human hand in the previous image of the 1 st frame image and the change trend, the target position of the human hand in the 1 st frame image cannot be predicted, at this time, the 1 st frame image is input to the human hand detection module, and the detection result is obtained as follows: without a human hand.
For the 2 nd frame image to the 9 th frame image, the process is similar to the process of the 1 st frame image, and the process of the 1 st frame image may be specifically referred to, which is not described herein again in this embodiment of the present application.
For the 10 th frame image, because the images before the 10 th frame image, namely the 1 st frame image to the 9 th frame image, have no human hand, at this time, the change trend of the human hand in the 1 st frame image to the 9 th frame image is determined to be no change trend according to the 1 st frame image to the 9 th frame image. And because the 9 th frame image has no human hand, the target position of the human hand in the 10 th frame image cannot be predicted according to the position of the human hand in the 9 th frame image and the change trend, at this time, the 10 th frame image is input to the human hand detection module, and the target position of the human hand in the 10 th frame image is obtained. Inputting the image of the human hand in the target area range in the 10 th frame image determined according to the target position into a human hand feature recognition module to obtain feature information of the human hand in the 10 th frame image, recognizing the feature information to obtain a gesture control command corresponding to the feature information of the human hand in the 10 th frame image, and executing the gesture control command.
For the 11 th frame image, since the target position of the human hand in the 10 th frame image is determined, the change trend of the human hand in the 10 th frame image is determined according to the 10 th frame image. And the target position of the human hand in the 10 th frame image is determined, and the target position of the human hand in the 11 th frame image is predicted according to the position of the human hand in the 10 th frame image and the change trend. Inputting the image of the human hand in the target area range in the 11 th frame image determined according to the target position into a human hand feature recognition module to obtain feature information of the human hand in the 11 th frame image, recognizing the feature information to obtain a gesture control command corresponding to the feature information of the human hand in the 11 th frame image, and executing the gesture control command. And correcting the target position of the human hand in the 11 th frame image by using the characteristic information of the human hand in the 11 th frame image to obtain the target position of the corrected human hand in the 11 th frame image.
If the image of the human hand in the target area range in the 11 th frame image determined according to the target position is input to the human hand feature recognition module, the human hand feature recognition module does not output feature information of the human hand in the 11 th frame image, at this time, the 11 th frame image may be input to the human hand detection module, and the target position of the human hand in the 11 th frame image is obtained. Inputting the image of the human hand in the target area range in the 11 th frame image determined according to the target position into a human hand feature recognition module to obtain feature information of the human hand in the 11 th frame image, recognizing the feature information to obtain a gesture control command corresponding to the feature information of the human hand in the 11 th frame image, and executing the gesture control command.
Assuming that the image of the human hand in the target area range in the 11 th frame image determined according to the target position is input to the human hand feature recognition module, and the human hand feature recognition module does not output the feature information of the human hand in the 11 th frame image, at this time, the model parameters of the position prediction model for predicting the target position, such as the predicted ambiguity, the predicted range and the like, may be adjusted, and the target position of the human hand in the 11 th frame image may be re-predicted according to the adjusted model parameters, the position of the human hand in the 10 th frame image and the variation trend until the preset condition is satisfied.
The method also comprises the steps of detecting a target area of the human hand in the 11 th frame image by using a human hand detection module while predicting the target position of the human hand in the 11 th frame image by using the position and the change trend of the human hand in the 10 th frame image, and stopping detecting the target area of the human hand in the 11 th frame image by using the human hand detection module when the target position of the human hand in the 11 th frame image is predicted by using the position and the change trend of the human hand in the 10 th frame image and the target area of the human hand in the 11 th frame image is not output by the human hand detection module; when the human hand detection module outputs the target area of the human hand in the 11 th frame image and the target position of the human hand in the 11 th frame image is not predicted by the position and the change trend of the human hand in the 10 th frame image, the target position of the human hand in the 11 th frame image is predicted by the position and the change trend of the human hand in the 10 th frame image.
For the 12 th frame image to the 30 th frame image, the process is similar to the process of the 11 th frame image, and the process of the 11 th frame image may be specifically referred to, which is not described herein again in this embodiment of the present application.
In addition, in any one of the 25 th frame image to the 30 th frame image, when the human hand feature recognition module cannot recognize feature information of the human hand from an image of the predicted human hand within the target area in the frame image, and the frame image is input to the human hand detection module, and the human hand is not detected, it indicates that the human hand is not present in the frame image. For the frame images to the 30 th frame image, the process is similar to the process of the 1 st frame image, and reference may be specifically made to the process of the 1 st frame image, which is not described herein again in this embodiment of the present application.
In the control method provided by the embodiment of the present application, the execution main body may be a control device, or a control module in the control device for executing the control method. In the embodiment of the present application, a control device executing a control method is taken as an example, and the control device provided in the embodiment of the present application is described.
Fig. 2 is a schematic structural diagram of a control device according to an embodiment of the present application. The control device 200 may include:
an acquisition module 201 for acquiring an image sequence including a gesture;
a first determining module 202, configured to determine, for a target image in the image sequence, a variation trend of the human hand in an image before the target image according to an image before the target image;
the prediction module 203 is used for predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the variation trend;
the second determining module 204 is used for determining a target area of the human hand in the target image according to the target position;
a first identification module 205, configured to identify a target area;
the second recognition module 206 is configured to, in a case that the feature information of the human hand is obtained by recognizing the target area, recognize the feature information to obtain a gesture control command corresponding to the feature information;
and the execution module 207 is used for executing the gesture control command.
In the embodiment of the application, for an image in an image sequence including a gesture, a change trend of a human hand in the image before the image is determined according to the image before the image, a target position of the human hand in the image is predicted according to the change trend and the position of the human hand in a previous image adjacent to the image, a target area of the human hand in the image is determined according to the target position, the target area is recognized, the characteristic information is recognized under the condition that the characteristic information of the human hand is obtained by recognizing the target area, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. Since the target position of the human hand in the image is predicted according to the change trend of the human hand in the image before the image and the position of the human hand in the image before the image, the time for predicting the position of the human hand in the image is shorter than the time for determining the position of the human hand in the image by using the human hand detection module, therefore, the speed for determining the position of the human hand in the image can be improved, and the gesture control efficiency can be improved.
In some possible implementations of the embodiments of the present application, the control device 200 further includes:
and the correction module is used for correcting the target position by utilizing the characteristic information, wherein the corrected target position is used for predicting the position of the human hand in a subsequent image adjacent to the target image.
In the embodiment of the application, the position of the hand in the image is corrected, so that the accuracy of subsequent prediction of the position of the hand can be improved, and the accuracy of gesture control is further improved.
In some possible implementations of the embodiments of the present application, the control device 200 further includes:
the first detection module is configured to detect a target area of the human hand in the target image and trigger the first recognition module 205 when the feature information of the human hand is not obtained in the recognition target area.
In the embodiment of the application, when the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, and the characteristic information of the human hand is not recognized in a target area of the human hand determined by recognizing the target position in the target image, the target area of the human hand in the target image is obtained by inputting the target image into a human hand detection module, the target area is further recognized, the characteristic information of the human hand is obtained, the characteristic information is recognized, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. The accuracy of gesture control can be improved.
In some possible implementations of the embodiments of the present application, the control device 200 further includes:
the adjusting module is used for adjusting model parameters of a position prediction model for predicting the target position under the condition that the characteristic information of the human hand is not obtained in the target area;
the prediction module 203 is further configured to:
and predicting the target position of the human hand in the target image again according to the adjusted model parameters, the position of the human hand in the previous image adjacent to the target image and the change trend until the preset condition is met.
In the embodiment of the application, when the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, and the characteristic information of the human hand is not recognized in a target area of the human hand determined by recognizing the target position in the target image, model parameters of a position prediction model for predicting the target position are adjusted; and predicting the target position of the human hand in the target image again according to the adjusted model parameters, so that the accuracy of gesture control can be improved.
In some possible implementations of the embodiments of the present application, the control device 200 further includes:
the second detection module is used for detecting a target area of the human hand in the target image in parallel with predicting the target position of the human hand in the target image;
the first stopping module is used for stopping predicting the target position of the human hand in the target image under the condition that the target area is detected and the target position is not predicted;
and the second stopping module is used for stopping detecting the target area of the human hand in the target image under the condition that the target position is predicted and the target area is not detected.
In the embodiment of the application, on one hand, the gesture recognition time can be shortened as much as possible, the gesture control efficiency is improved, on the other hand, the continuity of system operation can be ensured, and the experience of smooth operation of a user is ensured.
The control device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.
The control device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system (Android), an iOS operating system, or other possible operating systems, which is not specifically limited in the embodiments of the present application.
The control device provided in the embodiment of the present application can implement each process in the control method embodiment of fig. 1, and is not described here again to avoid repetition.
Optionally, as shown in fig. 3, an electronic device 300 is further provided in this embodiment of the present application, and includes a processor 301, a memory 302, and a program or an instruction stored in the memory 302 and capable of running on the processor 301, where the program or the instruction is executed by the processor 301 to implement each process of the control method embodiment, and can achieve the same technical effect, and no further description is provided here to avoid repetition.
It should be noted that the electronic devices in the embodiments of the present application include the mobile electronic devices and the non-mobile electronic devices described above.
In some possible implementations of embodiments of the present Application, the processor 301 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of embodiments of the present Application.
In some possible implementations of embodiments of the present application, the Memory 302 may include Read-Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform the operations described with reference to the control methods according to embodiments of the application.
Fig. 4 is a hardware configuration diagram of an electronic device implementing an embodiment of the present application.
The electronic device 400 includes, but is not limited to: radio unit 401, network module 402, audio output unit 403, input unit 404, sensor 405, display unit 406, user input unit 407, interface unit 408, memory 409, and processor 410.
Those skilled in the art will appreciate that the electronic device 400 may further include a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 410 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 4 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.
Wherein the processor 410 is configured to: acquiring a sequence of images comprising a gesture; aiming at a target image in the image sequence, determining the change trend of the human hand in the image before the target image according to the image before the target image; predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the variation trend; determining a target area of the human hand in the target image according to the target position; identifying a target area; under the condition that the characteristic information of the human hand is obtained by identifying the target area, identifying the characteristic information to obtain a gesture control command corresponding to the characteristic information; and executing the gesture control command.
In the embodiment of the application, for an image in an image sequence including a gesture, a change trend of a human hand in the image before the image is determined according to the image before the image, a target position of the human hand in the image is predicted according to the change trend and the position of the human hand in a previous image adjacent to the image, a target area of the human hand in the image is determined according to the target position, the target area is recognized, the characteristic information is recognized under the condition that the characteristic information of the human hand is obtained by recognizing the target area, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. Since the target position of the human hand in the image is predicted according to the change trend of the human hand in the image before the image and the position of the human hand in the image before the image, the time for predicting the position of the human hand in the image is shorter than the time for determining the position of the human hand in the image by using the human hand detection module, therefore, the speed for determining the position of the human hand in the image can be improved, and the gesture control efficiency can be improved.
In some possible implementations of embodiments of the present application, the processor 410 is further configured to:
and correcting the target position using the feature information, wherein the corrected target position is used for predicting the position of the human hand in a subsequent image adjacent to the target image.
In the embodiment of the application, the position of the hand in the image is corrected, so that the accuracy of subsequent prediction of the position of the hand can be improved, and the accuracy of gesture control is further improved.
In some possible implementations of embodiments of the present application, the processor 410 is further configured to:
and in the case that the characteristic information of the human hand is not obtained in the recognition of the target area, detecting the target area of the human hand in the target image.
In the embodiment of the application, when the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, and the characteristic information of the human hand is not recognized in a target area of the human hand determined by recognizing the target position in the target image, the target area of the human hand in the target image is obtained by inputting the target image into a human hand detection module, the target area is further recognized, the characteristic information of the human hand is obtained, the characteristic information is recognized, a gesture control command corresponding to the characteristic information is obtained, and the gesture control command is executed to control a corresponding controlled object. The accuracy of gesture control can be improved.
In some possible implementations of embodiments of the present application, the processor 410 is further configured to:
under the condition that the characteristic information of the human hand is not obtained in the target area, adjusting model parameters of a position prediction model for predicting the target position;
and predicting the target position of the human hand in the target image again according to the adjusted model parameters, the position of the human hand in the previous image adjacent to the target image and the change trend until the preset condition is met.
In the embodiment of the application, when the target position of a human hand in a target image is predicted according to the position and the variation trend of the human hand in a previous image adjacent to the target image, and the characteristic information of the human hand is not recognized in a target area of the human hand determined by recognizing the target position in the target image, model parameters of a position prediction model for predicting the target position are adjusted; and predicting the target position of the human hand in the target image again according to the adjusted model parameters, so that the accuracy of gesture control can be improved.
In some possible implementations of embodiments of the present application, the processor 410 is further configured to:
the second detection module is used for detecting a target area of the human hand in the target image in parallel with predicting the target position of the human hand in the target image;
the first stopping module is used for stopping predicting the target position of the human hand in the target image under the condition that the target area is detected and the target position is not predicted;
and the second stopping module is used for stopping detecting the target area of the human hand in the target image under the condition that the target position is predicted and the target area is not detected.
In the embodiment of the application, on one hand, the gesture recognition time can be shortened as much as possible, the gesture control efficiency is improved, on the other hand, the continuity of system operation can be ensured, and the experience of smooth operation of a user is ensured.
It should be understood that in the embodiment of the present application, the input Unit 404 may include a Graphics Processing Unit (GPU) 4041 and a microphone 4042, and the Graphics processor 4041 processes image data of a still picture or a video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 406 may include a display panel 4061, and the display panel 4061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 407 includes a touch panel 4071 and other input devices 4072. A touch panel 4071, also referred to as a touch screen. The touch panel 4071 may include two parts, a touch detection device and a touch controller. Other input devices 4072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. The memory 409 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 410 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 410.
The embodiment of the present application further provides a computer-readable storage medium, where a program or an instruction is stored on the computer-readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements the processes of the control method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
The processor is the processor in the electronic device described in the above embodiment. Examples of the computer readable storage medium include non-transitory computer readable storage media such as ROM, RAM, magnetic or optical disks, and the like.
The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the above control method embodiment, and can achieve the same technical effect, and for avoiding repetition, the details are not repeated here.
It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (12)

1. A control method, characterized in that the method comprises:
acquiring a sequence of images comprising a gesture;
for a target image in the image sequence, determining the change trend of the human hand in the image before the target image according to the image before the target image;
predicting a target position of the human hand in the target image according to the position of the human hand in a previous image adjacent to the target image and the variation trend;
determining a target area of the human hand in the target image according to the target position;
identifying the target area;
under the condition that the characteristic information of the human hand is obtained by identifying the target area, identifying the characteristic information to obtain a gesture control command corresponding to the characteristic information;
and executing the gesture control command.
2. The method of claim 1, wherein after said identifying the target area, obtaining characteristic information of the human hand, the method further comprises:
correcting the target position using the feature information, wherein the corrected target position is used to predict a position of the human hand in a subsequent image adjacent to the target image.
3. The method of claim 1, further comprising:
under the condition that the characteristic information of the human hand is not obtained by identifying the target area, detecting the target area of the human hand in the target image; and continuing to execute the step of identifying the target area to obtain the characteristic information of the human hand.
4. The method of claim 1, further comprising:
under the condition that the characteristic information of the human hand is not obtained by identifying the target area, adjusting model parameters of a position prediction model for predicting the target position;
and according to the adjusted model parameters, the position of the human hand in the previous image adjacent to the target image and the change trend, predicting the target position of the human hand in the target image again until a preset condition is met.
5. The method of claim 1, further comprising:
performing detection of a target region of the human hand in the target image in parallel with the predicting of the target position of the human hand in the target image;
stopping the predicting of the target position of the human hand in the target image if the target area is detected and the target position is not predicted;
and stopping detecting the target area of the human hand in the target image when the target position is predicted and the target area is not detected.
6. A control device, characterized in that the device comprises:
an acquisition module for acquiring an image sequence comprising a gesture;
the first determination module is used for determining the change trend of the human hand in the image before the target image according to the image before the target image aiming at the target image in the image sequence;
the prediction module is used for predicting the target position of the human hand in the target image according to the position of the human hand in the previous image adjacent to the target image and the change trend;
the second determination module is used for determining a target area of the human hand in the target image according to the target position;
the first identification module is used for identifying the target area;
the second identification module is used for identifying the characteristic information under the condition that the characteristic information of the human hand is obtained by identifying the target area, and obtaining a gesture control command corresponding to the characteristic information;
and the execution module is used for executing the gesture control command.
7. The apparatus of claim 6, further comprising:
a correction module for correcting the target position using the feature information, wherein the corrected target position is used for predicting a position of the human hand in a subsequent image adjacent to the target image.
8. The apparatus of claim 6, further comprising:
the first detection module is used for detecting the target area of the human hand in the target image and triggering the first recognition module under the condition that the characteristic information of the human hand is not obtained in the recognition of the target area.
9. The apparatus of claim 6, further comprising:
the adjusting module is used for adjusting model parameters of a position prediction model for predicting the target position under the condition that the characteristic information of the human hand is not obtained in the target area;
the prediction module is further to:
and according to the adjusted model parameters, the position of the human hand in the previous image adjacent to the target image and the change trend, predicting the target position of the human hand in the target image again until a preset condition is met.
10. The apparatus of claim 6, further comprising:
a second detection module for performing detection of a target region of the human hand in the target image in parallel with the prediction of the target position of the human hand in the target image;
a first stopping module, configured to stop the prediction of the target position of the human hand in the target image if the target area is detected and the target position is not predicted;
and the second stopping module is used for stopping detecting the target area of the human hand in the target image under the condition that the target position is predicted and the target area is not detected.
11. An electronic device, characterized in that the electronic device comprises: processor, memory and a program or instructions stored on the memory and executable on the processor, which when executed by the processor implements the steps of the control method according to any one of claims 1 to 5.
12. A computer-readable storage medium, on which a program or instructions are stored, which, when executed by a processor, implement the steps of the control method according to any one of claims 1 to 5.
CN202110807879.0A 2021-07-16 2021-07-16 Control method, apparatus, device and medium Pending CN113642413A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110807879.0A CN113642413A (en) 2021-07-16 2021-07-16 Control method, apparatus, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110807879.0A CN113642413A (en) 2021-07-16 2021-07-16 Control method, apparatus, device and medium

Publications (1)

Publication Number Publication Date
CN113642413A true CN113642413A (en) 2021-11-12

Family

ID=78417588

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110807879.0A Pending CN113642413A (en) 2021-07-16 2021-07-16 Control method, apparatus, device and medium

Country Status (1)

Country Link
CN (1) CN113642413A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992171A (en) * 2015-08-04 2015-10-21 易视腾科技有限公司 Method and system for gesture recognition and man-machine interaction based on 2D video sequence
US20160085310A1 (en) * 2014-09-23 2016-03-24 Microsoft Corporation Tracking hand/body pose
CN107169411A (en) * 2017-04-07 2017-09-15 南京邮电大学 A kind of real-time dynamic gesture identification method based on key frame and boundary constraint DTW
CN108229391A (en) * 2018-01-02 2018-06-29 京东方科技集团股份有限公司 Gesture identifying device and its server, gesture recognition system, gesture identification method
CN109255324A (en) * 2018-09-05 2019-01-22 北京航空航天大学青岛研究院 Gesture processing method, interaction control method and equipment
US20190354194A1 (en) * 2017-12-22 2019-11-21 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction
CN111382644A (en) * 2018-12-29 2020-07-07 Tcl集团股份有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN111860346A (en) * 2020-07-22 2020-10-30 苏州臻迪智能科技有限公司 Dynamic gesture recognition method and device, electronic equipment and storage medium
CN112489077A (en) * 2019-09-12 2021-03-12 阿里巴巴集团控股有限公司 Target tracking method and device and computer system
CN112527113A (en) * 2020-12-09 2021-03-19 北京地平线信息技术有限公司 Method and apparatus for training gesture recognition and gesture recognition network, medium, and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160085310A1 (en) * 2014-09-23 2016-03-24 Microsoft Corporation Tracking hand/body pose
CN104992171A (en) * 2015-08-04 2015-10-21 易视腾科技有限公司 Method and system for gesture recognition and man-machine interaction based on 2D video sequence
CN107169411A (en) * 2017-04-07 2017-09-15 南京邮电大学 A kind of real-time dynamic gesture identification method based on key frame and boundary constraint DTW
US20190354194A1 (en) * 2017-12-22 2019-11-21 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction
CN108229391A (en) * 2018-01-02 2018-06-29 京东方科技集团股份有限公司 Gesture identifying device and its server, gesture recognition system, gesture identification method
US20190204930A1 (en) * 2018-01-02 2019-07-04 Boe Technology Group Co., Ltd. Gesture recognition device, gesture recognition method, and gesture recognition system
CN109255324A (en) * 2018-09-05 2019-01-22 北京航空航天大学青岛研究院 Gesture processing method, interaction control method and equipment
CN111382644A (en) * 2018-12-29 2020-07-07 Tcl集团股份有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN112489077A (en) * 2019-09-12 2021-03-12 阿里巴巴集团控股有限公司 Target tracking method and device and computer system
CN111860346A (en) * 2020-07-22 2020-10-30 苏州臻迪智能科技有限公司 Dynamic gesture recognition method and device, electronic equipment and storage medium
CN112527113A (en) * 2020-12-09 2021-03-19 北京地平线信息技术有限公司 Method and apparatus for training gesture recognition and gesture recognition network, medium, and device

Similar Documents

Publication Publication Date Title
CN108960163B (en) Gesture recognition method, device, equipment and storage medium
CN108596092B (en) Gesture recognition method, device, equipment and storage medium
KR102230630B1 (en) Rapid gesture re-engagement
US20130141326A1 (en) Gesture detecting method, gesture detecting system and computer readable storage medium
US8965051B2 (en) Method and apparatus for providing hand detection
JP7181375B2 (en) Target object motion recognition method, device and electronic device
GB2589996A (en) Multiple Face Tracking Method For Facial Special Effect, Apparatus And Electronic Device
CN102103457B (en) Briefing operating system and method
CN112364799A (en) Gesture recognition method and device
CN112507918A (en) Gesture recognition method
KR20160079531A (en) Method and apparatus for processing gesture input
CN113655929A (en) Interface display adaptation processing method and device and electronic equipment
CN104035714A (en) Event processing method, device and equipment based on Android system
CN103679130B (en) Hand method for tracing, hand tracing equipment and gesture recognition system
CN112783406B (en) Operation execution method and device and electronic equipment
EP3696715A1 (en) Pose recognition method and device
US20170085784A1 (en) Method for image capturing and an electronic device using the method
CN107977147B (en) Sliding track display method and device
EP4307087A1 (en) Gesture recognition method and apparatus, device, and medium
CN113642413A (en) Control method, apparatus, device and medium
CN112788244B (en) Shooting method, shooting device and electronic equipment
CN114648556A (en) Visual tracking method and device and electronic equipment
CN114089868A (en) Touch operation method and device and electronic equipment
CN113253884A (en) Touch method, touch device and electronic equipment
CN114255513A (en) Gesture recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination