WO2022040994A1 - 手势识别方法及装置 - Google Patents

手势识别方法及装置 Download PDF

Info

Publication number
WO2022040994A1
WO2022040994A1 PCT/CN2020/111493 CN2020111493W WO2022040994A1 WO 2022040994 A1 WO2022040994 A1 WO 2022040994A1 CN 2020111493 W CN2020111493 W CN 2020111493W WO 2022040994 A1 WO2022040994 A1 WO 2022040994A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
gesture
target
candidate
joint point
Prior art date
Application number
PCT/CN2020/111493
Other languages
English (en)
French (fr)
Inventor
聂谷洪
施泽浩
王帅
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2020/111493 priority Critical patent/WO2022040994A1/zh
Priority to CN202080006664.2A priority patent/CN113168533A/zh
Publication of WO2022040994A1 publication Critical patent/WO2022040994A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a gesture recognition method and device.
  • the gesture interaction can be a multi-person scene.
  • the same picture can include multiple people, and multiple people correspond to multiple target areas.
  • a target area associated with the gesture area may also be determined as required, so as to know the target object to which the gesture belongs.
  • the target object in the target area closest to the gesture area is determined as the target object to which the gesture belongs.
  • the embodiments of the present application provide a gesture recognition method and device, which are used to solve the problem of wrong determination of gesture attribution in many scenarios in the way of determining attribution of gestures in the prior art.
  • an embodiment of the present application provides a gesture recognition method, including:
  • the human body joint point distribution map including a first joint point and a second joint point
  • the target region associated with the gesture region is determined according to the target region corresponding to the second joint point in the target human body joint point distribution map.
  • an embodiment of the present application provides a gesture recognition method, the method further comprising:
  • a target area associated with the gesture area is determined in the candidate area.
  • an embodiment of the present application provides a gesture recognition device, including: a memory and a processor;
  • the memory for storing program codes
  • the processor calls the program code, and when the program code is executed, is configured to perform the following operations:
  • the human body joint point distribution map including a first joint point and a second joint point
  • the target region associated with the gesture region is determined according to the target region corresponding to the second joint point in the target human body joint point distribution map.
  • an embodiment of the present application provides a gesture recognition device, including: a memory and a processor;
  • the memory for storing program codes
  • the processor calls the program code, and when the program code is executed, is configured to perform the following operations:
  • a target area associated with the gesture area is determined in the candidate area.
  • an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program includes at least one piece of code, and the at least one piece of code can be executed by a computer to control the The computer performs the method according to any one of the first aspects.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program includes at least a piece of code, and the at least one piece of code can be executed by a computer to control the The computer performs the method according to any one of the second aspects.
  • an embodiment of the present application provides a computer program, which, when the computer program is executed by a computer, is used to implement the method described in any one of the first aspect.
  • an embodiment of the present application provides a computer program, which is used to implement the method according to any one of the second aspect when the computer program is executed by a computer.
  • the point distribution map determines the attribution of gestures. Since the target human body joint point distribution map is credible and includes the first joint point and the second joint point distribution map, the gesture can be better determined based on the target human body joint point distribution map. Compared with the target object that is associated with the gesture area, the target object belonging to the target object is directly used as the target area associated with the gesture area among the multiple target areas in the conventional technology, which can reduce the occurrence of errors in determining the attribution of the gesture.
  • FIG. 1 is a schematic diagram of an application scenario of a gesture recognition method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a gesture recognition method provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a gesture area and a target area in an image provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of a target human body joint point provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of positions of multiple joint points in an image provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application.
  • FIG. 7 is a schematic diagram of determining a first candidate region based on a dividing line according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of determining a first candidate region based on a dividing line according to another embodiment of the present application.
  • FIG. 9 is a schematic diagram of determining a first candidate region based on a dividing line according to another embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application.
  • FIG. 11 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a gesture recognition device provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a gesture recognition apparatus provided by another embodiment of the present application.
  • the gesture recognition method provided in this embodiment of the present application may be applied to the application scenario shown in FIG. 1 , and the application scenario may include an image acquisition device 11 and a gesture recognition device 12 .
  • the image capturing device 11 is used for capturing images, and the image capturing device may be, for example, a camera or the like.
  • the gesture recognition device 12 may process the image collected by the image acquisition device 11 by using the gesture recognition method provided in the embodiment of the present application to determine the target object to which the gesture in the image belongs.
  • the image acquisition device 11 and the gesture recognition device 12 may be integrated into the same device, which may be, for example, a movable device provided with a camera, such as a handheld pan/tilt, a drone, and the like.
  • the device may also be other types of devices, which are not limited in this application.
  • the image capture device 11 and the gesture recognition device 12 may be located in different devices, for example, the image capture device 11 may be located in a terminal, and the gesture recognition device 12 may be located in a server.
  • the image acquisition device 11 and the gesture recognition device 12 may also be located in other types of devices, which are not limited in this application.
  • the target object to which the gesture belongs In some multi-person scenarios, it is necessary to determine the attribution of gestures. For example, in multi-person scenarios, such as tracking, taking pictures or videos, it is necessary to determine the target object to which the gesture belongs, so as to track, photograph or record the target object to which the gesture belongs. .
  • the area where the gesture is located in the image hereinafter referred to as the gesture area
  • the target area the area where multiple targets are located
  • the target area associated with the area can be determined from the multiple target areas.
  • the target area associated with the area It should be noted that the target object in the target area associated with the gesture area is the target object to which the gesture in the gesture area belongs.
  • the target may be a human face, a head and shoulders, a body, or the like.
  • the target area closest to the gesture area among the multiple target areas is directly used as the target area associated with the gesture area.
  • this method has the problem of wrong determination of gesture attribution in many scenarios. For example, assuming that one person is standing behind another person in the image, the faces of both are recognized as targets, and the person standing in front makes a gesture, if the distance between the person's face area a1 and the gesture area in the image is greater than that of the other person The distance between the face area b1 of a person and the gesture area, because the distance between the gesture area and the face area b1 is small, the face area b1 will be directly used as the target area associated with the gesture area, thus appearing The problem of misrecognizing a gesture made by a person standing in front as belonging to a face of another person standing behind.
  • gesture interaction applications such as automatic following applications
  • movable platforms such as the gimbal and unmanned vehicle will adjust the pose of the camera according to the movement of the wrong target.
  • the screen will appear, which will bring a bad follow-up experience to the user.
  • Another example is a shooting application. If a wrong shooting target is determined, the shooting device will focus incorrectly according to the wrong target, and the target that really needs to be shot will appear out of focus and blurred visual effects, bringing a bad shooting experience to the user.
  • the gesture recognition method by determining the positions of multiple joint points in the image, and identifying the gesture area and multiple target areas in the image, based on the positions of the multiple joint points in the image, determine the positions including At least one human body joint point distribution map of the first joint point in the gesture area and the second joint point located in the target area, and from the at least one human body joint point distribution map, determine whether there is a target human body joint point distribution map that satisfies the preset condition.
  • FIG. 2 is a schematic flowchart of a gesture recognition method provided by an embodiment of the present application.
  • the execution body of this embodiment may be the gesture recognition apparatus 12 in FIG. 1 , and may specifically be a processor of the gesture recognition apparatus 12 .
  • the method of this embodiment may include:
  • Step 21 Determine the positions of a plurality of joint points in the image, and identify a gesture area and a plurality of target areas in the image, where the plurality of joint points in the image include the first position in the gesture area. a joint point and a second joint point located in the target area.
  • the target area may include a face area, a head and shoulder area, or a body area.
  • the face area may include a face frame
  • the head and shoulders area may include a head and shoulders frame
  • the body area may include a body frame.
  • the gesture area may include a gesture frame. Taking the target area as the face frame and the gesture area as the gesture frame as an example, the gesture area and multiple target areas in the recognized image can be as shown in Figure 3, wherein the dashed frames A1 and A2 are the face frames, and the dashed frame A3 is the gesture box. It should be noted that in FIG. 3 , the number of targets is two as an example. It should be noted that the specific manner of identifying the gesture area and the target area in the image is not limited in this embodiment of the present application.
  • a preset joint point detection algorithm may be used to perform joint point detection processing on the entire image to determine the positions of multiple joint points in the image.
  • the preset joint point detection algorithm may be used to detect joint points of multiple preset joint types.
  • the plurality of preset joint types at least include the joint type of the hand, the joint type of the head, and the joint used to connect the body parts of the head and the hand Types of.
  • the type of joints of the hand may include wrist joints. Considering that there are many finger joints and the identification is difficult, the wrist joint can be used to simplify the implementation.
  • Types of joints in the head may include head joints and/or cervical spine joints.
  • the types of joints used to connect the body parts of the head and hands may include shoulder joints and elbow joints.
  • the joint point distribution of the lower body can also be obtained.
  • the multiple preset joint types may also include lumbar joints, knee joints, ankle joints, etc. .
  • a preset joint point detection algorithm may be used to perform joint point detection processing on a part of the image, so as to determine the positions of multiple joint points in the image.
  • joint point detection processing for a partial area compared with joint point detection for the entire image, it is beneficial to reduce the amount of calculation and save computing resources.
  • the target area is a face area (or head and shoulder area)
  • the partial area may be larger than the target area, so as to obtain the joint point situation between the hand and the head (or head and shoulders).
  • the method provided by the embodiment of the present application may further include: determining a region of interest according to the identified target region; and detecting the plurality of joint points in the region of interest.
  • the region of interest is the aforementioned partial region.
  • the area of interest may be determined by performing area expansion on the basis of the target area.
  • the target area involves more joint types than the preset joint types
  • the area of interest may be determined by reducing the area on the basis of the target area.
  • the target area may be directly used as the area of interest.
  • the type of the first joint point may be a wrist joint.
  • the type of the second joint point may be one or more of a head joint, a cervical vertebra joint, a shoulder joint, and an elbow joint.
  • Step 22 Determine at least one human body joint point distribution map based on the positions of the plurality of joint points in the image, where the human body joint point distribution map includes a first joint point and a second joint point.
  • the at least one human body joint point distribution map is obtained by connecting the joint points in the image, and includes a first joint point located in the gesture area and a second joint point located in the target area connection diagram.
  • the at least one human body joint point distribution map may include a credible human body joint point distribution map and/or an unreliable human body joint point distribution map.
  • a credible human body joint point distribution map for gesture attribution recognition. Based on this, after the at least one human body joint point distribution map determined in step 22, step 23 may be continued.
  • Step 23 From the at least one human body joint point distribution map, determine whether there is a target human body joint point distribution map that satisfies a preset condition.
  • the target human body joint point distribution map that meets the preset conditions can be understood as a credible human body joint point distribution map. Since the target human body joint point distribution map is credible and includes the first joint point and the second joint point, step 24 may be further performed to determine gesture attribution based on the target human body joint point distribution map.
  • the obtained target human body joint point distribution map can be as shown in Fig. 4, the joint points of the target human body distribution map include joint point 1 to joint point 8, wherein the joint point 1 is the head joint, joint point 2 is the neck joint, joint point 3 and joint point 4 are the shoulder joint, joint point 5 and joint point 6 are the elbow joint, joint point 7 and joint point 8 are the wrist joint.
  • the multiple preset joint types including head joint, neck joint, shoulder joint, elbow joint and wrist joint are taken as an example.
  • At least one joint point distribution map is unreliable, because the determination of gesture attribution is determined based on the unreliable human body joint point distribution map. The results must also be unreliable, so the gesture attribution cannot be determined using the distribution map of human joint points.
  • the closest distance principle may be used to further determine the gesture attribution, that is, in multiple target areas,
  • the target area closest to the gesture area is used as the target area associated with the gesture area.
  • the target object to which the gesture belongs can be better determined based on the human body joint point distribution map, and the human body joint points that cannot be trusted and include the first joint point and the second joint point cannot be obtained.
  • the closest distance principle is only used in the special scene of the distribution map. Therefore, compared with the use of the closest distance principle in the traditional technology, this method can reduce the occurrence of errors in determining the attribution of gestures.
  • Step 24 when there is a target human body joint point distribution map that satisfies the preset condition, according to the target area corresponding to the second joint point in the target human body joint point distribution map, determine the target associated with the gesture area. area.
  • the number of the target human body joint point distribution map may be one; the gesture area corresponding to the first joint point in the target human body joint points and the target area corresponding to the second , determining the target area associated with the gesture area, which may specifically include: determining the target area corresponding to the second joint point in the target human body joint point distribution map as the target area associated with the gesture area.
  • the target area corresponding to the second joint point (ie joint point 1) in the target human body joint point distribution diagram is the face frame A2
  • the gesture frame A3 The joint point in (ie joint point 7) is the first joint point.
  • the gesture frame A3 will be mistakenly identified as the The face frame A1 is associated.
  • the number of the target human body joint point distribution map may also be multiple. If there are more than one, an optimal target human body joint point distribution map may be further determined from the multiple target human body joint point distribution maps, and the target corresponding to the second joint point in the target human body joint point distribution map The area is determined as the target area associated with the gesture area.
  • a method of determining a human body joint point distribution map from a plurality of human body joint point distribution maps may be performed using the closest distance principle.
  • a manner of determining a human body joint point distribution map from multiple human body joint point distribution maps may be based on one or more of left and right hand attributes, size attributes, and position attributes of the gesture area.
  • the gesture recognition method by determining the positions of multiple joint points in the image, and identifying the gesture area and multiple target areas in the image, based on the positions of the multiple joint points in the image, determine the positions including At least one human body joint point distribution map of the first joint point in the gesture area and the second joint point located in the target area, and from the at least one human body joint point distribution map, determine whether there is a target human body joint point distribution map that satisfies the preset condition. , if so, determine the target area associated with the gesture area according to the gesture area corresponding to the first joint point in the target human body joint point and the target area corresponding to the second joint point, and realize the distribution map based on the target human body joint point.
  • the target human body joint point distribution map is credible and includes the first joint point and the second joint point distribution map, the target human body joint point distribution map can better determine which gesture belongs to. Compared with the target object in the conventional technology, the target area closest to the gesture area among the multiple target areas is directly used as the target area associated with the gesture area, which can reduce the occurrence of errors in determining the attribution of the gesture.
  • determining at least one human body joint point distribution map based on the positions of the plurality of joint points in the image may specifically include: In the set of joint points corresponding to the joint points, at least one joint point is taken out and combined to obtain multiple groups of joint points; the joint points in each group of joint points are connected to obtain the at least one human joint point distribution map and its confidence. Spend.
  • the multiple joint points may include joint point 1 ′-joint point 8 ′ and joint point 1 to joint point 8 ,
  • the joint point 1 and the joint point 1' are the head joint
  • the joint point 2 and the joint point 2' are the neck joint
  • the joint point 3', the joint point 4 and the joint point 4' are the shoulder joint
  • the joint point 5 , joint point 5', joint point 6 and joint point 6' are elbow joints
  • joint point 7, joint point 7', joint point 8 and joint point 8' are wrist joints.
  • the head joint can correspond to the joint point set a, and the joint point set a can include joint point 1 and joint point 1';
  • the neck joint can correspond to the joint point set b, and the joint point set b can include joint point 2 and joint point 2'
  • the shoulder joint can correspond to the joint point set c, and the joint point set c can include the joint point 3, the joint point 3', the joint point 4 and the joint point 4';
  • the elbow joint can correspond to the joint point set d, and the joint point set d can include the joint point Point 5, joint point 5', joint point 6 and joint point 6';
  • the wrist joint may correspond to joint point set e, and joint point set e may include joint point 7, joint point 7', joint point 8 and joint point 8'.
  • the foregoing plurality of preset joint types can distinguish left shoulder joints and right shoulder joints. Further, the left shoulder joint can correspond to one joint point set, and the right shoulder joint can correspond to another joint point set.
  • joint point 1' can be taken out from joint point set a
  • joint point 2 can be taken out from joint point set b
  • joint points 3 and 4 can be taken out from joint point set c
  • joint point 5' can be taken out from joint point set d and 6', take out the joint points 7 and 7' from the joint point set e to obtain a set of joint points.
  • other groups of joint points can be obtained by combining other joint points. It can be understood that one of the multiple groups of joint points may include joint point 1 to joint point 7 .
  • joint point 7 is the joint point of the wrist joint in the gesture area
  • the joint point 7 is the key joint point
  • the joint point 7', joint point 8 and joint point 8' which are also the wrist joint, are not the wrist joint in the gesture area.
  • joint points, so joint point 7', joint point 8 and joint point 8' are non-critical joint points.
  • only the joint point 7 may be included in the joint point set e, or the joint point 7', the joint point 8 and the joint point 8' may not be taken out from the joint set e to combine the joint points.
  • the joint points in each group of joint points may be connected to obtain the at least one human joint point distribution map and its confidence level.
  • Confidence is used to express the degree of confidence. The lower the confidence level can represent the less reliable, and the higher the confidence level can represent the more reliable.
  • the confidence level of a human body joint point distribution map is greater than the preset threshold, it can indicate that the human body joint point distribution map is credible, and when the confidence level of the human body joint point distribution map is less than the preset threshold value, it can indicate that the human body joint point distribution map is reliable.
  • the distribution map is not credible.
  • the confidence of the joint points will affect the confidence degree of the human joint point distribution map of the human body joint point distribution map to a certain extent.
  • the confidence level can effectively represent the credibility of the human body joint point distribution map, and the confidence level of the human body joint point distribution map can be related to the confidence level of the joint points it includes.
  • the higher the confidence level of the joint points included in a human body joint point distribution map the higher the confidence level of the human body joint point distribution map.
  • the lower the confidence level of the joint points included in a human body joint point distribution map the lower the confidence level of the human body joint point distribution map.
  • the degree of confidence of the human body joint point distribution map can also be related to whether the angle of the connection between the joint points of the human body meets the requirements of the motion angle of the human body joint points. If there is a situation in a human body joint point distribution map that the connection between the joint points does not meet the motion angle requirements of the human body joint points, the confidence level of the human body joint point distribution map is less than the preset threshold, that is, the human body joint point distribution map is unreliable .
  • the confidence level of the joint point distribution map can effectively represent the credibility of the human body joint point distribution map, and the confidence level of the human joint point distribution map can also be related to the number of types of joint points involved.
  • the greater the number of types of joint points involved in a human body joint point distribution map the higher the confidence level of the human body joint point distribution map.
  • the smaller the number of joint point types involved in a human body joint point distribution map the lower the confidence level of the human body joint point distribution map can be.
  • the human body joint point distribution map in which the number of the involved joint point types is equal to the number of the aforementioned preset joint point types can be considered as a complete human body joint point distribution map.
  • a human body joint point distribution map in which the number of involved joint point types is less than the number of the aforementioned preset joint point types can be considered as an incomplete human body joint point distribution map. If a human body joint point distribution map is incomplete, the confidence level of the human body joint point distribution map may be less than a preset threshold, that is, the human body joint point distribution map is unreliable.
  • the connecting the joint points in each group of joint points to obtain the at least one human body joint point distribution map and its confidence may specifically include: according to a preset joint point connecting strategy, The joint points in each group of joint points are connected to obtain the at least one human joint point distribution map and its confidence level.
  • the preset joint point connection strategy is consistent with the connection mode between the real joints of the human body. For example, since the wrist joint passes through the forearm and the elbow joint, the preset joint point connection strategy may include the connection between the joint point of the wrist joint type and the joint point of the elbow joint type. For another example, since the elbow joint is connected with the shoulder joint through the upper arm, the preset joint point connection strategy may also include the connection between the joint point of the elbow joint type and the joint point of the shoulder joint type.
  • the preset joint point connection strategy may be, for example, a strategy of connecting in sequence in the order of "head joint ⁇ neck joint ⁇ shoulder joint ⁇ elbow joint ⁇ wrist joint".
  • a group of joint points includes joint point 1 to joint point 8 in FIG. 5 , and the joint points in each group of joint points are connected according to the joint point connection strategy.
  • the joint points in each group can be as follows: first, connect joint point 1 to joint point 2 Make a connection, then connect the joint point 2 to the joint point 3 and the joint point 4 respectively, then connect the joint point 3 to the joint point 5 and connect the joint point 4 to the joint point 6, and finally connect the joint point 5 is connected to joint point 7 and joint point 6 is connected to joint point 8.
  • the preset joint point connection strategy may be a process of connecting in sequence in the order of “wrist joint ⁇ elbow joint ⁇ shoulder joint ⁇ neck joint ⁇ head joint”.
  • a group of joint points includes joint point 1 to joint point 8 in FIG. 5 , and the joint points in each group of joint points are connected according to the joint point connection strategy.
  • the joint points can be as follows: first, connect joint point 7 to joint point 5 Make a connection and connect the joint point 8 to the joint point 6, then connect the joint point 5 to the joint point 3 and connect the joint point 6 to the joint point 4, and then connect the joint point 3 and the joint point 4. Connect with joint point 2 respectively, and finally connect joint point 2 with joint point 1.
  • connecting the joint points in each group of joint points to obtain the at least one human body joint point distribution map and its confidence level may specifically include: determining the joint points in each group of joint points in an exhaustive manner.
  • the connection mode of the joint points and the connection are performed to obtain the at least one human body joint point distribution map and its confidence level.
  • Connecting in an exhaustive way is no longer limited to connecting the joint points of a certain type of joints with the joint points of a certain type of joints, but exhausting all possible joint point connection methods that need to be explained Yes, the connection is performed in an exhaustive manner. Since many of the connection methods are inconsistent with the connection methods between the real joints of the human body, the confidence of the distribution map of human joint points obtained by these connection methods is also very low. That is, the human body joint point distribution map obtained by these connection methods is also unreliable, so that the determination result of the target human body joint point distribution map will not be affected.
  • a plurality of human body joint point distribution maps and their confidence levels can be obtained for each group of joint points.
  • the distribution map of multiple human joint points corresponding to the joint points can further determine the distribution map of human joint points including the first joint point and the second joint point, that is, the distribution map of the at least one human joint point and its confidence level. .
  • the determining from the at least one human body joint point distribution map whether there is a target human body joint point distribution map that satisfies a preset condition may specifically include: acquiring the at least one human body joint point distribution map The maximum confidence in the figure; when the maximum confidence is greater than the preset threshold, it is determined that there is a target human body joint point distribution map that satisfies the preset condition; the target human body joint point distribution map is the human body joint point with the maximum confidence degree distribution map; when the maximum confidence level is less than the preset threshold, it is determined that there is no target human body joint point distribution map that satisfies the preset condition.
  • each human body joint point distribution map in the at least one human body joint point distribution map has a certain degree of confidence, and the human body joint point distribution map with the largest confidence degree in the at least one human body joint point distribution map may be possible.
  • the maximum confidence level is greater than the preset threshold, it can indicate that the human body joint point distribution map with the maximum confidence level is credible, and the human body joint point distribution map with the maximum confidence level can be used as the target human body for gesture attribution recognition
  • a joint point distribution map whereby it can be determined that a target human body joint point distribution map that satisfies a preset condition exists in the at least one human body joint point distribution map.
  • the maximum confidence level is less than the preset threshold, it may indicate that the human body joint point distribution map with the maximum confidence level is unreliable, and the human body joint point distribution map with the maximum confidence level cannot be used as a target for gesture attribution recognition A joint point distribution map, whereby it can be determined that there is no target human body joint point distribution map that satisfies a preset condition in the at least one human body joint point distribution map.
  • FIG. 6 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application.
  • this embodiment mainly describes when the target human joint that meets the preset condition does not exist.
  • An optional implementation for point distribution. As shown in FIG. 6 , the method of this embodiment may include:
  • Step 61 Determine the positions of multiple joint points in the image, and identify the gesture area and multiple target areas in the image, where the multiple joint points in the image include the first position in the gesture area. a joint point and a second joint point located in the target area.
  • Step 62 Determine at least one human joint point distribution map based on the positions of the plurality of joint points in the image, where the human body joint point distribution map includes a first joint point and a second joint point.
  • Step 63 From the at least one human body joint point distribution map, determine whether there is a target human body joint point distribution map that satisfies a preset condition.
  • Step 64 when there is a target human body joint point distribution map that satisfies the preset condition, according to the target area corresponding to the second joint point in the target human body joint point distribution map, determine the target associated with the gesture area. area.
  • steps 61 to 64 are similar to the aforementioned steps 21 to 24, and are not repeated here.
  • Step 65 When the target human body joint point distribution map that meets the preset condition does not exist, determine a first candidate area that meets the first condition among the multiple target areas.
  • Step 66 when the number of the first candidate regions is one, determine that the gesture region is associated with the first candidate region.
  • Step 67 When the number of the first candidate regions is multiple, determine that the first candidate region closest to the gesture region is associated with the gesture region.
  • a target area that is obviously not associated with the gesture area may be removed from the plurality of target areas to obtain a candidate area (ie, the first candidate area) that may be associated with the gesture area, Then, the target area associated with the gesture area may be further determined from the first candidate area. Since the range of the target area used to further determine the associated gesture area can be narrowed through the first condition, in the conventional technology, the target area closest to the gesture area among the multiple identified target areas is directly used as the target area associated with the gesture area. In comparison, it is possible to reduce the occurrence of errors in determining the attribution of gestures.
  • a first candidate area that meets the first condition among the multiple target areas may be determined.
  • Method 1 Considering that there is a phenomenon of near big and far small in image shooting, that is, multiple targets with large distance differences from the image acquisition device, there is an obvious size difference in the size of the image collected by the image acquisition device, and Usually, when a user interacts with a gesture, the distance between the hand and the body is relatively short, so the target area that is obviously not associated with the gesture area can be removed based on the size attribute of the gesture area. Therefore, in one embodiment, the first attribute can include a size attribute.
  • the determining of the first candidate area that meets the first condition in the multiple target areas may specifically include: determining, according to the relationship between the size of the target area and the size of the gesture area, that the first candidate area meets the first condition.
  • the first candidate region of the condition may specifically include: determining, according to the relationship between the size of the target area and the size of the gesture area, that the first candidate area meets the first condition. The first candidate region of the condition.
  • the size can be represented by, for example, length and width, diagonal line, area, and the like.
  • the first condition includes that the ratio of the size of the target area and the size of the gesture area conforms to a preset relationship.
  • the preset relationship may be that the ratio of the size of the target area to the size of the gesture area is smaller than the first ratio threshold, so as to remove the target area that is significantly larger than the size of the gesture area. And/or, the preset relationship may be that the ratio of the size of the target area to the size of the gesture area is greater than the second ratio threshold, so as to remove the target area that is significantly smaller than the size of the gesture area.
  • the first proportional threshold and the second proportional threshold can be determined according to experiments, for example.
  • the preset relationship can be determined by a preset first model, and the first model can output the size of the target area and the size of the gesture area based on the input size of the target area and the size of the gesture area.
  • the first output result of the degree of association of the size of the gesture area may be a neural network model, and the first output result may be, for example, 0 or 1, where 0 may indicate that the proportion does not conform to the preset relationship, and 1 may indicate that the proportion conforms to the preset relationship.
  • Method 2 Considering that the hand is a part of the body, the distance between the gesture area and the target area belonging to the same human body is limited by the characteristics of the body structure. Therefore, the target area that is obviously not related to the gesture area can be removed based on the position attribute of the gesture area. , whereby in another embodiment the first attribute may include a location attribute.
  • the determining of the first candidate area that meets the first condition among the multiple target areas may specifically include: according to the position of the target area and the position of the gesture area, determining the first candidate area that meets the first condition The first candidate area.
  • the position may be, for example, a pixel position, for example, the position of the gesture area may be the pixel position of the center pixel of the gesture area.
  • the first condition may be used to remove the target area that is obviously inconsistent with the position of the gesture area.
  • the first condition includes that the distance between the target area and the gesture area is less than a preset distance threshold. Wherein, the distance threshold can be determined according to experiments, for example.
  • Method 3 Considering that the content in the image has a left-right mirror relationship, and usually when the user interacts with gestures, the left hand is on the left side of the body, and the right hand is on the right side of the body. Therefore, based on the left and right hand attributes of the gesture area, the obvious and gesture area can be removed. Unassociated target areas, whereby in one embodiment the first attribute may include a left-handed attribute.
  • the determining the first candidate area that meets the first condition in the multiple target areas may specifically include: determining the right and left hand attributes of the gesture area, where the left and right hand attributes of the gesture area are left hand or right hand; The left and right hand attributes of the gesture area are used to determine a first candidate area in the multiple target areas that meets the first condition.
  • the first condition may be used to remove the target area that is obviously inconsistent with the left and right hand attributes of the gesture area.
  • the determining, according to the left and right hand attributes of the gesture area, the first candidate area that meets the first condition in the multiple target areas may specifically include: determining a dividing line according to the left and right hand attributes of the gesture area. ; When any one of the target areas is located on one side of the first direction of the dividing line, determine the target area as a first candidate area; the first direction is determined by the left and right hand attributes of the gesture area.
  • the relationship between the first direction and the right and left hand attributes of the gesture area may be: when the left and right hand attributes of the gesture area are left hands, the left direction of the gesture area is used as the first One direction; when the right hand attribute of the gesture area is the right hand, the right direction of the gesture area is used as the first direction.
  • the schematic diagram of determining the first candidate area based on the dividing line can be as shown in Figure 7, wherein the dotted frame A4 and A5 are the face frames, the dotted frame A6 is the gesture frame, and the dotted line represents the dividing line. , the direction of the arrow indicates the first direction.
  • the target area located on the side of the first direction of the dividing line is determined as the first candidate area, and the face area represented by the dotted frame A4 can be excluded. .
  • the dividing line is a straight line in a vertical direction as an example.
  • the dividing line may also be in other forms, which is not limited in this application.
  • determining a dividing line according to the right and left hand attributes of the gesture area may specifically include: when the left and right hand attributes of the gesture area are left hands, using the right boundary of the gesture area and the second direction Determine the dividing line by the offset of The opposite direction; the offset is greater than or equal to zero. In the above manner, it can be avoided to exclude the correctly associated target area.
  • the schematic diagram of determining the first candidate area based on the dividing line can be as shown in Figure 8, wherein the dotted frame A7 and A8 are the face frames, the dotted frame A9 is the gesture frame, and the dotted line represents the dividing line.
  • the direction of the right arrow indicates the first direction
  • the direction of the left arrow indicates the second direction.
  • the candidate region does not include the target region (the face region represented by the dotted box A8) that is actually associated with the gesture region. It should be noted that, in FIG. 8 , the dividing line is a straight line in the vertical direction, and the offset is greater than 0 as an example.
  • the determining, according to the left and right hand attributes of the gesture area, the first candidate area that meets the first condition in the multiple target areas may specifically include: determining a dividing line according to the left and right hand attributes of the gesture area. ; When any pixel of the target area greater than or equal to the preset ratio is located on one side of the first direction of the dividing line, determine that the target area is the first candidate area; the first direction is determined by the gesture area The left and right hand properties are determined. It should be noted that, for the specific content of the first direction and the dividing line, reference may be made to the previous description, and details are not repeated here.
  • the dividing line is determined by the left boundary of the gesture area as an example, as shown in FIG.
  • the direction of the arrow to the right indicates the first direction
  • the direction of the arrow to the left indicates the second direction.
  • the preset ratio is 50%, since more than 50% of the pixels of the face frame A11 are located on one side of the first direction of the dividing line, the face frame A11 can be determined as the first candidate area.
  • the target area is determined to be the first candidate area, and when the gesture is located in front of the target, the target area is determined as the first candidate area.
  • the problem that the first candidate region does not include the target region (the face region represented by the dashed box A9) that is actually associated with the gesture region can also be reduced.
  • the dividing line is a straight line in the vertical direction, and the offset is greater than 0 as an example.
  • the first candidate area that meets the first condition in the multiple target areas is determined.
  • the number is one, it is determined that the gesture area is associated with the first candidate area, and when the number of the first candidate areas is multiple, it is determined that the first candidate area with the closest distance to the gesture area is associated with the gesture area.
  • the target human body joint point distribution map of the preset condition first remove the target areas that are obviously not related to the gesture area from the multiple target areas through the first condition to obtain the first candidate area, and then determine the gesture area in the first candidate area.
  • the target area that is closest to the gesture area among the multiple identified target areas is directly used as the gesture in the traditional technology. Compared with the target area associated with the area, it can reduce the occurrence of wrong gesture attribution determination.
  • FIG. 10 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application. Based on the embodiment shown in FIG. 6 , this embodiment mainly describes another method when the number of the first candidate regions is multiple. an optional implementation. As shown in FIG. 10 , the method of this embodiment may include:
  • Step 101 Determine the positions of multiple joint points in the image, and identify a gesture area and multiple target areas in the image, where the multiple joint points in the image include a first position in the gesture area. a joint point and a second joint point located in the target area.
  • Step 102 Determine at least one human body joint point distribution map based on the positions of the plurality of joint points in the image, where the human body joint point distribution map includes a first joint point and a second joint point.
  • Step 103 From the at least one human body joint point distribution map, determine whether there is a target human body joint point distribution map that satisfies a preset condition.
  • Step 104 when there is a target human body joint point distribution map that satisfies the preset condition, according to the target area corresponding to the second joint point in the target human body joint point distribution map, determine the target associated with the gesture area area.
  • Step 105 When the target human body joint point distribution map that meets the preset condition does not exist, determine a first candidate area that meets the first condition among the multiple target areas.
  • Step 106 when the number of the first candidate regions is one, determine that the gesture region is associated with the first candidate region.
  • steps 101 to 106 are similar to the aforementioned steps 61 to 64, and are not repeated here.
  • Step 107 When the number of the first candidate regions is multiple, determine a second candidate region that meets the second condition among the multiple first candidate regions.
  • Step 108 When the number of the second candidate regions is one, determine that the gesture region is associated with the second candidate region.
  • Step 109 When the number of the second candidate regions is multiple, determine that the second candidate region closest to the gesture region is associated with the gesture region.
  • step 105 based on the first condition, target areas that are obviously not associated with the gesture area may be removed from the plurality of target areas to obtain one or more candidate areas that may be associated with the gesture area (ie, the first candidate area).
  • step 107 when the number of the first candidate regions is multiple, based on the second condition, the target regions that are obviously not associated with the gesture region can be further removed from the plurality of the first candidate regions to obtain a possible target region.
  • One or more second candidate regions associated with the gesture region Since the range of the target area related to the gesture area determination can be further narrowed by the second condition, compared with the method shown in FIG. 6 , it is beneficial to further reduce the occurrence of wrong gesture attribution determination.
  • the second condition is different from the first condition.
  • the first condition is related to a first attribute of the gesture area
  • the second condition is related to a second attribute of the gesture area
  • the second attribute is different from the first attribute.
  • the position of the first target region can be further determined according to the position of the first target region. and the position of the gesture area to determine a second candidate area that meets the second condition.
  • a candidate region determination method can be obtained in which the first candidate region is determined from the plurality of candidate regions based on the size attribute, and then the second candidate region is determined from the plurality of first candidate regions based on the position attribute.
  • the first candidate area is determined from the multiple candidate areas based on the position attribute, and then the second candidate area is determined from the multiple first candidate areas based on the size attribute. How to determine the candidate region of the region.
  • the determining of the second candidate regions that meet the second condition among the plurality of first candidate regions may specifically include: determining, according to the positions of the plurality of first target regions and the positions of the gesture regions. A second candidate region that meets the second condition.
  • the second condition includes that the distance between the first target area and the gesture area is smaller than a preset distance threshold.
  • the method of determining the second candidate area that meets the second condition according to the position of the first target area and the position of the gesture area is the same as the method of determining the second candidate area according to the position of the target area and the position of the gesture area described in the second method.
  • the manner of the first candidate region of a condition is similar, and details are not repeated here.
  • the determining of the second candidate area that meets the second condition among the plurality of first candidate areas may specifically include: determining, according to the size of the first target area and the size of the gesture area, that the area meets the second condition.
  • the second candidate region of the second condition may specifically include: determining, according to the size of the first target area and the size of the gesture area, that the area meets the second condition.
  • the second condition includes that the ratio of the size of the first target area and the size of the gesture area conforms to a preset relationship.
  • the method of determining the second candidate area that meets the second condition according to the size of the first target area and the size of the gesture area is the same as the method of determining the second candidate area according to the size of the target area and the size of the gesture area described in the foregoing method 1.
  • the manner of the first candidate region of a condition is similar, and details are not repeated here.
  • the first candidate region that meets the first condition on the basis of determining the first candidate region that meets the first condition according to the size and/or position of the target region and the gesture region, it may be further determined according to the left and right hand attributes of the gesture region.
  • a second candidate region that meets the second condition Thereby, a candidate region determination method can be obtained in which the first candidate region is determined from the plurality of candidate regions based on the size attribute, and then the second candidate region is determined from the plurality of first candidate regions based on the left and right hand attributes. It can be understood that by exchanging the order of the first condition and the second condition, it can be obtained that the first candidate area is determined from the multiple candidate areas based on the attributes of the left and right hands, and then the second candidate area is determined from the multiple first candidate areas based on the size attribute. How to determine the candidate region of the region.
  • the determining a second candidate area that meets the second condition among the plurality of first candidate areas may specifically include: determining the right and left hand attributes of the gesture area; the left and right hand attributes of the gesture area are left hand or right hand. ; According to the left and right hand attributes of the gesture area, determine a second candidate area that meets the second condition in the plurality of first candidate areas.
  • the determining, according to the left and right hand attributes of the gesture area, a second candidate area that meets the second condition in the plurality of first candidate areas includes: determining a dividing line according to the left and right hand attributes of the gesture area. ; When any one of the first candidate regions is greater than or equal to a preset ratio of pixels located on one side of the first direction of the dividing line, determine that the first candidate region is the second candidate region; the first direction is determined by The left and right hand attributes of the gesture area are determined.
  • the method of determining the second candidate area that meets the second condition according to the left and right hands of the gesture area is similar to the method of determining the first candidate area that meets the first condition according to the left and right hand attributes of the gesture area described in the third method. It is not repeated here.
  • the first candidate region that meets the first condition according to the position of the target region and the position of the gesture region it can be further determined according to the left and right hand attributes of the gesture region that meet the first candidate region.
  • the second candidate region of the second condition Thereby, a candidate region determination method can be obtained in which the first candidate region is determined from the plurality of candidate regions based on the position attribute, and then the second candidate region is determined from the plurality of first candidate regions based on the left and right hand attributes.
  • the first candidate area is determined from the multiple candidate areas based on the attributes of the left and right hands, and then the second candidate area is determined from the multiple first candidate areas based on the position attribute. How to determine the candidate region of the region.
  • a third candidate region that meets the third condition may be further determined, and a region associated with the gesture region is determined in the third candidate region.
  • the third condition is a condition different from both the first condition and the second condition.
  • the first candidate area that meets the first condition in the multiple target areas is determined, and when the number of the first candidate area is When there is one, determine that the gesture area is associated with the first candidate area, and when the number of the first candidate areas is multiple, determine the second candidate area that meets the second condition among the multiple first candidate areas. When there are more than one, it is determined that the second candidate area closest to the gesture area is associated with the gesture area, so that when there is no target human body joint point distribution map that satisfies the preset condition, the first condition is removed first.
  • a first candidate area is obtained from a target area that is obviously not associated with the gesture area among the multiple target areas, and a second candidate area is obtained by further removing the target area that is obviously not associated with the gesture area through the second condition in the first candidate area. Then, the target area associated with the gesture area is determined in the second candidate area. Since the range of the target area used to determine the gesture area is sequentially narrowed through the first condition and the second condition, the range of the candidate area can be further narrowed, which is beneficial to Further reduce the occurrence of gesture attribution errors.
  • the following steps may be further included: when the target area associated with the gesture area is determined, tracking, photographing or recording the target object in the target area.
  • the functions of tracking, photographing or recording the target object to which the gesture belongs is realized, which is beneficial to improve the user experience.
  • the following steps may also be included: when the target object is tracked or recorded, the region of interest is continuously updated; when a gesture is recognized in the region of interest, the tracking or recording is stopped. Therefore, in the process of tracking or recording, the user can stop the tracking or recording through gesture control, which is convenient for the user to stop the tracking or recording, and is beneficial to improve the user experience.
  • stopping tracking or recording may specifically include: when the gesture is recognized in the region of interest, determining whether the gesture is associated with the target object. ; When the gesture is associated with the target object, stop tracking or recording. Wherein, whether the gesture is associated with the target object, that is, whether the gesture belongs to the target object, and whether the gesture area where the gesture is located is associated with the object area where the target object is located. For the specific manner of determining whether the gesture is associated with the target object, reference may be made to the foregoing embodiments, and details are not described herein again.
  • the start-stop control of tracking or recording is usually the same person, by stopping the tracking or recording when the gesture is associated with the target object, after the tracking or recording process is started by a gesture associated with the target object, further
  • the target object can be controlled by gestures to stop tracking or recording, that is, the target of starting and stopping tracking or recording is the same target, which is beneficial to reduce false triggering of gestures.
  • FIG. 11 is a schematic flowchart of a gesture recognition method provided by another embodiment of the present application.
  • the execution body of this embodiment may be the gesture recognition apparatus 12 in FIG. 1 , and may specifically be the processor of the gesture recognition apparatus 12 .
  • the method of this embodiment may include:
  • Step 111 Identify multiple target areas and gesture areas in the image.
  • Step 112 Determine the attribute of the gesture area, and determine a candidate area among the multiple target areas according to the attribute of the gesture area.
  • Step 113 Determine a target area associated with the gesture area in the candidate area.
  • the attribute may be for removing target areas from the plurality of target areas that are clearly not associated with the gesture area.
  • the attributes include one or more of the following: a position attribute, a size attribute, or a right-handed attribute.
  • the determining a candidate area among the multiple target areas according to the attribute of the gesture area may specifically include: determining, according to the first attribute of the gesture area, that the multiple target areas conform to the first attribute.
  • the first candidate region of the condition may specifically include: determining, according to the first attribute of the gesture area, that the multiple target areas conform to the first attribute.
  • the determining, according to the first attribute of the gesture area, the first candidate area that meets the first condition in the plurality of target areas may specifically include: according to the size of the target area and the gesture The size of the area is used to determine a first candidate area in the multiple target areas that meets the first condition.
  • the determining, according to the first attribute of the gesture region, the first candidate region that meets the first condition among the multiple target regions may specifically include: according to the position of the target region and the The position of the gesture area is to determine a first candidate area in the plurality of target areas that meets the first condition.
  • the determining a target area associated with the gesture area in the candidate area may specifically include: when the number of the first candidate areas is one , determining that the gesture area is associated with the first candidate area.
  • the determining a target area associated with the gesture area in the candidate area may further include: when the number of the first candidate areas is multiple, determining a distance from the gesture area. The closest first candidate area is associated with the gesture area.
  • the determining candidate regions in the multiple target regions according to the attribute of the gesture region may further include: when the number of the first candidate regions is multiple, determining the candidate regions according to the gesture region.
  • the second attribute of the region determines a second candidate region that meets the second condition among the plurality of first candidate regions.
  • the determining, according to the second attribute of the gesture area, the second candidate area that meets the second condition in the plurality of first candidate areas may specifically include: determining, according to the left and right hand attributes of the gesture area, A second candidate area that meets the second condition among the plurality of first candidate areas.
  • the determining, according to the left and right hand attributes of the gesture area, a second candidate area that meets the second condition among the plurality of first candidate areas may specifically include: determining a second candidate area according to the left and right hand attributes of the gesture area. dividing line; when any pixel of the first candidate region greater than or equal to a preset ratio is located on one side of the first direction of the dividing line, the first candidate region is determined to be a second candidate region; the first candidate region is determined to be a second candidate region; The direction is determined by the left and right hand properties of the gesture area.
  • the determining a target region associated with the gesture region in the candidate region may specifically include: when the number of the second candidate regions is one , determining that the gesture area is associated with the second candidate area.
  • the determining a target area associated with the gesture area in the candidate area may further include: when the number of the second candidate areas is multiple, determining a target area associated with the gesture area. The closest second candidate area is associated with the gesture area. Or, optionally, when there are multiple second candidate regions, a third candidate region that meets the third condition may also be further determined, and a region associated with the gesture region is determined in the third candidate region.
  • the third condition is a condition different from both the first condition and the second condition.
  • the method of this embodiment of the present application may further include: when a target area associated with the gesture area is determined, tracking, photographing or recording a target object in the target area.
  • the attributes of the gesture areas are determined by recognizing multiple target areas and gesture areas in an image, and candidate areas in the multiple target areas are determined according to the attributes of the gesture area, and the candidate areas are determined in the candidate areas.
  • a target area associated with the gesture area which realizes that when there is no target human body joint point distribution map that meets the preset conditions, the candidate area is obtained by removing the target area that is obviously not related to the gesture area in the multiple target areas. , and then determine a target area associated with the gesture area in the candidate area. Since the range of the target area used to further determine the gesture area is narrowed by determining the candidate area, it is different from the multiple targets directly identified in the traditional technology. Compared with the target area that is closest to the gesture area in the area, as the target area associated with the gesture area, it is possible to reduce the occurrence of errors in determining the attribution of the gesture.
  • FIG. 12 is a schematic structural diagram of a gesture recognition apparatus provided by an embodiment of the present application.
  • the apparatus 120 may include: a processor 121 and a memory 122 .
  • the memory 122 for storing program codes
  • the processor 121 calls the program code, and when the program code is executed, is configured to perform the following operations:
  • the human body joint point distribution map including a first joint point and a second joint point
  • the target region associated with the gesture region is determined according to the target region corresponding to the second joint point in the target human body joint point distribution map.
  • FIG. 13 is a schematic structural diagram of a gesture recognition apparatus provided by another embodiment of the present application.
  • the apparatus 130 may include: a processor 131 and a memory 132 .
  • the memory 132 for storing program codes
  • the processor 131 calls the program code, and when the program code is executed, is configured to perform the following operations:
  • a target area associated with the gesture area is determined in the candidate area.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

一种手势识别方法及装置。该方法包括:确定多个关节点在图像中的位置,并识别图像中的手势区域和多个目标区域,基于多个关节点在图像中的位置,确定出至少一个人体关节点分布图,人体关节点分布图包括位于手势区域中的第一关节点和位于目标区域中的第二关节点,从至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图,如果是,根据目标人体关节点分布图中的第二关节点所对应的目标区域,确定手势区域相关联的目标区域。本申请能够减少出现手势归属确定错误的情况。。

Description

手势识别方法及装置 技术领域
本申请涉及图像处理技术领域,尤其涉及一种手势识别方法及装置。
背景技术
目前,随着人机交互方式的不断发展,手势交互的应用也越来越广泛。
手势交互可以为多人场景,多人场景下同一画面中可以包括多个人,多个人对应多个目标区域。对于画面中识别出的手势区域,除了确定手势区域中所包含的手势,还可以根据需要确定手势区域相关联的目标区域,以获知手势所归属的目标对象。通常,是将与手势区域距离最近的目标区域中的目标对象,确定为手势归属的目标对象。
然而,上述确定手势归属的方式,在很多场景下存在手势归属确定错误的问题。
发明内容
本申请实施例提供一种手势识别方法及装置,用以解决现有技术中确定手势归属的方式,在很多场景下存在手势归属确定错误的问题。
第一方面,本申请实施例提供一种手势识别方法,包括:
确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点;
基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点;
从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目 标人体关节点分布图;
当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
第二方面,本申请实施例提供一种手势识别方法,所述方法还包括:
识别图像中的多个目标区域和手势区域;
确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目标区域中的候选区域;
在所述候选区域中确定一个与所述手势区域相关联的目标区域。
第三方面,本申请实施例提供一种手势识别装置,包括:存储器和处理器;
所述存储器,用于存储程序代码;
所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点;
基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点;
从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图;
当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
第四方面,本申请实施例提供一种手势识别装置,包括:存储器和处理器;
所述存储器,用于存储程序代码;
所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
识别图像中的多个目标区域和手势区域;
确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目 标区域中的候选区域;
在所述候选区域中确定一个与所述手势区域相关联的目标区域。
第五方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行如第一方面任一项所述的方法。
第六方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行如第二方面任一项所述的方法。
第七方面,本申请实施例提供一种计算机程序,当所述计算机程序被计算机执行时,用于实现如第一方面任一项所述的方法。
第八方面,本申请实施例提供一种计算机程序,当所述计算机程序被计算机执行时,用于实现如第二方面任一项所述的方法。
本申请实施例提供一种手势识别方法及装置,通过确定多个关节点在图像中的位置,并识别图像中的手势区域和多个目标区域,基于多个关节点在图像中的位置,确定出包括位于手势区域中的第一关节点和位于目标区域中的第二关节点的至少一个人体关节点分布图,从至少一个人体关节点分布图中确定是否存在满足预设条件的目标人体关节点分布图,如果是,则根据目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定手势区域相关联的目标区域,实现了基于目标人体关节点分布图确定手势归属,由于目标人体关节点分布图是可信且包括第一关节点和第二关节点的人体关节点分布图,因此基于目标人体关节点分布图能够较好的确定出手势所归属的目标对象,与传统技术中直接将多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域相比,能够减少出现手势归属确定错误的情况。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下 面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的手势识别方法的应用场景示意图;
图2为本申请一实施例提供的手势识别方法的流程示意图;
图3为本申请一实施例提供的图像中手势区域和目标区域的示意图;
图4为本申请一实施例提供的目标人体关节点的示意图;
图5为本申请一实施例提供的图像中多个关节点的位置示意图;
图6为本申请另一实施例提供的手势识别方法的流程示意图;
图7为本申请一实施例提供的基于分割线确定第一候选区域的示意图;
图8为本申请另一实施例提供的基于分割线确定第一候选区域的示意图;
图9为本申请又一实施例提供的基于分割线确定第一候选区域的示意图;
图10为本申请又一实施例提供的手势识别方法的流程示意图;
图11为本申请又一实施例提供的手势识别方法的流程示意图;
图12为本申请一实施例提供的手势识别装置的结构示意图;
图13为本申请另一实施例提供的手势识别装置的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供的手势识别方法可以应用于如图1所示的应用场景,该应用场景可以包括图像采集装置11和手势识别装置12。其中,图像采集装置11用于采集图像,图像采集装置例如可以为摄像头等。手势识别装置12可以对图像采集装置11采集得到的图像采用本申请实施例提供的手势识别方法进行处理,以确定图像中手势所归属的目标对象。
一个实施例中,图像采集装置11和手势识别装置12可以集成于同一设备,该设备例如可以是设置有摄像头的可移动设备,例如手持云台、无人机等。当然,在其他实施例中,该设备还可以为其他类型设备,本申请对此不做限 定。
另一个实施例中,图像采集装置11和手势识别装置12可以分别位于不同的设备,例如图像采集装置11可以位于终端,手势识别装置12可以位于服务器。当然,在其他实施例中,图像采集装置11和手势识别装置12还可以分别位于其他类型设备,本申请对此不做限定。
一些多人场景下,会需要确定手势的归属,例如,多人场景下的跟踪、拍照或录像等,需要确定出手势归属的目标对象,以针对手势所归属目标对象进行跟踪、拍照或录像等。在多人场景下,可以先识别出图像中的手势所在的区域(以下简称为手势区域)和多个目标所在的区域(以下简称为目标区域),然后从多个目标区域中确定出与手势区域相关联的目标区域。需要说明的是,手势区域相关联目标区域中的目标对象即为手势区域中的手势所归属的目标对象。本申请实施例中,目标可以为人脸、头肩、身体等。
传统技术中,是直接将多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域。然而,这种方式在很多场景下存在手势归属确定错误的问题。例如,假设图像中一人身后站着另一人,两人的人脸均被识别为目标,站在前面的人作出手势,如果图像中这个人的人脸区域a1与手势区域之间的距离大于另一人的人脸区域b1与该手势区域之间的距离,则由于手势区域与人脸区域b1之间的距离较小,因此会直接将人脸区域b1作为手势区域相关联的目标区域,从而出现将站在前面的一人作出的手势误识别为是归属于站在后面的另一人的人脸的问题。又例如,假设图像中两人并排站立,两人的人脸均被识别为目标,站在左侧的一人使用右手作出手势,如果图像中这个人的人脸区域a2与手势区域之间的距离大于另一人的人脸区域b2与该手势区域的距离,则由于手势区域与人脸区域b2之间的距离较小,因此会将人脸区域b2作为手势区域相关联的目标区域,从而出现将站在左侧的一人作出的手势误识别为是归属于站在右侧的另一人的人脸的问题。在手势交互应用中,例如自动跟随应用,如果确定到错误的跟随目标上,云台、无人车等可移动平台会根据错误目标的移动而调整拍摄装置的位姿,而真正需要跟随的目标会出画面,给用户带来不好的跟随体验。又例如拍摄应用,如果确定错误的拍摄目标,拍摄装置会根据错误目标而对焦错误,而真正需要拍摄的目标会出现失焦和模糊的视觉效果,给用户带来不好的拍摄体验。
本申请实施例提供的手势识别方法,通过确定多个关节点在图像中的位 置,并识别图像中的手势区域和多个目标区域,基于多个关节点在图像中的位置,确定出包括位于手势区域中的第一关节点和位于目标区域中的第二关节点的至少一个人体关节点分布图,从至少一个人体关节点分布图中确定是否存在满足预设条件的目标人体关节点分布图;如果是,则根据目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定手势区域相关联的目标区域,实现了基于目标人体关节点分布图确定手势归属,减少了出现手势归属确定错误的情况。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
图2为本申请一实施例提供的手势识别方法的流程示意图,本实施例的执行主体可以为图1中的手势识别装置12,具体可以为手势识别装置12的处理器。如图2所示,本实施例的方法可以包括:
步骤21,确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点。
本步骤中,示例性的,所述目标区域可以包括人脸区域、头肩区域或身体区域。其中,人脸区域可以包括人脸框,头肩区域可以包括头肩框,身体区域可以包括身体框。示例性的,所述手势区域可以包括手势框。以目标区域为人脸框,手势区域为手势框为例,所识别的图像中的手势区域和多个目标区域可以如图3所示,其中,虚线框A1和A2是人脸框,虚线框A3是手势框。需要说明的是,图3中以目标的数量为两个为例。需要说明的是,关于识别图像中手势区域和目标区域的具体方式,本申请实施例不做限定。
一个实施例中,可以采用预设的关节点检测算法,对整个图像进行关节点检测处理,以确定多个关节点在图像中的位置。
所述预设的关节点检测算法可以用于检测多个预设关节类型的关节点。为了能够基于图像中的关节点分布情况进行手势归属识别,所述多个预设关节类型至少包括手部的关节类型、头部的关节类型以及用于连接头部和手部的身体部位的关节类型。一个实施例中,手部的关节类型可以包括手腕关节,考虑到手指关节较多且识别难度大,通过使用手腕关节可以简化实现。头部的关节类型可以包括头部关节和/或颈椎关节。用于连接头部和手部的身体部位的关节类型可以包括肩关节和肘关节。可选的,为了能够获得更加完整的 人体关节点分布图,还可以获得下半身的关节点分布情况,基于此,所述多个预设关节类型还可以包括腰椎关节、膝关节、脚腕关节等。
另一个实施例中,可以采用预设的关节点检测算法,对图像中的部分区域进行关节点检测处理,以确定多个关节点在图像中的位置。通过针对部分区域进行关节点检测处理,与针对整个图像进行关节点检测相比,有利于减小计算量,节省计算资源。其中,在目标区域为人脸区域(或头肩区域)时,所述部分区域可以较目标区域大,以获得手部与头部(或头肩)之间的关节点情况。
基于此,本申请实施例提供的方法还可以包括:根据识别到的所述目标区域确定感兴趣区域;在感兴趣区域中检测所述多个关节点。其中,所述感兴趣区域即为前述的部分区域。一个实施例中,当目标区域涉及的关节类型较预设关节类型不足时,可以通过在目标区域的基础上进行区域扩大的方式确定感兴趣区域。另一个实施例中,当目标区域涉及的关节类型较预设关节类型过多时,可以通过在目标区域的基础上进行区域缩小的方式确定感兴趣区域。又一个实施例中,当目标区域涉及的关节类型的数量与预设关节类型的数量相同时,可以直接将目标区域作为感兴趣区域。
一个实施例中,第一关节点的类型可以为手腕关节。第二关节点的类型可以为头部关节、颈椎关节、肩部关节、手肘关节中的一种或多种。
步骤22,基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点。
本步骤中,所述至少一个人体关节点分布图是通过对所述图像中的关节点进行连线所得到,且包括位于手势区域中的第一关节点和位于目标区域中的第二关节点的连线图。
其中,所述至少一个人体关节点分布图中可能包括可信的人体关节点分布图和/或不可信的人体关节点分布图。为了使得基于人体关节点分布进行手势归属识别能够获得较好的准确性,需要使用可信的人体关节点分布图来进行手势归属识别。基于此,在步骤22确定出的至少一个人体关节点分布图之后可以继续执行步骤23。
步骤23,从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图。
其中,满足预设条件的目标人体关节点分布图,可以理解为可信的人体 关节点分布图。由于目标人体关节点分布图可信,且包括了第一关节点和第二关节点,由此可以进一步执行步骤24,以基于目标人体关节点分布图确定手势归属。例如,对于图3经过步骤21-步骤23处理之后,所得到的目标人体关节点分布图可以如图4所示,目标人体分布图的关节点包括关节点1至关节点8,其中,关节点1为头关节,关节点2为颈关节,关节点3和关节点4为肩关节,关节点5和关节点6为肘关节、关节点7和关节点8为手腕关节。需要说明的是,图4中以所述多个预设关节类型包括头关节、颈关节、肩关节、肘关节和手腕关节为例。
如果至少一个人体关节点分布图中不存在满足预设条件的目标人体关节点分布图,则表示至少一个关节点分布图均不可信,由于基于不可信的人体关节点分布图确定手势归属的确定结果必定也是不可信的,因此无法使用人体关节点分布图的方式确定手势归属。针对于一些特殊场景,会出现至少一个人体关节点分布图中不存在满足预设条件的目标人体关节点分布图的情况,例如,在前置拍摄模式下,由于所获得的图像中通常只有人脸和手势,缺少了一些组成关节点分布图所必要的关节点信息,基于非常有限的关节点无法确定出可以相信的人体关节点分布图,因此会导致至少一个人体关节点分布图中不存在满足预设条件的目标人体关节点分布图。
一个实施例中,针对于所述至少一个人体关节点分布图中不存在满足预设条件的目标人体关节点分布图的情况,可以进一步采用最近距离原则进行手势归属确定,即将多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域。此种方式,由于针对一些场景已能够基于人体关节点分布图较好的确定出手势所归属的目标对象,针对于不能够得到可信且包括第一关节点和第二关节点的人体关节点分布图的特殊场景,才使用最近距离原则,因此此种方式与传统技术中使用最近距离原则相比,能够减少出现手势归属确定错误的情况。
步骤24,当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
一个实施例中,所述目标人体关节点分布图的个数可以为一个;所述根据所述目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域,具体可以包括:将所 述目标人体关节点分布图中所述第二关节点对应的目标区域确定为所述手势区域相关联的目标区域。例如,参考图4,由于目标人体关节点分布图中第二关节点(即关节点1)对应的目标区域为人脸框A2,因此可以确定人脸框A2为手势框A3相关联,手势框A3中的关节点(即关节点7)即为第一关节点。另外,参考图3可以看出,由于人脸框A1与手势框A3之间的距离小于人脸框A2与手势框A3之间的距离,采用传统技术的方式会将手势框A3误识别为与人脸框A1相关联。
可选的,所述目标人体关节点分布图的个数还可能为多个。如果为多个,则可以进一步从多个目标人体关节点分布图中确定出一个最优的目标人体关节点分布图,并将该目标人体关节点分布图中所述第二关节点对应的目标区域确定为所述手势区域相关联的目标区域。一个实施例中,可以采用最近距离原则从多个人体关节点分布图中确定出一个人体关节点分布图的方式。另一个实施例中,可以基于手势区域的左右手属性、尺寸属性、位置属性等中的一个或多个,从多个人体关节点分布图中确定出一个人体关节点分布图的方式。需要说明的是,基于左右手属性、位置属性和/或尺寸属性从多个人体关节点分布图中选择一个人体关节点分布图的方式,可以参见后续实施例的相关描述,在此不再赘述。
本申请实施例提供的手势识别方法,通过确定多个关节点在图像中的位置,并识别图像中的手势区域和多个目标区域,基于多个关节点在图像中的位置,确定出包括位于手势区域中的第一关节点和位于目标区域中的第二关节点的至少一个人体关节点分布图,从至少一个人体关节点分布图中确定是否存在满足预设条件的目标人体关节点分布图,如果是,则根据目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定手势区域相关联的目标区域,实现了基于目标人体关节点分布图确定手势归属,由于目标人体关节点分布图是可信且包括第一关节点和第二关节点的人体关节点分布图,因此基于目标人体关节点分布图能够较好的确定出手势所归属的目标对象,与传统技术中直接将多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域相比,能够减少出现手势归属确定错误的情况。
在上述实施例的基础上,可选的,所述基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,具体可以包括:从所述图像 的不同类型关节点对应的关节点集合中,分别取出至少一个关节点进行组合,得到多组关节点;对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度。
以图3所示的图像中的多个关节点如图5所示为例,参考图5,该多个关节点可以包括关节点1’-关节点8’以及关节点1至关节点8,其中,关节点1和关节点1’为头关节,关节点2和关节点2’为颈关节,关节点3、关节点3’、关节点4和关节点4’为肩关节,关节点5、关节点5’、关节点6和关节点6’为肘关节,关节点7、关节点7’、关节点8和关节点8’为手腕关节。基于此,头关节可以对应关节点集合a,关节点集合a可以包括关节点1和关节点1’;颈关节可以对应关节点集合b,关节点集合b可以包括关节点2和关节点2’;肩关节可以对应关节点集合c,关节点集合c可以包括关节点3、关节点3’、关节点4和关节点4’;肘关节可以对应关节点集合d,关节点集合d可以包括关节点5、关节点5’、关节点6和关节点6’;手腕关节可以对应关节点集合e,关节点集合e可以包括关节点7、关节点7’、关节点8和关节点8’。
需要说明的是,在其他实施例中,前述多个预设关节类型可以区分左肩关节和右肩关节,进一步的,左肩关节可以对应一个关节点集合,右肩关节可以对应另一个关节点集合。
在获得关节点集合a-关节点集合e之后,可以从关节点集合a-关节点集合e中分别取出至少一个关节点进行组合,得到多组关节点。例如,可以从关节点集合a中取出关节点1’,从关节点集合b中取出关节点2,从关节点集合c中取出关节点3和4,从关节点集合d中取出关节点5’和6’,从关节点集合e中取出关节点7和7’,得到一组关节点。类似的,可以通过其他的关节点组合方式,得到其他组关节点。可以理解的是,其中多组关节点中的其中一组可以包括关节点1至关节点7。
由于关节点7是手势区域中手腕关节的关节点,因此关节点7是关键关节点,而同为手腕关节的关节点7’、关节点8和关节点8’并不是手势区域中手腕关节的关节点,因此关节点7’、关节点8和关节点8’是非关键关节点。为了减少计算量,关节点集合e中可以只包括关节点7,或者可以不从关节集合e中取出关节点7’、关节点8和关节点8’进行关节点的组合。
在获得多组关节点之后,可以对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度。置信度用于表示可信的程度。 置信度越低可以表示越不可信,置信度越高可以表示越可信。在一人体关节点分布图的置信度大于预设阈值时可以表示该人体关节点分布图是可信的,在一人体关节点内分布图的置信度小于预设阈值时可以表示该人体关节点分布图是不可信的。
其中,考虑到人体关节点分布图是基于关节点得到的,关节点的置信度一定程度上会影响人体关节点分布图的人体关节点分布图的置信度,因此为了使得人体关节点分布图的置信度能够有效表征人体关节点分布图可信的程度,人体关节点分布图的置信度可以与其包括的关节点的置信度有关。一人体关节点分布图其包括的关节点的置信度越高,则该人体关节点分布图的置信度可以越高。相反的,一人体关节点分布图其包括的关节点的置信度越低,则该人体关节点分布图的置信度可以越低。
考虑到人体关节点的运动角度是受限的,不符合运动角度要求的关节点连线必定是不可信的,因此为了使得人体关节点分布图的置信度能够有效表征人体关节点分布图可信的程度,人体关节点分布图的置信度还可以与其关节点之间连线的角度是否符合人体关节点运动角度要求有关。如果一人体关节点分布图存在关节点之间的连线不符合人体关节点运动角度要求的情况,则该人体关节点分布图的置信度小于预设阈值,即该人体关节点分布图不可信。
考虑到人体关节点分布图所涉及关节点类型的数量越少,其所缺失的有效关节信息越多,进行关节点连线所得到的人体关节点分布图错误的概率越大,因此为了使得人体关节点分布图的置信度能够有效表征人体关节点分布图可信的程度,人体关节点分布图的置信度还可以与其所涉及关节点类型的数量有关。一人体关节点分布图所涉及关节点类型的数量越多,则该人体关节点分布图的置信度可以越高。相反的,一人体关节点分布图所涉及关节点类型的数量越少,则该人体关节点分布图的置信度可以越低。所涉及关节点类型的数量等于前述预设关节点类型的数量的人体关节点分布图,可以认为是完整的人体关节点分布图。所涉及关节点类型的数量小于前述预设关节点类型的数量的人体关节点分布图,可以认为是不完整的人体关节点分布图。如果一人体关节点分布图是不完整的,则该人体关节点分布图的置信度可以小于预设阈值,即该人体关节点分布图不可信。
一个实施例中,所述对各组关节点中的关节点进行连线,得到所述至少 一个人体关节点分布图及其置信度,具体可以包括:按照预设的关节点连线策略,对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度。所述预设的关节点连线策略与人体真实关节之间的连接方式一致。例如,由于手腕关节是通过小臂与肘关节,因此预设的关节点连线策略中可以包括手腕关节类型的关节点与肘关节类型的关节点连接。又例如,由于肘关节是通过大臂与肩关节连接,因此预设的关节点连线策略中还可以包括肘关节类型的关节点与肩关节类型的关节点连接。通过按照预设的关节点连线策略,对各组关节点中的关节点进行连线,以得到所述至少一个人体关节点分布图及其置信度,有利于减少计算量。
示例性的,所述预设的关节点连线策略例如可以为按照“头关节→颈关节→肩关节→肘关节→手腕关节”的顺序依次连线的策略。一组关节点包括图5中的关节点1至关节点8,则按照该关节点连线策略对各组关节点中的关节点进行连线具体可以为:首先将关节点1与关节点2进行连线,然后将关节点2分别与关节点3和关节点4进行连线,之后将关节点3与关节点5进行连线并将关节点4与关节点6进行连接,最后将关节点5与关节点7进行连线并将关节点6与关节点8进行连线。
示例性的,所述预设的关节点连线策略可以为按照“手腕关节→肘关节→肩关节→颈关节→头关节”的顺序依次连线的处理。一组关节点包括图5中的关节点1至关节点8,则按照该关节点连线策略对各组关节点中的关节点进行连线具体可以为:首先将关节点7与关节点5进行连线并将关节点8与关节点6进行连线,然后将关节点5与关节点3进行连线并将关节点6与关节点4进行连线,之后将关节点3和关节点4分别与关节点2进行连线,最后将关节点2与关节点1进行连线。
可以理解的是,按照预设的关节点连线策略对各组关节点中的关节点进行连线的方式可以得到多个人体关节点分布图及其置信度,多组人体关节点分布图可以与多组关节点对应,从多个人体关节点分布图中进一步的可以确定出同时包括第一关节点和第二关节点的人体关节点分布图,即所述至少一个人体关节点分布图及其置信度。
另一个实施例中,所述对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度,具体可以包括:通过穷举方式确定各组关节点中关节点的连线方式并进行连线,得到所述至少一个人体关节点分 布图及其置信度。通过穷举方式进行连线,不再局限于特定的某个类型关节的关节点与某个类型关节的关节点进行关节点连线,而是穷举所有可能的关节点连线方式需要说明的是,采用穷举方式进行连接,由于其中很多的连线方式与人体真实关节之间的连接方式不一致的,由此通过这些连线方式所得到的人体关节点分布图的置信度也非常低,即通过这些连线方式所得到的人体关节点分布图也是不可信的,从而不会影响目标人体关节点分布图的确定结果。
可以理解的是,通过穷举方式确定各组关节点中关节点的连线方式并进行连线的方式,针对各组关节点可以得到多个人体关节点分布图及其置信度,从各组关节点分别对应的多个人体关节点分布图中进一步的可以确定出同时包括第一关节点和第二关节点的人体关节点分布图,即所述至少一个人体关节点分布图及其置信度。
在上述实施例的基础上,所述从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图,具体可以包括:获取所述至少一个人体关节点分布图中的最大置信度;当所述最大置信度大于预设阈值时,确定存在满足预设条件的目标人体关节点分布图;所述目标人体关节点分布图为具有最大置信度的人体关节点分布图;当所述最大置信度小于预设阈值时,确定不存在满足预设条件的目标人体关节点分布图。
其中,所述至少一个人体关节点分布图中的各人体关节点分布图均具有某一个置信度,所述至少一个人体关节点分布图中具有最大置信度的人体关节点分布图,可能是可信的人体关节点分布图,也可能不是。在该最大置信度大于预设阈值时,可以表示该具有最大置信度的人体关节点分布图是可信的,具有最大置信度的人体关节点分布图能够作为用于进行手势归属识别的目标人体关节点分布图,由此可以确定所述至少一个人体关节点分布图中存在满足预设条件的目标人体关节点分布图。在该最大置信度小于预设阈值时,可以表示该具有最大置信度的人体关节点分布图是不可信的,具有最大置信度的人体关节点分布图并不能作为用于进行手势归属识别的目标关节点分布图,由此可以确定所述至少一个人体关节点分布图中不存在满足预设条件的目标人体关节点分布图。
图6为本申请另一实施例提供的手势识别方法的流程示意图,本实施例在图2所示实施例的基础上,主要描述了当不存在所述满足所述预设条件的目标 人体关节点分布图时的一种可选实现方式。如图6所示,本实施例的方法可以包括:
步骤61,确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点。
步骤62,基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点。
步骤63,从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图。
步骤64,当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
需要说明的是,步骤61-步骤64与前述的步骤21-步骤24类似,在此不再赘述。
步骤65,当不存在所述满足所述预设条件的目标人体关节点分布图时,确定所述多个目标区域中符合第一条件的第一候选区域。
步骤66,当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域关联。
步骤67,当所述第一候选区域的个数为多个时,确定与所述手势区域距离最近的第一候选区域与所述手势区域相关联。
在步骤65中,基于所述第一条件,可以从所述多个目标区域中去除掉明显与手势区域不关联的目标区域,得到可能与手势区域关联的候选区域(即第一候选区域),进而可以从第一候选区域中进一步确定出与手势区域关联的目标区域。由于通过第一条件可以缩小用于进一步确定手势区域相关联的目标区域的范围,与传统技术中直接将识别出的多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域相比,能够减少出现手势归属确定错误的情况。
示例性的,可以基于手势区域的第一属性,确定多个目标区域中符合第一条件的第一候选区域。
方式一,考虑到图像拍摄存在近大远小的现象,即与图像采集装置之间距离差异较大的多个目标,在图像采集装置所采集到的图像中的尺寸存在明 显的大小差异,且通常用户使用手势交互时手距离身体的距离较近,因此可以基于手势区域的尺寸属性,去除掉明显与手势区域不关联的目标区域,由此在一个实施例中第一属性可以包括尺寸属性。
基于此,所述确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述目标区域的尺寸和所述手势区域的尺寸的关系,确定符合所述第一条件的第一候选区域。
其中,尺寸例如可以通过长宽、对角线或面积等进行表示。一个实施例中,所述第一条件包括所述目标区域的尺寸和所述手势区域的尺寸的比例符合预设关系。
可选的,预设关系可以为目标区域的尺寸与手势区域的尺寸的比例小于第一比例阈值,以去除掉与手势区域的尺寸相比明显过大目标区域。和/或,预设关系可以为目标区域的尺寸与手势区域的尺寸的比例大于第二比例阈值,以去除掉尺寸与手势区域的尺寸相比明显过小目标区域。其中,第一比例阈值和第二比例阈值例如可以根据实验确定。
或者,可选的,预设关系可以由预设的第一模型决定,该第一模型可以基于输入的一目标区域的尺寸和一手势区域的尺寸,输出用于表示该目标区域的尺寸与该手势区域的尺寸的关联程度的第一输出结果。示例性的,该第一模型可以为神经网络模型,第一输出结果例如可以为0或者1,0可以表示比例不符合预设关系,1可以表示比例符合预设关系。
方式二,考虑到手作为身体的一部分,手势区域与其属于同一人体的目标区域之间的距离受限于身体结构特点,因此可以基于手势区域的位置属性,去除掉明显与手势区域不关联的目标区域,由此在另一个实施例中第一属性可以包括位置属性。
基于此,所述确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述目标区域的位置和所述手势区域的位置,确定符合所述第一条件的第一候选区域。
其中,位置例如可以为像素位置,例如手势区域的位置可以为手势区域的中心像素的像素位置。相应的,所述第一条件可以用于去除掉与手势区域的位置明显不符合的目标区域。一个实施例中,所述第一条件包括所述目标区域和所述手势区域的距离小于预设的距离阈值。其中,所述距离阈值例如可以根据实验确定。
方式三,考虑到图像中的内容存在左右镜像关系,且通常用户使用手势进行交互时左手在身体左侧,右手在身体的右侧,因此可以基于手势区域的左右手属性,去除掉明显与手势区域不关联的目标区域,由此在一个实施例中第一属性可以包括左右手属性。
基于此,所述确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:确定所述手势区域的左右手属性,所述手势区域的左右手属性为左手或右手;根据所述手势区域的左右手属性,确定所述多个目标区域中符合第一条件的第一候选区域。其中,所述第一条件可以用于去除掉与手势区域的左右手属性明显不相符的目标区域。
示例性的,所述根据所述手势区域的左右手属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述手势区域的左右手属性,确定一分割线;当任一所述目标区域位于所述分割线的第一方向的一侧时,确定目标区域为第一候选区域;所述第一方向由所述手势区域的左右手属性确定。
在所述图像中,,所述第一方向与所述手势区域的左右手属性的关系可以为:当所述手势区域的左右手属性为左手时,以所述手势区域的左侧方向作为所述第一方向;当所述手势区域的左右手属性为右手时,以所述手势区域的右侧方向作为所述第一方向。
以手势区域的左右手属性为右手为例,基于分割线确定第一候选区域的示意图可以如图7所示,其中,虚线框A4和A5为人脸框,虚线框A6为手势框,虚线表示分割线,箭头方向表示第一方向。参考图7可以看出,通过在手势区域的左右手属性为右手时,将位于分割线的第一方向一侧的目标区域确定为第一候选区域,可以排除掉虚线框A4所表示的人脸区域。
需要说明的是,图7中以分割线为竖直方向的直线为例,在其他实施例中,分割线还可以为其他形式,本申请对此不做限定。
需要说明的是,图7中以分割线为手势区域的左边界所在的直线为例。另一个实施例中,所述根据所述手势区域的左右手属性,确定一分割线,具体可以包括:当所述手势区域的左右手属性为左手时,以所述手势区域的右边界与第二方向的偏移量确定所述分割线;当所述手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定;其中,所述第一方向与第二方向的相反;所述偏移量大于等于零。通过以上方式,能避免 将正确关联的目标区域给排除掉。
以手势区域的左右手属性为右手为例,基于分割线确定第一候选区域的示意图可以如图8所示,其中,虚线框A7和A8为人脸框,虚线框A9为手势框,虚线表示分割线,向右箭头方向表示第一方向,向左箭头方向表示第二方向。参考图8可以看出,通过在手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定,在手势位于目标前方场景的下可以减小出现第一候选区域中未包括手势区域真实关联的目标区域(虚线框A8表示的人脸区域)的问题。需要说明的是,图8中以分割线为竖直方向的直线,且偏移量大于0为例。
示例性的,所述根据所述手势区域的左右手属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述手势区域的左右手属性,确定一分割线;当任一所述目标区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该目标区域为第一候选区域;所述第一方向由所述手势区域的左右手属性确定。需要说明的是,关于第一方向以及分割线的具体内容,以参见之前的描述,在此不再赘述。
以手势区域的左右手属性为右手为例,分割线是以手势区域的左边界确定为例,如图9所示,其中,虚线框A10和A11为人脸框,虚线框A12为手势框,虚线表示分割线,向右箭头方向表示第一方向,向左箭头方向表示第二方向。参考图9,如果预设比例为50%,则由于人脸框A11大于50%的像素位于分割线的第一方向的一侧,因此可以将人脸框A11确定为第一候选区域。参考图9可以看出,通过任一所述目标区域的预设比例的像素位于所述分割线的第一方向的一侧时,确定该目标区域为第一候选区域,在手势位于目标前方的场景下也可以减小出现第一候选区域中未包括手势区域真实关联的目标区域(虚线框A9表示的人脸区域)的问题。需要说明的是,图9中以分割线为竖直方向的直线,且偏移量大于0为例。
本申请实施例提供的手势识别方法,通过当不存在满足预设条件的目标人体关节点分布图时,确定多个目标区域中符合第一条件的第一候选区域,当第一候选区域的个数为一个时确定手势区域与第一候选区域关联,当第一候选区域的个数为多个时,确定与手势区域距离最近的第一候选区域与手势区域相关联,实现了在不存在满足预设条件的目标人体关节点分布图情况下,先通过第一条件去除掉多个目标区域中明显与手势区域不关联的目标区域得 到第一候选区域,再在第一候选区域中确定手势区域相关联的目标区域,由于通过第一条件可以缩小用于进一步确定手势区域相关联的目标区域的范围,与传统技术中直接将识别出的多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域相比,能够减少出现手势归属确定错误的情况。
图10为本申请又一实施例提供的手势识别方法的流程示意图,本实施例在图6所示实施例的基础上,主要描述了当第一候选区域的个数为多个时的另一种可选实现方式。如图10所示,本实施例的方法可以包括:
步骤101,确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点。
步骤102,基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点。
步骤103,从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图。
步骤104,当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
步骤105,当不存在所述满足所述预设条件的目标人体关节点分布图时,确定所述多个目标区域中符合第一条件的第一候选区域。
步骤106,当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域关联。
需要说明的是,步骤101-步骤106与前述的步骤61-步骤64类似,在此不再赘述。
步骤107,当所述第一候选区域的个数为多个时,确定多个所述第一候选区域中符合第二条件的第二候选区域。
步骤108,当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
步骤109,当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。
在步骤105中,基于所述第一条件,可以从所述多个目标区域中去除掉 明显与手势区域不关联的目标区域,得到可能与手势区域关联的一个或多个候选区域(即第一候选区域)。步骤107中,在第一候选区域的个数为多个时,基于所述第二条件,可以进一步从多个所述第一候选区域中去除掉明显与手势区域不关联的目标区域,得到可能与手势区域关联的一个或多个第二候选区域。由于通过第二条件可以进一步缩小用于确定手势区域相关的目标区域的范围,因此与图6所示的方法相比,有利于进一步减少出现手势归属确定错误的情况。
可以理解的是,所述第二条件与所述第一条件不同。所述第一条件与所述手势区域的第一属性相关,所述第二条件与所述手势区域的第二属性,第二属性与第一属性不同。
一个实施例中,在根据所述目标区域的尺寸和所述手势区域的尺寸的关系,确定符合所述第一条件的第一候选区域的基础上,可以进一步根据所述第一目标区域的位置和所述手势区域的位置,确定符合所述第二条件的第二候选区域。由此可以得到先基于尺寸属性从多个候选区域中确定第一候选区域,再基于位置属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。可以理解的是,交换第一条件和第二条件的顺序,可以得到先基于位置属性从多个候选区域中确定第一候选区域,再基于尺寸属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。
示例性的,所述确定多个所述第一候选区域中符合第二条件的第二候选区域,具体可以包括:根据多个所述第一目标区域的位置和所述手势区域的位置,确定符合所述第二条件的第二候选区域。
示例性的,所述第二条件包括所述第一目标区域和所述手势区域的距离小于预设的距离阈值。
需要说明的是,根据第一目标区域的位置和手势区域的位置确定符合第二条件的第二候选区域的方式,与前述方式二所述的根据目标区域的位置和手势区域的位置确定符合第一条件的第一候选区域的方式类似,在此不再赘述。
示例性的,所述确定多个所述第一候选区域中符合第二条件的第二候选区域,具体可以包括:根据所述第一目标区域的尺寸和所述手势区域的尺寸,确定符合所述第二条件的第二候选区域。
示例性的,所述第二条件包括所述第一目标区域的尺寸和所述手势区域 的尺寸的比例符合预设关系。
需要说明的是,根据第一目标区域的尺寸和手势区域的尺寸确定符合第二条件的第二候选区域的方式,与前述方式一所述的根据目标区域的尺寸和手势区域的尺寸确定符合第一条件的第一候选区域的方式类似,在此不再赘述。
另一个实施例中,在根据所述目标区域和所述手势区域的尺寸和/或位置,确定符合所述第一条件的第一候选区域的基础上,可以进一步根据手势区域的左右手属性,确定符合所述第二条件的第二候选区域。由此可以得到先基于尺寸属性从多个候选区域中确定第一候选区域,再基于左右手属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。可以理解的是,交换第一条件和第二条件的顺序,可以得到先基于左右手属性从多个候选区域中确定第一候选区域,再基于尺寸属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。
示例性的,所述确定多个所述第一候选区域中符合第二条件的第二候选区域,具体可以包括:确定所述手势区域的左右手属性;所述手势区域的左右手属性为左手或右手;根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
示例性的,所述根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,包括:根据所述手势区域的左右手属性,确定一分割线;当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
需要说明的是,根据手势区域的左右手确定符合第二条件的第二候选区域的方式,与前述方式三所述的根据手势区域的左右手属性确定符合第一条件的第一候选区域的方式类似,在此不再赘述。
又一个实施例中,在根据所述目标区域的位置和所述手势区域的位置,确定符合所述第一条件的第一候选区域的基础上,可以进一步根据手势区域的左右手属性,确定符合所述第二条件的第二候选区域。由此可以得到先基于位置属性从多个候选区域中确定第一候选区域,再基于左右手属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。可以理解的是,交换第一条件和第二条件的顺序,可以得到先基于左右手属性从多个候选区域 中确定第一候选区域,再基于位置属性从多个第一候选区域中确定第二候选区域的候选区域确定方式。
可替换的,当第二候选区域的个数为多个时,也可以进一步确定符合第三条件的第三候选区域,并在第三候选区域中确定与手势区域相关联的区域。第三条件是与第一条件和第二条件均不相同的条件。
本申请实施例提供的方法,通过当不存在满足预设条件的目标人体关节点分布图时,确定多个目标区域中符合第一条件的第一候选区域,当第一候选区域的个数为一个时,确定手势区域与第一候选区域关联,当第一候选区域的个数为多个时,确定多个第一候选区域中符合第二条件的第二候选区域,当第二候选区域的个数为多个时,确定与手势区域距离最近的第二候选区域与手势区域相关联,实现了在不存在满足预设条件的目标人体关节点分布图情况下,先通过第一条件去除掉多个目标区域中明显与手势区域不关联的目标区域得到第一候选区域,再在第一候选区域中先通过第二条件进一步去除掉明显与手势区域不关联的目标区域得到第二候选区域,之后在第二候选区域中确定出手势区域相关联的目标区域,由于通过第一条件和第二条件依次缩小用于确定手势区域相关的目标区域的范围,能进一步缩小候选区域的范围,有利于进一步减少出现手势归属确定错误的情况。
可选的,在上述方法实施例的基础上,还可以包括如下步骤:当确定所述手势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。从而实现了多人场景下,针对手势所归属目标对象的跟踪、拍照或录像功能,有利于提高用户的使用体验。
进一步可选的,还可以包括如下步骤:当对所述目标对象进行跟踪或录像时,不断更新感兴趣区域;当在感兴趣区域中识别到手势时,停止进行跟踪或录像。从而实现了在跟踪或录像过程中,用户能够通过手势控制停止跟踪或录像,方便用户进行跟踪或录像的停止操作,有利于提高用户的使用体验。
一个实施例中,所述当在感兴趣区域中识别到手势时,停止进行跟踪或录像,具体可以包括:当在感兴趣区域识别到手势时,确定所述手势是否与所述目标对象相关联;当所述手势与所述目标对象相关联时,停止进行跟踪或录像。其中,所述手势是否与目标对象相关联,即所述手势是否归属于所述目标对象,所述手势所在的手势区域是否与所述目标对象所在的对象区域 相关联。关于确定手势是否与目标对象相关联的具体方式可以参见前述实施例,在此不再赘述。
考虑到跟踪或录像的启停控制通常为同一人,通过当手势与目标对象相关联时,停止进行跟踪或录像,实现了在由与目标对象相关联的一手势启动跟踪或录像过程之后,进一步可以由该目标对象通过手势控制停止跟踪或录像,即启停跟踪或录像的目标为同一目标,有利于减少手势误触发。
图11为本申请又一实施例提供的手势识别方法的流程示意图,本实施例的执行主体可以为图1中的手势识别装置12,具体可以为手势识别装置12的处理器。如图11所示,本实施例的方法可以包括:
步骤111,识别图像中的多个目标区域和手势区域。
步骤112,确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目标区域中的候选区域。
步骤113,在所述候选区域中确定一个与所述手势区域相关联的目标区域。
所述属性可以为用于从多个目标区域中去除掉明显与手势区域不相关联的目标区域。示例性的,所述属性包括下述中的一种或多种:位置属性、尺寸属性或左右手属性。
示例性的,所述根据所述手势区域的属性确定所述多个目标区域中的候选区域,具体可以包括:根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域。
一个实施例中,所述根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述目标区域的尺寸以及所述手势区域的尺寸,确定所述多个目标区域中符合所述第一条件的第一候选区域。
另一个实施例中,所述根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体可以包括:根据所述目标区域的位置和所述手势区域的位置,确定所述多个目标区域中符合所述第一条件的第一候选区域。
需要说明的是,关于根据手势区域的第一属性,确定多个目标区域中符合第一条件的第一候选区域的方式,可以参见前述实施例的相关描述,在此不再赘述。
在确定所述第一候选区域的情况下,所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,具体可以包括:当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域相关联。可选的,所述在所述候选区域中确定一个与所述手势区域相关联的目标区域还可以包括:当所述第一候选区域的个数为多个时,确定与所述手势区域距离最近的第一候选区域与所述手势区域相关联。
或者,可选的,所述根据所述手势区域的属性确定所述多个目标区域中的候选区域,还可以包括:当所述第一候选区域的个数为多个时,根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
一个实施例中,所述根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域具体可以包括:根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
示例性的,所述根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,具体可以包括:根据所述手势区域的左右手属性,确定一分割线;当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
需要说明的是,关于根据手势区域的第二属性,确定多个第一目标区域中符合第二条件的第二候选区域的方式,可以参见前述实施例的相关描述,在此不再赘述。
在确定所述第二候选区域的情况下,所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,具体可以包括:当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
可选的,所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,还可以包括:当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。或者,可选的,当第二候选区域的个数为多个时,也可以进一步确定符合第三条件的第三候选区域,并在第三候选区域中确定与手势区域相关联的区域。第三条件是与第一条件和第二条件均不相同的条件。
类似于之前的实施例,本申请实施例的方法还可以包括:当确定所述手 势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。
本申请实施例提供的手势识别方法,通过识别图像中的多个目标区域和手势区域,确定手势区域的属性,并根据手势区域的属性确定多个目标区域中的候选区域,在候选区域中确定一个与手势区域相关联的目标区域,实现了在不存在满足预设条件的目标人体关节点分布图情况下,先通过去除掉多个目标区域中明显与手势区域不关联的目标区域得到候选区域,再在候选区域中确定一个与手势区域相关联的目标区域,由于通过确定候选区域缩小了用于进一步确定手势区域相关联的目标区域的范围,与传统技术中直接将识别出的多个目标区域中距离手势区域最近的目标区域作为手势区域相关联的目标区域相比,能够减少出现手势归属确定错误的情况。
图12为本申请一实施例提供的手势识别装置的结构示意图,如图12所示,该装置120可以包括:处理器121和存储器122。
所述存储器122,用于存储程序代码;
所述处理器121,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点;
基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点;
从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图;
当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
本实施例提供的手势识别装置,可以用于执行前述图2、图6所示方法实施例的技术方案,其实现原理和技术效果与方法实施例类似,在此不再赘述。
图13为本申请另一实施例提供的手势识别装置的结构示意图,如图13所示,该装置130可以包括:处理器131和存储器132。
所述存储器132,用于存储程序代码;
所述处理器131,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
识别图像中的多个目标区域和手势区域;
确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目标区域中的候选区域;
在所述候选区域中确定一个与所述手势区域相关联的目标区域。
本实施例提供的手势识别装置,可以用于执行前述图11所示方法实施例的技术方案,其实现原理和技术效果与方法实施例类似,在此不再赘述。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (82)

  1. 一种手势识别方法,其特征在于,包括:
    确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点;
    基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点;
    从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图;
    当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,包括:
    从所述图像的不同类型关节点对应的关节点集合中,分别取出至少一个关节点进行组合,得到多组关节点;
    对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度;
    从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图包括:
    获取所述至少一个人体关节点分布图中的最大置信度;
    当所述最大置信度大于预设阈值时,确定存在满足预设条件的目标人体关节点分布图;所述目标人体关节点分布图为具有最大置信度的人体关节点分布图;
    当所述最大置信度小于预设阈值时,确定不存在满足预设条件的目标人体关节点分布图。
  3. 根据权利要求2所述的方法,其特征在于,所述对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度,包括:
    按照预设的关节点连线策略,对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度。
  4. 根据权利要求2所述的方法,其特征在于,所述对各组关节点中的关 节点进行连线,得到所述至少一个人体关节点分布图及其置信度,包括:
    通过穷举方式确定各组关节点中关节点的连线方式并进行连线,得到所述至少一个人体关节点分布图及其置信度。
  5. 根据权利要求2所述的方法,其特征在于,单个人体关节点分布图的置信度与其包括的关节点的置信度、关节点之间连线的角度是否符合人体关节点运动角度要求,以及所涉及关节点的类型数量相关。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述目标人体关节点分布图的个数为一个;所述根据所述目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域,包括:
    将所述目标人体关节点分布图中所述第二关节点对应的目标区域确定为所述手势区域相关联的目标区域。
  7. 根据权利要求1-5任一项所述的方法,其特征在于,所述方法包括:
    当不存在所述满足所述预设条件的目标人体关节点分布图时,确定所述多个目标区域中符合第一条件的第一候选区域;当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域关联。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    当所述第一候选区域的个数为多个时,确定多个所述第一候选区域中符合第二条件的第二候选区域;
    当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。
  10. 根据权利要求7-9任一项所述的方法,其特征在于,所述确定所述多个目标区域中符合第一条件的第一候选区域,包括:
    根据所述目标区域的尺寸和所述手势区域的尺寸的关系,确定符合所述第一条件的第一候选区域。
  11. 根据权利要求10所述的方法,其特征在于,所述第一条件包括所述目标区域的尺寸和所述手势区域的尺寸的比例符合预设关系。
  12. 根据权利要求7-9任一项所述的方法,其特征在于,所述确定所述多 个目标区域中符合第一条件的第一候选区域,包括:
    根据所述目标区域的位置和所述手势区域的位置,确定符合所述第一条件的第一候选区域。
  13. 根据权利要求12所述的方法,其特征在于,所述第一条件包括所述目标区域和所述手势区域的距离小于预设的距离阈值。
  14. 根据权利要求8所述的方法,其特征在于,所述确定多个所述第一候选区域中符合第二条件的第二候选区域,包括:
    确定所述手势区域的左右手属性;所述手势区域的左右手属性为左手或右手;
    根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
  15. 根据权利要求14所述的方法,其特征在于,所述根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,包括:
    根据所述手势区域的左右手属性,确定一分割线;
    当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
  16. 根据权利要求15所述的方法,其特征在于,当所述手势区域的左右手属性为左手时,以所述手势区域的左侧方向作为所述第一方向;
    当所述手势区域的左右手属性为右手时,以所述手势区域的右侧方向作为所述第一方向。
  17. 根据权利要求15所述的方法,其特征在于,所述根据所述手势区域的左右手属性,确定一分割线,包括:
    当所述手势区域的左右手属性为左手时,以所述手势区域的右边界与第二方向的偏移量确定所述分割线;
    当所述手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定;其中,所述第一方向与第二方向的相反;所述偏移量大于等于零。
  18. 根据权利要求1-17任一项所述的方法,其特征在于,所述方法还包括:
    根据识别到的所述目标区域确定感兴趣区域;
    在感兴趣区域中检测所述多个关节点。
  19. 根据权利要求1-17任一项所述的方法,其特征在于,所述方法还包括:
    当确定所述手势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。
  20. 根据权利要求19所述的方法,其特征在于,所述方法还包括:
    当对所述目标对象进行跟踪或录像时,不断更新感兴趣区域;
    当在感兴趣区域中识别到手势时,停止进行跟踪或录像。
  21. 根据权利要求20所述的方法,其特征在于,所述当在感兴趣区域中识别到手势时,停止进行跟踪或录像,包括:
    当在感兴趣区域识别到手势时,确定所述手势是否与所述目标对象相关联;
    当所述手势与所述目标对象相关联时,停止进行跟踪或录像。
  22. 根据权利要求1所述的方法,其特征在于,所述目标区域包括人脸框或头肩框。
  23. 一种手势识别方法,其特征在于,所述方法还包括:
    识别图像中的多个目标区域和手势区域;
    确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目标区域中的候选区域;
    在所述候选区域中确定一个与所述手势区域相关联的目标区域。
  24. 根据权利要求23所述的方法,其特征在于,所述根据所述手势区域的属性确定所述多个目标区域中的候选区域,包括:
    根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域;所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,包括:当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域相关联。
  25. 根据权利要求24所述的方法,其特征在于,所述根据所述手势区域的属性确定所述多个目标区域中的候选区域,还包括:
    当所述第一候选区域的个数为多个时,根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域;
    所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,包括:当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
  26. 根据权利要求25所述的方法,其特征在于,所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,还包括:
    当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。
  27. 根据权利要求24所述的方法,其特征在于,所述根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,包括:
    根据所述目标区域的尺寸以及所述手势区域的尺寸,确定所述多个目标区域中符合所述第一条件的第一候选区域。
  28. 根据权利要求27所述的方法,其特征在于,所述第一条件包括所述目标区域的尺寸和所述手势区域的尺寸的比例符合预设关系。
  29. 根据权利要求24所述的方法,其特征在于,所述根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,包括:
    根据所述目标区域的位置和所述手势区域的位置,确定所述多个目标区域中符合所述第一条件的第一候选区域。
  30. 根据权利要求29所述的方法,其特征在于,所述第一条件包括所述目标区域和所述手势区域的距离小于预设的距离阈值。
  31. 根据权利要求25-30任一项所述的方法,其特征在于,所述根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域包括:
    根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
  32. 根据权利要求31所述的方法,其特征在于,所述根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,包括:
    根据所述手势区域的左右手属性,确定一分割线;
    当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
  33. 根据权利要求32所述的方法,其特征在于,当所述手势区域的左右手属性为左手时,以所述手势区域的左侧方向作为所述第一方向;
    当所述手势区域的左右手属性为右手时,以所述手势区域的右侧方向作为所述第一方向。
  34. 根据权利要求32所述的方法,其特征在于,所述根据所述手势区域的左右手属性,确定一分割线包括:
    当所述手势区域的左右手属性为左手时,以所述手势区域的右边界与第二方向的偏移量确定所述分割线;
    当所述手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定;其中,所述第一方向与第二方向的相反;所述偏移量大于等于零。
  35. 根据权利要求23-34任一项所述的方法,其特征在于,所述方法还包括:
    当确定所述手势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。
  36. 根据权利要求35所述的方法,其特征在于,所述方法还包括:
    当对所述目标对象进行跟踪或录像时,不断更新感兴趣区域;
    当在感兴趣区域中识别到手势时,停止进行跟踪或录像。
  37. 根据权利要求36所述的方法,其特征在于,所述当在感兴趣区域中识别到手势时,停止进行跟踪或录像,包括:
    当在感兴趣区域识别到手势时,确定所述手势是否与所述目标对象相关联;
    当所述手势与所述目标对象相关联时,停止进行跟踪或录像。
  38. 根据权利要求23所述的方法,其特征在于,所述目标区域包括人脸框或头肩框。
  39. 根据权利要求23所述的方法,其特征在于,所述属性包括下述中的一种或多种:位置属性、尺寸属性或左右手属性。
  40. 一种手势识别装置,其特征在于,包括:存储器和处理器;
    所述存储器,用于存储程序代码;
    所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
    确定多个关节点在图像中的位置,并识别所述图像中的手势区域和多个目标区域,所述图像中的所述多个关节点包括位于所述手势区域中的第一关节点和位于所述目标区域中的第二关节点;
    基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,所述人体关节点分布图包括第一关节点和第二关节点;
    从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图;
    当存在满足所述预设条件的目标人体关节点分布图时,根据所述目标人体关节点分布图中的第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域。
  41. 根据权利要求40所述的装置,其特征在于,所述处理器用于基于所述多个关节点在所述图像中的位置,确定出至少一个人体关节点分布图,具体包括:
    从所述图像的不同类型关节点对应的关节点集合中,分别取出至少一个关节点进行组合,得到多组关节点;
    对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度;
    从所述至少一个人体关节点分布图中,确定是否存在满足预设条件的目标人体关节点分布图包括:
    获取所述至少一个人体关节点分布图中的最大置信度;
    当所述最大置信度大于预设阈值时,确定存在满足预设条件的目标人体关节点分布图;所述目标人体关节点分布图为具有最大置信度的人体关节点分布图;
    当所述最大置信度小于预设阈值时,确定不存在满足预设条件的目标人体关节点分布图。
  42. 根据权利要求41所述的装置,其特征在于,所述处理器用于对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度,具体包括:
    按照预设的关节点连线策略,对各组关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度。
  43. 根据权利要求41所述的装置,其特征在于,所述处理器用于对各组 关节点中的关节点进行连线,得到所述至少一个人体关节点分布图及其置信度,具体包括:
    通过穷举方式确定各组关节点中关节点的连线方式并进行连线,得到所述至少一个人体关节点分布图及其置信度。
  44. 根据权利要求41所述的装置,其特征在于,单个人体关节点分布图的置信度与其包括的关节点的置信度、关节点之间连线的角度是否符合人体关节点运动角度要求,以及所涉及关节点的类型数量相关。
  45. 根据权利要求40-44任一项所述的装置,其特征在于,所述目标人体关节点分布图的个数为一个;所述处理器用于根据所述目标人体关节点中的第一关节点所对应的手势区域和第二关节点所对应的目标区域,确定所述手势区域相关联的目标区域,具体包括:
    将所述目标人体关节点分布图中所述第二关节点对应的目标区域确定为所述手势区域相关联的目标区域。
  46. 根据权利要求40-44任一项所述的装置,其特征在于,所述处理器还用于:
    当不存在所述满足所述预设条件的目标人体关节点分布图时,确定所述多个目标区域中符合第一条件的第一候选区域;当所述第一候选区域的个数为一个时,确定所述手势区域与所述第一候选区域关联。
  47. 根据权利要求46所述的装置,其特征在于,所述处理器还用于:
    当所述第一候选区域的个数为多个时,确定多个所述第一候选区域中符合第二条件的第二候选区域;
    当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
  48. 根据权利要求47所述的装置,其特征在于,所述处理器还用于:
    当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。
  49. 根据权利要求46-48任一项所述的装置,其特征在于,所述处理器用于确定所述多个目标区域中符合第一条件的第一候选区域,具体包括:
    根据所述目标区域的尺寸和所述手势区域的尺寸的关系,确定符合所述第一条件的第一候选区域。
  50. 根据权利要求49所述的装置,其特征在于,所述第一条件包括所述 目标区域的尺寸和所述手势区域的尺寸的比例符合预设关系。
  51. 根据权利要求46-48任一项所述的装置,其特征在于,所述处理器用于确定所述多个目标区域中符合第一条件的第一候选区域,具体包括:
    根据所述目标区域的位置和所述手势区域的位置,确定符合所述第一条件的第一候选区域。
  52. 根据权利要求51所述的装置,其特征在于,所述第一条件包括所述目标区域和所述手势区域的距离小于预设的距离阈值。
  53. 根据权利要求47所述的装置,其特征在于,所述处理器用于确定多个所述第一候选区域中符合第二条件的第二候选区域,具体包括:
    确定所述手势区域的左右手属性;所述手势区域的左右手属性为左手或右手;
    根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
  54. 根据权利要求53所述的装置,其特征在于,所述处理器用于根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,具体包括:
    根据所述手势区域的左右手属性,确定一分割线;
    当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
  55. 根据权利要求54所述的装置,其特征在于,当所述手势区域的左右手属性为左手时,以所述手势区域的左侧方向作为所述第一方向;
    当所述手势区域的左右手属性为右手时,以所述手势区域的右侧方向作为所述第一方向。
  56. 根据权利要求54所述的装置,其特征在于,所述处理器用于根据所述手势区域的左右手属性,确定一分割线,具体包括:
    当所述手势区域的左右手属性为左手时,以所述手势区域的右边界与第二方向的偏移量确定所述分割线;
    当所述手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定;其中,所述第一方向与第二方向的相反;所述偏移量大于等于零。
  57. 根据权利要求40-56任一项所述的装置,其特征在于,所述处理器还用于:
    根据识别到的所述目标区域确定感兴趣区域;
    在感兴趣区域中检测所述多个关节点。
  58. 根据权利要求40-56任一项所述的装置,其特征在于,所述处理器还用于:
    当确定所述手势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。
  59. 根据权利要求58所述的装置,其特征在于,所述处理器还用于:
    当对所述目标对象进行跟踪或录像时,不断更新感兴趣区域;
    当在感兴趣区域中识别到手势时,停止进行跟踪或录像。
  60. 根据权利要求59所述的装置,其特征在于,所述处理器用于当在感兴趣区域中识别到手势时,停止进行跟踪或录像,具体包括:
    当在感兴趣区域识别到手势时,确定所述手势是否与所述目标对象相关联;
    当所述手势与所述目标对象相关联时,停止进行跟踪或录像。
  61. 根据权利要求40所述的装置,其特征在于,所述目标区域包括人脸框或头肩框。
  62. 一种手势识别装置,其特征在于,包括:存储器和处理器;
    所述存储器,用于存储程序代码;
    所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:
    识别图像中的多个目标区域和手势区域;
    确定所述手势区域的属性,并根据所述手势区域的属性确定所述多个目标区域中的候选区域;
    在所述候选区域中确定一个与所述手势区域相关联的目标区域。
  63. 根据权利要求62所述的装置,其特征在于,所述处理器用于根据所述手势区域的属性确定所述多个目标区域中的候选区域,具体包括:
    根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域;所述在所述候选区域中确定一个与所述手势区域相关联的目标区域,包括:当所述第一候选区域的个数为一个时,确定所述手势区域 与所述第一候选区域相关联。
  64. 根据权利要求63所述的装置,其特征在于,所述处理器还用于:
    当所述第一候选区域的个数为多个时,根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域;
    所述处理器用于在所述候选区域中确定一个与所述手势区域相关联的目标区域,具体包括:当所述第二候选区域的个数为一个时,确定所述手势区域与所述第二候选区域相关联。
  65. 根据权利要求64所述的装置,其特征在于,所述处理器还用于:
    当所述第二候选区域的个数为多个时,确定与所述手势区域距离最近的第二候选区域与所述手势区域相关联。
  66. 根据权利要求63所述的装置,其特征在于,所述处理器用于根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体包括:
    根据所述目标区域的尺寸和所述手势区域的尺寸的关系,确定所述多个目标区域中符合所述第一条件的第一候选区域。
  67. 根据权利要求66所述的装置,其特征在于,所述第一条件包括所述目标区域的尺寸和所述手势区域的尺寸的比例符合预设关系。
  68. 根据权利要求63所述的装置,其特征在于,所述处理器用于根据所述手势区域的第一属性,确定所述多个目标区域中符合第一条件的第一候选区域,具体包括:
    根据所述目标区域的位置和所述手势区域的位置,确定所述多个目标区域中符合所述第一条件的第一候选区域。
  69. 根据权利要求68所述的装置,其特征在于,所述第一条件包括所述目标区域和所述手势区域的距离小于预设的距离阈值。
  70. 根据权利要求64-69任一项所述的装置,其特征在于,所述处理器用于根据所述手势区域的第二属性,确定多个所述第一候选区域中符合第二条件的第二候选区域,具体包括:
    根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第二候选区域。
  71. 根据权利要求70所述的装置,其特征在于,所述处理器用于根据所述手势区域的左右手属性,确定多个所述第一候选区域中符合第二条件的第 二候选区域,具体包括:
    根据所述手势区域的左右手属性,确定一分割线;
    当任一所述第一候选区域大于或等于预设比例的像素位于所述分割线的第一方向的一侧时,确定该第一候选区域为第二候选区域;所述第一方向由所述手势区域的左右手属性确定。
  72. 根据权利要求71所述的装置,其特征在于,当所述手势区域的左右手属性为左手时,以所述手势区域的左侧方向作为所述第一方向;
    当所述手势区域的左右手属性为右手时,以所述手势区域的右侧方向作为所述第一方向。
  73. 根据权利要求71所述的装置,其特征在于,所述处理器用于根据所述手势区域的左右手属性,确定一分割线,具体包括:
    当所述手势区域的左右手属性为左手时,以所述手势区域的右边界与第二方向的偏移量确定所述分割线;
    当所述手势区域的左右手属性为右手时,以所述手势区域的左边界与第二方向的偏移量确定;其中,所述第一方向与第二方向的相反;所述偏移量大于等于零。
  74. 根据权利要求62-73任一项所述的装置,其特征在于,所述处理器还用于:
    当确定所述手势区域相关联的目标区域时,对所述目标区域中的目标对象进行跟踪、拍照或录像。
  75. 根据权利要求74所述的装置,其特征在于,所述方法还包括:
    当对所述目标对象进行跟踪或录像时,不断更新感兴趣区域;
    当在感兴趣区域中识别到手势时,停止进行跟踪或录像。
  76. 根据权利要求75所述的装置,其特征在于,所述当在感兴趣区域中识别到手势时,停止进行跟踪或录像,包括:
    当在感兴趣区域识别到手势时,确定所述手势是否与所述目标对象相关联;
    当所述手势与所述目标对象相关联时,停止进行跟踪或录像。
  77. 根据权利要求62所述的装置,其特征在于,所述目标区域包括人脸框或头肩框。
  78. 根据权利要求62所述的装置,其特征在于,所述属性包括下述中的 一种或多种:位置属性、尺寸属性或左右手属性。
  79. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行如权利要求1-22任一项所述的方法。
  80. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行如权利要求23-39任一项所述的方法。
  81. 一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,用于实现如权利要求1-22任一项所述的方法。
  82. 一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,用于实现如权利要求23-39任一项所述的方法。
PCT/CN2020/111493 2020-08-26 2020-08-26 手势识别方法及装置 WO2022040994A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/111493 WO2022040994A1 (zh) 2020-08-26 2020-08-26 手势识别方法及装置
CN202080006664.2A CN113168533A (zh) 2020-08-26 2020-08-26 手势识别方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/111493 WO2022040994A1 (zh) 2020-08-26 2020-08-26 手势识别方法及装置

Publications (1)

Publication Number Publication Date
WO2022040994A1 true WO2022040994A1 (zh) 2022-03-03

Family

ID=76881223

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111493 WO2022040994A1 (zh) 2020-08-26 2020-08-26 手势识别方法及装置

Country Status (2)

Country Link
CN (1) CN113168533A (zh)
WO (1) WO2022040994A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130158712A1 (en) * 2011-12-16 2013-06-20 Samsung Electronics Co., Ltd. Walking robot and control method thereof
CN108960211A (zh) * 2018-08-10 2018-12-07 罗普特(厦门)科技集团有限公司 一种多目标人体姿态检测方法以及系统
CN111199207A (zh) * 2019-12-31 2020-05-26 华南农业大学 基于深度残差神经网络的二维多人体姿态估计方法
CN111368751A (zh) * 2020-03-06 2020-07-03 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130158712A1 (en) * 2011-12-16 2013-06-20 Samsung Electronics Co., Ltd. Walking robot and control method thereof
CN108960211A (zh) * 2018-08-10 2018-12-07 罗普特(厦门)科技集团有限公司 一种多目标人体姿态检测方法以及系统
CN111199207A (zh) * 2019-12-31 2020-05-26 华南农业大学 基于深度残差神经网络的二维多人体姿态估计方法
CN111368751A (zh) * 2020-03-06 2020-07-03 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN113168533A (zh) 2021-07-23

Similar Documents

Publication Publication Date Title
WO2021227360A1 (zh) 一种交互式视频投影方法、装置、设备及存储介质
CN111862296B (zh) 三维重建方法及装置、系统、模型训练方法、存储介质
WO2020015492A1 (zh) 识别视频中的关键时间点的方法、装置、计算机设备及存储介质
WO2017133605A1 (zh) 一种人脸跟踪方法、装置和智能终端
CN107948517B (zh) 预览画面虚化处理方法、装置及设备
WO2018112788A1 (zh) 图像处理方法及设备
WO2019033574A1 (zh) 电子装置、动态视频人脸识别的方法、系统及存储介质
JP4951498B2 (ja) 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体
CN112287868B (zh) 一种人体动作识别方法及装置
US10991124B2 (en) Determination apparatus and method for gaze angle
CN112287867B (zh) 一种多摄像头的人体动作识别方法及装置
JP6515039B2 (ja) 連続的な撮影画像に映り込む平面物体の法線ベクトルを算出するプログラム、装置及び方法
CN111353336B (zh) 图像处理方法、装置及设备
US10666858B2 (en) Deep-learning-based system to assist camera autofocus
WO2019157922A1 (zh) 一种图像处理方法、装置及ar设备
WO2023273372A1 (zh) 手势识别对象确定方法及装置
JP6349448B1 (ja) 情報処理装置、情報処理プログラム、及び、情報処理方法
JP2002342762A (ja) 物体追跡方法
CN114463781A (zh) 确定触发手势的方法、装置及设备
WO2019144296A1 (zh) 可移动平台的控制方法、装置和可移动平台
JP6798609B2 (ja) 映像解析装置、映像解析方法およびプログラム
WO2022021093A1 (zh) 拍摄方法、拍摄装置及存储介质
WO2022040994A1 (zh) 手势识别方法及装置
Makris et al. Robust 3d human pose estimation guided by filtered subsets of body keypoints
JP2019040592A (ja) 情報処理装置、情報処理プログラム、及び、情報処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20950666

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20950666

Country of ref document: EP

Kind code of ref document: A1