WO2022198819A1 - Image recognition-based device control method and apparatus, electronic device, and computer readable storage medium - Google Patents

Image recognition-based device control method and apparatus, electronic device, and computer readable storage medium Download PDF

Info

Publication number
WO2022198819A1
WO2022198819A1 PCT/CN2021/102478 CN2021102478W WO2022198819A1 WO 2022198819 A1 WO2022198819 A1 WO 2022198819A1 CN 2021102478 W CN2021102478 W CN 2021102478W WO 2022198819 A1 WO2022198819 A1 WO 2022198819A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
hand
image
detected
gesture
Prior art date
Application number
PCT/CN2021/102478
Other languages
French (fr)
Chinese (zh)
Inventor
孔祥晖
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Publication of WO2022198819A1 publication Critical patent/WO2022198819A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular, to an image recognition-based device control method, apparatus, electronic device, and computer-readable storage medium.
  • gestures have become an important means of human-computer interaction due to their intuitive and natural characteristics. Therefore, gesture recognition based on computer vision has become a research focus in the field of human-computer interaction.
  • the user's gesture category can be determined through the acquired image, and the target device can be controlled by using the determined gesture category.
  • the target device can be controlled by using the determined gesture category.
  • there may be interference between gestures of different users. thereby reducing the accuracy of image recognition of the gesture of the main control user, thereby reducing the control accuracy of the target device.
  • the embodiments of the present disclosure provide at least an image recognition-based device control method, device, electronic device, and computer-readable storage medium, which can improve the accuracy of image recognition, thereby improving the accuracy of target device control based on the image recognition result.
  • the present disclosure provides a device control method based on image recognition, including:
  • the target hand Based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second to-be-detected image, and determine that the target hand is in the second to-be-detected image
  • the gesture recognition result in the image wherein, the second to-be-detected image is an image obtained after the first to-be-detected image
  • the target device is controlled.
  • the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the acquired second to-be-detected hand is detected.
  • the target limb connected to the target hand in the detection image is subjected to limb tracking detection, and the gesture recognition result of the target hand in the second to-be-detected image is determined.
  • the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result.
  • limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device.
  • the interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
  • the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.
  • the method before the control of the target device based on the gesture recognition result, the method further includes:
  • the hand detection information of the target hand matching the preset gesture category is re-determined.
  • the hand detection information of the target hand matching the preset gesture category can be re-determined, so that At least one user in the second image to be detected can control the target device in real time.
  • the target hand satisfying the cut-off condition includes one or more of the following:
  • the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
  • the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category.
  • the number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
  • the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
  • performing hand detection on the acquired first image to be detected includes:
  • hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
  • the limb detection can be performed on the first image to be detected first to determine the limb detection information, and then based on the limb detection By performing hand detection on the first image to be detected, the hand detection information of the target hand associated with the limb can be more accurately determined.
  • performing hand detection on the acquired first image to be detected includes:
  • the hand detection information for the target hand associated with the limb is determined.
  • the hand detection information of the target hand associated with the limb can be determined through the distance between the hand and the limb, and the determination process is simple and easy to implement.
  • control target device includes at least one of the following:
  • Adjust the working mode of the target device includes turning off or turning on at least part of the function of the target device;
  • the volume of the target device can be controlled, the target device can be turned off, and the display position of the movement logo in the display interface of the target device, etc., so as to realize flexible control of the target device.
  • the method further includes:
  • the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes: Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
  • it also includes:
  • the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user.
  • adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
  • each user when multiple users are included in the first image to be detected, each user can be regarded as a target user, and the target user can be determined based on the target joint position information of the target user and the target joint position information of other users.
  • the gesture fault tolerance mechanism corresponding to the target user can be adjusted, that is, the adjusted default gesture category can be used as the target user.
  • the preset gesture category can alleviate the influence of the interference user on the gesture category detection of the target user.
  • the distance threshold corresponding to the target user is determined according to the following steps:
  • the distance threshold corresponding to the target user is determined.
  • the intermediate distance representing the shoulder width of the target user can be determined based on the determined position information of the first joint point and the position information of the second joint point, and then the distance threshold of the target user can be determined based on the intermediate distance corresponding to the target user.
  • different users correspond to different distance thresholds. By determining the corresponding distance threshold for each target user, it can be more accurately judged whether other users will cause interference to the target user.
  • the present disclosure provides a device control device based on image recognition, including:
  • a first determining module configured to perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category
  • the detection module is configured to, based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected, and determine that the target hand is in the The gesture recognition result in the second to-be-detected image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
  • the control module is configured to control the target device based on the gesture recognition result.
  • the target device before the target device is controlled based on the gesture recognition result, it also includes: a second determining module, configured as:
  • the hand detection information of the target hand matching the preset gesture category is re-determined.
  • the target hand satisfying the cut-off condition includes one or more of the following:
  • the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
  • the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category.
  • the number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
  • the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
  • the first determination module when performing hand detection on the acquired first image to be detected, is configured to:
  • hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
  • the first determination module when performing hand detection on the acquired first image to be detected, is configured to:
  • the hand detection information for the target hand associated with the limb is determined.
  • control module when controlling the target device, includes at least one of the following:
  • Adjust the working mode of the target device includes turning off or turning on at least part of the function of the target device;
  • the method further includes: an adjustment module, which is configured as:
  • the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
  • the adjustment module is further configured to:
  • the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user.
  • adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
  • the apparatus further includes a distance threshold determination module, the distance threshold determination module is configured to determine the distance threshold corresponding to the target user according to the following steps:
  • the distance threshold corresponding to the target user is determined.
  • the present disclosure provides an electronic device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor communicates with the The memory communicates through a bus, and when the machine-readable instruction is executed by the processor, the image recognition-based device control method according to the first aspect or any one of the implementation manners is executed.
  • the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the computer program according to the first aspect or any one of the above-mentioned embodiments is executed.
  • Device control method for image recognition is executed.
  • the present disclosure provides a computer program, comprising computer-readable code, and when the computer-readable code is executed in an electronic device, the processor in the electronic device implements the above-mentioned first aspect when executed. Or the device control method based on image recognition described in any embodiment.
  • FIG. 1 shows a schematic flowchart of an image recognition-based device control method provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of a limb joint point and a hand detection frame in an image recognition-based device control method provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic structural diagram of an image recognition-based device control apparatus provided by an embodiment of the present disclosure
  • FIG. 4 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the user's gesture category can be determined through the acquired image, and the target device can be controlled by using the determined gesture category.
  • the target device can be controlled by using the determined gesture category.
  • there may be interference between gestures of different users. reducing the control effect of controlling the target device through human-computer interaction.
  • an embodiment of the present disclosure provides a device control scheme based on image recognition.
  • the execution subject of the device control method based on image recognition provided by the embodiments of the present disclosure is generally a computer device with a certain computing capability.
  • Equipment, UE mobile devices, user terminals, cellular phones, cordless phones, personal digital assistants (Personal Digital Assistant, PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • PDA Personal Digital Assistant
  • the image recognition-based device control method may be implemented by a processor calling computer-readable instructions stored in a memory.
  • FIG. 1 is a schematic flowchart of an image recognition-based device control method provided by an embodiment of the present disclosure
  • the method includes S101-S103, wherein:
  • the S102 based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected, and determine that the target hand is in the second image to be detected.
  • the hand detection information refers to the detected feature information of the target hand matching the preset gesture category in the first image to be detected, which may include hand position information, gesture category, hand identification information, and the like.
  • the hand position information may be the coordinate information of the vertex of the hand detection frame corresponding to the target hand in the image coordinate system corresponding to the first image to be detected, or the hand position information may be the position corresponding to the target hand. Coordinate information of the contour region in the image coordinate system corresponding to the first image to be detected, etc.
  • the gesture category may be the category of the gesture action of the target hand on the first image to be detected, for example, the gesture category may be the category of the gesture action of "ok".
  • the hand identification information may be any identification matched for the target hand, and the identification information may be composed of numbers, characters, patterns, etc., for example, the hand identification information may be the left hand a1.
  • the first to-be-detected image and the second to-be-detected image may be two frames of video images that are adjacent in time sequence in the video stream, or adjacent in time sequence in the video sequence obtained by sampling and sampling the original video stream. Two frames of video images.
  • the time difference formed between the acquisition moments corresponding to the second to-be-detected image is relatively small, which can be regarded as a small difference between the different acquired video images, which will not affect the image recognition based on the first detection image and the second detection image. analysis and processing.
  • the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the acquired second to-be-detected hand is detected.
  • the target limb connected to the target hand in the detection image is subjected to limb tracking detection, and the gesture recognition result of the target hand in the second to-be-detected image is determined.
  • the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result.
  • limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device.
  • the interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
  • the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.
  • the first image to be detected may be the current image of the set target area, and the target area is any scene area set for controlling the target device.
  • a camera device may be set on the target device, or a camera device may also be set in a surrounding area of the target device, so that the camera device can acquire the first image to be detected of the target area corresponding to the target device.
  • the photographing area corresponding to the imaging device includes a target area, that is, the target area is located within the photographing range of the imaging device.
  • the hand detection information of the target hand matched by the gesture category.
  • the preset gesture category can be the set gesture action category, and the set gesture action can be used to control the target device. ” of the gesture action category, etc.
  • the gesture category information and the preset gesture category can be determined from the gesture category information and the preset gesture category according to the position information of each user's limb center point.
  • the target user is determined, for example, a user whose limb center point is located in the middle of the first image to be detected is selected as the target user, and the target user's hand is used as the target hand.
  • performing hand detection on the acquired first image to be detected includes:
  • S1011 Perform limb detection on the acquired first image to be detected to obtain limb detection information.
  • limb detection may be performed on the first image to be detected, and the limb detection information of each user included in the first image to be detected is determined.
  • the limb detection information may include position information of a plurality of limb joint points, a limb identification corresponding to the user (the limb identification may be associated with the hand identification information included in the hand detection information), etc.; or the limb detection information may include the user's
  • the limb contour information includes position information of multiple limb contour points.
  • the limb detection information may be the limb detection information of the user's half body.
  • the limb joint points may be image key points extracted from the identified limb images of each user by performing limb detection on the first image to be detected by an image detection method.
  • the tracked and determined user's limb identification in the historical to-be-detected image is determined as the user's limb identification in the first to-be-detected image ; If the user's limb identification does not exist in the historical to-be-detected image before the first to-be-detected image, generate a corresponding limb identification for the user.
  • the limb detection information of at least one user can be used to perform hand detection on the first image to be detected, and the hand detection information of the target hand associated with the limb can be determined.
  • the hand region image of the hand associated with the limb on the first to-be-detected image can be determined according to the limb detection information, and the hand region image can be detected by hand to obtain the hand detection information associated with the limb;
  • the gesture category included in the part detection information is determined, and the target hand matching the preset gesture category is determined.
  • the constructed first neural network may be trained so that the trained first neural network satisfies a first preset condition, for example, the loss value of the trained first neural network is smaller than a set loss threshold wherein, the trained first neural network is used to perform limb detection on the first image to be detected, and determine the limb detection information of at least one user included in the first image to be detected.
  • the number of the limb joint points and the positions of the limb joint points included in the limb detection information can be set as required. For example, the number of limb joint points can be 14, 17, etc.
  • the second neural network for detecting the hand can also be trained, so that the trained second neural network satisfies the second preset condition, and then the trained second neural network can be used, based on the limb detection information, to detect the first neural network.
  • the image to be detected is subjected to hand detection, and the hand detection information of the target hand associated with the limb is determined.
  • the limb detection can be performed on the first image to be detected first to determine the limb detection information, and then based on the limb detection By performing hand detection on the first image to be detected, the hand detection information of the target hand associated with the limb can be more accurately determined.
  • performing hand detection on the acquired first image to be detected includes:
  • a first neural network may be used to perform limb detection on the first image to be detected to obtain limb detection information of at least one user
  • a second neural network may be used to perform hand detection on the first image to be detected to obtain at least one hand. corresponding hand detection information. Determine the target hand according to the gesture category indicated by the hand detection information.
  • the distance between the hand and the limb according to the position information of the limb center point indicated by the limb detection information and the position information of the hand center point indicated by the hand detection information; and then determine the limb with the shortest distance from the target hand.
  • the target hand is associated, that is, the hand detection information of the target hand associated with the limb is obtained.
  • the hand detection information of the target hand associated with the limb can be determined through the distance between the hand and the limb, and the determination process is simple and easy to implement.
  • the limb joint point information of the target user in FIG. 2 may include head vertex 5, head center point 4, neck joint point 3, left shoulder joint point 9, right shoulder joint point 6, left elbow joint point 10, right elbow joint point 7.
  • the hand detection frame can include four of the right hand detection frame.
  • the gesture recognition result includes, but is not limited to, gesture category, hand position information, and the like.
  • the second to-be-detected image is one or more frames of images acquired after the first to-be-detected image.
  • the method before the control of the target device based on the gesture recognition result, the method further includes:
  • the hand detection information of the target hand matching the preset gesture category is re-determined.
  • the gesture recognition result satisfying the cut-off condition includes one or more of the following:
  • the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category. It is assumed that the gesture categories do not match, and the target hand does not move;
  • Condition 2 In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category.
  • the number of frames is greater than or equal to the number threshold, and/or continues
  • the duration is greater than or equal to the duration threshold;
  • the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
  • the gesture recognition result of the target hand can be detected in real time to determine whether the gesture recognition result satisfies the cut-off condition, and when it is detected that the gesture recognition result meets the cut-off condition, it means that the target hand no longer controls the target device, then
  • the hand detection information of the target hand that matches the preset gesture category can be re-determined, so that at least one user in the second to-be-detected image can control the target device in real time.
  • the hand detection information of the target hand matching the preset gesture category is re-determined, so as to use the re-determined gesture recognition result of the target hand to control the target device.
  • the cut-off condition includes but is not limited to one or more of the first condition, the second condition, and the third condition.
  • the cut-off condition may also include: if the hand of the target hand cannot be detected in the second image to be detected When the information is detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
  • condition 1 if the gesture category indicated by the gesture recognition result of the target hand in the second image to be detected does not match the preset gesture category, and/or, if in the second image to be detected, the gesture category of the target hand does not match the preset gesture category
  • the gesture recognition result indicates that the target hand has not moved, it is determined that the first condition is satisfied.
  • it may be determined whether the target hand moves according to the position information of the target hand in the multiple frames of the second images to be detected.
  • the second condition when it is detected that the target hand does not move in the second image to be detected in consecutive N frames, and the value of N is greater than or equal to the number threshold, it is determined that the second condition is satisfied, and N is a positive integer; Condition 2 is determined to be satisfied when the gesture category of the target hand in the second to-be-detected images of consecutive N frames does not match the preset gesture category, and the value of N is greater than or equal to the number threshold.
  • the number threshold may be set as required, for example, the number threshold may be 3, 5, 10, and so on.
  • it is determined that the second condition is satisfied when the duration of the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category and the duration is greater than or equal to the duration threshold.
  • the duration threshold can be set according to actual needs.
  • a cut-off gesture category may be preset, and the cut-off gesture category is used to instruct the target hand and/or hand detection information to be re-determined.
  • the cut-off gesture category may be a thumbs-up gesture category. When the gesture category of the target hand is thumbs up, it is determined that the target hand satisfies the third condition.
  • the target device After the gesture recognition result of the target hand in the second image to be detected is determined, the target device can be controlled according to the gesture recognition result.
  • the target device may be a smart TV, a smart display screen, or the like.
  • controlling the target device includes at least one of the following: adjusting the volume of the target device; adjusting a working mode of the target device, where the working mode includes turning off or turning on at least part of the target device function; display the mobile logo in the display interface of the target device, or adjust the display position of the mobile logo in the display interface; reduce or enlarge at least part of the displayed content in the display interface; slide the display interface or jump.
  • the volume of the target device can be controlled, the target device can be turned off, the display position of the movement logo in the display interface of the target device, etc., to realize flexible control of the target device.
  • the first target gesture category may be the gesture category of the index finger and the middle finger.
  • the gesture category is the gesture category of the vertical index finger and the middle finger, it can be determined that the target hand has triggered the function of adjusting the volume of the target device, and then the volume can be increased or decreased according to the moving direction and distance of the target hand. And determine the amplified volume value or the reduced volume value.
  • the target hand moves from bottom to top, it indicates that the volume of the target device is amplified, and can be moved from bottom to top according to the distance, and The current volume is to determine the amplified volume value; if it is detected that the target hand moves from top to bottom, it indicates that the volume of the target device is being decreased, and it can be determined according to the distance moving from top to bottom and the current volume value.
  • the volume value after the small.
  • the second target gesture category may be the OK gesture category.
  • the gesture category is the OK gesture category, it can be determined that the target hand triggers the function of closing the target device, and then the target device can be closed in response to the function triggered by the user.
  • the third target gesture category may be the gesture category of the vertical index finger, and if the gesture category of the target hand indicated by the gesture recognition result is the vertical gesture category
  • the gesture category of the index finger it can be determined that the target user has triggered the click function at the target display position of the target device that matches the current position of the target hand, and the target device can be controlled to display the corresponding And the display content that matches the target display position controls the sliding or jumping of the display interface.
  • the method further includes:
  • Step 1 Determine the target joint point position information of each user in the first to-be-detected image
  • Step 2 Take each user in the first image to be detected as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and the target joint of the multiple users. The horizontal distance between the target joint points of other users other than the target user;
  • Step 3 When it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the Interfering users include users whose horizontal distance is less than a distance threshold corresponding to the target user.
  • Step 4 When it is determined that there is an interfering user among the other users based on the horizontal distance, adjust the default gesture category of the target user, and use the adjusted default gesture category as the target user's default gesture category.
  • adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and adding the gesture category.
  • the motion detection of the category is adjusted to the motion detection of the hand detection frame.
  • each user when multiple users are included in the first image to be detected, each user can be regarded as a target user, and the target user can be determined based on the target joint position information of the target user and the target joint position information of other users.
  • the gesture fault tolerance mechanism corresponding to the target user can be adjusted, that is, the adjusted default gesture category can be used as the target user.
  • the preset gesture category can alleviate the influence of the interference user on the gesture category detection of the target user.
  • limb detection can be performed on the first image to be detected, and limb detection information of each user in the first image to be detected can be determined.
  • the limb detection information can include target joint point position information, and the joints of each user are obtained. point location information.
  • the target joint point can be selected as required, for example, the target joint point can be the center point of the limb, that is, the center point 12 of the half-body limb in FIG. 2 , or the center point 0 of the crotch in FIG. 2 .
  • each user in the first to-be-detected image can be used as a target user, and based on the target joint position information of the target user, the target joint of the target user and other users other than the target user among the multiple users can be determined.
  • the horizontal distance between the target joint points of the user that is, the abscissa value indicated by the target joint point position information of the target user and other users can be subtracted to determine the target joint point of the target user and multiple users except the target user.
  • the horizontal distance between the target joint points of other users can be used as a target user, and based on the target joint position information of the target user, the target joint of the target user and other users other than the target user among the multiple users can be determined.
  • the horizontal distance between the target joint points of the user that is, the abscissa value indicated by the target joint point position information of the target user and other users can be subtracted to determine the target joint point of the target user and multiple users except the target user.
  • the horizontal distance between the target joint points of other users
  • step 3 it can be determined whether there are interfering users in other users, if not, go to step 3; if there is, go to step four.
  • the horizontal distance between other users and the target user is greater than or equal to the distance threshold corresponding to the determined target user, other users are determined to be interfering users; if the horizontal distance between other users and the target user is less than the determined target user.
  • the distance threshold corresponding to the user it is determined that other users are not interfering users.
  • the distance threshold corresponding to the target user can be determined according to the following steps A1 to A3:
  • Step A1 determining the position information of the first joint point and the position information of the second joint point of the target user
  • Step A2 based on the position information of the first joint point and the position information of the second joint point, determine the intermediate distance used to represent the shoulder width of the target user;
  • Step A3 Determine the distance threshold corresponding to the target user based on the intermediate distance corresponding to the target user.
  • the first joint point may be the left shoulder joint point 9 in FIG. 2
  • the second joint point may be the neck joint point 3 in FIG. 2 ; or, the first joint point may be the right shoulder joint in FIG. 2
  • Point 6 and the second joint point may be the neck joint point 3 in Figure 2; or, the first joint point may be the right shoulder joint point 6 in Figure 2, and the second joint point may be the left shoulder joint point in Figure 2 9.
  • the intermediate distance used to represent the shoulder width of the target user can be determined.
  • the abscissa values indicated by the position information of the joint points are subtracted to determine the intermediate distance.
  • the distance threshold corresponding to the target user is determined.
  • the determined intermediate distance may be used as the distance threshold corresponding to the target user; alternatively, the determined intermediate distance may be reduced or enlarged, and the reduced or enlarged intermediate distance may be used as the distance threshold corresponding to the target user.
  • the intermediate distance representing the shoulder width of the target user can be determined based on the determined position information of the first joint point and the position information of the second joint point, and then the distance threshold of the target user can be determined based on the intermediate distance corresponding to the target user.
  • different users correspond to different distance thresholds. By determining the corresponding distance threshold for each target user, it can be more accurately judged whether other users will cause interference to the target user.
  • step 3 if the target user does not interfere with the user, the default gesture category of the target user can be used as the default gesture category of the target user, and there is no need to adjust the default gesture category.
  • step 4 if it is determined that the target user is interfering with the user, the default gesture category corresponding to the target user may be adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user.
  • the types of default gesture categories can be added.
  • the default gesture category before adjustment is a dynamic gesture of one-finger circle
  • the adjusted default gesture category can include: one-finger circle gesture category, and fist circle gesture category.
  • the types of gesture categories used to control at least one function of the target device can be added.
  • the first target gesture category for controlling the volume of the target device before the increase is the gesture category of raising the index finger and the middle finger
  • the added control target The first target gesture category of the volume of the device may include: gesture category of raising index finger and middle finger, gesture category of palm, gesture category of raising three fingers, and the like.
  • the types of cut-off gesture categories may also be added.
  • the types of cut-off gesture categories before the addition are thumb-up gesture categories; the added cut-off gesture categories may be thumb-up gesture categories, index finger-raise gesture categories, and gesture categories of the vertical tail finger, etc.
  • the movement detection of the gesture category can also be adjusted to the movement detection of the hand detection frame, that is, the real-time movement of the gesture category is detected before adjustment, and the display position of the mobile logo on the target device is determined based on the detection result of the gesture category.
  • the target hand before the adjustment: the target hand may be detected first, the current gesture category corresponding to the target hand may be determined, and when the current gesture category matches the set movement gesture category, the hand gesture of the target hand may be determined. and determine the display position of the mobile logo on the target device based on the hand position of the target hand; when the current gesture category does not match the set mobile gesture category, the hand position of the target hand is not determined.
  • the hand position of the target hand can be the position of the center point of the hand detection frame corresponding to the target hand, or it can also be the target hand The position of the hand center point set on the hand.
  • the real-time movement of the hand detection frame can be detected, and the display position of the movement mark on the target device can be determined based on the detection result of the hand detection frame.
  • the position information of the hand detection frame of the target hand may be determined, and based on the position information of the hand detection frame (for example, the position information of the center point of the hand detection frame), it is determined that the mobile identifier is in The display position on the target device, at this time there is no need to detect the current gesture category of the target hand.
  • an embodiment of the present disclosure also provides an image recognition-based device control apparatus.
  • a schematic diagram of the architecture of an image recognition-based device control apparatus provided by an embodiment of the present disclosure includes the first A determination module 301, a detection module 302, and a control module 303, wherein:
  • the first determining module 301 is configured to perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
  • the detection module 302 is configured to perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected based on the hand detection information of the target hand, and determine the target hand The gesture recognition result in the second to-be-detected image; wherein, the second to-be-detected image is an image acquired after the first to-be-detected image;
  • the control module 303 is configured to control the target device based on the gesture recognition result.
  • the method before the control of the target device based on the gesture recognition result, the method further includes: a second determination module 304 configured to:
  • the hand detection information of the target hand matching the preset gesture category is re-determined.
  • the target hand satisfying the cut-off condition includes one or more of the following:
  • the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
  • the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category.
  • the number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
  • the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
  • the first determination module 301 when performing hand detection on the acquired first image to be detected, is configured to:
  • hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
  • the first determination module 301 when performing hand detection on the acquired first image to be detected, is configured to:
  • the hand detection information for the target hand associated with the limb is determined.
  • control module 303 when controlling the target device, includes at least one of the following:
  • Adjust the working mode of the target device includes turning off or turning on at least part of the function of the target device;
  • the method further includes: an adjustment module 305, which is configured to:
  • the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
  • the adjustment module 305 is further configured to:
  • the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user.
  • adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
  • the apparatus further includes a distance threshold determination module, and the distance threshold determination module 306 is configured to determine the distance threshold corresponding to the target user according to the following steps:
  • the distance threshold corresponding to the target user is determined.
  • the functions or templates included in the apparatus provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments.
  • reference may be made to the descriptions in the above method embodiments. Repeat.
  • an embodiment of the present disclosure also provides an electronic device.
  • a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure includes a processor 401 , a memory 402 , and a bus 403 .
  • the memory 402 is configured to store execution instructions, including the memory 4021 and the external memory 4022; the memory 4021 here is also called internal memory, and is configured to temporarily store the operation data in the processor 401 and the external memory 4022 such as the hard disk.
  • the processor 401 exchanges data with the external memory 4022 through the memory 4021.
  • the processor 401 and the memory 402 communicate through the bus 403, so that the processor 401 executes the following instructions:
  • the target hand Based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second to-be-detected image, and determine that the target hand is in the second to-be-detected image
  • the gesture recognition result in the image wherein, the second to-be-detected image is an image obtained after the first to-be-detected image
  • the target device is controlled.
  • embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the image recognition-based device described in the foregoing method embodiments is executed Control Method.
  • the computer program product of the image recognition-based device control method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing computer-readable codes, and the instructions included in the computer-readable codes can be used to execute the above method embodiments.
  • the device control method based on image recognition.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
  • the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the obtained second image to be detected and the The target limb connected to the target hand performs limb tracking detection to determine the gesture recognition result of the target hand in the second to-be-detected image.
  • the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result.
  • limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device.
  • the interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
  • the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.

Abstract

The present disclosure provides an image recognition-based device control method and apparatus, an electronic device, and a computer readable storage medium. The method comprises: performing hand detection on an obtained first image to be tested, and determining hand detection information of a target hand matching a preset gesture category; on the basis of the hand detection information of the target hand, performing limb tracking detection on a target limb connected to the target hand in an obtained second image to be tested, and determining a gesture recognition result of the target hand in said second image, said second image being an image acquired after said first image; and controlling a target device on the basis of the gesture recognition result.

Description

基于图像识别的设备控制方法及装置、电子设备及计算机可读存储介质Device control method and device based on image recognition, electronic device and computer-readable storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本公开基于申请号为202110301465.0、申请日为2020年03月22日、申请名称为“设备控制方法、装置、电子设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。The present disclosure is based on the Chinese patent application with the application number of 202110301465.0, the application date of March 22, 2020, and the application name of "equipment control method, device, electronic equipment and storage medium", and claims the priority of the Chinese patent application, The entire contents of this Chinese patent application are hereby incorporated by reference into the present disclosure.
技术领域technical field
本公开涉及计算机视觉技术领域,尤其涉及一种基于图像识别的设备控制方法、装置、电子设备及计算机可读存储介质。The present disclosure relates to the technical field of computer vision, and in particular, to an image recognition-based device control method, apparatus, electronic device, and computer-readable storage medium.
背景技术Background technique
随着科学技术的发展,人们不断的对人机交互的水平和质量提出新的要求和调整。其中,由于手势具有直观性、自然性等特点,使得手势已成为人机交互的一种重要手段。因此,基于计算机视觉的手势识别成为了人机交互领域的研究重点。With the development of science and technology, people continue to put forward new requirements and adjustments to the level and quality of human-computer interaction. Among them, gestures have become an important means of human-computer interaction due to their intuitive and natural characteristics. Therefore, gesture recognition based on computer vision has become a research focus in the field of human-computer interaction.
一般的,可以通过获取的图像,确定用户的手势类别,利用确定的手势类别,实现对目标设备的控制,但是在人机交互场景内存在多个用户时,不同用户的手势之间可能存在干扰,从而降低了对主控制用户的手势进行图像识别的准确度,进而降低了目标设备的控制精准度。Generally, the user's gesture category can be determined through the acquired image, and the target device can be controlled by using the determined gesture category. However, when there are multiple users in the human-computer interaction scene, there may be interference between gestures of different users. , thereby reducing the accuracy of image recognition of the gesture of the main control user, thereby reducing the control accuracy of the target device.
发明内容SUMMARY OF THE INVENTION
本公开实施例至少提供一种基于图像识别的设备控制方法、装置、电子设备及计算机可读存储介质,能够提高图像识别的准确度,进而提高根据图像识别结果进行目标设备控制的精准度。The embodiments of the present disclosure provide at least an image recognition-based device control method, device, electronic device, and computer-readable storage medium, which can improve the accuracy of image recognition, thereby improving the accuracy of target device control based on the image recognition result.
第一方面,本公开提供了一种基于图像识别的设备控制方法,包括:In a first aspect, the present disclosure provides a device control method based on image recognition, including:
对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;Perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;Based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second to-be-detected image, and determine that the target hand is in the second to-be-detected image The gesture recognition result in the image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
基于所述手势识别结果,控制目标设备。Based on the gesture recognition result, the target device is controlled.
上述方法中,通过对第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息,并基于目标手部的手部检测信息,对获取的第二待检测图像中与目标手部相连的目标肢体进行肢体跟踪检测,确定目标手部在第二待检测图像中的手势识别结果。这样可以借助肢体跟踪来实现对不易跟踪检测的目标手部的追踪,进而可以基于手势识别结果,控制目标设备。在诸多用户的手部中,或是在同一用户的两个手部中,通过锁定目标手部,并借助肢体与手部之间的唯一匹配性,以跟踪目标手部为目的进行了肢体跟踪,并以肢体跟踪结果为依据,得到目标手部在第二待检测图像中的手势识别结果,从而有效降低了对目标手部对应的目标用户控制目标设备的手势进行图像识别时,其他用户的手部动作所产生的干扰,提高了图像识别的准确性,进而提高了目标设备的控制精准度。In the above method, by performing hand detection on the first to-be-detected image, the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the acquired second to-be-detected hand is detected. The target limb connected to the target hand in the detection image is subjected to limb tracking detection, and the gesture recognition result of the target hand in the second to-be-detected image is determined. In this way, the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result. In the hands of many users, or in the two hands of the same user, by locking the target hand and using the unique matching between the limb and the hand, limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device. The interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
由此可见,采用本公开提供的技术方案,可以有效甄别多个用户中用于控制目标设备的目标用户,并在一定程度上,当目标用户的两个手部都存在手部动作的情况下,择一确定目标手部,以对目标设备进行准确控制。需要说明的是,若部分控制操作是通过用户的两个手部分别执行相应动作来触控的,那么采用本公开提供的技术方案可以锁定目标用户,并基于目标用户的两个手部对应的手部动作来实现目标设备的控制。It can be seen that, by adopting the technical solution provided by the present disclosure, the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.
一种可能的实施方式中,在所述基于所述手势识别结果,控制目标设备之前,还包括:In a possible implementation manner, before the control of the target device based on the gesture recognition result, the method further includes:
检测所述目标手部是否满足截止条件;detecting whether the target hand satisfies the cut-off condition;
在检测到所述手势识别结果满足截止条件的情况下,在所述第二待检测图像中,重新确定与所述预设手势类别匹配的目标手部的手部检测信息。When it is detected that the gesture recognition result satisfies the cutoff condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
这里,在检测到手势识别结果满足截止条件时,表征该目标用户的目标手部不再对目标设备进行控制,则可以重新确定与预设手势类别匹配的目标手部的手部检测信息,使得第二待检测图像中的至少一个用户可以实时的控制目标设备。Here, when it is detected that the gesture recognition result meets the cutoff condition, and the target hand representing the target user no longer controls the target device, the hand detection information of the target hand matching the preset gesture category can be re-determined, so that At least one user in the second image to be detected can control the target device in real time.
一种可能的实施方式中,所述目标手部满足所述截止条件包括以下一种或多种:In a possible implementation manner, the target hand satisfying the cut-off condition includes one or more of the following:
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为无效手势类别,所述无效手势类别包括如下至少一项:所述手势类别与所述预设手势类别不匹配,以及所述目标手部未发生移动;In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
在所述第二待检测图像包括多帧的情况下,所述目标手部的手势识别结果指示的手势类别为所述无效手势类别的帧数大于或等于数量阈值,和/或持续时长大于或等于时长阈值;In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category. The number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为有效手势类别,且所述有效手势类别用于指示重新确定目标手部和/或手部检测信息。In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
一种可能的实施方式中,所述对获取的第一待检测图像进行手部检测,包括:In a possible implementation manner, performing hand detection on the acquired first image to be detected includes:
对获取的所述第一待检测图像进行肢体检测,得到肢体检测信息;performing limb detection on the acquired first image to be detected to obtain limb detection information;
基于所述肢体检测信息,对所述第一待检测图像进行手部检测,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the limb detection information, hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
由于在图像中较难对手部进行跟踪检测,而肢体的追踪检测较易实现,且手部与肢体相连,故可以先对第一待检测图像进行肢体检测,确定肢体检测信息,再基于肢体检测信息对第一待检测图像进行手部检测,可以较准确的确定与肢体关联的目标手部的手部检测信息。Since it is difficult to track and detect the hand in the image, and the tracking and detection of the limb is easier to achieve, and the hand is connected with the limb, the limb detection can be performed on the first image to be detected first to determine the limb detection information, and then based on the limb detection By performing hand detection on the first image to be detected, the hand detection information of the target hand associated with the limb can be more accurately determined.
一种可能的实施方式中,所述对获取的第一待检测图像进行手部检测,包括:In a possible implementation manner, performing hand detection on the acquired first image to be detected includes:
对获取的所述第一待检测图像分别进行肢体检测和手部检测,得到肢体检测信息和所述手部检测信息;Performing limb detection and hand detection on the acquired first image to be detected, respectively, to obtain limb detection information and the hand detection information;
基于所述肢体检测信息和所述手部检测信息,确定所述手部与所述肢体之间的距离;determining the distance between the hand and the limb based on the limb detection information and the hand detection information;
基于所述距离,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the distance, the hand detection information for the target hand associated with the limb is determined.
这里,可以通过手部与肢体之间的距离,确定与肢体关联的目标手部的手部检测信息,确定过程简单、易实现。Here, the hand detection information of the target hand associated with the limb can be determined through the distance between the hand and the limb, and the determination process is simple and easy to implement.
一种可能的实施方式中,所述控制目标设备,包括如下至少一种:In a possible implementation manner, the control target device includes at least one of the following:
调整所述目标设备的音量;adjust the volume of the target device;
调整所述目标设备的工作模式,所述工作模式包括关闭或开启所述目标设备的至少部分功能;Adjust the working mode of the target device, the working mode includes turning off or turning on at least part of the function of the target device;
在所述目标设备的显示界面中显示移动标识,或调整所述显示界面中所述移动标识的显示位置;Displaying the mobile logo in the display interface of the target device, or adjusting the display position of the mobile logo in the display interface;
所述显示界面中至少部分显示内容的缩小或放大;reduction or enlargement of at least part of the displayed content in the display interface;
所述显示界面的滑动或跳转。Sliding or jumping of the display interface.
这里,可以基于手势识别结果,控制目标设备的音量、控制目标设备的关闭、目标设备的显示界面中移动标识的显示位置等,实现了对目标设备的灵活控制。Here, based on the gesture recognition result, the volume of the target device can be controlled, the target device can be turned off, and the display position of the movement logo in the display interface of the target device, etc., so as to realize flexible control of the target device.
一种可能的实施方式中,在所述第一待检测图像中包括多个用户的情况下,在所述基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测之前,还包括:In a possible implementation manner, in the case that the first image to be detected includes multiple users, in the hand detection information based on the target hand, the second image to be detected that is obtained and the Before performing the limb tracking detection on the target limb connected to the target hand, the method further includes:
确定所述第一待检测图像中每个用户的目标关节点位置信息;Determine the target joint point position information of each user in the first to-be-detected image;
将所述第一待检测图像中的每个用户作为目标用户,基于所述目标用户的所述目标关节点位置信息,确定所述目标用户的目标关节点与多个用户中除所述目标用户之外的其他用户的目标关节点之间的水平距离;Taking each user in the first to-be-detected image as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and the target user in the multiple users except the target user The horizontal distance between the target joint points of other users;
在基于所述水平距离,确定所述其他用户中不存在干扰用户的情况下, 则将所述目标用户的默认手势类别,作为所述目标用户的所述预设手势类别,所述干扰用户包括所述水平距离小于所述目标用户对应的距离阈值的用户。In the case that it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes: Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
一种可能的实施方式中,还包括:In a possible implementation, it also includes:
在基于所述水平距离,确定所述其他用户中存在干扰用户的情况下,则对所述目标用户的默认手势类别进行调整,并将调整后的默认手势类别作为所述目标用户的所述预设手势类别,调整所述默认手势类别包括以下至少一种操作:增加所述默认手势类别的种类、增加用于控制所述目标设备的至少一个功能的手势类别的种类,以及将手势类别的移动检测调整为手部检测框的移动检测。When it is determined that there is an interfering user among the other users based on the horizontal distance, the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user. Assuming a gesture category, adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
上述实施方式中,在第一待检测图像中包括多个用户时,可以将每个用户作为目标用户,基于目标用户的目标关节点位置信息、和其他用户的目标关节点位置信息,确定目标用户与其他用户的目标关节点之间的水平距离,在基于水平距离,确定其他用户中存在干扰用户时,则可以调整目标用户对应的手势容错机制,即可以将调整后的默认手势类别作为目标用户的预设手势类别,缓解干扰用户对目标用户的手势类别检测产生的影响。In the above embodiment, when multiple users are included in the first image to be detected, each user can be regarded as a target user, and the target user can be determined based on the target joint position information of the target user and the target joint position information of other users. The horizontal distance from the target joint points of other users. When it is determined that there are interfering users among other users based on the horizontal distance, the gesture fault tolerance mechanism corresponding to the target user can be adjusted, that is, the adjusted default gesture category can be used as the target user. The preset gesture category can alleviate the influence of the interference user on the gesture category detection of the target user.
一种可能的实施方式中,根据下述步骤确定所述目标用户对应的所述距离阈值:In a possible implementation manner, the distance threshold corresponding to the target user is determined according to the following steps:
确定所述目标用户的第一关节点的位置信息和第二关节点的位置信息;determining the position information of the first joint point and the position information of the second joint point of the target user;
基于所述第一关节点的位置信息和所述第二关节点的位置信息,确定用于表征所述目标用户肩宽的中间距离;based on the position information of the first joint point and the position information of the second joint point, determining an intermediate distance used to represent the shoulder width of the target user;
基于所述中间距离,确定所述目标用户对应的所述距离阈值。Based on the intermediate distance, the distance threshold corresponding to the target user is determined.
采用上述方法,可以基于确定的第一关节点的位置信息和第二关节点的位置信息,确定表征目标用户肩宽的中间距离,进而可以基于目标用户对应的中间距离,确定目标用户的距离阈值,不同的用户对应不同的距离阈值,通过为每个目标用户确定对应的距离阈值,可以较准确的判断其他用户是否会对目标用户造成干扰。Using the above method, the intermediate distance representing the shoulder width of the target user can be determined based on the determined position information of the first joint point and the position information of the second joint point, and then the distance threshold of the target user can be determined based on the intermediate distance corresponding to the target user. , different users correspond to different distance thresholds. By determining the corresponding distance threshold for each target user, it can be more accurately judged whether other users will cause interference to the target user.
以下装置、电子设备等的效果描述参见上述方法的说明,这里不再赘述。For descriptions of the effects of the following apparatuses, electronic devices, etc., reference may be made to the descriptions of the above-mentioned methods, which will not be repeated here.
第二方面,本公开提供了一种基于图像识别的设备控制装置,包括:In a second aspect, the present disclosure provides a device control device based on image recognition, including:
第一确定模块,被配置为对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;a first determining module, configured to perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
检测模块,被配置为基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;The detection module is configured to, based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected, and determine that the target hand is in the The gesture recognition result in the second to-be-detected image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
控制模块,被配置为基于所述手势识别结果,控制目标设备。The control module is configured to control the target device based on the gesture recognition result.
一种可能的实施方式中,在所述基于所述手势识别结果,控制目标设 备之前,还包括:第二确定模块,被配置为:In a possible implementation manner, before the target device is controlled based on the gesture recognition result, it also includes: a second determining module, configured as:
检测所述目标手部是否满足截止条件;detecting whether the target hand satisfies the cut-off condition;
在检测到所述手势识别结果满足截止条件的情况下,在所述第二待检测图像中,重新确定与所述预设手势类别匹配的目标手部的手部检测信息。When it is detected that the gesture recognition result satisfies the cutoff condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
一种可能的实施方式中,所述目标手部满足所述截止条件包括以下一种或多种:In a possible implementation manner, the target hand satisfying the cut-off condition includes one or more of the following:
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为无效手势类别,所述无效手势类别包括如下至少一项:所述手势类别与所述预设手势类别不匹配,以及所述目标手部未发生移动;In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
在所述第二待检测图像包括多帧的情况下,所述目标手部的手势识别结果指示的手势类别为所述无效手势类别的帧数大于或等于数量阈值,和/或持续时长大于或等于时长阈值;In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category. The number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为有效手势类别,且所述有效手势类别用于指示重新确定目标手部和/或手部检测信息。In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
一种可能的实施方式中,所述第一确定模块,在对获取的第一待检测图像进行手部检测时,被配置为:In a possible implementation manner, the first determination module, when performing hand detection on the acquired first image to be detected, is configured to:
对获取的所述第一待检测图像进行肢体检测,得到肢体检测信息;performing limb detection on the acquired first image to be detected to obtain limb detection information;
基于所述肢体检测信息,对所述第一待检测图像进行手部检测,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the limb detection information, hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
一种可能的实施方式中,所述第一确定模块,在对获取的第一待检测图像进行手部检测时,被配置为:In a possible implementation manner, the first determination module, when performing hand detection on the acquired first image to be detected, is configured to:
对获取的所述第一待检测图像分别进行肢体检测和手部检测,得到肢体检测信息和所述手部检测信息;Performing limb detection and hand detection on the acquired first image to be detected, respectively, to obtain limb detection information and the hand detection information;
基于所述肢体检测信息和所述手部检测信息,确定所述手部与所述肢体之间的距离;determining the distance between the hand and the limb based on the limb detection information and the hand detection information;
基于所述距离,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the distance, the hand detection information for the target hand associated with the limb is determined.
一种可能的实施方式中,所述控制模块,在控制目标设备时,包括如下至少一种:In a possible implementation manner, the control module, when controlling the target device, includes at least one of the following:
调整所述目标设备的音量;adjust the volume of the target device;
调整所述目标设备的工作模式,所述工作模式包括关闭或开启所述目标设备的至少部分功能;Adjust the working mode of the target device, the working mode includes turning off or turning on at least part of the function of the target device;
在所述目标设备的显示界面中显示移动标识,或调整所述显示界面中所述移动标识的显示位置;Displaying the mobile logo in the display interface of the target device, or adjusting the display position of the mobile logo in the display interface;
所述显示界面中至少部分显示内容的缩小或放大;reduction or enlargement of at least part of the displayed content in the display interface;
所述显示界面的滑动或跳转。Sliding or jumping of the display interface.
一种可能的实施方式中,在所述第一待检测图像中包括多个用户的情 况下,在基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测之前,还包括:调整模块,被配置为:In a possible implementation manner, in the case that the first image to be detected includes multiple users, based on the hand detection information of the target hand, the obtained second image to be detected is compared with the Before performing the limb tracking detection on the target limb connected to the target hand, the method further includes: an adjustment module, which is configured as:
确定所述第一待检测图像中每个用户的目标关节点位置信息;Determine the target joint point position information of each user in the first to-be-detected image;
将所述第一待检测图像中的每个用户作为目标用户,基于所述目标用户的所述目标关节点位置信息,确定所述目标用户的目标关节点与多个用户中除所述目标用户之外的其他用户的目标关节点之间的水平距离;Taking each user in the first to-be-detected image as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and the target user in the multiple users except the target user The horizontal distance between the target joint points of other users;
在基于所述水平距离,确定所述其他用户中不存在干扰用户的情况下,则将所述目标用户的默认手势类别,作为所述目标用户的所述预设手势类别,所述干扰用户包括所述水平距离小于所述目标用户对应的距离阈值的用户。When it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
一种可能的实施方式中,所述调整模块,还被配置为:In a possible implementation manner, the adjustment module is further configured to:
在基于所述水平距离,确定所述其他用户中存在干扰用户的情况下,则对所述目标用户的默认手势类别进行调整,并将调整后的默认手势类别作为所述目标用户的所述预设手势类别,调整所述默认手势类别包括以下至少一种操作:增加所述默认手势类别的种类、增加用于控制所述目标设备的至少一个功能的手势类别的种类,以及将手势类别的移动检测调整为手部检测框的移动检测。When it is determined that there is an interfering user among the other users based on the horizontal distance, the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user. Assuming a gesture category, adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
一种可能的实施方式中,所述装置还包括距离阈值确定模块,所述距离阈值确定模块,被配置为根据下述步骤确定所述目标用户对应的所述距离阈值:In a possible implementation manner, the apparatus further includes a distance threshold determination module, the distance threshold determination module is configured to determine the distance threshold corresponding to the target user according to the following steps:
确定所述目标用户的第一关节点的位置信息和第二关节点的位置信息;determining the position information of the first joint point and the position information of the second joint point of the target user;
基于所述第一关节点的位置信息和所述第二关节点的位置信息,确定用于表征所述目标用户肩宽的中间距离;based on the position information of the first joint point and the position information of the second joint point, determining an intermediate distance used to represent the shoulder width of the target user;
基于所述中间距离,确定所述目标用户对应的所述距离阈值。Based on the intermediate distance, the distance threshold corresponding to the target user is determined.
第三方面,本公开提供一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如上述第一方面或任一实施方式所述的基于图像识别的设备控制方法。In a third aspect, the present disclosure provides an electronic device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor communicates with the The memory communicates through a bus, and when the machine-readable instruction is executed by the processor, the image recognition-based device control method according to the first aspect or any one of the implementation manners is executed.
第四方面,本公开提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如上述第一方面或任一实施方式所述的基于图像识别的设备控制方法。In a fourth aspect, the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the computer program according to the first aspect or any one of the above-mentioned embodiments is executed. Device control method for image recognition.
第五方面,本公开提供一种计算机程序,包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备中的处理器执行时实现如上述第一方面或任一实施方式所述的基于图像识别的设备控制方法。In a fifth aspect, the present disclosure provides a computer program, comprising computer-readable code, and when the computer-readable code is executed in an electronic device, the processor in the electronic device implements the above-mentioned first aspect when executed. Or the device control method based on image recognition described in any embodiment.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实 施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings required in the embodiments, which are incorporated into the specification and constitute a part of the specification. The drawings illustrate embodiments consistent with the present disclosure, and together with the description serve to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. Other related figures are obtained from these figures.
图1示出了本公开实施例所提供的一种基于图像识别的设备控制方法的流程示意图;FIG. 1 shows a schematic flowchart of an image recognition-based device control method provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的一种基于图像识别的设备控制方法中,肢体关节点和手部检测框的示意图;2 shows a schematic diagram of a limb joint point and a hand detection frame in an image recognition-based device control method provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的一种基于图像识别的设备控制装置的架构示意图;FIG. 3 shows a schematic structural diagram of an image recognition-based device control apparatus provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的一种电子设备的结构示意图。FIG. 4 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be described clearly and completely below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments These are only some of the embodiments of the present disclosure, but not all of the embodiments. The components of the disclosed embodiments generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.
一般的,可以通过获取的图像,确定用户的手势类别,利用确定的手势类别,实现对目标设备的控制,但是在人机交互场景内存在多个用户时,不同用户的手势之间可能存在干扰,降低了通过人机交互实现目标设备控制的控制效果。为了解决上述问题,提高基于人机交互实现的控制目标设备的控制效果,本公开实施例提供了一种基于图像识别的设备控制方案。Generally, the user's gesture category can be determined through the acquired image, and the target device can be controlled by using the determined gesture category. However, when there are multiple users in the human-computer interaction scene, there may be interference between gestures of different users. , reducing the control effect of controlling the target device through human-computer interaction. In order to solve the above problems and improve the control effect of the control target device based on human-computer interaction, an embodiment of the present disclosure provides a device control scheme based on image recognition.
针对以上方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。The defects existing in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions to the above problems proposed by the present disclosure hereinafter should be the inventors Contributions made to this disclosure during the course of this disclosure.
下面将结合本公开中附图,对本公开中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开的组件可以以各种不同的 配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。The technical solutions in the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the present disclosure. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, but not all of the embodiments. The components of the present disclosure generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
为便于对本公开实施例进行理解,首先对本公开实施例所公开的一种基于图像识别的设备控制方法进行详细介绍。本公开实施例所提供的基于图像识别的设备控制方法的执行主体一般为具有一定计算能力的计算机设备,该计算机设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该基于图像识别的设备控制方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of the embodiments of the present disclosure, an image recognition-based device control method disclosed in the embodiments of the present disclosure is first introduced in detail. The execution subject of the device control method based on image recognition provided by the embodiments of the present disclosure is generally a computer device with a certain computing capability. Equipment, UE), mobile devices, user terminals, cellular phones, cordless phones, personal digital assistants (Personal Digital Assistant, PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementations, the image recognition-based device control method may be implemented by a processor calling computer-readable instructions stored in a memory.
参见图1所示,为本公开实施例所提供的基于图像识别的设备控制方法的流程示意图,该方法包括S101-S103,其中:Referring to FIG. 1, which is a schematic flowchart of an image recognition-based device control method provided by an embodiment of the present disclosure, the method includes S101-S103, wherein:
S101,对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;S101, performing hand detection on the acquired first image to be detected, and determining hand detection information of a target hand matching a preset gesture category;
S102,基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;S102, based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected, and determine that the target hand is in the second image to be detected. The gesture recognition result in the image to be detected; wherein, the second image to be detected is an image obtained after the first image to be detected;
S103,基于所述手势识别结果,控制目标设备。S103, based on the gesture recognition result, control the target device.
其中,手部检测信息指的是检测得到的与预设手势类别匹配的目标手部在第一待检测图像中的特征信息,可以包括手部位置信息、手势类别、手部标识信息等。示例性的,手部位置信息可以为目标手部对应的手部检测框的顶点在第一待检测图像对应的图像坐标系下的坐标信息,或者,手部位置信息可以为目标手部对应的轮廓区域在第一待检测图像对应的图像坐标系下的坐标信息等。手势类别可以为目标手部在第一待检测图像上的手势动作的类别,比如,手势类别可以为“ok”的手势动作的类别。手部标识信息可以是为目标手部匹配的任一标识,该标识信息可以由数字、文字、图案等构成,比如,手部标识信息可以为左手a1。The hand detection information refers to the detected feature information of the target hand matching the preset gesture category in the first image to be detected, which may include hand position information, gesture category, hand identification information, and the like. Exemplarily, the hand position information may be the coordinate information of the vertex of the hand detection frame corresponding to the target hand in the image coordinate system corresponding to the first image to be detected, or the hand position information may be the position corresponding to the target hand. Coordinate information of the contour region in the image coordinate system corresponding to the first image to be detected, etc. The gesture category may be the category of the gesture action of the target hand on the first image to be detected, for example, the gesture category may be the category of the gesture action of "ok". The hand identification information may be any identification matched for the target hand, and the identification information may be composed of numbers, characters, patterns, etc., for example, the hand identification information may be the left hand a1.
第一待检测图像与第二待检测图像可以为视频流中在时序上相邻的两帧视频图像,或是对原始视频流进行抽帧、采样等得到的视频序列中在时序上相邻的两帧视频图像。The first to-be-detected image and the second to-be-detected image may be two frames of video images that are adjacent in time sequence in the video stream, or adjacent in time sequence in the video sequence obtained by sampling and sampling the original video stream. Two frames of video images.
在实际应用中,若第一待检测图像与第二待检测图像之间存在其他图 像,那么通常情况下是可以对各对象在其他图像中产生的变化忽略不计的,比如,第一待检测图像与第二待检测图像分别对应的采集时刻形成的时间差较小,可视为采集到的不同视频图像之间的差异较小,不会影响基于第一检测图像和第二检测图像进行的图像识别分析和处理。In practical applications, if there are other images between the first image to be detected and the second image to be detected, usually the changes of each object in other images can be ignored, for example, the first image to be detected The time difference formed between the acquisition moments corresponding to the second to-be-detected image is relatively small, which can be regarded as a small difference between the different acquired video images, which will not affect the image recognition based on the first detection image and the second detection image. analysis and processing.
上述方法中,通过对第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息,并基于目标手部的手部检测信息,对获取的第二待检测图像中与目标手部相连的目标肢体进行肢体跟踪检测,确定目标手部在第二待检测图像中的手势识别结果。这样可以借助肢体跟踪来实现对不易跟踪检测的目标手部的追踪,进而可以基于手势识别结果,控制目标设备。在诸多用户的手部中,或是在同一用户的两个手部中,通过锁定目标手部,并借助肢体与手部之间的唯一匹配性,以跟踪目标手部为目的进行了肢体跟踪,并以肢体跟踪结果为依据,得到目标手部在第二待检测图像中的手势识别结果,从而有效降低了对目标手部对应的目标用户控制目标设备的手势进行图像识别时,其他用户的手部动作所产生的干扰,提高了图像识别的准确性,进而提高了目标设备的控制精准度。In the above method, by performing hand detection on the first to-be-detected image, the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the acquired second to-be-detected hand is detected. The target limb connected to the target hand in the detection image is subjected to limb tracking detection, and the gesture recognition result of the target hand in the second to-be-detected image is determined. In this way, the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result. In the hands of many users, or in the two hands of the same user, by locking the target hand and using the unique matching between the limb and the hand, limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device. The interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
由此可见,采用本公开提供的技术方案,可以有效甄别多个用户中用于控制目标设备的目标用户,并在一定程度上,当目标用户的两个手部都存在手部动作的情况下,择一确定目标手部,以对目标设备进行准确控制。需要说明的是,若部分控制操作是通过用户的两个手部分别执行相应动作来触控的,那么采用本公开提供的技术方案可以锁定目标用户,并基于目标用户的两个手部对应的手部动作来实现目标设备的控制。It can be seen that, by adopting the technical solution provided by the present disclosure, the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.
下述对S101-S103进行详细说明。S101-S103 will be described in detail below.
针对S101:For S101:
这里,第一待检测图像可以为设置的目标区域的当前图像,目标区域为设置的用于对目标设备进行控制的任一场景区域。在一些实施例中,可以在目标设备上设置摄像设备,或者,也可以在目标设备的周围区域内设置摄像设备,以便摄像设备可以获取目标设备对应的目标区域的第一待检测图像。其中,摄像设备对应的拍摄区域包含目标区域,即目标区域位于摄像设备的拍摄范围内。Here, the first image to be detected may be the current image of the set target area, and the target area is any scene area set for controlling the target device. In some embodiments, a camera device may be set on the target device, or a camera device may also be set in a surrounding area of the target device, so that the camera device can acquire the first image to be detected of the target area corresponding to the target device. The photographing area corresponding to the imaging device includes a target area, that is, the target area is located within the photographing range of the imaging device.
对第一待检测图像进行手部检测,得到第一待检测图像中包括的每个用户的手部检测信息,再根据每个用户对应的手部检测信息指示的手势类别信息,确定与预设手势类别匹配的目标手部的手部检测信息。Perform hand detection on the first to-be-detected image to obtain the hand detection information of each user included in the first to-be-detected image, and then determine and preset according to the gesture category information indicated by the hand detection information corresponding to each user. The hand detection information of the target hand matched by the gesture category.
预设手势类别可以为设置的手势动作的类别,设置的手势动作可以用于对目标设备进行控制等,比如,预设手势类别可以为“OK”的手势动作的类别、也可以为“比心”的手势动作的类别等。The preset gesture category can be the set gesture action category, and the set gesture action can be used to control the target device. ” of the gesture action category, etc.
若第一待检测图像中存在多个用户的手部检测信息指示的手势类别与预设手势类别相同时,则可以根据每个用户的肢体中心点位置信息,从手势类别信息与预设手势类别相同的多个用户中,确定目标用户,比如选择 肢体中心点位置处于第一待检测图像中间的用户作为目标用户,将目标用户的手部作为目标手部。If the gesture category indicated by the hand detection information of multiple users in the first to-be-detected image is the same as the preset gesture category, the gesture category information and the preset gesture category can be determined from the gesture category information and the preset gesture category according to the position information of each user's limb center point. Among the same multiple users, the target user is determined, for example, a user whose limb center point is located in the middle of the first image to be detected is selected as the target user, and the target user's hand is used as the target hand.
一种可选实施方式中,对获取的第一待检测图像进行手部检测,包括:In an optional implementation manner, performing hand detection on the acquired first image to be detected includes:
S1011,对获取的第一待检测图像进行肢体检测,得到肢体检测信息。S1011: Perform limb detection on the acquired first image to be detected to obtain limb detection information.
S1012,基于所述肢体检测信息,对所述第一待检测图像进行手部检测,确定与所述肢体关联的所述目标手部的所述手部检测信息。S1012. Based on the limb detection information, perform hand detection on the first image to be detected, and determine the hand detection information of the target hand associated with the limb.
这里,可以先对第一待检测图像进行肢体检测,确定第一待检测图像中包括的每个用户的肢体检测信息。该肢体检测信息可以包括多个肢体关节点位置信息、该用户对应的肢体标识(该肢体标识可以与手部检测信息中包括的手部标识信息关联)等;或者该肢体检测信息可以包括用户的肢体轮廓信息,肢体轮廓信息中包括多个肢体轮廓点的位置信息。其中,该肢体检测信息可以为用户的半身肢体检测信息。其中,肢体关节点可以是通过图像检测方法对第一待检测图像进行肢体检测,从识别到的每个用户的肢体图像中提取的图像关键点。Here, limb detection may be performed on the first image to be detected, and the limb detection information of each user included in the first image to be detected is determined. The limb detection information may include position information of a plurality of limb joint points, a limb identification corresponding to the user (the limb identification may be associated with the hand identification information included in the hand detection information), etc.; or the limb detection information may include the user's The limb contour information includes position information of multiple limb contour points. Wherein, the limb detection information may be the limb detection information of the user's half body. The limb joint points may be image key points extracted from the identified limb images of each user by performing limb detection on the first image to be detected by an image detection method.
若在第一待检测图像之前的历史待检测图像中存在用户的肢体标识,则将追踪确定的用户在历史待检测图像中的肢体标识,确定为该用户在第一待检测图像中的肢体标识;若在第一待检测图像之前的历史待检测图像中不存在用户的肢体标识,则为该用户生成对应的肢体标识。If there is a user's limb identification in the historical to-be-detected image prior to the first to-be-detected image, the tracked and determined user's limb identification in the historical to-be-detected image is determined as the user's limb identification in the first to-be-detected image ; If the user's limb identification does not exist in the historical to-be-detected image before the first to-be-detected image, generate a corresponding limb identification for the user.
再可以利用至少一个用户的肢体检测信息,对第一待检测图像进行手部检测,确定与肢体关联的目标手部的手部检测信息。比如,可以根据肢体检测信息,确定与肢体关联的手部在第一待检测图像上的手部区域图像,对手部区域图像进行手部检测,得到与肢体关联的手部检测信息;再根据手部检测信息中包括的手势类别,确定与预设手势类别匹配的目标手部。Then, the limb detection information of at least one user can be used to perform hand detection on the first image to be detected, and the hand detection information of the target hand associated with the limb can be determined. For example, the hand region image of the hand associated with the limb on the first to-be-detected image can be determined according to the limb detection information, and the hand region image can be detected by hand to obtain the hand detection information associated with the limb; The gesture category included in the part detection information is determined, and the target hand matching the preset gesture category is determined.
在一些实施例中,可以对构建的第一神经网络进行训练,使得训练后的第一神经网络满足第一预设条件,比如,使得训练后的第一神经网络的损失值小于设置的损失阈值;其中,训练后的第一神经网络用于对第一待检测图像进行肢体检测,确定第一待检测图像中包括的至少一个用户的肢体检测信息。其中,肢体检测信息中包括的肢体关节点的数量和肢体关节点的位置,可以根据需要进行设置。比如,肢体关节点的数量可以为14个、17个等。以及还可以训练用于对手部进行检测的第二神经网络,使得训练后的第二神经网络满足第二预设条件,进而可以利用训练好的第二神经网络,基于肢体检测信息,对第一待检测图像进行手部检测,确定与肢体关联的目标手部的手部检测信息。In some embodiments, the constructed first neural network may be trained so that the trained first neural network satisfies a first preset condition, for example, the loss value of the trained first neural network is smaller than a set loss threshold wherein, the trained first neural network is used to perform limb detection on the first image to be detected, and determine the limb detection information of at least one user included in the first image to be detected. The number of the limb joint points and the positions of the limb joint points included in the limb detection information can be set as required. For example, the number of limb joint points can be 14, 17, etc. And the second neural network for detecting the hand can also be trained, so that the trained second neural network satisfies the second preset condition, and then the trained second neural network can be used, based on the limb detection information, to detect the first neural network. The image to be detected is subjected to hand detection, and the hand detection information of the target hand associated with the limb is determined.
由于在图像中较难对手部进行跟踪检测,而肢体的追踪检测较易实现,且手部与肢体相连,故可以先对第一待检测图像进行肢体检测,确定肢体检测信息,再基于肢体检测信息对第一待检测图像进行手部检测,可以较准确的确定与肢体关联的目标手部的手部检测信息。Since it is difficult to track and detect the hand in the image, and the tracking and detection of the limb is easier to achieve, and the hand is connected with the limb, the limb detection can be performed on the first image to be detected first to determine the limb detection information, and then based on the limb detection By performing hand detection on the first image to be detected, the hand detection information of the target hand associated with the limb can be more accurately determined.
一种可能的实施方式中,所述对获取的第一待检测图像进行手部检测, 包括:In a possible implementation manner, performing hand detection on the acquired first image to be detected includes:
S1013,对获取的所述第一待检测图像分别进行肢体检测和手部检测,得到肢体检测信息和所述手部检测信息;S1013, performing limb detection and hand detection on the acquired first image to be detected, respectively, to obtain limb detection information and the hand detection information;
S1014,基于所述肢体检测信息和所述手部检测信息,确定所述手部与所述肢体之间的距离;S1014, determining the distance between the hand and the limb based on the limb detection information and the hand detection information;
S1015,基于所述距离,确定与所述肢体关联的所述目标手部的所述手部检测信息。S1015. Based on the distance, determine the hand detection information of the target hand associated with the limb.
示例性的,可以使用第一神经网络对第一待检测图像进行肢体检测,得到至少一个用户的肢体检测信息,以及使用第二神经网络对第一待检测图像进行手部检测,得到至少一个手部对应的手部检测信息。根据手部检测信息指示的手势类别,确定目标手部。Exemplarily, a first neural network may be used to perform limb detection on the first image to be detected to obtain limb detection information of at least one user, and a second neural network may be used to perform hand detection on the first image to be detected to obtain at least one hand. corresponding hand detection information. Determine the target hand according to the gesture category indicated by the hand detection information.
再根据肢体检测信息指示的肢体中心点位置信息和手部检测信息指示的手部中心点位置信息,确定手部与肢体之间的距离;进而再将与目标手部之间的距离最短的肢体、和目标手部进行关联,即得到了与肢体关联的目标手部的手部检测信息。Then determine the distance between the hand and the limb according to the position information of the limb center point indicated by the limb detection information and the position information of the hand center point indicated by the hand detection information; and then determine the limb with the shortest distance from the target hand. , and the target hand is associated, that is, the hand detection information of the target hand associated with the limb is obtained.
这里,可以通过手部与肢体之间的距离,确定与肢体关联的目标手部的手部检测信息,确定过程简单、易实现。Here, the hand detection information of the target hand associated with the limb can be determined through the distance between the hand and the limb, and the determination process is simple and easy to implement.
参见图2所示的一种基于图像识别的设备控制方法中,肢体关节点和手部检测框的示意图。图2中目标用户的肢体关节点信息可以包括头部顶点5、头部中心点4、颈部关节点3、左肩关节点9、右肩关节点6、左手肘关节点10、右手肘关节点7、左手腕关节点11、右手腕关节点8、半身肢体中心点12、胯部关节点1、胯部关节点2、和胯部中心点0;手部检测框可以包括右手检测框的四个顶点13、15、16、17和右手框的中心点14;以及左手检测框的四个顶点18、20、21、22和左手框的中心点19。Refer to a schematic diagram of a limb joint point and a hand detection frame in an image recognition-based device control method shown in FIG. 2 . The limb joint point information of the target user in FIG. 2 may include head vertex 5, head center point 4, neck joint point 3, left shoulder joint point 9, right shoulder joint point 6, left elbow joint point 10, right elbow joint point 7. Left wrist joint point 11, right wrist joint point 8, half body limb center point 12, crotch joint point 1, crotch joint point 2, and crotch center point 0; the hand detection frame can include four of the right hand detection frame. The vertices 13, 15, 16, 17 and the center point 14 of the right-hand frame; and the four vertices 18, 20, 21, 22 of the left-hand detection frame and the center point 19 of the left-hand frame.
针对S102:For S102:
将目标手部对应的用户作为对目标设备进行控制的目标用户,基于目标用户的目标手部的手部检测信息,对获取的第二待检测图像中与目标手部相连的目标肢体进行肢体跟踪检测,确定目标用户在第二待检测图像中的肢体信息,并根据确定的目标用户的肢体信息,确定目标手部在第二待检测图像中的手势识别结果。其中,该手势识别结果包括但不限于手势类别、手部位置信息等。Taking the user corresponding to the target hand as the target user who controls the target device, and based on the hand detection information of the target user's target hand, perform limb tracking on the target limb connected to the target hand in the acquired second image to be detected Detect, determine the limb information of the target user in the second image to be detected, and determine the gesture recognition result of the target hand in the second image to be detected according to the determined limb information of the target user. The gesture recognition result includes, but is not limited to, gesture category, hand position information, and the like.
第二待检测图像为在第一待检测图像之后获取到的一帧或者多帧图像。The second to-be-detected image is one or more frames of images acquired after the first to-be-detected image.
一种可选实施方式中,在所述基于所述手势识别结果,控制目标设备之前,还包括:In an optional implementation manner, before the control of the target device based on the gesture recognition result, the method further includes:
一、检测所述手势识别结果是否满足截止条件;1. Detecting whether the gesture recognition result satisfies the cut-off condition;
二、在检测到所述手势识别结果满足截止条件的情况下,在所述第二待检测图像中,重新确定与所述预设手势类别匹配的目标手部的手部检测信息。其中,所述手势识别结果满足截止条件包括以下一种或多种:2. When it is detected that the gesture recognition result meets the cut-off condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined. Wherein, the gesture recognition result satisfying the cut-off condition includes one or more of the following:
条件一、在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为无效手势类别,所述无效手势类别包括如下至少一项:所述手势类别与所述预设手势类别不匹配,以及所述目标手部未发生移动; Condition 1. In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category. It is assumed that the gesture categories do not match, and the target hand does not move;
条件二、在所述第二待检测图像包括多帧的情况下,所述目标手部的手势识别结果指示的手势类别为所述无效手势类别的帧数大于或等于数量阈值,和/或持续时长大于或等于时长阈值;Condition 2: In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category. The number of frames is greater than or equal to the number threshold, and/or continues The duration is greater than or equal to the duration threshold;
条件三、在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为有效手势类别,且所述有效手势类别用于指示重新确定目标手部和/或手部检测信息。Condition 3: In the second to-be-detected image, the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
实施时,可以实时的对目标手部的手势识别结果进行检测,判断手势识别结果是否满足截止条件,在检测到手势识别结果满足截止条件时,表征目标手部不再对目标设备进行控制,则可以重新确定与预设手势类别匹配的目标手部的手部检测信息,使得第二待检测图像中的至少一个用户可以实时的控制目标设备。During implementation, the gesture recognition result of the target hand can be detected in real time to determine whether the gesture recognition result satisfies the cut-off condition, and when it is detected that the gesture recognition result meets the cut-off condition, it means that the target hand no longer controls the target device, then The hand detection information of the target hand that matches the preset gesture category can be re-determined, so that at least one user in the second to-be-detected image can control the target device in real time.
在检测到手势识别结果满足截止条件时,则在第二待检测图像中,重新确定与预设手势类别匹配的目标手部的手部检测信息,以便利用重新确定的目标手部的手势识别结果,对目标设备进行控制。When it is detected that the gesture recognition result meets the cutoff condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined, so as to use the re-determined gesture recognition result of the target hand to control the target device.
其中,截止条件包括但不限于条件一、条件二、和条件三中的一项或多项,比如,截止条件还可以包括:若在第二待检测图像中无法检测到目标手部的手部检测信息时,则重新确定与预设手势类别匹配的目标手部的手部检测信息。The cut-off condition includes but is not limited to one or more of the first condition, the second condition, and the third condition. For example, the cut-off condition may also include: if the hand of the target hand cannot be detected in the second image to be detected When the information is detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
在条件一中,若第二待检测图像中,目标手部的手势识别结果指示的手势类别为与预设手势类别不匹配时,和/或,若第二待检测图像中,目标手部的手势识别结果指示目标手部未发生移动时,确定满足条件一。示例性,可以根据目标手部在多帧第二待检测图像中的位置信息,判断目标手部是否发生移动。In condition 1, if the gesture category indicated by the gesture recognition result of the target hand in the second image to be detected does not match the preset gesture category, and/or, if in the second image to be detected, the gesture category of the target hand does not match the preset gesture category When the gesture recognition result indicates that the target hand has not moved, it is determined that the first condition is satisfied. Exemplarily, it may be determined whether the target hand moves according to the position information of the target hand in the multiple frames of the second images to be detected.
在条件二中,在检测到在连续N帧第二待检测图像中目标手部未发生移动、且N的值大于或等于数量阈值时,确定满足条件二,N为正整数;或者,在检测到在连续N帧第二待检测图像中目标手部的手势类别与预设手势类别不匹配、且N的值大于或等于数量阈值时确定满足条件二。其中,数量阈值可以根据需要进行设置,比如,数量阈值可以为3、5、10等。或者,在目标手部的手势识别结果指示的手势类别为无效手势类别的持续时长大于或等于时长阈值时,确定满足条件二。时长阈值可以根据实际需要进行设置。In the second condition, when it is detected that the target hand does not move in the second image to be detected in consecutive N frames, and the value of N is greater than or equal to the number threshold, it is determined that the second condition is satisfied, and N is a positive integer; Condition 2 is determined to be satisfied when the gesture category of the target hand in the second to-be-detected images of consecutive N frames does not match the preset gesture category, and the value of N is greater than or equal to the number threshold. The number threshold may be set as required, for example, the number threshold may be 3, 5, 10, and so on. Alternatively, it is determined that the second condition is satisfied when the duration of the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category and the duration is greater than or equal to the duration threshold. The duration threshold can be set according to actual needs.
在条件三中,可以预先设置一个截止手势类别,该截止手势类别用于指示重新确定目标手部和/或手部检测信息,比如,该截止手势类别可以为竖拇指的手势类别,在检测到目标手部的手势类别为竖拇指时,则确定该目标手部满足条件三。In condition 3, a cut-off gesture category may be preset, and the cut-off gesture category is used to instruct the target hand and/or hand detection information to be re-determined. For example, the cut-off gesture category may be a thumbs-up gesture category. When the gesture category of the target hand is thumbs up, it is determined that the target hand satisfies the third condition.
针对S103:For S103:
在确定了目标手部在第二待检测图像中的手势识别结果后,可以根据手势识别结果,控制目标设备。其中,目标设备可以为智能电视、智能显示屏等。After the gesture recognition result of the target hand in the second image to be detected is determined, the target device can be controlled according to the gesture recognition result. The target device may be a smart TV, a smart display screen, or the like.
一种可选实施方式中,控制目标设备,包括如下至少一种:调整所述目标设备的音量;调整所述目标设备的工作模式,所述工作模式包括关闭或开启所述目标设备的至少部分功能;在所述目标设备的显示界面中显示移动标识,或调整所述显示界面中所述移动标识的显示位置;所述显示界面中至少部分显示内容的缩小或放大;所述显示界面的滑动或跳转。In an optional implementation manner, controlling the target device includes at least one of the following: adjusting the volume of the target device; adjusting a working mode of the target device, where the working mode includes turning off or turning on at least part of the target device function; display the mobile logo in the display interface of the target device, or adjust the display position of the mobile logo in the display interface; reduce or enlarge at least part of the displayed content in the display interface; slide the display interface or jump.
这里,可以基于手势识别结果,控制目标设备的音量、控制目标设备的关闭、目标设备的显示界面中移动标识的显示位置等等,实现了对目标设备的灵活控制。Here, based on the gesture recognition result, the volume of the target device can be controlled, the target device can be turned off, the display position of the movement logo in the display interface of the target device, etc., to realize flexible control of the target device.
对基于手势识别结果,调整目标设备的音量进行示例性说明。若手势识别结果中包括的手势类别为设置的用于控制音量的第一目标手势类别时,比如,第一目标手势类别可以为竖食指和中指的手势类别,若手势识别结果指示的目标手部的手势类别为竖食指和中指的手势类别时,则可以确定该目标手部触发了调整目标设备的音量的功能,进而可以根据目标手部的移动方向和距离,确定音量放大、或减小,以及确定放大后的音量值或减小后的音量值,比如,若检测到目标手部从下往上移动,表征对目标设备的音量进行放大,并可以根据从下往上移动的距离、和当前音量,确定放大后的音量值;若检测到目标手部从上往下移动,表征对目标设备的音量进减小,并可以根据从上往下移动的距离、和当前音量值,确定减小后的音量值。An exemplary description will be given of adjusting the volume of the target device based on the gesture recognition result. If the gesture category included in the gesture recognition result is the set first target gesture category for volume control, for example, the first target gesture category may be the gesture category of the index finger and the middle finger. When the gesture category is the gesture category of the vertical index finger and the middle finger, it can be determined that the target hand has triggered the function of adjusting the volume of the target device, and then the volume can be increased or decreased according to the moving direction and distance of the target hand. And determine the amplified volume value or the reduced volume value. For example, if it is detected that the target hand moves from bottom to top, it indicates that the volume of the target device is amplified, and can be moved from bottom to top according to the distance, and The current volume is to determine the amplified volume value; if it is detected that the target hand moves from top to bottom, it indicates that the volume of the target device is being decreased, and it can be determined according to the distance moving from top to bottom and the current volume value. The volume value after the small.
对基于手势识别结果,调整目标设备的工作模式进行示例性说明。比如,若手势识别结果中的手势类别为设置的用于关闭目标设备的第二目标手势类别时,比如,第二目标手势类别可以为OK的手势类别,若手势识别结果指示的目标手部的手势类别为OK的手势类别时,则可以确定该目标手部触发了关闭目标设备的功能,进而可以响应用户触发的功能关闭目标设备。An exemplary description will be given of adjusting the working mode of the target device based on the gesture recognition result. For example, if the gesture category in the gesture recognition result is the second target gesture category set for shutting down the target device, for example, the second target gesture category may be the OK gesture category. When the gesture category is the OK gesture category, it can be determined that the target hand triggers the function of closing the target device, and then the target device can be closed in response to the function triggered by the user.
还可以基于手势识别结果指示的目标手部的位置信息,确定移动标识在目标设备上的显示位置,控制目标设备的显示界面在该显示位置处显示移动标识,其中,移动标识可以为移动光标等。It is also possible to determine the display position of the mobile logo on the target device based on the position information of the target hand indicated by the gesture recognition result, and control the display interface of the target device to display the mobile logo at the display position, wherein the mobile logo can be a moving cursor, etc. .
若手势识别结果中的手势类别与单击对应的第三目标手势类别相同时,比如,第三目标手势类别可以为竖食指的手势类别,若手势识别结果指示的目标手部的手势类别为竖食指的手势类别时,则可以确定该目标用户在与目标手部的当前位置处匹配的、目标设备的目标显示位置处触发了单击功能,则可以控制目标设备展示与单击操作对应的、且与目标显示位置处匹配的展示内容,控制显示界面的滑动或跳转。If the gesture category in the gesture recognition result is the same as the third target gesture category corresponding to the click, for example, the third target gesture category may be the gesture category of the vertical index finger, and if the gesture category of the target hand indicated by the gesture recognition result is the vertical gesture category When the gesture category of the index finger is used, it can be determined that the target user has triggered the click function at the target display position of the target device that matches the current position of the target hand, and the target device can be controlled to display the corresponding And the display content that matches the target display position controls the sliding or jumping of the display interface.
考虑到在第一待检测图像中包括多个用户时,若用户与用户之间的距离较近时,用户与用户的手势之间可能存在干扰,若检测到用户与用户之间存在干扰时,可以对调整预设手势类别检测的容错机制。Considering that when multiple users are included in the first image to be detected, if the distance between the user and the user is relatively close, there may be interference between the user and the user's gestures, if it is detected that there is interference between the user and the user, Can adjust the fault tolerance mechanism of preset gesture category detection.
一种可选实施方式中,在所述第一待检测图像中包括多个用户的情况下,在所述基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测之前,还包括:In an optional implementation manner, in the case that the first image to be detected includes multiple users, in the hand detection information based on the target hand, the second image to be detected that is obtained and the Before performing the limb tracking detection on the target limb connected to the target hand, the method further includes:
步骤一、确定所述第一待检测图像中每个用户的目标关节点位置信息; Step 1. Determine the target joint point position information of each user in the first to-be-detected image;
步骤二、将所述第一待检测图像中的每个用户作为目标用户,基于所述目标用户的所述目标关节点位置信息,确定所述目标用户的目标关节点与多个用户中除所述目标用户之外的其他用户的目标关节点之间的水平距离;Step 2: Take each user in the first image to be detected as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and the target joint of the multiple users. The horizontal distance between the target joint points of other users other than the target user;
步骤三、在基于所述水平距离,确定所述其他用户中不存在干扰用户的情况下,则将所述目标用户的默认手势类别,作为所述目标用户的所述预设手势类别,所述干扰用户包括所述水平距离小于所述目标用户对应的距离阈值的用户。Step 3: When it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the Interfering users include users whose horizontal distance is less than a distance threshold corresponding to the target user.
步骤四、在基于所述水平距离,确定所述其他用户中存在干扰用户的情况下,则对所述目标用户的默认手势类别进行调整,并将调整后的默认手势类别作为所述目标用户的所述预设手势类别,调整所述默认手势类别包括以下至少一种操作:增加所述默认手势类别的种类、增加用于控制所述目标设备的至少一个功能的手势类别的种类,以及将手势类别的移动检测调整为手部检测框的移动检测。Step 4. When it is determined that there is an interfering user among the other users based on the horizontal distance, adjust the default gesture category of the target user, and use the adjusted default gesture category as the target user's default gesture category. For the preset gesture category, adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and adding the gesture category. The motion detection of the category is adjusted to the motion detection of the hand detection frame.
上述实施方式中,在第一待检测图像中包括多个用户时,可以将每个用户作为目标用户,基于目标用户的目标关节点位置信息、和其他用户的目标关节点位置信息,确定目标用户与其他用户的目标关节点之间的水平距离,在基于水平距离,确定其他用户中存在干扰用户时,则可以调整目标用户对应的手势容错机制,即可以将调整后的默认手势类别作为目标用户的预设手势类别,缓解干扰用户对目标用户的手势类别检测产生的影响。In the above embodiment, when multiple users are included in the first image to be detected, each user can be regarded as a target user, and the target user can be determined based on the target joint position information of the target user and the target joint position information of other users. The horizontal distance from the target joint points of other users. When it is determined that there are interfering users among other users based on the horizontal distance, the gesture fault tolerance mechanism corresponding to the target user can be adjusted, that is, the adjusted default gesture category can be used as the target user. The preset gesture category can alleviate the influence of the interference user on the gesture category detection of the target user.
针对步骤一、可以对第一待检测图像进行肢体检测,确定第一待检测图像中每个用户的肢体检测信息,该肢体检测信息中可以包括目标关节点位置信息,得到了每个用户的关节点位置信息。其中,目标关节点可以根据需要进行选取,比如,目标关节点可以为肢体中心点,即图2中的半身肢体中心点12,也可以为图2中的胯部中心点0。For step 1, limb detection can be performed on the first image to be detected, and limb detection information of each user in the first image to be detected can be determined. The limb detection information can include target joint point position information, and the joints of each user are obtained. point location information. The target joint point can be selected as required, for example, the target joint point can be the center point of the limb, that is, the center point 12 of the half-body limb in FIG. 2 , or the center point 0 of the crotch in FIG. 2 .
针对步骤二、再可以将第一待检测图像中的每个用户作为目标用户,基于目标用户的目标关节点位置信息,确定目标用户的目标关节点与多个用户中除目标用户之外的其他用户的目标关节点之间的水平距离,即可以将目标用户与其他用户的目标关节点位置信息指示的横坐标值相减,确定目标用户的目标关节点与多个用户中除目标用户之外的其他用户的目标关节点之间的水平距离。For step 2, each user in the first to-be-detected image can be used as a target user, and based on the target joint position information of the target user, the target joint of the target user and other users other than the target user among the multiple users can be determined. The horizontal distance between the target joint points of the user, that is, the abscissa value indicated by the target joint point position information of the target user and other users can be subtracted to determine the target joint point of the target user and multiple users except the target user. The horizontal distance between the target joint points of other users.
再可以基于目标用户与其他用户之间的水平距离,确定其他用户中是否存在干扰用户,若不存在,则执行步骤三;若存在,则执行步骤四。其中,在其他用户与目标用户之间的水平距离,大于或等于确定的目标用户对应的距离阈值时,确定其他用户为干扰用户;若其他用户与目标用户之间的水平距离,小于确定的目标用户对应的距离阈值时,确定其他用户不是干扰用户。Then, based on the horizontal distance between the target user and other users, it can be determined whether there are interfering users in other users, if not, go to step 3; if there is, go to step four. Among them, when the horizontal distance between other users and the target user is greater than or equal to the distance threshold corresponding to the determined target user, other users are determined to be interfering users; if the horizontal distance between other users and the target user is less than the determined target user. When the distance threshold corresponding to the user is determined, it is determined that other users are not interfering users.
其中,可以根据下述步骤A1至步骤A3确定所述目标用户对应的所述距离阈值:Wherein, the distance threshold corresponding to the target user can be determined according to the following steps A1 to A3:
步骤A1、确定所述目标用户的第一关节点的位置信息和第二关节点的位置信息;Step A1, determining the position information of the first joint point and the position information of the second joint point of the target user;
步骤A2、基于所述第一关节点的位置信息和所述第二关节点的位置信息,确定用于表征所述目标用户肩宽的中间距离;Step A2, based on the position information of the first joint point and the position information of the second joint point, determine the intermediate distance used to represent the shoulder width of the target user;
步骤A3、基于所述目标用户对应的所述中间距离,确定所述目标用户对应的所述距离阈值。Step A3: Determine the distance threshold corresponding to the target user based on the intermediate distance corresponding to the target user.
示例性的,第一关节点可以为图2中的左肩关节点9、第二关节点可以为图2中的颈部关节点3;或者,第一关节点可以为图2中的右肩关节点6、第二关节点可以为图2中的颈部关节点3;或者,第一关节点可以为图2中的右肩关节点6、第二关节点可以为图2中的左肩关节点9。Exemplarily, the first joint point may be the left shoulder joint point 9 in FIG. 2 , and the second joint point may be the neck joint point 3 in FIG. 2 ; or, the first joint point may be the right shoulder joint in FIG. 2 . Point 6 and the second joint point may be the neck joint point 3 in Figure 2; or, the first joint point may be the right shoulder joint point 6 in Figure 2, and the second joint point may be the left shoulder joint point in Figure 2 9.
再可以基于第一关节点的位置信息和第二关节点的位置信息,确定用于表征目标用户肩宽的中间距离,比如,可以将第一关节点的位置信息指示的横坐标值与第二关节点的位置信息指示的横坐标值相减,确定中间距离。Then, based on the position information of the first joint point and the position information of the second joint point, the intermediate distance used to represent the shoulder width of the target user can be determined. The abscissa values indicated by the position information of the joint points are subtracted to determine the intermediate distance.
最后基于目标用户对应的中间距离,确定目标用户对应的距离阈值。比如可以将确定的中间距离作为目标用户对应的距离阈值;或者,也可以将确定的中间距离进行缩小或放大,将缩小或放大后的中间距离作为目标用户对应的距离阈值。Finally, based on the intermediate distance corresponding to the target user, the distance threshold corresponding to the target user is determined. For example, the determined intermediate distance may be used as the distance threshold corresponding to the target user; alternatively, the determined intermediate distance may be reduced or enlarged, and the reduced or enlarged intermediate distance may be used as the distance threshold corresponding to the target user.
采用上述方法,可以基于确定的第一关节点的位置信息和第二关节点的位置信息,确定表征目标用户肩宽的中间距离,进而可以基于目标用户对应的中间距离,确定目标用户的距离阈值,不同的用户对应不同的距离阈值,通过为每个目标用户确定对应的距离阈值,可以较准确的判断其他用户是否会对目标用户造成干扰。Using the above method, the intermediate distance representing the shoulder width of the target user can be determined based on the determined position information of the first joint point and the position information of the second joint point, and then the distance threshold of the target user can be determined based on the intermediate distance corresponding to the target user. , different users correspond to different distance thresholds. By determining the corresponding distance threshold for each target user, it can be more accurately judged whether other users will cause interference to the target user.
在步骤三中,若目标用户不存在干扰用户时,可以将目标用户的默认手势类别,作为目标用户的预设手势类别,无需对默认手势类别进行调整。在步骤四中,若确定目标用户存在干扰用户时,可以对目标用户对应的默认手势类别进行调整,将调整后的默认手势类别作为目标用户的预设手势类别。In step 3, if the target user does not interfere with the user, the default gesture category of the target user can be used as the default gesture category of the target user, and there is no need to adjust the default gesture category. In step 4, if it is determined that the target user is interfering with the user, the default gesture category corresponding to the target user may be adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user.
比如,可以增加默认手势类别的种类,比如,调整前的默认手势类别为单指转圈的动态手势,调整后的默认手势类别可以包括:单指转圈的手 势类别、和拳头转圈的手势类别等。For example, the types of default gesture categories can be added. For example, the default gesture category before adjustment is a dynamic gesture of one-finger circle, and the adjusted default gesture category can include: one-finger circle gesture category, and fist circle gesture category.
再比如,可以增加用于控制目标设备的至少一个功能的手势类别的种类,比如,增加前的控制目标设备的音量的第一目标手势类别为竖食指和中指的手势类别,增加后的控制目标设备的音量的第一目标手势类别可以包括:竖食指和中指的手势类别、手掌的手势类别、竖三根手指的手势类别等。For another example, the types of gesture categories used to control at least one function of the target device can be added. For example, the first target gesture category for controlling the volume of the target device before the increase is the gesture category of raising the index finger and the middle finger, and the added control target The first target gesture category of the volume of the device may include: gesture category of raising index finger and middle finger, gesture category of palm, gesture category of raising three fingers, and the like.
或者,还可以增加截止手势类别的种类,比如,增加前的截止手势类别的种类为竖拇指的手势类别;增加后的截止手势类别的种类可以为竖拇指的手势类别、竖食指的手势类别、和竖尾指的手势类别等。Alternatively, the types of cut-off gesture categories may also be added. For example, the types of cut-off gesture categories before the addition are thumb-up gesture categories; the added cut-off gesture categories may be thumb-up gesture categories, index finger-raise gesture categories, and gesture categories of the vertical tail finger, etc.
再比如,还可以将手势类别的移动检测调整为手部检测框的移动检测,即调整前通过对手势类别的实时移动进行检测,基于手势类别的检测结果确定移动标识在目标设备上的显示位置。在一些实施例中,调整前:可以先对目标手部进行检测,确定该目标手部对应的当前手势类别,在该当前手势类别与设置的移动手势类别匹配时,则确定目标手部的手部位置,并基于目标手部的手部位置,确定移动标识在目标设备上的显示位置;在该当前手势类别与设置的移动手势类别不匹配时,则不进行确定目标手部的手部位置的步骤,即此时无法对显示设备上移动标识的移动进行控制,其中,目标手部的手部位置可以为目标手部对应的手部检测框的中心点的位置,或者,也可以为目标手部上设置的手部中心点的位置。For another example, the movement detection of the gesture category can also be adjusted to the movement detection of the hand detection frame, that is, the real-time movement of the gesture category is detected before adjustment, and the display position of the mobile logo on the target device is determined based on the detection result of the gesture category. . In some embodiments, before the adjustment: the target hand may be detected first, the current gesture category corresponding to the target hand may be determined, and when the current gesture category matches the set movement gesture category, the hand gesture of the target hand may be determined. and determine the display position of the mobile logo on the target device based on the hand position of the target hand; when the current gesture category does not match the set mobile gesture category, the hand position of the target hand is not determined. step, that is, the movement of the mobile logo on the display device cannot be controlled at this time, wherein the hand position of the target hand can be the position of the center point of the hand detection frame corresponding to the target hand, or it can also be the target hand The position of the hand center point set on the hand.
调整后:可以对手部检测框的实时移动进行检测,基于手部检测框的检测结果确定移动标识在目标设备上的显示位置。在一些实施例中,可以确定目标手部的手部检测框的位置信息,基于该手部检测框的位置信息(比如,可以为手部检测框的中心点的位置信息),确定移动标识在目标设备上的显示位置,此时无需对目标手部的当前手势类别进行检测。After adjustment: The real-time movement of the hand detection frame can be detected, and the display position of the movement mark on the target device can be determined based on the detection result of the hand detection frame. In some embodiments, the position information of the hand detection frame of the target hand may be determined, and based on the position information of the hand detection frame (for example, the position information of the center point of the hand detection frame), it is determined that the mobile identifier is in The display position on the target device, at this time there is no need to detect the current gesture category of the target hand.
本领域技术人员可以理解,上述方法中各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that the writing order of each step in the above method does not imply a strict execution order but constitutes any limitation on the implementation process, and the execution order of each step should be determined by its function and possible internal logic.
基于相同的构思,本公开实施例还提供了一种基于图像识别的设备控制装置,参见图3所示,为本公开实施例提供的一种基于图像识别的设备控制装置的架构示意图,包括第一确定模块301、检测模块302、控制模块303,其中:Based on the same concept, an embodiment of the present disclosure also provides an image recognition-based device control apparatus. Referring to FIG. 3 , a schematic diagram of the architecture of an image recognition-based device control apparatus provided by an embodiment of the present disclosure includes the first A determination module 301, a detection module 302, and a control module 303, wherein:
第一确定模块301,被配置为对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;The first determining module 301 is configured to perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
检测模块302,被配置为基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;The detection module 302 is configured to perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected based on the hand detection information of the target hand, and determine the target hand The gesture recognition result in the second to-be-detected image; wherein, the second to-be-detected image is an image acquired after the first to-be-detected image;
控制模块303,被配置为基于所述手势识别结果,控制目标设备。The control module 303 is configured to control the target device based on the gesture recognition result.
一种可能的实施方式中,在所述基于所述手势识别结果,控制目标设备之前,还包括:第二确定模块304,被配置为:In a possible implementation manner, before the control of the target device based on the gesture recognition result, the method further includes: a second determination module 304 configured to:
检测所述目标手部是否满足截止条件;detecting whether the target hand satisfies the cut-off condition;
在检测到所述手势识别结果满足截止条件的情况下,在所述第二待检测图像中,重新确定与所述预设手势类别匹配的目标手部的手部检测信息。When it is detected that the gesture recognition result satisfies the cutoff condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
一种可能的实施方式中,所述目标手部满足所述截止条件包括以下一种或多种:In a possible implementation manner, the target hand satisfying the cut-off condition includes one or more of the following:
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为无效手势类别,所述无效手势类别包括如下至少一项:所述手势类别与所述预设手势类别不匹配,以及所述目标手部未发生移动;In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
在所述第二待检测图像包括多帧的情况下,所述目标手部的手势识别结果指示的手势类别为所述无效手势类别的帧数大于或等于数量阈值,和/或持续时长大于或等于时长阈值;In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category. The number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为有效手势类别,且所述有效手势类别用于指示重新确定目标手部和/或手部检测信息。In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
一种可能的实施方式中,所述第一确定模块301,在对获取的第一待检测图像进行手部检测时,被配置为:In a possible implementation manner, the first determination module 301, when performing hand detection on the acquired first image to be detected, is configured to:
对获取的所述第一待检测图像进行肢体检测,得到肢体检测信息;performing limb detection on the acquired first image to be detected to obtain limb detection information;
基于所述肢体检测信息,对所述第一待检测图像进行手部检测,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the limb detection information, hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
一种可能的实施方式中,所述第一确定模块301,在对获取的第一待检测图像进行手部检测时,被配置为:In a possible implementation manner, the first determination module 301, when performing hand detection on the acquired first image to be detected, is configured to:
对获取的所述第一待检测图像分别进行肢体检测和手部检测,得到肢体检测信息和所述手部检测信息;Performing limb detection and hand detection on the acquired first image to be detected, respectively, to obtain limb detection information and the hand detection information;
基于所述肢体检测信息和所述手部检测信息,确定所述手部与所述肢体之间的距离;determining the distance between the hand and the limb based on the limb detection information and the hand detection information;
基于所述距离,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the distance, the hand detection information for the target hand associated with the limb is determined.
一种可能的实施方式中,所述控制模块303,在控制目标设备时,包括如下至少一种:In a possible implementation manner, the control module 303, when controlling the target device, includes at least one of the following:
调整所述目标设备的音量;adjust the volume of the target device;
调整所述目标设备的工作模式,所述工作模式包括关闭或开启所述目标设备的至少部分功能;Adjust the working mode of the target device, the working mode includes turning off or turning on at least part of the function of the target device;
在所述目标设备的显示界面中显示移动标识,或调整所述显示界面中所述移动标识的显示位置;Displaying the mobile logo in the display interface of the target device, or adjusting the display position of the mobile logo in the display interface;
所述显示界面中至少部分显示内容的缩小或放大;reduction or enlargement of at least part of the displayed content in the display interface;
所述显示界面的滑动或跳转。Sliding or jumping of the display interface.
一种可能的实施方式中,在所述第一待检测图像中包括多个用户的情况下,在基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测之前,还包括:调整模块305,被配置为:In a possible implementation manner, in the case that the first image to be detected includes multiple users, based on the hand detection information of the target hand, the obtained second image to be detected is compared with the Before performing the limb tracking detection on the target limb connected to the target hand, the method further includes: an adjustment module 305, which is configured to:
确定所述第一待检测图像中每个用户的目标关节点位置信息;Determine the target joint point position information of each user in the first to-be-detected image;
将所述第一待检测图像中的每个用户作为目标用户,基于所述目标用户的所述目标关节点位置信息,确定所述目标用户的目标关节点与多个用户中除所述目标用户之外的其他用户的目标关节点之间的水平距离;Taking each user in the first to-be-detected image as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and the target user in the multiple users except the target user The horizontal distance between the target joint points of other users;
在基于所述水平距离,确定所述其他用户中不存在干扰用户的情况下,则将所述目标用户的默认手势类别,作为所述目标用户的所述预设手势类别,所述干扰用户包括所述水平距离小于所述目标用户对应的距离阈值的用户。When it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
一种可能的实施方式中,所述调整模块305,还被配置为:In a possible implementation manner, the adjustment module 305 is further configured to:
在基于所述水平距离,确定所述其他用户中存在干扰用户的情况下,则对所述目标用户的默认手势类别进行调整,并将调整后的默认手势类别作为所述目标用户的所述预设手势类别,调整所述默认手势类别包括以下至少一种操作:增加所述默认手势类别的种类、增加用于控制所述目标设备的至少一个功能的手势类别的种类,以及将手势类别的移动检测调整为手部检测框的移动检测。When it is determined that there is an interfering user among the other users based on the horizontal distance, the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user. Assuming a gesture category, adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
一种可能的实施方式中,所述装置还包括距离阈值确定模块,所述距离阈值确定模块306,被配置为根据下述步骤确定所述目标用户对应的所述距离阈值:In a possible implementation manner, the apparatus further includes a distance threshold determination module, and the distance threshold determination module 306 is configured to determine the distance threshold corresponding to the target user according to the following steps:
确定所述目标用户对应的第一关节点的位置信息和第二关节点的位置信息;determining the position information of the first joint point and the position information of the second joint point corresponding to the target user;
基于所述第一关节点的位置信息和所述第二关节点的位置信息,确定用于表征所述目标用户肩宽的中间距离;based on the position information of the first joint point and the position information of the second joint point, determining an intermediate distance used to represent the shoulder width of the target user;
基于所述目标用户对应的所述中间距离,确定所述目标用户对应的所述距离阈值。Based on the intermediate distance corresponding to the target user, the distance threshold corresponding to the target user is determined.
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模板可以用于执行上文方法实施例描述的方法,其实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。In some embodiments, the functions or templates included in the apparatus provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments. For implementation, reference may be made to the descriptions in the above method embodiments. Repeat.
基于同一技术构思,本公开实施例还提供了一种电子设备。参照图4所示,为本公开实施例提供的电子设备的结构示意图,包括处理器401、存储器402、和总线403。其中,存储器402被配置为存储执行指令,包括内存4021和外部存储器4022;这里的内存4021也称内存储器,被配置为暂时存放处理器401中的运算数据,以及与硬盘等外部存储器4022交换的数据,处理器401通过内存4021与外部存储器4022进行数据交换,当电子设备400运行时,处理器401与存储器402之间通过总线403通信,使得 处理器401在执行以下指令:Based on the same technical concept, an embodiment of the present disclosure also provides an electronic device. Referring to FIG. 4 , a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure includes a processor 401 , a memory 402 , and a bus 403 . Among them, the memory 402 is configured to store execution instructions, including the memory 4021 and the external memory 4022; the memory 4021 here is also called internal memory, and is configured to temporarily store the operation data in the processor 401 and the external memory 4022 such as the hard disk. Data, the processor 401 exchanges data with the external memory 4022 through the memory 4021. When the electronic device 400 is running, the processor 401 and the memory 402 communicate through the bus 403, so that the processor 401 executes the following instructions:
对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;Perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;Based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second to-be-detected image, and determine that the target hand is in the second to-be-detected image The gesture recognition result in the image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
基于所述手势识别结果,控制目标设备。Based on the gesture recognition result, the target device is controlled.
此外,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的基于图像识别的设备控制方法。In addition, embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the image recognition-based device described in the foregoing method embodiments is executed Control Method.
本公开实施例所提供的基于图像识别的设备控制方法的计算机程序产品,包括存储了计算机可读代码的计算机可读存储介质,所述计算机可读代码包括的指令可用于执行上述方法实施例中所述的基于图像识别的设备控制方法。The computer program product of the image recognition-based device control method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing computer-readable codes, and the instructions included in the computer-readable codes can be used to execute the above method embodiments. The device control method based on image recognition.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述 方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
以上仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art who is familiar with the technical scope of the present disclosure can easily think of changes or substitutions, which should be covered within the scope of the present disclosure. within the scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.
工业实用性Industrial Applicability
通过对第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息,并基于目标手部的手部检测信息,对获取的第二待检测图像中与目标手部相连的目标肢体进行肢体跟踪检测,确定目标手部在第二待检测图像中的手势识别结果。这样可以借助肢体跟踪来实现对不易跟踪检测的目标手部的追踪,进而可以基于手势识别结果,控制目标设备。在诸多用户的手部中,或是在同一用户的两个手部中,通过锁定目标手部,并借助肢体与手部之间的唯一匹配性,以跟踪目标手部为目的进行了肢体跟踪,并以肢体跟踪结果为依据,得到目标手部在第二待检测图像中的手势识别结果,从而有效降低了对目标手部对应的目标用户控制目标设备的手势进行图像识别时,其他用户的手部动作所产生的干扰,提高了图像识别的准确性,进而提高了目标设备的控制精准度。By performing hand detection on the first image to be detected, the hand detection information of the target hand that matches the preset gesture category is determined, and based on the hand detection information of the target hand, the obtained second image to be detected and the The target limb connected to the target hand performs limb tracking detection to determine the gesture recognition result of the target hand in the second to-be-detected image. In this way, the target hand that is difficult to be tracked and detected can be tracked by means of limb tracking, and then the target device can be controlled based on the gesture recognition result. In the hands of many users, or in the two hands of the same user, by locking the target hand and using the unique matching between the limb and the hand, limb tracking is carried out for the purpose of tracking the target hand , and based on the limb tracking result, the gesture recognition result of the target hand in the second to-be-detected image is obtained, thereby effectively reducing the risk of other users' problems when performing image recognition on the gesture of the target user corresponding to the target hand controlling the target device. The interference generated by hand movements improves the accuracy of image recognition, thereby improving the control accuracy of the target device.
由此可见,采用本公开提供的技术方案,可以有效甄别多个用户中用于控制目标设备的目标用户,并在一定程度上,当目标用户的两个手部都存在手部动作的情况下,择一确定目标手部,以对目标设备进行准确控制。需要说明的是,若部分控制操作是通过用户的两个手部分别执行相应动作来触控的,那么采用本公开提供的技术方案可以锁定目标用户,并基于目标用户的两个手部对应的手部动作来实现目标设备的控制。It can be seen that, by adopting the technical solution provided by the present disclosure, the target user used to control the target device among multiple users can be effectively identified, and to a certain extent, when both hands of the target user have hand movements , and choose one to determine the target hand to accurately control the target device. It should be noted that, if part of the control operations are touched by the user's two hands performing corresponding actions respectively, then the technical solution provided by the present disclosure can lock the target user, and based on the corresponding two hands of the target user, the target user can be locked. Hand movements to achieve control of the target device.

Claims (13)

  1. 一种基于图像识别的设备控制方法,包括:A device control method based on image recognition, comprising:
    对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;Perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
    基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;Based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second to-be-detected image, and determine that the target hand is in the second to-be-detected image The gesture recognition result in the image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
    基于所述手势识别结果,控制目标设备。Based on the gesture recognition result, the target device is controlled.
  2. 根据权利要求1所述的方法,其中,在所述基于所述手势识别结果,控制目标设备之前,还包括:The method according to claim 1, wherein before the controlling the target device based on the gesture recognition result, the method further comprises:
    检测所述手势识别结果是否满足截止条件;Detecting whether the gesture recognition result satisfies the cut-off condition;
    在检测到所述手势识别结果满足截止条件的情况下,在所述第二待检测图像中,重新确定与所述预设手势类别匹配的目标手部的手部检测信息。When it is detected that the gesture recognition result satisfies the cutoff condition, in the second image to be detected, the hand detection information of the target hand matching the preset gesture category is re-determined.
  3. 根据权利要求2所述的方法,其中,所述手势识别结果满足所述截止条件包括以下一种或多种:The method according to claim 2, wherein the gesture recognition result satisfying the cut-off condition comprises one or more of the following:
    在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为无效手势类别,所述无效手势类别包括如下至少一项:所述手势类别与所述预设手势类别不匹配,以及所述目标手部未发生移动;In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is an invalid gesture category, and the invalid gesture category includes at least one of the following: the gesture category and the preset gesture category mismatch, and the target hand has not moved;
    在所述第二待检测图像包括多帧的情况下,所述目标手部的手势识别结果指示的手势类别为所述无效手势类别的帧数大于或等于数量阈值,和/或持续时长大于或等于时长阈值;In the case where the second image to be detected includes multiple frames, the gesture category indicated by the gesture recognition result of the target hand is the invalid gesture category. The number of frames is greater than or equal to the number threshold, and/or the duration is greater than or equal to is equal to the duration threshold;
    在所述第二待检测图像中,所述目标手部的手势识别结果指示的手势类别为有效手势类别,且所述有效手势类别用于指示重新确定目标手部和/或手部检测信息。In the second image to be detected, the gesture category indicated by the gesture recognition result of the target hand is a valid gesture category, and the valid gesture category is used to instruct to re-determine the target hand and/or hand detection information.
  4. 根据权利要求1至3任一所述的方法,其中,所述对获取的第一待检测图像进行手部检测,包括:The method according to any one of claims 1 to 3, wherein the performing hand detection on the acquired first image to be detected comprises:
    对获取的所述第一待检测图像进行肢体检测,得到肢体检测信息;performing limb detection on the acquired first image to be detected to obtain limb detection information;
    基于所述肢体检测信息,对所述第一待检测图像进行手部检测,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the limb detection information, hand detection is performed on the first image to be detected, and the hand detection information of the target hand associated with the limb is determined.
  5. 根据权利要求1至3任一所述的方法,其中,所述对获取的第一待检测图像进行手部检测,包括:The method according to any one of claims 1 to 3, wherein the performing hand detection on the acquired first image to be detected comprises:
    对获取的所述第一待检测图像分别进行肢体检测和手部检测,得到肢体检测信息和所述手部检测信息;Performing limb detection and hand detection on the acquired first image to be detected, respectively, to obtain limb detection information and the hand detection information;
    基于所述肢体检测信息和所述手部检测信息,确定所述手部与所述 肢体之间的距离;determining the distance between the hand and the limb based on the limb detection information and the hand detection information;
    基于所述距离,确定与所述肢体关联的所述目标手部的所述手部检测信息。Based on the distance, the hand detection information for the target hand associated with the limb is determined.
  6. 根据权利要求1至5任一所述的方法,其中,所述控制目标设备,包括如下至少一种:The method according to any one of claims 1 to 5, wherein the control target device includes at least one of the following:
    调整所述目标设备的音量;adjust the volume of the target device;
    调整所述目标设备的工作模式,所述工作模式包括关闭或开启所述目标设备的至少部分功能;Adjust the working mode of the target device, the working mode includes turning off or turning on at least part of the function of the target device;
    在所述目标设备的显示界面中显示移动标识,或调整所述显示界面中所述移动标识的显示位置;Displaying the mobile logo in the display interface of the target device, or adjusting the display position of the mobile logo in the display interface;
    所述显示界面中至少部分显示内容的缩小或放大;reduction or enlargement of at least part of the displayed content in the display interface;
    所述显示界面的滑动或跳转。Sliding or jumping of the display interface.
  7. 根据权利要求1至6任一所述的方法,其中,在所述第一待检测图像中包括多个用户的情况下,在所述基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测之前,还包括:The method according to any one of claims 1 to 6, wherein in the case that the first image to be detected includes multiple users, in the hand detection information based on the target hand, the acquired Before performing the limb tracking detection on the target limb connected to the target hand in the second to-be-detected image, the method further includes:
    确定所述第一待检测图像中每个用户的目标关节点位置信息;Determine the target joint point position information of each user in the first to-be-detected image;
    将所述第一待检测图像中的每个用户作为目标用户,基于所述目标用户的所述目标关节点位置信息,确定所述目标用户的目标关节点与多个用户中,除所述目标用户之外的其他用户的目标关节点之间的水平距离;Taking each user in the first to-be-detected image as a target user, and based on the target joint position information of the target user, determine the target joint of the target user and a plurality of users, except for the target The horizontal distance between the target joint points of other users other than the user;
    在基于所述水平距离,确定所述其他用户中不存在干扰用户的情况下,则将所述目标用户的默认手势类别,作为所述目标用户的所述预设手势类别,所述干扰用户包括所述水平距离小于所述目标用户对应的距离阈值的用户。When it is determined based on the horizontal distance that there is no interfering user among the other users, the default gesture category of the target user is taken as the preset gesture category of the target user, and the interfering user includes Users whose horizontal distance is smaller than a distance threshold corresponding to the target user.
  8. 根据权利要求7所述的方法,其中,还包括:The method of claim 7, further comprising:
    在基于所述水平距离,确定所述其他用户中存在干扰用户的情况下,则对所述目标用户的默认手势类别进行调整,并将调整后的默认手势类别作为所述目标用户的所述预设手势类别,调整所述默认手势类别包括以下至少一种操作:增加所述默认手势类别的种类、增加用于控制所述目标设备的至少一个功能的手势类别的种类,以及将手势类别的移动检测调整为手部检测框的移动检测。When it is determined that there is an interfering user among the other users based on the horizontal distance, the default gesture category of the target user is adjusted, and the adjusted default gesture category is used as the preset gesture category of the target user. Assuming a gesture category, adjusting the default gesture category includes at least one of the following operations: increasing the category of the default gesture category, increasing the category of the gesture category used to control at least one function of the target device, and moving the gesture category The detection is adjusted to the motion detection of the hand detection frame.
  9. 根据权利要求7或8所述的方法,其中,根据下述步骤确定所述目标用户对应的所述距离阈值:The method according to claim 7 or 8, wherein the distance threshold corresponding to the target user is determined according to the following steps:
    确定所述目标用户的第一关节点的位置信息和第二关节点的位置信息;determining the position information of the first joint point and the position information of the second joint point of the target user;
    基于所述第一关节点的位置信息和所述第二关节点的位置信息,确定用于表征所述目标用户肩宽的中间距离;based on the position information of the first joint point and the position information of the second joint point, determining an intermediate distance used to represent the shoulder width of the target user;
    基于所述中间距离,确定所述目标用户对应的所述距离阈值。Based on the intermediate distance, the distance threshold corresponding to the target user is determined.
  10. 一种基于图像识别的设备控制装置,包括:A device control device based on image recognition, comprising:
    第一确定模块,被配置为对获取的第一待检测图像进行手部检测,确定与预设手势类别匹配的目标手部的手部检测信息;a first determining module, configured to perform hand detection on the acquired first image to be detected, and determine the hand detection information of the target hand matching the preset gesture category;
    检测模块,被配置为基于所述目标手部的手部检测信息,对获取的第二待检测图像中与所述目标手部相连的目标肢体进行肢体跟踪检测,并确定所述目标手部在所述第二待检测图像中的手势识别结果;其中,所述第二待检测图像为在所述第一待检测图像之后获取到的图像;The detection module is configured to, based on the hand detection information of the target hand, perform limb tracking detection on the target limb connected to the target hand in the acquired second image to be detected, and determine that the target hand is in the The gesture recognition result in the second to-be-detected image; wherein, the second to-be-detected image is an image obtained after the first to-be-detected image;
    控制模块,被配置为基于所述手势识别结果,控制目标设备。The control module is configured to control the target device based on the gesture recognition result.
  11. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至9任一所述的基于图像识别的设备控制方法。An electronic device, comprising: a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate through the bus , the image recognition-based device control method according to any one of claims 1 to 9 is executed when the machine-readable instructions are executed by the processor.
  12. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至9任一所述的基于图像识别的设备控制方法。A computer-readable storage medium storing a computer program on the computer-readable storage medium, when the computer program is executed by a processor, the image recognition-based device control method according to any one of claims 1 to 9 is executed.
  13. 一种计算机程序,包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备中的处理器执行时实现权利要求1至9中任意一项所述的基于图像识别的设备控制方法。A computer program, comprising computer-readable codes, when the computer-readable codes are executed in an electronic device, a processor in the electronic device implements the method described in any one of claims 1 to 9 when executed. Device control method based on image recognition.
PCT/CN2021/102478 2021-03-22 2021-06-25 Image recognition-based device control method and apparatus, electronic device, and computer readable storage medium WO2022198819A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110301465.0 2021-03-22
CN202110301465.0A CN113031464B (en) 2021-03-22 2021-03-22 Device control method, device, electronic device and storage medium

Publications (1)

Publication Number Publication Date
WO2022198819A1 true WO2022198819A1 (en) 2022-09-29

Family

ID=76472174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102478 WO2022198819A1 (en) 2021-03-22 2021-06-25 Image recognition-based device control method and apparatus, electronic device, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN113031464B (en)
WO (1) WO2022198819A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113791548A (en) * 2021-09-26 2021-12-14 北京市商汤科技开发有限公司 Device control method, device, electronic device and storage medium
CN114766955A (en) * 2022-05-07 2022-07-22 深圳市恒致云科技有限公司 Press control method and device, intelligent closestool, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710071A (en) * 2018-12-26 2019-05-03 青岛小鸟看看科技有限公司 A kind of screen control method and device
US20200142495A1 (en) * 2018-11-05 2020-05-07 Eyesight Mobile Technologies Ltd. Gesture recognition control device
CN111580652A (en) * 2020-05-06 2020-08-25 Oppo广东移动通信有限公司 Control method and device for video playing, augmented reality equipment and storage medium
CN111736693A (en) * 2020-06-09 2020-10-02 海尔优家智能科技(北京)有限公司 Gesture control method and device of intelligent equipment
CN112166435A (en) * 2019-12-23 2021-01-01 商汤国际私人有限公司 Target tracking method and device, electronic equipment and storage medium
CN112270302A (en) * 2020-11-17 2021-01-26 支付宝(杭州)信息技术有限公司 Limb control method and device and electronic equipment

Family Cites Families (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8166421B2 (en) * 2008-01-14 2012-04-24 Primesense Ltd. Three-dimensional user interface
WO2013126905A2 (en) * 2012-02-24 2013-08-29 Moscarillo Thomas J Gesture recognition devices and methods
GB201305812D0 (en) * 2013-03-28 2013-05-15 Univ Warwick Gesture tracking and classification
JP2015056141A (en) * 2013-09-13 2015-03-23 ソニー株式会社 Information processing device and information processing method
CN104123007B (en) * 2014-07-29 2017-01-11 电子科技大学 Multidimensional weighted 3D recognition method for dynamic gestures
CN104536562B (en) * 2014-12-11 2017-12-15 北京工业大学 A kind of document transmission method based on body-sensing technology and cloud computing
EP3139247A1 (en) * 2015-09-03 2017-03-08 Siemens Aktiengesellschaft Method of and system for performing buyoff operations in a computer-managed production facility
JP2017097577A (en) * 2015-11-24 2017-06-01 キヤノン株式会社 Posture estimation method and posture estimation device
CN105912974A (en) * 2015-12-18 2016-08-31 乐视致新电子科技(天津)有限公司 Gesture identification method and apparatus
CN106296741A (en) * 2016-08-15 2017-01-04 常熟理工学院 Cell high-speed motion feature mask method in nanoscopic image
CN106843469B (en) * 2016-12-27 2020-09-04 广东小天才科技有限公司 Method for controlling wearable device to give time and wearable device
CN107358149B (en) * 2017-05-27 2020-09-22 深圳市深网视界科技有限公司 Human body posture detection method and device
CN107765573A (en) * 2017-10-19 2018-03-06 美的集团股份有限公司 Control method and household electrical appliance, the storage medium of a kind of household electrical appliance
CN107832736B (en) * 2017-11-24 2020-10-27 南京华捷艾米软件科技有限公司 Real-time human body action recognition method and real-time human body action recognition device
CN108229324B (en) * 2017-11-30 2021-01-26 北京市商汤科技开发有限公司 Gesture tracking method and device, electronic equipment and computer storage medium
CN109918975B (en) * 2017-12-13 2022-10-21 腾讯科技(深圳)有限公司 Augmented reality processing method, object identification method and terminal
CN108629283B (en) * 2018-04-02 2022-04-08 北京小米移动软件有限公司 Face tracking method, device, equipment and storage medium
CN108846853A (en) * 2018-04-26 2018-11-20 武汉幻视智能科技有限公司 A kind of teaching behavior analysis method and device based on target following and attitude detection
WO2019216593A1 (en) * 2018-05-11 2019-11-14 Samsung Electronics Co., Ltd. Method and apparatus for pose processing
CN109325408A (en) * 2018-08-14 2019-02-12 莆田学院 A kind of gesture judging method and storage medium
CN111079481B (en) * 2018-10-22 2023-09-26 西安邮电大学 Aggressive behavior recognition method based on two-dimensional skeleton information
CN109902588B (en) * 2019-01-29 2021-08-20 北京奇艺世纪科技有限公司 Gesture recognition method and device and computer readable storage medium
CN109977906B (en) * 2019-04-04 2021-06-01 睿魔智能科技(深圳)有限公司 Gesture recognition method and system, computer device and storage medium
CN110213493B (en) * 2019-06-28 2021-03-02 Oppo广东移动通信有限公司 Device imaging method and device, storage medium and electronic device
CN110322760B (en) * 2019-07-08 2020-11-03 北京达佳互联信息技术有限公司 Voice data generation method, device, terminal and storage medium
CN110674712A (en) * 2019-09-11 2020-01-10 苏宁云计算有限公司 Interactive behavior recognition method and device, computer equipment and storage medium
CN110647834B (en) * 2019-09-18 2021-06-25 北京市商汤科技开发有限公司 Human face and human hand correlation detection method and device, electronic equipment and storage medium
CN111103891B (en) * 2019-12-30 2021-03-16 西安交通大学 Unmanned aerial vehicle rapid posture control system and method based on skeleton point detection
CN111273777A (en) * 2020-02-11 2020-06-12 Oppo广东移动通信有限公司 Virtual content control method and device, electronic equipment and storage medium
CN112307896A (en) * 2020-09-27 2021-02-02 青岛邃智信息科技有限公司 Method for detecting lewd behavior abnormity of elevator under community monitoring scene
CN112287869A (en) * 2020-11-10 2021-01-29 上海依图网络科技有限公司 Image data detection method and device
CN112379773A (en) * 2020-11-12 2021-02-19 深圳市洲明科技股份有限公司 Multi-user three-dimensional motion capturing method, storage medium and electronic device
CN112363626B (en) * 2020-11-25 2021-10-01 广东魅视科技股份有限公司 Large screen interaction control method based on human body posture and gesture posture visual recognition
CN112328090B (en) * 2020-11-27 2023-01-31 北京市商汤科技开发有限公司 Gesture recognition method and device, electronic equipment and storage medium
CN112506340B (en) * 2020-11-30 2023-07-25 北京市商汤科技开发有限公司 Equipment control method, device, electronic equipment and storage medium
CN112506342B (en) * 2020-12-04 2022-01-28 郑州中业科技股份有限公司 Man-machine interaction method and system based on dynamic gesture recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200142495A1 (en) * 2018-11-05 2020-05-07 Eyesight Mobile Technologies Ltd. Gesture recognition control device
CN109710071A (en) * 2018-12-26 2019-05-03 青岛小鸟看看科技有限公司 A kind of screen control method and device
CN112166435A (en) * 2019-12-23 2021-01-01 商汤国际私人有限公司 Target tracking method and device, electronic equipment and storage medium
CN111580652A (en) * 2020-05-06 2020-08-25 Oppo广东移动通信有限公司 Control method and device for video playing, augmented reality equipment and storage medium
CN111736693A (en) * 2020-06-09 2020-10-02 海尔优家智能科技(北京)有限公司 Gesture control method and device of intelligent equipment
CN112270302A (en) * 2020-11-17 2021-01-26 支付宝(杭州)信息技术有限公司 Limb control method and device and electronic equipment

Also Published As

Publication number Publication date
CN113031464A (en) 2021-06-25
CN113031464B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
JP6079832B2 (en) Human computer interaction system, hand-to-hand pointing point positioning method, and finger gesture determination method
TWI497347B (en) Control system using gestures as inputs
US20200097082A1 (en) Neuromuscular text entry, writing and drawing in augmented reality systems
WO2022198819A1 (en) Image recognition-based device control method and apparatus, electronic device, and computer readable storage medium
TWI590098B (en) Control system using facial expressions as inputs
Rautaray et al. Real time multiple hand gesture recognition system for human computer interaction
Tran et al. Real-time virtual mouse system using RGB-D images and fingertip detection
US20120212413A1 (en) Method and System for Touch-Free Control of Devices
US20120242566A1 (en) Vision-Based User Interface and Related Method
US20160086349A1 (en) Tracking hand pose using forearm-hand model
Wong et al. Back-mirror: Back-of-device one-handed interaction on smartphones
CN112987933A (en) Device control method, device, electronic device and storage medium
Rautaray et al. Design of gesture recognition system for dynamic user interface
Vivek Veeriah et al. Robust hand gesture recognition algorithm for simple mouse control
WO2016110259A1 (en) Content acquiring method and apparatus, and user equipment
Liang et al. Turn any display into a touch screen using infrared optical technique
JP5499106B2 (en) Display control apparatus, display control method, information display system, and program
US20170177204A1 (en) Centering gesture to enhance pinch-to-zoom gesture on touchscreens
KR20130123116A (en) Apparatus for recognizing gesture by using see-through display and method thereof
Vasanthagokul et al. Virtual Mouse to Enhance User Experience and Increase Accessibility
JP2021009552A (en) Information processing apparatus, information processing method, and program
Annachhatre et al. Virtual Mouse Using Hand Gesture Recognition-A Systematic Literature Review
JP2016071824A (en) Interface device, finger tracking method, and program
WO2018150757A1 (en) Information processing system, information processing method, and program
Chen Universal Motion-based control and motion recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932435

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932435

Country of ref document: EP

Kind code of ref document: A1