CN113128435A - Hand region segmentation method, device, medium and computer equipment in image - Google Patents

Hand region segmentation method, device, medium and computer equipment in image Download PDF

Info

Publication number
CN113128435A
CN113128435A CN202110458968.9A CN202110458968A CN113128435A CN 113128435 A CN113128435 A CN 113128435A CN 202110458968 A CN202110458968 A CN 202110458968A CN 113128435 A CN113128435 A CN 113128435A
Authority
CN
China
Prior art keywords
image
hand
skin
region
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110458968.9A
Other languages
Chinese (zh)
Other versions
CN113128435B (en
Inventor
毛凤辉
郭振民
李博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Virtual Reality Institute Co Ltd
Original Assignee
Nanchang Virtual Reality Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Virtual Reality Institute Co Ltd filed Critical Nanchang Virtual Reality Institute Co Ltd
Priority to CN202110458968.9A priority Critical patent/CN113128435B/en
Publication of CN113128435A publication Critical patent/CN113128435A/en
Application granted granted Critical
Publication of CN113128435B publication Critical patent/CN113128435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Abstract

The invention discloses a method, a device, a medium and computer equipment for segmenting a hand region in an image, wherein the method comprises the following steps: performing hand target detection on the target image through the trained hand detection model; performing skin region segmentation to obtain a skin region image of only the skin; according to the average depth of the skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of the non-skin pixel at the skin edge and the depth value of the skin pixel, and combining the pixel into the hand area image if the second absolute value is smaller than the skin pixel threshold value; converting the depth hand image into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information. The method and the device can solve the technical defects that the accuracy and the stability of gesture recognition are influenced due to inaccurate segmentation of the gesture area in the prior art.

Description

Hand region segmentation method, device, medium and computer equipment in image
Technical Field
The invention relates to the technical field of image processing, in particular to a method, a device, a medium and computer equipment for segmenting a hand region in an image.
Background
The current gesture area segmentation methods mainly include hand skin segmentation such as RGB, HSV and YCrCb, gesture segmentation for motion detection such as an optical flow method and an inter-frame difference method.
However, the method based on skin detection is easily affected by illumination, and the hand regions segmented under different illumination conditions have large difference, so that the algorithm robustness is low; but also by the skin tissue of other body parts. The gesture segmentation method based on motion detection has low static gesture segmentation rate, is influenced by illumination, and directly leads to accuracy and stability of later-stage gesture recognition due to inaccurate segmentation.
Disclosure of Invention
The invention mainly aims to provide a method and a device for segmenting a hand region in an image and a readable storage medium, and aims to solve the technical defects that the accuracy and the stability of gesture recognition are influenced by inaccurate segmentation of a gesture region in the prior art.
In order to achieve the above object, in one aspect, an embodiment of the present invention provides a method for segmenting a hand region in an image, where the method includes:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
According to one aspect of the above technical solution, the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
According to an aspect of the foregoing technical solution, the step of cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image specifically includes:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
According to an aspect of the foregoing technical solution, a depth image of a skin region is cut out according to an intermediate image and a skin region image, whether skin regions connected in the depth image are an entire region is detected, and the step of separating each skin region in the depth image based on a detection result specifically includes:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
According to one aspect of the above technical solution, the selection formula of the seed region is:
Figure BDA0003041543420000031
wherein di represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
According to one aspect of the above technical solution, the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and if the second absolute value is smaller than a skin pixel threshold, combining the pixel into the seed area to obtain the depth hand image specifically includes:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
According to one aspect of the above technical solution, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically including:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
and restoring the hand region obtained by segmentation to a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0,0, 0).
The method for dividing the hand region in the image has the following beneficial effects:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
In another aspect, the present invention further provides an apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
In another aspect, the present invention further provides a readable storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned method for segmenting a hand region in an image.
In another aspect, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned method for segmenting a hand region in an image.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flowchart illustrating a method for segmenting a hand region in an image according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating an RGB target image according to a first embodiment of the present invention;
FIG. 3 is a diagram illustrating an RGB target image with a minimum bounding rectangle according to a first embodiment of the present invention;
FIG. 4 is a diagram of an intermediate image1 according to a first embodiment of the present invention;
FIG. 5 is a schematic view of a skin area image2 in accordance with a first embodiment of the present invention;
FIG. 6 is a schematic view of a color converted image3 of a skin area image2 according to a first embodiment of the present invention;
FIG. 7 is a schematic diagram of a non-hand area image4 and a hand area image5 separating skin areas according to the first embodiment of the present invention;
FIG. 8 is a diagram of a depth hand image6 according to the first embodiment of the present invention;
FIG. 9 is a diagram of a binary image7 according to the first embodiment of the present invention;
FIG. 10 is a diagram illustrating a conversion of a binary image7 into an RGB image8 according to a first embodiment of the present invention;
FIG. 11 is a diagram of a gesture image9 according to a first embodiment of the present invention;
FIG. 12 is a block diagram of a hand segmentation apparatus according to a second embodiment of the present invention;
the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a first embodiment of the present invention provides a method for segmenting a hand region in an image, including steps S10-S50:
s10, performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, step S10 specifically includes:
s11, acquiring RGB images which are acquired by the RGB camera and contain hands; the RGB image is shown in FIG. 2;
s12, inputting the RGB images into the trained hand detection model for hand target detection;
and S13, obtaining the graphic information (as shown in FIG. 3) of the minimum circumscribed rectangle of the hand region according to the detection result of the hand target detection, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p)x,py) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
S20, according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out the intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
wherein, step S20 specifically includes:
s21, obtaining a bounding box of a minimum bounding rectangle of the hand region through hand target detection, and cutting the hand region from the target image to obtain an intermediate image (such as image1 in FIG. 4), wherein the width and the height of the intermediate image are w and h respectively;
s22, converting the RGB space (image 2 in fig. 5) of the intermediate image into YCrCb space (image 3 in fig. 6), and detecting the skin in the bounding box by an elliptical skin detection method to obtain the skin region image (image 3 in fig. 6).
S30, cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on the detection result;
it should be noted that, since the skin area image including the skin is obtained in step S20, skin areas of other non-hand areas (for example, a face area, a neck area, etc., as shown in image4 in fig. 7) may be included in the skin area image, and therefore the skin areas of the non-hand areas need to be marked for processing.
Wherein, step S30 specifically includes:
s31, circularly traversing a first absolute value of a depth value difference of adjacent pixels in each connected region in the depth image, judging whether the first absolute value is smaller than a first pixel threshold, if the first absolute value is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total pixel number of each marked region, and if the total pixel is smaller than a second pixel threshold, directly treating the region as a background;
s32, marking different depth areas, and separating skin areas (such as image4 and image5 in the depth image according to the marks;
s40, calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
wherein, the formula of selecting the seed area is as follows:
Figure BDA0003041543420000071
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
In this embodiment, step S40 specifically includes:
traversing a second absolute value of a difference value between the depth value of the skin edge non-skin pixel of the seed region and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold, determining that the pixel belongs to the hand region, merging the pixel into the hand region, and traversing again in the next round until the hand skin region does not grow, so as to obtain a depth hand image (e.g., image6 in fig. 8).
And S50, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.
Wherein, step S50 specifically includes:
s51, converting the depth hand image (image 6 in fig. 8) into a binary image (image 7 in fig. 9), wherein the pixel value of the hand region in the binary image is 255 and the pixel value of the non-hand region is 0;
s52, converting the binary image into a three-channel RGB image (such as image8 in FIG. 10), wherein the pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing hands;
digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;
and S53, restoring the hand region (such as image8 in FIG. 3) obtained by segmentation to the gesture image (such as image9 in FIG. 11) corresponding to the target image graphic information according to the coordinates of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel value, and the rest background pixel values are (0,0, 0).
Specifically, the width and height W, H of the intermediate image and the coordinates (p) of the top left corner point of the minimum bounding rectanglex,py) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by the division to the original image size, namely creating a new image with the width and height of W, H and the RGB value of (0,0,0), copying the RGB image (image8) of the hand only obtained in the step S42Beibei to the newly created image (image9), only the RGB figure of the hand has the coordinates of the upper left corner point in the newly created image (p)x,py) The width and height are w and h respectively.
According to the method for segmenting the hand region in the image provided by the embodiment, the following beneficial effects are achieved:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
A second embodiment of the present invention provides an apparatus for segmenting a hand region in an image, the apparatus including:
the detection module 10 is used for performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, detection module 10 specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera; an RGB image such as the RGB image in FIG. 2;
inputting the RGB image into the trained hand detection model for hand target detection;
according to the detection result of the hand target detection, obtaining the graph information (such as the SRC diagram in fig. 2) of the minimum circumscribed rectangle of the hand region, where the graph information includes the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, and the width and height of the rectangle.
Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p)x,py) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
The segmentation module 20 is configured to cut out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and perform color space conversion on the intermediate image to obtain a skin region image;
the segmentation module 20 specifically includes:
obtaining a bounding box of a minimum bounding rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image (such as image1), wherein the width and the height of the intermediate image are w and h respectively;
the RGB space of the intermediate image is converted into the YCrCb space, and the skin in the bounding box is detected by an elliptical skin detection method to obtain the skin area image (e.g., image2 in fig. 2).
A separation module 30, configured to cut out a depth image of the skin region according to the intermediate image and the skin region image, detect whether the skin regions connected in the depth image are an entire region, and separate each skin region in the depth image based on a detection result;
it should be noted that, since the skin region image including the skin is obtained in the partitioning module, skin regions (e.g., a face region, a neck region, etc.) of other non-hand regions may be included in the skin region image, and therefore the skin region of the non-hand region needs to be marked for processing.
Wherein the separation module 30 comprises:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly performing background removal processing;
marking different depth areas, and separating skin areas (such as image4 and image5 in the depth image) according to the marks;
the combining module 40 is configured to calculate an average depth of each skin area, select a skin area with the smallest average depth as a seed area for hand target detection, circularly traverse a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combine the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold;
wherein, the formula of selecting the seed area is as follows:
Figure BDA0003041543420000101
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
In this embodiment, the combining module 40 is specifically configured to:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
And the conversion module 50 is configured to convert the depth hand image obtained by the segmentation into a binary image, convert the binary image into a three-channel RGB image, and restore the hand region obtained by the segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.
The conversion module 50 specifically includes:
converting the segmented depth hand image (image 6 in fig. 3) into a binary image (image 7 in fig. 3) with a hand region pixel value of 255 and a non-hand region pixel value of 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;
and restoring the hand region (such as image8 in fig. 3) obtained by segmentation to a gesture image (such as image9 in fig. 3) corresponding to the target image graph information according to the coordinates of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture image is (0,0, 0).
Specifically, the width and height W, H of the intermediate image and the coordinates (p) of the top left corner point of the minimum bounding rectanglex,py) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly creating an image with the width and height W, H and the RGB value of (0,0,0), copying the hand-only RGB image (image8) obtained in the step S42 into the newly created image (image9), wherein the upper-left corner coordinates of the hand-only RGB image in the newly created image are (p)x,py) The width and height are w and h respectively.
According to the hand region segmentation device in the image provided by the embodiment, the following beneficial effects are achieved:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.
Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the program.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A method for segmenting a hand region in an image, the method comprising:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
2. The method for segmenting the hand region in the image according to claim 1, wherein the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically comprises:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
3. The method for segmenting a hand region in an image according to claim 2, wherein the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain the skin region image specifically comprises:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
4. The method for segmenting a hand region in an image according to claim 3, wherein a depth image of the skin region is cut out from the intermediate image and the skin region image, whether the skin regions connected in the depth image are an entire region or not is detected, and the step of separating each skin region in the depth image based on the detection result is specifically:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
5. The method for segmenting the hand region in the image according to claim 4, wherein the seed region is selected according to the formula:
Figure FDA0003041543410000021
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
6. The method as claimed in claim 5, wherein the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel and a depth value of a skin pixel in the seed region, and if the second absolute value is smaller than a skin pixel threshold, the step of combining the pixel into the seed region to obtain the deep hand image is specifically as follows:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
7. The method for segmenting a hand region in an image according to claim 6, wherein the depth hand image obtained by segmentation is converted into a binary image, the binary image is converted into a three-channel RGB image, and the hand region obtained by segmentation is restored into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically comprising:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
and restoring the hand region obtained by segmentation to a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0,0, 0).
8. An apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
9. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the program.
CN202110458968.9A 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image Active CN113128435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110458968.9A CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110458968.9A CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Publications (2)

Publication Number Publication Date
CN113128435A true CN113128435A (en) 2021-07-16
CN113128435B CN113128435B (en) 2022-11-22

Family

ID=76780179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110458968.9A Active CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Country Status (1)

Country Link
CN (1) CN113128435B (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324019A (en) * 2011-08-12 2012-01-18 浙江大学 Method and system for automatically extracting gesture candidate region in video sequence
CN102402687A (en) * 2010-09-13 2012-04-04 三星电子株式会社 Method and device for detecting rigid body part direction based on depth information
CN103890782A (en) * 2011-10-18 2014-06-25 诺基亚公司 Methods and apparatuses for gesture recognition
US20160014392A1 (en) * 2014-07-11 2016-01-14 Microsoft Technology Licensing, Llc. Camera system and method for hair segmentation
CN105893925A (en) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 Human hand detection method based on complexion and device
CN106097354A (en) * 2016-06-16 2016-11-09 南昌航空大学 A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing
CN106682571A (en) * 2016-11-08 2017-05-17 中国民航大学 Skin color segmentation and wavelet transformation-based face detection method
CN107408205A (en) * 2015-03-11 2017-11-28 微软技术许可有限责任公司 Foreground and background is distinguished with infrared imaging
CN108256421A (en) * 2017-12-05 2018-07-06 盈盛资讯科技有限公司 A kind of dynamic gesture sequence real-time identification method, system and device
CN108564070A (en) * 2018-05-07 2018-09-21 京东方科技集团股份有限公司 Method for extracting gesture and its device
CN108647597A (en) * 2018-04-27 2018-10-12 京东方科技集团股份有限公司 A kind of wrist recognition methods, gesture identification method, device and electronic equipment
CN109214297A (en) * 2018-08-09 2019-01-15 华南理工大学 A kind of static gesture identification method of combination depth information and Skin Color Information
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110335342A (en) * 2019-06-12 2019-10-15 清华大学 It is a kind of for immersing the hand model Real-time Generation of mode simulator
CN111553891A (en) * 2020-04-23 2020-08-18 大连理工大学 Handheld object existence detection method
CN111831123A (en) * 2020-07-23 2020-10-27 山东大学 Gesture interaction method and system suitable for desktop mixed reality environment
CN112085855A (en) * 2020-09-09 2020-12-15 南昌虚拟现实研究院股份有限公司 Interactive image editing method and device, storage medium and computer equipment
CN112232332A (en) * 2020-12-17 2021-01-15 四川圣点世纪科技有限公司 Non-contact palm detection method based on video sequence
CN112509117A (en) * 2020-11-30 2021-03-16 清华大学 Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium
CN112686231A (en) * 2021-03-15 2021-04-20 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and computer equipment

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402687A (en) * 2010-09-13 2012-04-04 三星电子株式会社 Method and device for detecting rigid body part direction based on depth information
CN102324019A (en) * 2011-08-12 2012-01-18 浙江大学 Method and system for automatically extracting gesture candidate region in video sequence
CN103890782A (en) * 2011-10-18 2014-06-25 诺基亚公司 Methods and apparatuses for gesture recognition
US20160014392A1 (en) * 2014-07-11 2016-01-14 Microsoft Technology Licensing, Llc. Camera system and method for hair segmentation
CN107408205A (en) * 2015-03-11 2017-11-28 微软技术许可有限责任公司 Foreground and background is distinguished with infrared imaging
CN105893925A (en) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 Human hand detection method based on complexion and device
CN106097354A (en) * 2016-06-16 2016-11-09 南昌航空大学 A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing
CN106682571A (en) * 2016-11-08 2017-05-17 中国民航大学 Skin color segmentation and wavelet transformation-based face detection method
CN108256421A (en) * 2017-12-05 2018-07-06 盈盛资讯科技有限公司 A kind of dynamic gesture sequence real-time identification method, system and device
CN108647597A (en) * 2018-04-27 2018-10-12 京东方科技集团股份有限公司 A kind of wrist recognition methods, gesture identification method, device and electronic equipment
CN108564070A (en) * 2018-05-07 2018-09-21 京东方科技集团股份有限公司 Method for extracting gesture and its device
CN109214297A (en) * 2018-08-09 2019-01-15 华南理工大学 A kind of static gesture identification method of combination depth information and Skin Color Information
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110335342A (en) * 2019-06-12 2019-10-15 清华大学 It is a kind of for immersing the hand model Real-time Generation of mode simulator
CN111553891A (en) * 2020-04-23 2020-08-18 大连理工大学 Handheld object existence detection method
CN111831123A (en) * 2020-07-23 2020-10-27 山东大学 Gesture interaction method and system suitable for desktop mixed reality environment
CN112085855A (en) * 2020-09-09 2020-12-15 南昌虚拟现实研究院股份有限公司 Interactive image editing method and device, storage medium and computer equipment
CN112509117A (en) * 2020-11-30 2021-03-16 清华大学 Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium
CN112232332A (en) * 2020-12-17 2021-01-15 四川圣点世纪科技有限公司 Non-contact palm detection method based on video sequence
CN112686231A (en) * 2021-03-15 2021-04-20 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and computer equipment

Also Published As

Publication number Publication date
CN113128435B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN113781402B (en) Method and device for detecting scratch defects on chip surface and computer equipment
CN110717489B (en) Method, device and storage medium for identifying text region of OSD (on Screen display)
CN109978839B (en) Method for detecting wafer low-texture defects
CN105184763B (en) Image processing method and device
US10748023B2 (en) Region-of-interest detection apparatus, region-of-interest detection method, and recording medium
CN109308465B (en) Table line detection method, device, equipment and computer readable medium
US20230009564A1 (en) Character segmentation method and apparatus, and computer-readable storage medium
WO2007061779A1 (en) Shadow detection in images
US9524445B2 (en) Methods and systems for suppressing non-document-boundary contours in an image
US20170178341A1 (en) Single Parameter Segmentation of Images
CN112308854A (en) Automatic detection method and system for chip surface flaws and electronic equipment
CN113609984A (en) Pointer instrument reading identification method and device and electronic equipment
CN112733823B (en) Method and device for extracting key frame for gesture recognition and readable storage medium
CN113128435B (en) Hand region segmentation method, device, medium and computer equipment in image
Fang et al. 1-D barcode localization in complex background
CN115049713A (en) Image registration method, device, equipment and readable storage medium
CN108765456A (en) Method for tracking target, system based on linear edge feature
CN115187744A (en) Cabinet identification method based on laser point cloud
CN114140620A (en) Object straight line contour detection method
CN111476800A (en) Character region detection method and device based on morphological operation
JP4253265B2 (en) Shadow detection apparatus, shadow detection method and shadow detection program, image processing apparatus using shadow detection apparatus, image processing method using shadow detection method, and image processing program using shadow detection program
JP2004094427A (en) Slip image processor and program for realizing the same device
CN117351011B (en) Screen defect detection method, apparatus, and readable storage medium
CN112101139B (en) Human shape detection method, device, equipment and storage medium
CN109271986B (en) Digital identification method based on Second-Confirm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant