CN113128435B - Hand region segmentation method, device, medium and computer equipment in image - Google Patents

Hand region segmentation method, device, medium and computer equipment in image Download PDF

Info

Publication number
CN113128435B
CN113128435B CN202110458968.9A CN202110458968A CN113128435B CN 113128435 B CN113128435 B CN 113128435B CN 202110458968 A CN202110458968 A CN 202110458968A CN 113128435 B CN113128435 B CN 113128435B
Authority
CN
China
Prior art keywords
image
hand
skin
depth
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110458968.9A
Other languages
Chinese (zh)
Other versions
CN113128435A (en
Inventor
毛凤辉
郭振民
李博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Virtual Reality Institute Co Ltd
Original Assignee
Nanchang Virtual Reality Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Virtual Reality Institute Co Ltd filed Critical Nanchang Virtual Reality Institute Co Ltd
Priority to CN202110458968.9A priority Critical patent/CN113128435B/en
Publication of CN113128435A publication Critical patent/CN113128435A/en
Application granted granted Critical
Publication of CN113128435B publication Critical patent/CN113128435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a device, a medium and computer equipment for segmenting a hand region in an image, wherein the method comprises the following steps: performing hand target detection on the target image through the trained hand detection model; performing skin region segmentation to obtain a skin region image of only the skin; according to the average depth of the skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of the non-skin pixel at the skin edge and the depth value of the skin pixel, and combining the pixel into the hand area image if the second absolute value is smaller than the skin pixel threshold value; converting the depth hand image into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information. The method and the device can solve the technical defects that the accuracy and the stability of gesture recognition are influenced due to inaccurate segmentation of the gesture area in the prior art.

Description

Hand region segmentation method, device, medium and computer equipment in image
Technical Field
The invention relates to the technical field of image processing, in particular to a method, a device, a medium and computer equipment for segmenting a hand region in an image.
Background
The current gesture area segmentation methods mainly include hand skin segmentation such as RGB, HSV and YCrCb, gesture segmentation for motion detection such as an optical flow method and an inter-frame difference method.
However, the method based on skin detection is easily affected by illumination, and the hand regions segmented under different illumination conditions have large difference, so that the algorithm robustness is low; but also by the skin tissue of other body parts. The gesture segmentation method based on motion detection has low static gesture segmentation rate, is influenced by illumination, and directly leads to accuracy and stability of later-stage gesture recognition due to inaccurate segmentation.
Disclosure of Invention
The invention mainly aims to provide a method and a device for segmenting a hand region in an image and a readable storage medium, and aims to solve the technical defects that the accuracy and the stability of gesture recognition are influenced by inaccurate segmentation of a gesture region in the prior art.
In order to achieve the above object, in one aspect, an embodiment of the present invention provides a method for segmenting a hand region in an image, where the method includes:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on a detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
According to one aspect of the above technical solution, the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
According to one aspect of the foregoing technical solution, the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image specifically includes:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
According to an aspect of the foregoing technical solution, a depth image of a skin region is cut out according to an intermediate image and a skin region image, whether skin regions connected in the depth image are an entire region is detected, and the step of separating each skin region in the depth image based on a detection result specifically includes:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
According to one aspect of the above technical solution, the selection formula of the seed region is:
Figure BDA0003041543420000031
wherein di represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d n Representing the depth to which the pixel corresponds, d k Indicating that the k-th region depth mean is minimal.
According to one aspect of the above technical solution, the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and if the second absolute value is smaller than a skin pixel threshold, combining the pixel into the seed area to obtain the depth hand image specifically includes:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
According to one aspect of the above technical solution, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically including:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum external rectangular frame to obtain an RGB image only containing hands;
and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0, 0).
The method for dividing the hand region in the image has the following beneficial effects:
(1) The hand region is accurately detected through the deep learning target hand detection model, the hand region is cut out, and the hand positioning is guaranteed and the later segmentation calculated amount is reduced.
(2) Skin interference in different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference in other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
In another aspect, the present invention further provides an apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on a detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
In another aspect, the present invention further provides a readable storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned method for segmenting a hand region in an image.
In another aspect, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned method for segmenting a hand region in an image.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flowchart illustrating a method for segmenting a hand region in an image according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating an RGB target image according to a first embodiment of the present invention;
FIG. 3 is a diagram illustrating an RGB target image with a minimum bounding rectangle according to a first embodiment of the present invention;
FIG. 4 is a diagram of an intermediate image1 according to a first embodiment of the present invention;
FIG. 5 is a diagram of a skin area image2 according to a first embodiment of the present invention;
FIG. 6 is a schematic view of an image3 of a skin area image2 after color conversion according to a first embodiment of the present invention;
FIG. 7 is a schematic view of a non-hand area image4 and a hand area image5 for separating skin areas according to the first embodiment of the present invention;
FIG. 8 is a schematic view of a deep hand image6 according to the first embodiment of the present invention;
FIG. 9 is a diagram of a binary image7 according to the first embodiment of the present invention;
FIG. 10 is a diagram illustrating a conversion of a binary image7 into an RGB image8 according to a first embodiment of the present invention;
FIG. 11 is a schematic diagram of a gesture image9 according to the first embodiment of the present invention;
FIG. 12 is a block diagram of a hand segmentation apparatus according to a second embodiment of the present invention;
the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Referring to fig. 1, a first embodiment of the present invention provides a method for segmenting a hand region in an image, including steps S10-S50:
s10, performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, step S10 specifically includes:
s11, acquiring an RGB image which is acquired by an RGB camera and contains a hand part; the RGB image is shown in FIG. 2;
s12, inputting the RGB image into the trained hand detection model to perform hand target detection;
and S13, obtaining the graphic information (shown in figure 3) of the minimum circumscribed rectangle of the hand region according to the detection result of the hand target detection, wherein the graphic information comprises the top left corner vertex coordinate of the minimum circumscribed rectangle, and the width and the height of the rectangle.
Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p) x ,p y ) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
S20, according to the graphic information of the minimum circumscribed rectangle of the hand area, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand area in the target image, and performing color space conversion on the intermediate image to obtain a skin area image;
wherein, step S20 specifically includes:
s21, obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and segmenting the hand region from a target image to obtain an intermediate image (such as image1 in FIG. 4), wherein the width and the height of the intermediate image are w and h respectively;
s22, converting the RGB space (e.g. image2 in fig. 5) of the intermediate image into the YCrCb space (e.g. image3 in fig. 6), and detecting the skin in the bounding box by an elliptical skin detection method to obtain the skin region image (e.g. image3 in fig. 6).
S30, cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on the detection result;
it should be noted that, since the skin region image including the skin is obtained in step S20, skin regions of other non-hand regions (for example, a face region, a neck region, and the like, as shown in image4 in fig. 7) may be included in the skin region image, and therefore the skin regions of the non-hand regions need to be marked for processing.
Wherein, step S30 specifically includes:
s31, circularly traversing first absolute values of depth value difference values of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold value, if the first absolute values are smaller than or equal to the first pixel threshold value, the adjacent pixels are the same parts, if the first absolute values are larger than the first pixel threshold value, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total pixel number of each marked region, and if the total pixel is smaller than a second pixel threshold value, directly taking the region as a background;
s32, marking different depth areas, and separating skin areas (such as image4 and image5 in the image 7) in the depth image according to the marks;
s40, calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
wherein, the formula of selecting the seed area is as follows:
Figure BDA0003041543420000071
wherein d is i Represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d n Indicating the depth to which the pixel corresponds, d k Indicating that the k-th region depth mean is minimal.
In this embodiment, step S40 specifically includes:
traversing a second absolute value of a difference value between the depth value of the skin edge non-skin pixel of the seed region and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold, determining that the pixel belongs to the hand region, merging the pixel into the hand region, and traversing again in the next round until the hand skin region does not grow, so as to obtain a depth hand image (e.g., image6 in fig. 8).
And S50, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
Wherein, step S50 specifically includes:
s51, converting the depth hand image (image 6 in figure 8) into a binary image (image 7 in figure 9), wherein the pixel value of the hand area in the binary image is 255, and the pixel value of the non-hand area is 0;
s52, converting the binary image into a three-channel RGB image (such as image8 in FIG. 10), wherein the pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum circumscribed rectangular frame to obtain an RGB image only containing hands;
wherein, digital image logic and operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not repeated herein;
s53, restoring the hand region (such as image8 in the figure 3) obtained by segmentation to the gesture image (such as image9 in the figure 11) corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel value, and the other background pixel values are (0, 0).
Specifically, the width and height W and H of the intermediate image and the coordinates (p) of the upper left corner point of the minimum bounding rectangle are determined x ,p y ) And the widths and heights W and H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly building an image with the widths and heights W and H and the RGB values of (0, 0 and 0), copying the RGB image (image 8) of only the hand obtained in the step S42 into the newly built image (image 9), wherein the coordinate of the upper left corner point of the RGB image of only the hand in the newly built image is (p) x ,p y ) The width and height are w and h respectively.
According to the method for segmenting the hand region in the image provided by the embodiment, the following beneficial effects are achieved:
(1) The hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
A second embodiment of the present invention provides an apparatus for segmenting a hand region in an image, the apparatus including:
the detection module 10 is used for performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, detection module 10 specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera; an RGB image such as the RGB image in FIG. 2;
inputting the RGB image into the trained hand detection model for hand target detection;
according to the detection result of the hand target detection, obtaining the graph information (such as the SRC diagram in fig. 2) of the minimum circumscribed rectangle of the hand region, where the graph information includes the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, and the width and height of the rectangle.
Wherein, the coordinate of the top left corner point of the minimum circumscribed rectangle can be used (p) x ,p y ) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
The segmentation module 20 is configured to cut out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and perform color space conversion on the intermediate image to obtain a skin region image;
the segmentation module 20 specifically includes:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image (such as image 1), wherein the width and the height of the intermediate image are w and h respectively;
converting the RGB space of the intermediate image into YCrCb space, and detecting the skin in the boundary frame by an ellipse skin detection method to obtain the skin area image (such as image2 in FIG. 2).
A separation module 30, configured to cut out a depth image of the skin region according to the intermediate image and the skin region image, detect whether the skin regions connected in the depth image are an entire region, and separate each skin region in the depth image based on a detection result;
it should be noted that, since the skin region image including the skin is obtained in the partitioning module, skin regions (e.g., a face region, a neck region, etc.) of other non-hand regions may also be included in the skin region image, and therefore the skin region of the non-hand region needs to be marked for processing.
Wherein, the separation module 30 includes:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly performing background removal processing;
marking different depth areas, and separating skin areas (such as image4 and image5 in the image 3) in the depth image according to the marks;
a combination module 40, configured to calculate an average depth of each skin area, select a skin area with the smallest average depth as a seed area for hand target detection, cycle through a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combine the pixel into the seed area to obtain a deep hand image if the second absolute value is smaller than a skin pixel threshold;
the seed region selection formula is as follows:
Figure BDA0003041543420000101
wherein d is i Represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d n Representing the depth to which the pixel corresponds, d k Indicating that the k-th region depth mean is minimal.
In this embodiment, the combining module 40 is specifically configured to:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
And the conversion module 50 is configured to convert the depth hand image obtained by the segmentation into a binary image, convert the binary image into a three-channel RGB image, and restore the hand region obtained by the segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.
The conversion module 50 specifically includes:
converting the divided depth hand image (image 6 in fig. 3) into a binary image (image 7 in fig. 3), wherein the pixel value of the hand area in the binary image is 255, and the pixel value of the non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;
and restoring the hand region (such as image8 in the figure 3) obtained by segmentation into a gesture image (such as image9 in the figure 3) corresponding to the target image graph information according to the coordinates and the width and height information of the upper left corner point of the minimum circumscribed rectangle in the intermediate image, wherein the RGB values of the gesture image are (0, 0 and 0).
Specifically, the width and height W and H of the intermediate image and the coordinates (p) of the upper left corner point of the minimum bounding rectangle are determined x ,p y ) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly creating an image with the width and height W, H and the RGB value of (0, 0), copying the hand-only RGB image (image 8) obtained in the step S42 into the newly created image (image 9), wherein the upper left corner point coordinate of the hand-only RGB image in the newly created image is (p) x ,p y ) The width and height are w and h respectively.
According to the hand region segmentation device in the image provided by the embodiment, the following beneficial effects are achieved:
(1) The hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference in different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference in other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.
Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the program.
The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following technologies, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (9)

1. A method for segmenting a hand region in an image, the method comprising:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are a whole area, and separating each skin area in the depth image based on the detection result specifically comprises the following steps:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
2. The method for segmenting the hand region in the image according to claim 1, wherein the step of performing the hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically comprises:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
3. The method for segmenting a hand region in an image according to claim 2, wherein the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain the skin region image specifically comprises:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
4. The method for segmenting a hand region in an image according to claim 1, wherein the selection formula of the seed region is as follows:
Figure FDA0003834935730000021
wherein d is i Represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d n Representing the depth to which the pixel corresponds, d k Indicating that the k-th region depth mean is minimal.
5. The method as claimed in claim 4, wherein the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel and a depth value of a skin pixel in the seed region, and if the second absolute value is smaller than a skin pixel threshold, the step of combining the pixel into the seed region to obtain the deep hand image is specifically as follows:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
6. The method for segmenting a hand region in an image according to claim 5, wherein the depth hand image obtained by segmentation is converted into a binary image, the binary image is converted into a three-channel RGB image, and the hand region obtained by segmentation is restored into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically comprising:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0, 0).
7. An apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on a detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand area obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image;
the separation module specifically:
the method for cutting out the depth image of the skin area according to the intermediate image and the skin area image and detecting whether the skin areas connected in the depth image are a whole area or not comprises the following steps of:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
8. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-6 when executing the program.
CN202110458968.9A 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image Active CN113128435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110458968.9A CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110458968.9A CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Publications (2)

Publication Number Publication Date
CN113128435A CN113128435A (en) 2021-07-16
CN113128435B true CN113128435B (en) 2022-11-22

Family

ID=76780179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110458968.9A Active CN113128435B (en) 2021-04-27 2021-04-27 Hand region segmentation method, device, medium and computer equipment in image

Country Status (1)

Country Link
CN (1) CN113128435B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324019A (en) * 2011-08-12 2012-01-18 浙江大学 Method and system for automatically extracting gesture candidate region in video sequence
CN102402687A (en) * 2010-09-13 2012-04-04 三星电子株式会社 Method and device for detecting rigid body part direction based on depth information
CN103890782A (en) * 2011-10-18 2014-06-25 诺基亚公司 Methods and apparatuses for gesture recognition
CN105893925A (en) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 Human hand detection method based on complexion and device
CN106097354A (en) * 2016-06-16 2016-11-09 南昌航空大学 A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing
CN107408205A (en) * 2015-03-11 2017-11-28 微软技术许可有限责任公司 Foreground and background is distinguished with infrared imaging
CN108256421A (en) * 2017-12-05 2018-07-06 盈盛资讯科技有限公司 A kind of dynamic gesture sequence real-time identification method, system and device
CN108647597A (en) * 2018-04-27 2018-10-12 京东方科技集团股份有限公司 A kind of wrist recognition methods, gesture identification method, device and electronic equipment
CN109214297A (en) * 2018-08-09 2019-01-15 华南理工大学 A kind of static gesture identification method of combination depth information and Skin Color Information
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110335342A (en) * 2019-06-12 2019-10-15 清华大学 It is a kind of for immersing the hand model Real-time Generation of mode simulator
CN111553891A (en) * 2020-04-23 2020-08-18 大连理工大学 Handheld object existence detection method
CN111831123A (en) * 2020-07-23 2020-10-27 山东大学 Gesture interaction method and system suitable for desktop mixed reality environment
CN112085855A (en) * 2020-09-09 2020-12-15 南昌虚拟现实研究院股份有限公司 Interactive image editing method and device, storage medium and computer equipment
CN112232332A (en) * 2020-12-17 2021-01-15 四川圣点世纪科技有限公司 Non-contact palm detection method based on video sequence
CN112509117A (en) * 2020-11-30 2021-03-16 清华大学 Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium
CN112686231A (en) * 2021-03-15 2021-04-20 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and computer equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767586B2 (en) * 2014-07-11 2017-09-19 Microsoft Technology Licensing, Llc Camera system and method for hair segmentation
CN106682571B (en) * 2016-11-08 2019-09-27 中国民航大学 Method for detecting human face based on skin color segmentation and wavelet transformation
CN108564070B (en) * 2018-05-07 2021-05-11 京东方科技集团股份有限公司 Method and device for extracting gestures

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402687A (en) * 2010-09-13 2012-04-04 三星电子株式会社 Method and device for detecting rigid body part direction based on depth information
CN102324019A (en) * 2011-08-12 2012-01-18 浙江大学 Method and system for automatically extracting gesture candidate region in video sequence
CN103890782A (en) * 2011-10-18 2014-06-25 诺基亚公司 Methods and apparatuses for gesture recognition
CN107408205A (en) * 2015-03-11 2017-11-28 微软技术许可有限责任公司 Foreground and background is distinguished with infrared imaging
CN105893925A (en) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 Human hand detection method based on complexion and device
CN106097354A (en) * 2016-06-16 2016-11-09 南昌航空大学 A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing
CN108256421A (en) * 2017-12-05 2018-07-06 盈盛资讯科技有限公司 A kind of dynamic gesture sequence real-time identification method, system and device
CN108647597A (en) * 2018-04-27 2018-10-12 京东方科技集团股份有限公司 A kind of wrist recognition methods, gesture identification method, device and electronic equipment
CN109214297A (en) * 2018-08-09 2019-01-15 华南理工大学 A kind of static gesture identification method of combination depth information and Skin Color Information
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110335342A (en) * 2019-06-12 2019-10-15 清华大学 It is a kind of for immersing the hand model Real-time Generation of mode simulator
CN111553891A (en) * 2020-04-23 2020-08-18 大连理工大学 Handheld object existence detection method
CN111831123A (en) * 2020-07-23 2020-10-27 山东大学 Gesture interaction method and system suitable for desktop mixed reality environment
CN112085855A (en) * 2020-09-09 2020-12-15 南昌虚拟现实研究院股份有限公司 Interactive image editing method and device, storage medium and computer equipment
CN112509117A (en) * 2020-11-30 2021-03-16 清华大学 Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium
CN112232332A (en) * 2020-12-17 2021-01-15 四川圣点世纪科技有限公司 Non-contact palm detection method based on video sequence
CN112686231A (en) * 2021-03-15 2021-04-20 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and computer equipment

Also Published As

Publication number Publication date
CN113128435A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN109978839B (en) Method for detecting wafer low-texture defects
CN110717489B (en) Method, device and storage medium for identifying text region of OSD (on Screen display)
US8411986B2 (en) Systems and methods for segmenation by removal of monochromatic background with limitied intensity variations
US10748023B2 (en) Region-of-interest detection apparatus, region-of-interest detection method, and recording medium
CN113781402A (en) Method and device for detecting chip surface scratch defects and computer equipment
US20230009564A1 (en) Character segmentation method and apparatus, and computer-readable storage medium
CN111832659B (en) Laser marking system and method based on feature point extraction algorithm detection
US20120320433A1 (en) Image processing method, image processing device and scanner
CN113609984A (en) Pointer instrument reading identification method and device and electronic equipment
CN112733823B (en) Method and device for extracting key frame for gesture recognition and readable storage medium
CN112733855B (en) Table structuring method, table recovering device and device with storage function
US8254693B2 (en) Image processing apparatus, image processing method and program
CN113128435B (en) Hand region segmentation method, device, medium and computer equipment in image
CN115049713A (en) Image registration method, device, equipment and readable storage medium
CN112101139B (en) Human shape detection method, device, equipment and storage medium
WO2022056875A1 (en) Method and apparatus for segmenting nameplate image, and computer-readable storage medium
CN114445814A (en) Character region extraction method and computer-readable storage medium
CN114140620A (en) Object straight line contour detection method
CN111476800A (en) Character region detection method and device based on morphological operation
CN112560740A (en) PCA-Kmeans-based visible light remote sensing image change detection method
JP4253265B2 (en) Shadow detection apparatus, shadow detection method and shadow detection program, image processing apparatus using shadow detection apparatus, image processing method using shadow detection method, and image processing program using shadow detection program
JP2004094427A (en) Slip image processor and program for realizing the same device
CN110598697A (en) Container number positioning method based on thickness character positioning
CN117351011B (en) Screen defect detection method, apparatus, and readable storage medium
US11769322B2 (en) Program creation device, object detection system, anchor setting method, and anchor setting program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant