CN113128435A - Hand region segmentation method, device, medium and computer equipment in image - Google Patents
Hand region segmentation method, device, medium and computer equipment in image Download PDFInfo
- Publication number
- CN113128435A CN113128435A CN202110458968.9A CN202110458968A CN113128435A CN 113128435 A CN113128435 A CN 113128435A CN 202110458968 A CN202110458968 A CN 202110458968A CN 113128435 A CN113128435 A CN 113128435A
- Authority
- CN
- China
- Prior art keywords
- image
- hand
- skin
- region
- depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
Abstract
The invention discloses a method, a device, a medium and computer equipment for segmenting a hand region in an image, wherein the method comprises the following steps: performing hand target detection on the target image through the trained hand detection model; performing skin region segmentation to obtain a skin region image of only the skin; according to the average depth of the skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of the non-skin pixel at the skin edge and the depth value of the skin pixel, and combining the pixel into the hand area image if the second absolute value is smaller than the skin pixel threshold value; converting the depth hand image into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information. The method and the device can solve the technical defects that the accuracy and the stability of gesture recognition are influenced due to inaccurate segmentation of the gesture area in the prior art.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a method, a device, a medium and computer equipment for segmenting a hand region in an image.
Background
The current gesture area segmentation methods mainly include hand skin segmentation such as RGB, HSV and YCrCb, gesture segmentation for motion detection such as an optical flow method and an inter-frame difference method.
However, the method based on skin detection is easily affected by illumination, and the hand regions segmented under different illumination conditions have large difference, so that the algorithm robustness is low; but also by the skin tissue of other body parts. The gesture segmentation method based on motion detection has low static gesture segmentation rate, is influenced by illumination, and directly leads to accuracy and stability of later-stage gesture recognition due to inaccurate segmentation.
Disclosure of Invention
The invention mainly aims to provide a method and a device for segmenting a hand region in an image and a readable storage medium, and aims to solve the technical defects that the accuracy and the stability of gesture recognition are influenced by inaccurate segmentation of a gesture region in the prior art.
In order to achieve the above object, in one aspect, an embodiment of the present invention provides a method for segmenting a hand region in an image, where the method includes:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
According to one aspect of the above technical solution, the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
According to an aspect of the foregoing technical solution, the step of cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image specifically includes:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
According to an aspect of the foregoing technical solution, a depth image of a skin region is cut out according to an intermediate image and a skin region image, whether skin regions connected in the depth image are an entire region is detected, and the step of separating each skin region in the depth image based on a detection result specifically includes:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
According to one aspect of the above technical solution, the selection formula of the seed region is:
wherein di represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
According to one aspect of the above technical solution, the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and if the second absolute value is smaller than a skin pixel threshold, combining the pixel into the seed area to obtain the depth hand image specifically includes:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
According to one aspect of the above technical solution, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically including:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
and restoring the hand region obtained by segmentation to a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0,0, 0).
The method for dividing the hand region in the image has the following beneficial effects:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
In another aspect, the present invention further provides an apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
In another aspect, the present invention further provides a readable storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned method for segmenting a hand region in an image.
In another aspect, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned method for segmenting a hand region in an image.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flowchart illustrating a method for segmenting a hand region in an image according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating an RGB target image according to a first embodiment of the present invention;
FIG. 3 is a diagram illustrating an RGB target image with a minimum bounding rectangle according to a first embodiment of the present invention;
FIG. 4 is a diagram of an intermediate image1 according to a first embodiment of the present invention;
FIG. 5 is a schematic view of a skin area image2 in accordance with a first embodiment of the present invention;
FIG. 6 is a schematic view of a color converted image3 of a skin area image2 according to a first embodiment of the present invention;
FIG. 7 is a schematic diagram of a non-hand area image4 and a hand area image5 separating skin areas according to the first embodiment of the present invention;
FIG. 8 is a diagram of a depth hand image6 according to the first embodiment of the present invention;
FIG. 9 is a diagram of a binary image7 according to the first embodiment of the present invention;
FIG. 10 is a diagram illustrating a conversion of a binary image7 into an RGB image8 according to a first embodiment of the present invention;
FIG. 11 is a diagram of a gesture image9 according to a first embodiment of the present invention;
FIG. 12 is a block diagram of a hand segmentation apparatus according to a second embodiment of the present invention;
the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a first embodiment of the present invention provides a method for segmenting a hand region in an image, including steps S10-S50:
s10, performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, step S10 specifically includes:
s11, acquiring RGB images which are acquired by the RGB camera and contain hands; the RGB image is shown in FIG. 2;
s12, inputting the RGB images into the trained hand detection model for hand target detection;
and S13, obtaining the graphic information (as shown in FIG. 3) of the minimum circumscribed rectangle of the hand region according to the detection result of the hand target detection, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p)x,py) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
S20, according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out the intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
wherein, step S20 specifically includes:
s21, obtaining a bounding box of a minimum bounding rectangle of the hand region through hand target detection, and cutting the hand region from the target image to obtain an intermediate image (such as image1 in FIG. 4), wherein the width and the height of the intermediate image are w and h respectively;
s22, converting the RGB space (image 2 in fig. 5) of the intermediate image into YCrCb space (image 3 in fig. 6), and detecting the skin in the bounding box by an elliptical skin detection method to obtain the skin region image (image 3 in fig. 6).
S30, cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on the detection result;
it should be noted that, since the skin area image including the skin is obtained in step S20, skin areas of other non-hand areas (for example, a face area, a neck area, etc., as shown in image4 in fig. 7) may be included in the skin area image, and therefore the skin areas of the non-hand areas need to be marked for processing.
Wherein, step S30 specifically includes:
s31, circularly traversing a first absolute value of a depth value difference of adjacent pixels in each connected region in the depth image, judging whether the first absolute value is smaller than a first pixel threshold, if the first absolute value is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total pixel number of each marked region, and if the total pixel is smaller than a second pixel threshold, directly treating the region as a background;
s32, marking different depth areas, and separating skin areas (such as image4 and image5 in the depth image according to the marks;
s40, calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
wherein, the formula of selecting the seed area is as follows:
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
In this embodiment, step S40 specifically includes:
traversing a second absolute value of a difference value between the depth value of the skin edge non-skin pixel of the seed region and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold, determining that the pixel belongs to the hand region, merging the pixel into the hand region, and traversing again in the next round until the hand skin region does not grow, so as to obtain a depth hand image (e.g., image6 in fig. 8).
And S50, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.
Wherein, step S50 specifically includes:
s51, converting the depth hand image (image 6 in fig. 8) into a binary image (image 7 in fig. 9), wherein the pixel value of the hand region in the binary image is 255 and the pixel value of the non-hand region is 0;
s52, converting the binary image into a three-channel RGB image (such as image8 in FIG. 10), wherein the pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing hands;
digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;
and S53, restoring the hand region (such as image8 in FIG. 3) obtained by segmentation to the gesture image (such as image9 in FIG. 11) corresponding to the target image graphic information according to the coordinates of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel value, and the rest background pixel values are (0,0, 0).
Specifically, the width and height W, H of the intermediate image and the coordinates (p) of the top left corner point of the minimum bounding rectanglex,py) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by the division to the original image size, namely creating a new image with the width and height of W, H and the RGB value of (0,0,0), copying the RGB image (image8) of the hand only obtained in the step S42Beibei to the newly created image (image9), only the RGB figure of the hand has the coordinates of the upper left corner point in the newly created image (p)x,py) The width and height are w and h respectively.
According to the method for segmenting the hand region in the image provided by the embodiment, the following beneficial effects are achieved:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
A second embodiment of the present invention provides an apparatus for segmenting a hand region in an image, the apparatus including:
the detection module 10 is used for performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
wherein, detection module 10 specifically includes:
acquiring an RGB image containing a hand, which is acquired by an RGB camera; an RGB image such as the RGB image in FIG. 2;
inputting the RGB image into the trained hand detection model for hand target detection;
according to the detection result of the hand target detection, obtaining the graph information (such as the SRC diagram in fig. 2) of the minimum circumscribed rectangle of the hand region, where the graph information includes the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, and the width and height of the rectangle.
Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p)x,py) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.
The segmentation module 20 is configured to cut out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and perform color space conversion on the intermediate image to obtain a skin region image;
the segmentation module 20 specifically includes:
obtaining a bounding box of a minimum bounding rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image (such as image1), wherein the width and the height of the intermediate image are w and h respectively;
the RGB space of the intermediate image is converted into the YCrCb space, and the skin in the bounding box is detected by an elliptical skin detection method to obtain the skin area image (e.g., image2 in fig. 2).
A separation module 30, configured to cut out a depth image of the skin region according to the intermediate image and the skin region image, detect whether the skin regions connected in the depth image are an entire region, and separate each skin region in the depth image based on a detection result;
it should be noted that, since the skin region image including the skin is obtained in the partitioning module, skin regions (e.g., a face region, a neck region, etc.) of other non-hand regions may be included in the skin region image, and therefore the skin region of the non-hand region needs to be marked for processing.
Wherein the separation module 30 comprises:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly performing background removal processing;
marking different depth areas, and separating skin areas (such as image4 and image5 in the depth image) according to the marks;
the combining module 40 is configured to calculate an average depth of each skin area, select a skin area with the smallest average depth as a seed area for hand target detection, circularly traverse a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combine the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold;
wherein, the formula of selecting the seed area is as follows:
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
In this embodiment, the combining module 40 is specifically configured to:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
And the conversion module 50 is configured to convert the depth hand image obtained by the segmentation into a binary image, convert the binary image into a three-channel RGB image, and restore the hand region obtained by the segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.
The conversion module 50 specifically includes:
converting the segmented depth hand image (image 6 in fig. 3) into a binary image (image 7 in fig. 3) with a hand region pixel value of 255 and a non-hand region pixel value of 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;
and restoring the hand region (such as image8 in fig. 3) obtained by segmentation to a gesture image (such as image9 in fig. 3) corresponding to the target image graph information according to the coordinates of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture image is (0,0, 0).
Specifically, the width and height W, H of the intermediate image and the coordinates (p) of the top left corner point of the minimum bounding rectanglex,py) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly creating an image with the width and height W, H and the RGB value of (0,0,0), copying the hand-only RGB image (image8) obtained in the step S42 into the newly created image (image9), wherein the upper-left corner coordinates of the hand-only RGB image in the newly created image are (p)x,py) The width and height are w and h respectively.
According to the hand region segmentation device in the image provided by the embodiment, the following beneficial effects are achieved:
(1) the hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.
(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.
(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.
Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.
Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the program.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims (10)
1. A method for segmenting a hand region in an image, the method comprising:
performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;
cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
2. The method for segmenting the hand region in the image according to claim 1, wherein the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically comprises:
acquiring an RGB image containing a hand, which is acquired by an RGB camera;
inputting the RGB image into the trained hand detection model for hand target detection;
and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.
3. The method for segmenting a hand region in an image according to claim 2, wherein the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain the skin region image specifically comprises:
obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;
and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.
4. The method for segmenting a hand region in an image according to claim 3, wherein a depth image of the skin region is cut out from the intermediate image and the skin region image, whether the skin regions connected in the depth image are an entire region or not is detected, and the step of separating each skin region in the depth image based on the detection result is specifically:
circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;
different depth regions are marked and skin regions in the depth image are separated according to the marks.
5. The method for segmenting the hand region in the image according to claim 4, wherein the seed region is selected according to the formula:
wherein d isiRepresents the average depth of the ith skin region, N represents the number of pixels in each skin region, and dnIndicating the depth to which the pixel corresponds, dkIndicating that the k-th region depth mean is minimal.
6. The method as claimed in claim 5, wherein the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel and a depth value of a skin pixel in the seed region, and if the second absolute value is smaller than a skin pixel threshold, the step of combining the pixel into the seed region to obtain the deep hand image is specifically as follows:
and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.
7. The method for segmenting a hand region in an image according to claim 6, wherein the depth hand image obtained by segmentation is converted into a binary image, the binary image is converted into a three-channel RGB image, and the hand region obtained by segmentation is restored into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically comprising:
converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;
converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;
and restoring the hand region obtained by segmentation to a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0,0, 0).
8. An apparatus for segmenting a hand region in an image, the apparatus comprising:
the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;
the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;
the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;
the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;
and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.
9. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110458968.9A CN113128435B (en) | 2021-04-27 | 2021-04-27 | Hand region segmentation method, device, medium and computer equipment in image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110458968.9A CN113128435B (en) | 2021-04-27 | 2021-04-27 | Hand region segmentation method, device, medium and computer equipment in image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113128435A true CN113128435A (en) | 2021-07-16 |
CN113128435B CN113128435B (en) | 2022-11-22 |
Family
ID=76780179
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110458968.9A Active CN113128435B (en) | 2021-04-27 | 2021-04-27 | Hand region segmentation method, device, medium and computer equipment in image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113128435B (en) |
Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102324019A (en) * | 2011-08-12 | 2012-01-18 | 浙江大学 | Method and system for automatically extracting gesture candidate region in video sequence |
CN102402687A (en) * | 2010-09-13 | 2012-04-04 | 三星电子株式会社 | Method and device for detecting rigid body part direction based on depth information |
CN103890782A (en) * | 2011-10-18 | 2014-06-25 | 诺基亚公司 | Methods and apparatuses for gesture recognition |
US20160014392A1 (en) * | 2014-07-11 | 2016-01-14 | Microsoft Technology Licensing, Llc. | Camera system and method for hair segmentation |
CN105893925A (en) * | 2015-12-01 | 2016-08-24 | 乐视致新电子科技(天津)有限公司 | Human hand detection method based on complexion and device |
CN106097354A (en) * | 2016-06-16 | 2016-11-09 | 南昌航空大学 | A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing |
CN106682571A (en) * | 2016-11-08 | 2017-05-17 | 中国民航大学 | Skin color segmentation and wavelet transformation-based face detection method |
CN107408205A (en) * | 2015-03-11 | 2017-11-28 | 微软技术许可有限责任公司 | Foreground and background is distinguished with infrared imaging |
CN108256421A (en) * | 2017-12-05 | 2018-07-06 | 盈盛资讯科技有限公司 | A kind of dynamic gesture sequence real-time identification method, system and device |
CN108564070A (en) * | 2018-05-07 | 2018-09-21 | 京东方科技集团股份有限公司 | Method for extracting gesture and its device |
CN108647597A (en) * | 2018-04-27 | 2018-10-12 | 京东方科技集团股份有限公司 | A kind of wrist recognition methods, gesture identification method, device and electronic equipment |
CN109214297A (en) * | 2018-08-09 | 2019-01-15 | 华南理工大学 | A kind of static gesture identification method of combination depth information and Skin Color Information |
CN109684959A (en) * | 2018-12-14 | 2019-04-26 | 武汉大学 | The recognition methods of video gesture based on Face Detection and deep learning and device |
CN110335342A (en) * | 2019-06-12 | 2019-10-15 | 清华大学 | It is a kind of for immersing the hand model Real-time Generation of mode simulator |
CN111553891A (en) * | 2020-04-23 | 2020-08-18 | 大连理工大学 | Handheld object existence detection method |
CN111831123A (en) * | 2020-07-23 | 2020-10-27 | 山东大学 | Gesture interaction method and system suitable for desktop mixed reality environment |
CN112085855A (en) * | 2020-09-09 | 2020-12-15 | 南昌虚拟现实研究院股份有限公司 | Interactive image editing method and device, storage medium and computer equipment |
CN112232332A (en) * | 2020-12-17 | 2021-01-15 | 四川圣点世纪科技有限公司 | Non-contact palm detection method based on video sequence |
CN112509117A (en) * | 2020-11-30 | 2021-03-16 | 清华大学 | Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium |
CN112686231A (en) * | 2021-03-15 | 2021-04-20 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and device, readable storage medium and computer equipment |
-
2021
- 2021-04-27 CN CN202110458968.9A patent/CN113128435B/en active Active
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102402687A (en) * | 2010-09-13 | 2012-04-04 | 三星电子株式会社 | Method and device for detecting rigid body part direction based on depth information |
CN102324019A (en) * | 2011-08-12 | 2012-01-18 | 浙江大学 | Method and system for automatically extracting gesture candidate region in video sequence |
CN103890782A (en) * | 2011-10-18 | 2014-06-25 | 诺基亚公司 | Methods and apparatuses for gesture recognition |
US20160014392A1 (en) * | 2014-07-11 | 2016-01-14 | Microsoft Technology Licensing, Llc. | Camera system and method for hair segmentation |
CN107408205A (en) * | 2015-03-11 | 2017-11-28 | 微软技术许可有限责任公司 | Foreground and background is distinguished with infrared imaging |
CN105893925A (en) * | 2015-12-01 | 2016-08-24 | 乐视致新电子科技(天津)有限公司 | Human hand detection method based on complexion and device |
CN106097354A (en) * | 2016-06-16 | 2016-11-09 | 南昌航空大学 | A kind of combining adaptive Gauss Face Detection and the hand images dividing method of region growing |
CN106682571A (en) * | 2016-11-08 | 2017-05-17 | 中国民航大学 | Skin color segmentation and wavelet transformation-based face detection method |
CN108256421A (en) * | 2017-12-05 | 2018-07-06 | 盈盛资讯科技有限公司 | A kind of dynamic gesture sequence real-time identification method, system and device |
CN108647597A (en) * | 2018-04-27 | 2018-10-12 | 京东方科技集团股份有限公司 | A kind of wrist recognition methods, gesture identification method, device and electronic equipment |
CN108564070A (en) * | 2018-05-07 | 2018-09-21 | 京东方科技集团股份有限公司 | Method for extracting gesture and its device |
CN109214297A (en) * | 2018-08-09 | 2019-01-15 | 华南理工大学 | A kind of static gesture identification method of combination depth information and Skin Color Information |
CN109684959A (en) * | 2018-12-14 | 2019-04-26 | 武汉大学 | The recognition methods of video gesture based on Face Detection and deep learning and device |
CN110335342A (en) * | 2019-06-12 | 2019-10-15 | 清华大学 | It is a kind of for immersing the hand model Real-time Generation of mode simulator |
CN111553891A (en) * | 2020-04-23 | 2020-08-18 | 大连理工大学 | Handheld object existence detection method |
CN111831123A (en) * | 2020-07-23 | 2020-10-27 | 山东大学 | Gesture interaction method and system suitable for desktop mixed reality environment |
CN112085855A (en) * | 2020-09-09 | 2020-12-15 | 南昌虚拟现实研究院股份有限公司 | Interactive image editing method and device, storage medium and computer equipment |
CN112509117A (en) * | 2020-11-30 | 2021-03-16 | 清华大学 | Hand three-dimensional model reconstruction method and device, electronic equipment and storage medium |
CN112232332A (en) * | 2020-12-17 | 2021-01-15 | 四川圣点世纪科技有限公司 | Non-contact palm detection method based on video sequence |
CN112686231A (en) * | 2021-03-15 | 2021-04-20 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and device, readable storage medium and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113128435B (en) | 2022-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113781402B (en) | Method and device for detecting scratch defects on chip surface and computer equipment | |
CN110717489B (en) | Method, device and storage medium for identifying text region of OSD (on Screen display) | |
CN109978839B (en) | Method for detecting wafer low-texture defects | |
CN105184763B (en) | Image processing method and device | |
US10748023B2 (en) | Region-of-interest detection apparatus, region-of-interest detection method, and recording medium | |
CN109308465B (en) | Table line detection method, device, equipment and computer readable medium | |
US20230009564A1 (en) | Character segmentation method and apparatus, and computer-readable storage medium | |
WO2007061779A1 (en) | Shadow detection in images | |
US9524445B2 (en) | Methods and systems for suppressing non-document-boundary contours in an image | |
US20170178341A1 (en) | Single Parameter Segmentation of Images | |
CN112308854A (en) | Automatic detection method and system for chip surface flaws and electronic equipment | |
CN113609984A (en) | Pointer instrument reading identification method and device and electronic equipment | |
CN112733823B (en) | Method and device for extracting key frame for gesture recognition and readable storage medium | |
CN113128435B (en) | Hand region segmentation method, device, medium and computer equipment in image | |
Fang et al. | 1-D barcode localization in complex background | |
CN115049713A (en) | Image registration method, device, equipment and readable storage medium | |
CN108765456A (en) | Method for tracking target, system based on linear edge feature | |
CN115187744A (en) | Cabinet identification method based on laser point cloud | |
CN114140620A (en) | Object straight line contour detection method | |
CN111476800A (en) | Character region detection method and device based on morphological operation | |
JP4253265B2 (en) | Shadow detection apparatus, shadow detection method and shadow detection program, image processing apparatus using shadow detection apparatus, image processing method using shadow detection method, and image processing program using shadow detection program | |
JP2004094427A (en) | Slip image processor and program for realizing the same device | |
CN117351011B (en) | Screen defect detection method, apparatus, and readable storage medium | |
CN112101139B (en) | Human shape detection method, device, equipment and storage medium | |
CN109271986B (en) | Digital identification method based on Second-Confirm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |