CN113128435B

CN113128435B - Hand region segmentation method, device, medium and computer equipment in image

Info

Publication number: CN113128435B
Application number: CN202110458968.9A
Authority: CN
Inventors: 毛凤辉; 郭振民; 李博
Original assignee: Nanchang Virtual Reality Institute Co Ltd
Current assignee: Nanchang Virtual Reality Institute Co Ltd
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2022-11-22
Anticipated expiration: 2041-04-27
Also published as: CN113128435A

Abstract

The invention discloses a method, a device, a medium and computer equipment for segmenting a hand region in an image, wherein the method comprises the following steps: performing hand target detection on the target image through the trained hand detection model; performing skin region segmentation to obtain a skin region image of only the skin; according to the average depth of the skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of the non-skin pixel at the skin edge and the depth value of the skin pixel, and combining the pixel into the hand area image if the second absolute value is smaller than the skin pixel threshold value; converting the depth hand image into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information. The method and the device can solve the technical defects that the accuracy and the stability of gesture recognition are influenced due to inaccurate segmentation of the gesture area in the prior art.

Description

Hand region segmentation method, device, medium and computer equipment in image

Technical Field

The invention relates to the technical field of image processing, in particular to a method, a device, a medium and computer equipment for segmenting a hand region in an image.

Background

The current gesture area segmentation methods mainly include hand skin segmentation such as RGB, HSV and YCrCb, gesture segmentation for motion detection such as an optical flow method and an inter-frame difference method.

However, the method based on skin detection is easily affected by illumination, and the hand regions segmented under different illumination conditions have large difference, so that the algorithm robustness is low; but also by the skin tissue of other body parts. The gesture segmentation method based on motion detection has low static gesture segmentation rate, is influenced by illumination, and directly leads to accuracy and stability of later-stage gesture recognition due to inaccurate segmentation.

Disclosure of Invention

The invention mainly aims to provide a method and a device for segmenting a hand region in an image and a readable storage medium, and aims to solve the technical defects that the accuracy and the stability of gesture recognition are influenced by inaccurate segmentation of a gesture region in the prior art.

In order to achieve the above object, in one aspect, an embodiment of the present invention provides a method for segmenting a hand region in an image, where the method includes:

performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;

according to the graphic information of the minimum circumscribed rectangle of the hand region, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image, and performing color space conversion on the intermediate image to obtain a skin region image;

cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on a detection result;

calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;

converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.

According to one aspect of the above technical solution, the step of performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically includes:

acquiring an RGB image containing a hand, which is acquired by an RGB camera;

inputting the RGB image into the trained hand detection model for hand target detection;

and according to the detection result of the hand target detection, obtaining the graphic information of the minimum circumscribed rectangle of the hand region, wherein the graphic information comprises the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, the width and the height of the rectangle.

According to one aspect of the foregoing technical solution, the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image specifically includes:

obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image, wherein the width and the height of the intermediate image are w and h respectively;

and converting the RGB space of the intermediate image into YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.

According to an aspect of the foregoing technical solution, a depth image of a skin region is cut out according to an intermediate image and a skin region image, whether skin regions connected in the depth image are an entire region is detected, and the step of separating each skin region in the depth image based on a detection result specifically includes:

circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly treating the region as a background;

different depth regions are marked and skin regions in the depth image are separated according to the marks.

According to one aspect of the above technical solution, the selection formula of the seed region is:

wherein di represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d _n Representing the depth to which the pixel corresponds, d _k Indicating that the k-th region depth mean is minimal.

According to one aspect of the above technical solution, the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and if the second absolute value is smaller than a skin pixel threshold, combining the pixel into the seed area to obtain the depth hand image specifically includes:

and traversing a second absolute value of the difference value between the depth value of the skin edge non-skin pixel of the seed area and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold value, judging that the pixel belongs to the hand area, merging the pixel into the hand area, and traversing again in the next round until the hand skin area is not increased to finish the traversal so as to obtain a depth hand image.

According to one aspect of the above technical solution, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically including:

converting the segmented depth hand image into a binary image, wherein the pixel value of a hand area in the binary image is 255, and the pixel value of a non-hand area is 0;

converting the binary image into a three-channel RGB image, wherein pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum external rectangular frame to obtain an RGB image only containing hands;

and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel, and the values of the rest background pixels are (0, 0).

The method for dividing the hand region in the image has the following beneficial effects:

(1) The hand region is accurately detected through the deep learning target hand detection model, the hand region is cut out, and the hand positioning is guaranteed and the later segmentation calculated amount is reduced.

(2) Skin interference in different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference in other areas cannot be eliminated by using the skin detection alone.

(3) The screened skin is used as a seed, and the region which is not detected in the skin detection process is continuously searched through the skin region depth map edge pixel difference value, so that the influence on illumination is small, the algorithm robustness is strong, and the calculated amount is small.

In another aspect, the present invention further provides an apparatus for segmenting a hand region in an image, the apparatus comprising:

the detection module is used for carrying out hand target detection on the target image through the trained hand detection model so as to obtain the graphic information of the minimum circumscribed rectangle of the hand region;

the segmentation module is used for cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and performing color space conversion on the intermediate image to obtain a skin region image;

the separation module is used for cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on a detection result;

the combination module is used for calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of a difference value between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;

and the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.

In another aspect, the present invention further provides a readable storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned method for segmenting a hand region in an image.

In another aspect, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned method for segmenting a hand region in an image.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a flowchart illustrating a method for segmenting a hand region in an image according to a first embodiment of the present invention;

FIG. 2 is a diagram illustrating an RGB target image according to a first embodiment of the present invention;

FIG. 3 is a diagram illustrating an RGB target image with a minimum bounding rectangle according to a first embodiment of the present invention;

FIG. 4 is a diagram of an intermediate image1 according to a first embodiment of the present invention;

FIG. 5 is a diagram of a skin area image2 according to a first embodiment of the present invention;

FIG. 6 is a schematic view of an image3 of a skin area image2 after color conversion according to a first embodiment of the present invention;

FIG. 7 is a schematic view of a non-hand area image4 and a hand area image5 for separating skin areas according to the first embodiment of the present invention;

FIG. 8 is a schematic view of a deep hand image6 according to the first embodiment of the present invention;

FIG. 9 is a diagram of a binary image7 according to the first embodiment of the present invention;

FIG. 10 is a diagram illustrating a conversion of a binary image7 into an RGB image8 according to a first embodiment of the present invention;

FIG. 11 is a schematic diagram of a gesture image9 according to the first embodiment of the present invention;

FIG. 12 is a block diagram of a hand segmentation apparatus according to a second embodiment of the present invention;

the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

Referring to fig. 1, a first embodiment of the present invention provides a method for segmenting a hand region in an image, including steps S10-S50:

s10, performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;

wherein, step S10 specifically includes:

s11, acquiring an RGB image which is acquired by an RGB camera and contains a hand part; the RGB image is shown in FIG. 2;

s12, inputting the RGB image into the trained hand detection model to perform hand target detection;

and S13, obtaining the graphic information (shown in figure 3) of the minimum circumscribed rectangle of the hand region according to the detection result of the hand target detection, wherein the graphic information comprises the top left corner vertex coordinate of the minimum circumscribed rectangle, and the width and the height of the rectangle.

Wherein, the coordinate of the upper left corner point of the minimum circumscribed rectangle can be used (p) _x ,p _y ) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.

S20, according to the graphic information of the minimum circumscribed rectangle of the hand area, cutting out an intermediate image corresponding to the minimum circumscribed rectangle of the hand area in the target image, and performing color space conversion on the intermediate image to obtain a skin area image;

wherein, step S20 specifically includes:

s21, obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and segmenting the hand region from a target image to obtain an intermediate image (such as image1 in FIG. 4), wherein the width and the height of the intermediate image are w and h respectively;

s22, converting the RGB space (e.g. image2 in fig. 5) of the intermediate image into the YCrCb space (e.g. image3 in fig. 6), and detecting the skin in the bounding box by an elliptical skin detection method to obtain the skin region image (e.g. image3 in fig. 6).

S30, cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area, and separating each skin area in the depth image based on the detection result;

it should be noted that, since the skin region image including the skin is obtained in step S20, skin regions of other non-hand regions (for example, a face region, a neck region, and the like, as shown in image4 in fig. 7) may be included in the skin region image, and therefore the skin regions of the non-hand regions need to be marked for processing.

Wherein, step S30 specifically includes:

s31, circularly traversing first absolute values of depth value difference values of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold value, if the first absolute values are smaller than or equal to the first pixel threshold value, the adjacent pixels are the same parts, if the first absolute values are larger than the first pixel threshold value, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total pixel number of each marked region, and if the total pixel is smaller than a second pixel threshold value, directly taking the region as a background;

s32, marking different depth areas, and separating skin areas (such as image4 and image5 in the image 7) in the depth image according to the marks;

s40, calculating the average depth of each skin area, selecting the skin area with the minimum average depth as a seed area for hand target detection, circularly traversing a second absolute value of the difference value between the depth value of a non-skin pixel at the skin edge of the seed area and the depth value of a skin pixel, and combining the pixel into the seed area to obtain a depth hand image if the second absolute value is smaller than a skin pixel threshold value;

wherein, the formula of selecting the seed area is as follows:

wherein d is _i Represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d _n Indicating the depth to which the pixel corresponds, d _k Indicating that the k-th region depth mean is minimal.

In this embodiment, step S40 specifically includes:

traversing a second absolute value of a difference value between the depth value of the skin edge non-skin pixel of the seed region and the depth value of the skin pixel, if the second absolute value is smaller than the skin pixel threshold, determining that the pixel belongs to the hand region, merging the pixel into the hand region, and traversing again in the next round until the hand skin region does not grow, so as to obtain a depth hand image (e.g., image6 in fig. 8).

And S50, converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into the gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image.

Wherein, step S50 specifically includes:

s51, converting the depth hand image (image 6 in figure 8) into a binary image (image 7 in figure 9), wherein the pixel value of the hand area in the binary image is 255, and the pixel value of the non-hand area is 0;

s52, converting the binary image into a three-channel RGB image (such as image8 in FIG. 10), wherein the pixel values of all channels are equal, and performing digital image logical operation on the depth image in the cut minimum circumscribed rectangular frame to obtain an RGB image only containing hands;

wherein, digital image logic and operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not repeated herein;

s53, restoring the hand region (such as image8 in the figure 3) obtained by segmentation to the gesture image (such as image9 in the figure 11) corresponding to the target image graphic information according to the coordinate of the upper left corner point of the minimum circumscribed rectangle in the intermediate image and the width and height information, wherein the RGB value of the gesture region is the original pixel value, and the other background pixel values are (0, 0).

Specifically, the width and height W and H of the intermediate image and the coordinates (p) of the upper left corner point of the minimum bounding rectangle are determined _x ，p _y ) And the widths and heights W and H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly building an image with the widths and heights W and H and the RGB values of (0, 0 and 0), copying the RGB image (image 8) of only the hand obtained in the step S42 into the newly built image (image 9), wherein the coordinate of the upper left corner point of the RGB image of only the hand in the newly built image is (p) _x ，p _y ) The width and height are w and h respectively.

According to the method for segmenting the hand region in the image provided by the embodiment, the following beneficial effects are achieved:

(1) The hand region is accurately detected through the deep learning target hand detection model, and the hand region is cut out, so that the hand positioning is guaranteed, and the later segmentation calculated amount is reduced.

(2) Skin interference of different areas can be effectively segmented by combining skin detection with hand depth information, and skin interference of other areas cannot be eliminated by using the skin detection alone.

A second embodiment of the present invention provides an apparatus for segmenting a hand region in an image, the apparatus including:

the detection module 10 is used for performing hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region;

wherein, detection module 10 specifically includes:

acquiring an RGB image containing a hand, which is acquired by an RGB camera; an RGB image such as the RGB image in FIG. 2;

according to the detection result of the hand target detection, obtaining the graph information (such as the SRC diagram in fig. 2) of the minimum circumscribed rectangle of the hand region, where the graph information includes the vertex coordinates of the upper left corner of the minimum circumscribed rectangle, and the width and height of the rectangle.

Wherein, the coordinate of the top left corner point of the minimum circumscribed rectangle can be used (p) _x ,p _y ) The width and height of the minimum bounding rectangle are denoted by w and h, respectively, in pix.

The segmentation module 20 is configured to cut out an intermediate image corresponding to the minimum circumscribed rectangle of the hand region in the target image according to the graphic information of the minimum circumscribed rectangle of the hand region, and perform color space conversion on the intermediate image to obtain a skin region image;

the segmentation module 20 specifically includes:

obtaining a bounding box of a minimum circumscribed rectangle of a hand region through hand target detection, and cutting the hand region from a target image to obtain an intermediate image (such as image 1), wherein the width and the height of the intermediate image are w and h respectively;

converting the RGB space of the intermediate image into YCrCb space, and detecting the skin in the boundary frame by an ellipse skin detection method to obtain the skin area image (such as image2 in FIG. 2).

A separation module 30, configured to cut out a depth image of the skin region according to the intermediate image and the skin region image, detect whether the skin regions connected in the depth image are an entire region, and separate each skin region in the depth image based on a detection result;

it should be noted that, since the skin region image including the skin is obtained in the partitioning module, skin regions (e.g., a face region, a neck region, etc.) of other non-hand regions may also be included in the skin region image, and therefore the skin region of the non-hand region needs to be marked for processing.

Wherein, the separation module 30 includes:

circularly traversing first absolute values of depth value differences of adjacent pixels in each connected region in the depth image, judging whether the first absolute values are smaller than a first pixel threshold, if the first pixel threshold is smaller than or equal to the first pixel threshold, the adjacent pixels are the same part, if the first pixel threshold is larger than the first pixel threshold, the adjacent pixels are different parts, if the connected regions are not separable, marking the connected regions into a complete region, counting the total number of pixels in each marked region, and if the total number of pixels is smaller than a second pixel threshold, directly performing background removal processing;

marking different depth areas, and separating skin areas (such as image4 and image5 in the image 3) in the depth image according to the marks;

a combination module 40, configured to calculate an average depth of each skin area, select a skin area with the smallest average depth as a seed area for hand target detection, cycle through a second absolute value of a difference between a depth value of a skin edge non-skin pixel of the seed area and a depth value of a skin pixel, and combine the pixel into the seed area to obtain a deep hand image if the second absolute value is smaller than a skin pixel threshold;

the seed region selection formula is as follows:

wherein d is _i Represents the average depth of the ith skin region, N represents the number of pixels in each skin region, and d _n Representing the depth to which the pixel corresponds, d _k Indicating that the k-th region depth mean is minimal.

In this embodiment, the combining module 40 is specifically configured to:

And the conversion module 50 is configured to convert the depth hand image obtained by the segmentation into a binary image, convert the binary image into a three-channel RGB image, and restore the hand region obtained by the segmentation into the gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image.

The conversion module 50 specifically includes:

converting the divided depth hand image (image 6 in fig. 3) into a binary image (image 7 in fig. 3), wherein the pixel value of the hand area in the binary image is 255, and the pixel value of the non-hand area is 0;

converting the binary image into a three-channel RGB image, wherein the pixel values of all channels are equal, and performing digital image logical operation on the three-channel RGB image and the depth image in the cut minimum circumscribed rectangular frame to obtain the RGB image only containing the hand;

digital image logic AND operation is performed on the RGB images of the three channels and the skin area images, and the digital image logic operation is easily understood by a person skilled in the art and is not described herein;

and restoring the hand region (such as image8 in the figure 3) obtained by segmentation into a gesture image (such as image9 in the figure 3) corresponding to the target image graph information according to the coordinates and the width and height information of the upper left corner point of the minimum circumscribed rectangle in the intermediate image, wherein the RGB values of the gesture image are (0, 0 and 0).

Specifically, the width and height W and H of the intermediate image and the coordinates (p) of the upper left corner point of the minimum bounding rectangle are determined _x ，p _y ) And the width and height W, H of the minimum circumscribed rectangle, restoring the hand image obtained by segmentation to the original image size, namely newly creating an image with the width and height W, H and the RGB value of (0, 0), copying the hand-only RGB image (image 8) obtained in the step S42 into the newly created image (image 9), wherein the upper left corner point coordinate of the hand-only RGB image in the newly created image is (p) _x ，p _y ) The width and height are w and h respectively.

According to the hand region segmentation device in the image provided by the embodiment, the following beneficial effects are achieved:

Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.

Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the program.

The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following technologies, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A method for segmenting a hand region in an image, the method comprising:

cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are an integral area or not, and separating each skin area in the depth image based on the detection result;

converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand region obtained by segmentation into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image;

cutting out a depth image of the skin area according to the intermediate image and the skin area image, detecting whether the skin areas connected in the depth image are a whole area, and separating each skin area in the depth image based on the detection result specifically comprises the following steps:

2. The method for segmenting the hand region in the image according to claim 1, wherein the step of performing the hand target detection on the target image through the trained hand detection model to obtain the graphic information of the minimum circumscribed rectangle of the hand region specifically comprises:

acquiring an RGB image containing a hand, which is acquired by an RGB camera;

3. The method for segmenting a hand region in an image according to claim 2, wherein the step of cutting out an intermediate image corresponding to the minimum bounding rectangle of the hand region in the target image according to the graphic information of the minimum bounding rectangle of the hand region, and performing color space conversion on the intermediate image to obtain the skin region image specifically comprises:

and converting the RGB space of the intermediate image into the YCrCb space, and detecting the skin in the boundary frame by an elliptical skin detection method to obtain the skin area image.

4. The method for segmenting a hand region in an image according to claim 1, wherein the selection formula of the seed region is as follows:

5. The method as claimed in claim 4, wherein the step of circularly traversing a second absolute value of a difference between a depth value of a skin edge non-skin pixel and a depth value of a skin pixel in the seed region, and if the second absolute value is smaller than a skin pixel threshold, the step of combining the pixel into the seed region to obtain the deep hand image is specifically as follows:

6. The method for segmenting a hand region in an image according to claim 5, wherein the depth hand image obtained by segmentation is converted into a binary image, the binary image is converted into a three-channel RGB image, and the hand region obtained by segmentation is restored into a gesture image corresponding to the target image graphic information according to the graphic information of the intermediate image, specifically comprising:

7. An apparatus for segmenting a hand region in an image, the apparatus comprising:

the conversion module is used for converting the depth hand image obtained by segmentation into a binary image, converting the binary image into a three-channel RGB image, and restoring the hand area obtained by segmentation into a gesture image corresponding to the graphic information of the target image according to the graphic information of the intermediate image;

the separation module specifically:

the method for cutting out the depth image of the skin area according to the intermediate image and the skin area image and detecting whether the skin areas connected in the depth image are a whole area or not comprises the following steps of:

8. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-6 when executing the program.