CN114898096A

CN114898096A - Segmentation and annotation method and system for figure image

Info

Publication number: CN114898096A
Application number: CN202210549650.6A
Authority: CN
Inventors: 赵凯; 黄涛; 程雯; 王先兰
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2022-08-12

Abstract

The invention relates to the technical field of digital image processing, in particular to a method and a system for segmenting and labeling a figure image, which comprise the following steps: acquiring an image set to be annotated and a high-resolution image set; acquiring low-frequency information and high-frequency information of a high-resolution image to train a reconstruction network; inputting an image set to be marked into an optimal reconstruction network to obtain a reconstructed image; segmenting the reconstructed image through a segmentation network to generate a segmentation image set; and checking the segmented image set, and performing annotation processing on the original image to be annotated which meets the preset requirement to obtain an annotated image set. By extracting the low-frequency information and the high-frequency information of the high-resolution image, on the premise of not changing the original image to be marked, the high-frequency information of the image to be marked is supplemented through the reconstruction network, the image to be marked is reconstructed into a clear image, the original segmentation algorithm does not need to be changed, the robustness and the generalization capability of the existing segmentation algorithm are enhanced from the image angle, and the image segmentation precision is higher and the marked image is more accurate.

Description

Segmentation and annotation method and system for figure image

Technical Field

The invention relates to the technical field of digital image processing, in particular to a method and a system for segmenting and labeling a figure image.

Background

With the development of computer vision technology, in the field of image processing, many algorithm networks cannot separate data sets, and particularly in the aspect of image segmentation, people have an increasing demand for high-quality image segmentation data sets. At present, most image segmentation data sets on the market are manually labeled, namely labeled by software such as Hitachi segmentation and LabelMe. Firstly, an annotation person needs to perform contour drawing on an original image to form pre-segmentation, and then manually modifies the original image on the basis of the pre-segmentation to finally determine an annotated image. The traditional image segmentation and annotation method is not only tedious in operation process but also very time-consuming, a large amount of time is used for image annotation, and the efficiency is very low.

In the prior art, some developers introduce related segmentation algorithms into image annotation in order to further improve the annotation efficiency on the premise of ensuring the quality of an annotated image; firstly, an image to be marked is segmented by an algorithm, and then marking personnel mark the image on the basis of the segmented image, but the method has a plurality of defects and poor practical effect, when the image is fuzzy or has low resolution, the phenomenon of mistaken segmentation is easy to generate by the segmentation algorithm, at the moment, the marking personnel manually draw and mark the original fuzzy image again in a manual mode, the defect that the algorithm is easy to generate mistaken segmentation is not overcome, and the marking efficiency is not effectively improved.

Therefore, the current manual segmentation labeling method and the method combining manual operation and algorithm can not solve the problem of low efficiency of current image labeling.

Disclosure of Invention

The invention provides a method and a system for segmenting and labeling a person image, which are used for overcoming the defects in the prior art.

The invention provides a segmentation and annotation method of a character image, which comprises the following steps:

s1, acquiring an image set to be annotated and a high-resolution image set;

s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;

s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;

s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;

s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;

s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, performing annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.

Preferably, the first sampling is performed on a plurality of high resolution images in the high resolution image set through wavelet transform, so as to obtain corresponding high frequency information and low frequency information.

According to the segmentation and labeling method for the figure image, provided by the invention, a reconstruction network is trained on the basis of the low-frequency information and the high-frequency information, and the method comprises the following steps:

storing the high-frequency information into an auxiliary variable and inputting the high-frequency information into the reconstruction network;

generating a corresponding low-resolution image based on the low-frequency information, and inputting the image to the reconstruction network;

and fusing the low-frequency information of the low-resolution image and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the repaired training reconstructed image.

According to the method for segmenting and labeling the human image, which is provided by the invention, the image set to be labeled is input into the optimal reconstruction network, and the reconstruction image corresponding to each image to be labeled is obtained, and the method specifically comprises the following steps:

inputting each image to be marked into the optimal reconstruction network, and taking the image to be marked as low-frequency information; generating auxiliary variables corresponding to the image to be marked through the optimal reconstruction network; and fusing the low-frequency information of the image to be marked and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the restored reconstructed image.

According to the segmentation and labeling method for the figure image, provided by the invention, a reconstruction network is trained on the basis of the low-frequency information and the high-frequency information to obtain the optimal reconstruction network, and the segmentation and labeling method further comprises the following steps:

acquiring forward loss and reverse loss generated in the process of training and reconstructing the network;

comparing a low-resolution image obtained based on the low-frequency information with a low-resolution image obtained by subjecting a high-resolution image to image degradation processing to obtain an error between the low-resolution image and the high-resolution image, and recording the error as the forward loss;

inputting the low-resolution image obtained based on the low-frequency information into the reconstruction network to obtain a corresponding training reconstruction image, comparing the training reconstruction image with an original high-resolution image corresponding to the high-frequency information to obtain an error between the training reconstruction image and the original high-resolution image, and recording the error as the reverse loss;

and taking the corresponding reconstructed network when the forward loss and the reverse loss are minimum as the optimal reconstructed network.

According to the method for segmenting and labeling the human image, provided by the invention, the obtained segmented image set is checked, and whether any segmented image meets the preset requirement is judged, wherein the method comprises the following steps:

judging whether the character area and the environment area in the segmented image have error segmentation;

judging whether the average intersection ratio and the average pixel precision of the segmented images are lower than preset index values or not;

and if the error segmentation does not exist, and the average intersection ratio and the average pixel precision are not lower than the preset index value, judging that the segmented image meets the preset requirement.

According to the segmentation and annotation method of the figure image, provided by the invention, for any segmentation image, if the segmentation image does not meet the preset requirement, secondary processing is carried out on the corresponding segmentation image, and the segmentation and annotation method comprises the following steps:

s601, forming a rechecking image set by the original reconstructed images corresponding to the segmented images which do not meet the preset requirements;

s602, inputting the review image set into the optimal reconstruction network, outputting a new reconstruction image, segmenting the new reconstruction image through the segmentation network, and outputting a new segmentation image set;

s603, judging whether the new segmented image set meets a preset requirement or not, and if so, outputting the corresponding new segmented image set; if the preset requirement is not met, the steps S601-S603 are repeatedly executed until the preset requirement is met.

According to the segmentation and labeling method of the figure image, provided by the invention, each reconstructed image is segmented through a preset segmentation network to obtain a segmentation region, the segmentation region is labeled as a figure region and an environment region, the contour line of the segmented region after labeling is extracted, and the coordinates of a plurality of point positions on the contour line are obtained;

and drawing a corresponding contour line on the image to be marked corresponding to the segmented image based on the coordinates of the point positions on the external contour, filling the contour line based on the color preset by the user so as to mark the corresponding segmented region, and outputting a marked image.

On the other hand, the invention also provides a segmentation and annotation system of the figure image, which comprises a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;

the data transmission module is used for acquiring an image set to be annotated and a high-resolution image set;

the storage module is used for storing all image data and image sets generated in the segmentation process and the labeling process;

the first sampling module is used for acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;

the reconstruction module is used for training a reconstruction network according to the low-frequency information and the high-frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest; acquiring the image set to be annotated from the storage module, and acquiring a reconstructed image corresponding to each image to be annotated through the optimal reconstruction network;

the segmentation module is used for segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, obtaining a corresponding segmented image and generating a segmented image set;

the marking module is used for checking the obtained segmentation image set, and if any segmentation image does not meet the preset requirement, performing secondary processing on the corresponding segmentation image; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image, and outputting a corresponding annotated image set.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for segmenting and annotating a human image as described in any one of the above.

The segmentation and annotation method and system for the figure image, provided by the invention, have the following technical effects:

(1) by extracting the low-frequency information and the high-frequency information of the high-resolution image, training the reconstruction network based on the low-frequency information and the high-frequency information, supplementing the high-frequency information of the image to be marked through the reconstruction network on the premise of not changing the original image to be marked, reconstructing the image to be marked into a clear image through the high-frequency information, being beneficial to segmenting and marking the original image to be marked later, not needing to change the original segmentation algorithm, enhancing the robustness and the generalization capability of the existing segmentation algorithm from the angle of the image, enabling the image segmentation precision to be higher, and enabling the output marked image to be more accurate.

(2) The problems of long time consumption and low efficiency in the existing manual labeling are solved through a reconstruction network and a segmentation network; in the using process, a large number of fuzzy image labeling tasks can be realized only by uploading a certain amount of training set training networks, so that the labeling time is greatly shortened, and the image processing efficiency is favorably improved;

(3) the acquired segmented images can be checked for many times according to the requirements of users, and the images to be labeled are reconstructed and labeled for many times through the reconstruction network and the segmentation network, so that the accuracy of the image labeling task is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of a method for segmenting and labeling a human image according to the present invention;

fig. 2 is a second flowchart of the method for segmenting and labeling a human image according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1-2, the present invention provides a method for segmenting and labeling a human image, comprising the steps of:

s1, acquiring an image set to be annotated and a high-resolution image set;

s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set;

in step S1, the user may input the image set to be annotated and the high resolution image set for reconstructing the network training; the image set to be marked is specifically a fuzzy image set;

the blurred image is caused by factors such as inaccurate focusing or relative motion between the target and the camera, camera distortion and the like when the image is acquired, for example, the image is blurred due to factors such as weather such as rain and snow, dark surrounding environment, noise and the like when the image is shot; the method also comprises the steps that the focusing misalignment caused by the monitoring camera forms a fuzzy state, the relative motion exists between the monitored object and the camera to cause a fuzzy state, the definition of the monitoring camera is low to cause an image fuzzy state and the like; there is also a possibility that the distance causes the formation of blur, for example, if the distance between a pedestrian and a monitor is too far, the resolution of a pedestrian gait sequence image cut from a video is very low, the formation of blur is caused, and the features in the image are difficult to distinguish;

the high resolution and the low resolution are relative concepts, and the resolution of the high resolution image is greater than that of the low resolution image;

optionally, the high-resolution image set may further include, in addition to the high-resolution image set, a segmentation annotation image corresponding to each high-resolution image, and is used to train the reconstruction network and the segmentation network;

preferably, a plurality of high resolution images in the high resolution image set are subjected to first sampling through wavelet transformation to obtain corresponding high frequency information and low frequency information;

in step S3, training a reconstruction network based on the low frequency information and the high frequency information specifically includes:

fusing the low-frequency information of the low-resolution image and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting a repaired training reconstructed image;

the reconstruction network is constructed by a plurality of reversible networks and is used for further separating low-frequency information and high-frequency information of the original high-resolution image, so that the low-frequency information generates a corresponding low-resolution image or a fuzzy image through the reconstruction network, and the high-frequency information is embedded into an auxiliary variable;

the reconstruction network is also reversible, and when the image is input in the forward direction, the high-frequency information and the low-frequency information of the original image are separated; when the image is input reversely, inputting a low-resolution image, generating an auxiliary variable with high-frequency information corresponding to the low-resolution image, and obtaining low-frequency information and corresponding high-frequency information of the low-resolution image; fusing the low-frequency information and the high-frequency information through second sampling processing, and outputting a high-resolution image corresponding to the low-resolution image, namely a repaired clear image of the low-resolution image;

in one embodiment, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstructed image corresponding to each image to be annotated specifically includes:

inputting each image to be marked into the optimal reconstruction network, and taking the image to be marked as low-frequency information; generating auxiliary variables corresponding to the image to be marked through the optimal reconstruction network; fusing the low-frequency information of the image to be marked and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting a restored reconstructed image;

the reconstruction network can be regarded as a reversible transformation function and is based on the following expression:

and (3) forward process:

the reverse process:

wherein a is an auxiliary variable carrying high-frequency information of the image,

for low-fraction images generated over a reconstruction network, x _L ,x _H Respectively representing low-frequency information and high-frequency information corresponding to the original high-resolution image obtained by down-sampling,

it is the low frequency information and the high frequency information of the low fraction image obtained by inverse reconstruction,

then it is the upsampled repaired high resolution image, f () represents the reconstruction network forward process, and g () represents the upsampling process.

Preferably, in the training process of the reconstructed network, the error loss may be taken into consideration as a basis for the training convergence of the reconstructed network, and the reconstructed network is optimized through multiple training, specifically including:

firstly, forward loss is adopted, low-frequency information and high-frequency information are input into a reconstruction network in a forward direction to obtain a low-fraction image corresponding to a high-fraction image, in order to better simulate a real low-fraction blurred image, the original high fraction is directly subjected to image degradation processing to obtain a degraded low-fraction blurred image, and an error between the low-fraction images obtained in the two modes is called the forward loss of the reconstruction network;

further, the reverse loss is to perform reverse reconstruction on the low-fraction blurred image generated in the forward process, that is, the low-fraction blurred image and the auxiliary variable are input into a reconstruction network, and then upsampled to obtain a repaired high-resolution image, that is, the training reconstructed image; the error between the repaired high resolution image and the original high resolution image is called the reverse loss.

Taking the corresponding reconstructed network when the forward loss and the reverse loss are minimum as the optimal reconstructed network;

in the training process, the reconstructed image generated by the untrained reconstruction network is the training reconstructed image;

the image quality degradation process is to perform bicubic difference downsampling processing on the high-fraction image to obtain a low-fraction image, and then add noise to the low-fraction image;

further, the loss of training of the reconstructed network can be expressed as a sub-expression of:

loss _total ＝λ ₁ loss _Forw +λ ₂ loss _Reve ；

therein, loss _Forw Represents reconstructed network forward loss, loss _Reve Representing the reverse loss, λ, of the reconstructed network ₁ ,λ ₂ Are the corresponding coefficients.

Further, loss _Forw Can be obtained by the following equation

Further, loss _Reve This can be obtained by the following equation:

wherein l _forw Calculating a loss value function for the forward process,/ _reve In order to calculate the loss value function in the reverse direction,

is composed of high-fraction images

Obtaining a low-fraction blurred image through a degradation process;

further, in step S4, the fuzzy image set is repaired by using the optimal reconstruction network, specifically, the missing variable of the high frequency information is supplemented by the optimal reconstruction network, the high frequency information and the fuzzy image set to be labeled are simultaneously input to the optimal reconstruction network for inverse reconstruction, and a clear image set corresponding to the fuzzy image set to be labeled, that is, a reconstructed image (a repaired fuzzy image set), is obtained;

further, in step S5, segmenting each of the reconstructed images through a preset segmentation network, distinguishing a person region from an environment region, obtaining corresponding segmented images, and generating a segmented image set;

the segmentation network comprises but is not limited to one or more combinations of mainstream segmentation algorithms such as DeepLabv3+, Mask R-CNN and the like;

preferably, the method further comprises the following steps: before inputting the repaired fuzzy image set into the segmentation network model, the segmentation network model training stage is also included, and the segmentation network is trained by the high-resolution training set uploaded by a user and the corresponding segmentation annotation images to obtain the trained segmentation network model.

In one embodiment, the step of checking the acquired segmented image set to determine whether any segmented image meets preset requirements includes:

if the error segmentation does not exist, and the average intersection ratio and the average pixel precision are not lower than the preset index value, judging that the segmented image meets the preset requirement;

further, if any segmented image does not meet the preset requirement, secondary processing is carried out on the corresponding segmented image, and the method comprises the following steps:

s603, judging whether the new segmented image set meets a preset requirement or not, and if so, outputting the corresponding new segmented image set; if the preset requirement is not met, the steps S601-S603 are repeatedly executed until the preset requirement is met;

specifically, in one example, before labeling the divided images, the divided images may be examined by a user or an algorithm, and rejected images that do not meet the preset requirements after the first reconstruction are screened out to form a new image set a for secondary processing;

before the second processing, the method further comprises an image retrieval stage of an original image set to be processed, traversing reconstructed images generated by the image set to be processed after the first repairing according to an image set A of unqualified images screened by a user, selecting a reconstructed image corresponding to each unqualified image in the image set A, and forming a new image set B;

inputting the image set B into the optimal reconstruction network again, and performing secondary processing including secondary reconstruction and segmentation;

optionally, the user may set the reconstruction times as needed until obtaining a segmented image that meets the requirement, has no error segmentation, has no average intersection ratio, and has no average pixel precision lower than a preset index value;

optionally, a processing time threshold of multiple reconstruction processing may be set, and for an image that does not meet the requirement after re-reconstruction and re-segmentation after exceeding the processing time threshold, the corresponding image is converted into manual processing to perform segmentation processing on the image; or further retraining the original reconstruction network and the segmentation network, and expanding a training sample set, so that the algorithm precision of the reconstruction network and the segmentation network is improved;

after the images which can not meet the preset requirements and exceed the reconstruction times are processed manually, the original images and the images processed manually are used as a supplementary training set for retraining the reconstructed network;

further, each reconstructed image is segmented through a preset segmentation network to obtain a segmentation region, the segmentation region is labeled as a human body region and an environment region, a contour line of the segmentation region after labeling is extracted, and coordinates of a plurality of point positions on the contour line are obtained;

drawing a corresponding contour line on an image to be marked corresponding to the segmented image based on the coordinates of a plurality of point positions on the external contour, filling the contour line based on colors preset by a user so as to mark a corresponding segmented region, and outputting a marked image;

optionally, the user may set the type of the segmentation region as required, so as to segment different regions in the image, select a required color according to the corresponding type, fill the contour line range of the segmentation region with a preset color, and fill the region outside the contour line range with black;

optionally, before extracting the contour of the target segmented region, the region outside the segmented region needs to be changed into a black background, and a specific implementation may be to perform contour extraction using a related function such as findContours () in OpenCV, so as to generate a black and white contour image and a contour position information coordinate value of the image.

On the other hand, the invention also provides a segmentation and annotation system of the figure image, which is characterized by comprising a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;

the marking module is used for checking the obtained segmentation image set, and if any segmentation image does not meet the preset requirement, performing secondary processing on the corresponding segmentation image; if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image, and outputting a corresponding annotated image set;

optionally, the user may finally download the annotation image to the local by using the data transmission module, and the annotation task is completed.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method for segmenting and labeling a human image provided by the above methods, including:

s1, acquiring an image set to be annotated and a high-resolution image set;

s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to execute the method for segmenting and labeling a human image provided above, including:

s1, acquiring an image set to be annotated and a high-resolution image set;

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A segmentation and annotation method for a human image is characterized by comprising the following steps:

s1, acquiring an image set to be annotated and a high-resolution image set;

2. The method of claim 1, wherein the high-frequency information and the low-frequency information are obtained by performing a first sampling on a plurality of high-resolution images in the high-resolution image set through wavelet transform.

3. The method of claim 1, wherein training a reconstruction network based on the low frequency information and the high frequency information comprises:

4. The method for segmenting and labeling human images according to claim 3, wherein the step of inputting the image set to be labeled into the optimal reconstruction network to obtain the reconstructed image corresponding to each image to be labeled specifically comprises:

5. The method for segmenting and labeling human images according to claim 1, wherein a reconstruction network is trained based on the low-frequency information and the high-frequency information to obtain the optimal reconstruction network, further comprising:

6. The method for segmenting and labeling a human image according to claim 1, wherein the step of checking the obtained segmented image set and judging whether any segmented image meets preset requirements comprises the steps of:

7. The method for segmenting and labeling human images as claimed in claim 1 or 6, wherein for any segmented image, if the segmented image does not meet the preset requirement, the corresponding segmented image is secondarily processed, comprising the steps of:

8. The method for segmenting and labeling human images as claimed in claim 1, wherein each reconstructed image is segmented through a preset segmentation network to obtain segmented regions, the segmented regions are labeled as human regions and environmental regions, contour lines of the segmented regions after labeling are extracted, and coordinates of a plurality of point locations on the contour lines are obtained;

9. A segmentation and annotation system for a figure image is characterized by comprising a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the method for segmenting and labeling a human image according to any one of claims 1 to 8.