CN114898096A - Segmentation and annotation method and system for figure image - Google Patents

Segmentation and annotation method and system for figure image Download PDF

Info

Publication number
CN114898096A
CN114898096A CN202210549650.6A CN202210549650A CN114898096A CN 114898096 A CN114898096 A CN 114898096A CN 202210549650 A CN202210549650 A CN 202210549650A CN 114898096 A CN114898096 A CN 114898096A
Authority
CN
China
Prior art keywords
image
frequency information
segmentation
segmented
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210549650.6A
Other languages
Chinese (zh)
Inventor
赵凯
黄涛
程雯
王先兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202210549650.6A priority Critical patent/CN114898096A/en
Publication of CN114898096A publication Critical patent/CN114898096A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Abstract

The invention relates to the technical field of digital image processing, in particular to a method and a system for segmenting and labeling a figure image, which comprise the following steps: acquiring an image set to be annotated and a high-resolution image set; acquiring low-frequency information and high-frequency information of a high-resolution image to train a reconstruction network; inputting an image set to be marked into an optimal reconstruction network to obtain a reconstructed image; segmenting the reconstructed image through a segmentation network to generate a segmentation image set; and checking the segmented image set, and performing annotation processing on the original image to be annotated which meets the preset requirement to obtain an annotated image set. By extracting the low-frequency information and the high-frequency information of the high-resolution image, on the premise of not changing the original image to be marked, the high-frequency information of the image to be marked is supplemented through the reconstruction network, the image to be marked is reconstructed into a clear image, the original segmentation algorithm does not need to be changed, the robustness and the generalization capability of the existing segmentation algorithm are enhanced from the image angle, and the image segmentation precision is higher and the marked image is more accurate.

Description

Segmentation and annotation method and system for figure image
Technical Field
The invention relates to the technical field of digital image processing, in particular to a method and a system for segmenting and labeling a figure image.
Background
With the development of computer vision technology, in the field of image processing, many algorithm networks cannot separate data sets, and particularly in the aspect of image segmentation, people have an increasing demand for high-quality image segmentation data sets. At present, most image segmentation data sets on the market are manually labeled, namely labeled by software such as Hitachi segmentation and LabelMe. Firstly, an annotation person needs to perform contour drawing on an original image to form pre-segmentation, and then manually modifies the original image on the basis of the pre-segmentation to finally determine an annotated image. The traditional image segmentation and annotation method is not only tedious in operation process but also very time-consuming, a large amount of time is used for image annotation, and the efficiency is very low.
In the prior art, some developers introduce related segmentation algorithms into image annotation in order to further improve the annotation efficiency on the premise of ensuring the quality of an annotated image; firstly, an image to be marked is segmented by an algorithm, and then marking personnel mark the image on the basis of the segmented image, but the method has a plurality of defects and poor practical effect, when the image is fuzzy or has low resolution, the phenomenon of mistaken segmentation is easy to generate by the segmentation algorithm, at the moment, the marking personnel manually draw and mark the original fuzzy image again in a manual mode, the defect that the algorithm is easy to generate mistaken segmentation is not overcome, and the marking efficiency is not effectively improved.
Therefore, the current manual segmentation labeling method and the method combining manual operation and algorithm can not solve the problem of low efficiency of current image labeling.
Disclosure of Invention
The invention provides a method and a system for segmenting and labeling a person image, which are used for overcoming the defects in the prior art.
The invention provides a segmentation and annotation method of a character image, which comprises the following steps:
s1, acquiring an image set to be annotated and a high-resolution image set;
s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;
s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;
s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;
s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, performing annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.
Preferably, the first sampling is performed on a plurality of high resolution images in the high resolution image set through wavelet transform, so as to obtain corresponding high frequency information and low frequency information.
According to the segmentation and labeling method for the figure image, provided by the invention, a reconstruction network is trained on the basis of the low-frequency information and the high-frequency information, and the method comprises the following steps:
storing the high-frequency information into an auxiliary variable and inputting the high-frequency information into the reconstruction network;
generating a corresponding low-resolution image based on the low-frequency information, and inputting the image to the reconstruction network;
and fusing the low-frequency information of the low-resolution image and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the repaired training reconstructed image.
According to the method for segmenting and labeling the human image, which is provided by the invention, the image set to be labeled is input into the optimal reconstruction network, and the reconstruction image corresponding to each image to be labeled is obtained, and the method specifically comprises the following steps:
inputting each image to be marked into the optimal reconstruction network, and taking the image to be marked as low-frequency information; generating auxiliary variables corresponding to the image to be marked through the optimal reconstruction network; and fusing the low-frequency information of the image to be marked and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the restored reconstructed image.
According to the segmentation and labeling method for the figure image, provided by the invention, a reconstruction network is trained on the basis of the low-frequency information and the high-frequency information to obtain the optimal reconstruction network, and the segmentation and labeling method further comprises the following steps:
acquiring forward loss and reverse loss generated in the process of training and reconstructing the network;
comparing a low-resolution image obtained based on the low-frequency information with a low-resolution image obtained by subjecting a high-resolution image to image degradation processing to obtain an error between the low-resolution image and the high-resolution image, and recording the error as the forward loss;
inputting the low-resolution image obtained based on the low-frequency information into the reconstruction network to obtain a corresponding training reconstruction image, comparing the training reconstruction image with an original high-resolution image corresponding to the high-frequency information to obtain an error between the training reconstruction image and the original high-resolution image, and recording the error as the reverse loss;
and taking the corresponding reconstructed network when the forward loss and the reverse loss are minimum as the optimal reconstructed network.
According to the method for segmenting and labeling the human image, provided by the invention, the obtained segmented image set is checked, and whether any segmented image meets the preset requirement is judged, wherein the method comprises the following steps:
judging whether the character area and the environment area in the segmented image have error segmentation;
judging whether the average intersection ratio and the average pixel precision of the segmented images are lower than preset index values or not;
and if the error segmentation does not exist, and the average intersection ratio and the average pixel precision are not lower than the preset index value, judging that the segmented image meets the preset requirement.
According to the segmentation and annotation method of the figure image, provided by the invention, for any segmentation image, if the segmentation image does not meet the preset requirement, secondary processing is carried out on the corresponding segmentation image, and the segmentation and annotation method comprises the following steps:
s601, forming a rechecking image set by the original reconstructed images corresponding to the segmented images which do not meet the preset requirements;
s602, inputting the review image set into the optimal reconstruction network, outputting a new reconstruction image, segmenting the new reconstruction image through the segmentation network, and outputting a new segmentation image set;
s603, judging whether the new segmented image set meets a preset requirement or not, and if so, outputting the corresponding new segmented image set; if the preset requirement is not met, the steps S601-S603 are repeatedly executed until the preset requirement is met.
According to the segmentation and labeling method of the figure image, provided by the invention, each reconstructed image is segmented through a preset segmentation network to obtain a segmentation region, the segmentation region is labeled as a figure region and an environment region, the contour line of the segmented region after labeling is extracted, and the coordinates of a plurality of point positions on the contour line are obtained;
and drawing a corresponding contour line on the image to be marked corresponding to the segmented image based on the coordinates of the point positions on the external contour, filling the contour line based on the color preset by the user so as to mark the corresponding segmented region, and outputting a marked image.
On the other hand, the invention also provides a segmentation and annotation system of the figure image, which comprises a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;
the data transmission module is used for acquiring an image set to be annotated and a high-resolution image set;
the storage module is used for storing all image data and image sets generated in the segmentation process and the labeling process;
the first sampling module is used for acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
the reconstruction module is used for training a reconstruction network according to the low-frequency information and the high-frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest; acquiring the image set to be annotated from the storage module, and acquiring a reconstructed image corresponding to each image to be annotated through the optimal reconstruction network;
the segmentation module is used for segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, obtaining a corresponding segmented image and generating a segmented image set;
the marking module is used for checking the obtained segmentation image set, and if any segmentation image does not meet the preset requirement, performing secondary processing on the corresponding segmentation image; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image, and outputting a corresponding annotated image set.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for segmenting and annotating a human image as described in any one of the above.
The segmentation and annotation method and system for the figure image, provided by the invention, have the following technical effects:
(1) by extracting the low-frequency information and the high-frequency information of the high-resolution image, training the reconstruction network based on the low-frequency information and the high-frequency information, supplementing the high-frequency information of the image to be marked through the reconstruction network on the premise of not changing the original image to be marked, reconstructing the image to be marked into a clear image through the high-frequency information, being beneficial to segmenting and marking the original image to be marked later, not needing to change the original segmentation algorithm, enhancing the robustness and the generalization capability of the existing segmentation algorithm from the angle of the image, enabling the image segmentation precision to be higher, and enabling the output marked image to be more accurate.
(2) The problems of long time consumption and low efficiency in the existing manual labeling are solved through a reconstruction network and a segmentation network; in the using process, a large number of fuzzy image labeling tasks can be realized only by uploading a certain amount of training set training networks, so that the labeling time is greatly shortened, and the image processing efficiency is favorably improved;
(3) the acquired segmented images can be checked for many times according to the requirements of users, and the images to be labeled are reconstructed and labeled for many times through the reconstruction network and the segmentation network, so that the accuracy of the image labeling task is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for segmenting and labeling a human image according to the present invention;
fig. 2 is a second flowchart of the method for segmenting and labeling a human image according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1-2, the present invention provides a method for segmenting and labeling a human image, comprising the steps of:
s1, acquiring an image set to be annotated and a high-resolution image set;
s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;
s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;
s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;
s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set;
in step S1, the user may input the image set to be annotated and the high resolution image set for reconstructing the network training; the image set to be marked is specifically a fuzzy image set;
the blurred image is caused by factors such as inaccurate focusing or relative motion between the target and the camera, camera distortion and the like when the image is acquired, for example, the image is blurred due to factors such as weather such as rain and snow, dark surrounding environment, noise and the like when the image is shot; the method also comprises the steps that the focusing misalignment caused by the monitoring camera forms a fuzzy state, the relative motion exists between the monitored object and the camera to cause a fuzzy state, the definition of the monitoring camera is low to cause an image fuzzy state and the like; there is also a possibility that the distance causes the formation of blur, for example, if the distance between a pedestrian and a monitor is too far, the resolution of a pedestrian gait sequence image cut from a video is very low, the formation of blur is caused, and the features in the image are difficult to distinguish;
the high resolution and the low resolution are relative concepts, and the resolution of the high resolution image is greater than that of the low resolution image;
optionally, the high-resolution image set may further include, in addition to the high-resolution image set, a segmentation annotation image corresponding to each high-resolution image, and is used to train the reconstruction network and the segmentation network;
preferably, a plurality of high resolution images in the high resolution image set are subjected to first sampling through wavelet transformation to obtain corresponding high frequency information and low frequency information;
in step S3, training a reconstruction network based on the low frequency information and the high frequency information specifically includes:
storing the high-frequency information into an auxiliary variable and inputting the high-frequency information into the reconstruction network;
generating a corresponding low-resolution image based on the low-frequency information, and inputting the image to the reconstruction network;
fusing the low-frequency information of the low-resolution image and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting a repaired training reconstructed image;
the reconstruction network is constructed by a plurality of reversible networks and is used for further separating low-frequency information and high-frequency information of the original high-resolution image, so that the low-frequency information generates a corresponding low-resolution image or a fuzzy image through the reconstruction network, and the high-frequency information is embedded into an auxiliary variable;
the reconstruction network is also reversible, and when the image is input in the forward direction, the high-frequency information and the low-frequency information of the original image are separated; when the image is input reversely, inputting a low-resolution image, generating an auxiliary variable with high-frequency information corresponding to the low-resolution image, and obtaining low-frequency information and corresponding high-frequency information of the low-resolution image; fusing the low-frequency information and the high-frequency information through second sampling processing, and outputting a high-resolution image corresponding to the low-resolution image, namely a repaired clear image of the low-resolution image;
in one embodiment, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstructed image corresponding to each image to be annotated specifically includes:
inputting each image to be marked into the optimal reconstruction network, and taking the image to be marked as low-frequency information; generating auxiliary variables corresponding to the image to be marked through the optimal reconstruction network; fusing the low-frequency information of the image to be marked and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting a restored reconstructed image;
the reconstruction network can be regarded as a reversible transformation function and is based on the following expression:
and (3) forward process:
Figure BDA0003654251690000091
the reverse process:
Figure BDA0003654251690000092
wherein a is an auxiliary variable carrying high-frequency information of the image,
Figure BDA0003654251690000093
for low-fraction images generated over a reconstruction network, x L ,x H Respectively representing low-frequency information and high-frequency information corresponding to the original high-resolution image obtained by down-sampling,
Figure BDA0003654251690000094
it is the low frequency information and the high frequency information of the low fraction image obtained by inverse reconstruction,
Figure BDA0003654251690000095
then it is the upsampled repaired high resolution image, f () represents the reconstruction network forward process, and g () represents the upsampling process.
Preferably, in the training process of the reconstructed network, the error loss may be taken into consideration as a basis for the training convergence of the reconstructed network, and the reconstructed network is optimized through multiple training, specifically including:
acquiring forward loss and reverse loss generated in the process of training and reconstructing the network;
firstly, forward loss is adopted, low-frequency information and high-frequency information are input into a reconstruction network in a forward direction to obtain a low-fraction image corresponding to a high-fraction image, in order to better simulate a real low-fraction blurred image, the original high fraction is directly subjected to image degradation processing to obtain a degraded low-fraction blurred image, and an error between the low-fraction images obtained in the two modes is called the forward loss of the reconstruction network;
further, the reverse loss is to perform reverse reconstruction on the low-fraction blurred image generated in the forward process, that is, the low-fraction blurred image and the auxiliary variable are input into a reconstruction network, and then upsampled to obtain a repaired high-resolution image, that is, the training reconstructed image; the error between the repaired high resolution image and the original high resolution image is called the reverse loss.
Taking the corresponding reconstructed network when the forward loss and the reverse loss are minimum as the optimal reconstructed network;
in the training process, the reconstructed image generated by the untrained reconstruction network is the training reconstructed image;
the image quality degradation process is to perform bicubic difference downsampling processing on the high-fraction image to obtain a low-fraction image, and then add noise to the low-fraction image;
further, the loss of training of the reconstructed network can be expressed as a sub-expression of:
loss total =λ 1 loss Forw2 loss Reve
therein, loss Forw Represents reconstructed network forward loss, loss Reve Representing the reverse loss, λ, of the reconstructed network 12 Are the corresponding coefficients.
Further, loss Forw Can be obtained by the following equation
Figure BDA0003654251690000101
Further, loss Reve This can be obtained by the following equation:
Figure BDA0003654251690000102
wherein l forw Calculating a loss value function for the forward process,/ reve In order to calculate the loss value function in the reverse direction,
Figure BDA0003654251690000103
is composed of high-fraction images
Figure BDA0003654251690000104
Obtaining a low-fraction blurred image through a degradation process;
further, in step S4, the fuzzy image set is repaired by using the optimal reconstruction network, specifically, the missing variable of the high frequency information is supplemented by the optimal reconstruction network, the high frequency information and the fuzzy image set to be labeled are simultaneously input to the optimal reconstruction network for inverse reconstruction, and a clear image set corresponding to the fuzzy image set to be labeled, that is, a reconstructed image (a repaired fuzzy image set), is obtained;
further, in step S5, segmenting each of the reconstructed images through a preset segmentation network, distinguishing a person region from an environment region, obtaining corresponding segmented images, and generating a segmented image set;
the segmentation network comprises but is not limited to one or more combinations of mainstream segmentation algorithms such as DeepLabv3+, Mask R-CNN and the like;
preferably, the method further comprises the following steps: before inputting the repaired fuzzy image set into the segmentation network model, the segmentation network model training stage is also included, and the segmentation network is trained by the high-resolution training set uploaded by a user and the corresponding segmentation annotation images to obtain the trained segmentation network model.
In one embodiment, the step of checking the acquired segmented image set to determine whether any segmented image meets preset requirements includes:
judging whether the character area and the environment area in the segmented image have error segmentation;
judging whether the average intersection ratio and the average pixel precision of the segmented images are lower than preset index values or not;
if the error segmentation does not exist, and the average intersection ratio and the average pixel precision are not lower than the preset index value, judging that the segmented image meets the preset requirement;
further, if any segmented image does not meet the preset requirement, secondary processing is carried out on the corresponding segmented image, and the method comprises the following steps:
s601, forming a rechecking image set by the original reconstructed images corresponding to the segmented images which do not meet the preset requirements;
s602, inputting the review image set into the optimal reconstruction network, outputting a new reconstruction image, segmenting the new reconstruction image through the segmentation network, and outputting a new segmentation image set;
s603, judging whether the new segmented image set meets a preset requirement or not, and if so, outputting the corresponding new segmented image set; if the preset requirement is not met, the steps S601-S603 are repeatedly executed until the preset requirement is met;
specifically, in one example, before labeling the divided images, the divided images may be examined by a user or an algorithm, and rejected images that do not meet the preset requirements after the first reconstruction are screened out to form a new image set a for secondary processing;
before the second processing, the method further comprises an image retrieval stage of an original image set to be processed, traversing reconstructed images generated by the image set to be processed after the first repairing according to an image set A of unqualified images screened by a user, selecting a reconstructed image corresponding to each unqualified image in the image set A, and forming a new image set B;
inputting the image set B into the optimal reconstruction network again, and performing secondary processing including secondary reconstruction and segmentation;
optionally, the user may set the reconstruction times as needed until obtaining a segmented image that meets the requirement, has no error segmentation, has no average intersection ratio, and has no average pixel precision lower than a preset index value;
optionally, a processing time threshold of multiple reconstruction processing may be set, and for an image that does not meet the requirement after re-reconstruction and re-segmentation after exceeding the processing time threshold, the corresponding image is converted into manual processing to perform segmentation processing on the image; or further retraining the original reconstruction network and the segmentation network, and expanding a training sample set, so that the algorithm precision of the reconstruction network and the segmentation network is improved;
after the images which can not meet the preset requirements and exceed the reconstruction times are processed manually, the original images and the images processed manually are used as a supplementary training set for retraining the reconstructed network;
further, each reconstructed image is segmented through a preset segmentation network to obtain a segmentation region, the segmentation region is labeled as a human body region and an environment region, a contour line of the segmentation region after labeling is extracted, and coordinates of a plurality of point positions on the contour line are obtained;
drawing a corresponding contour line on an image to be marked corresponding to the segmented image based on the coordinates of a plurality of point positions on the external contour, filling the contour line based on colors preset by a user so as to mark a corresponding segmented region, and outputting a marked image;
optionally, the user may set the type of the segmentation region as required, so as to segment different regions in the image, select a required color according to the corresponding type, fill the contour line range of the segmentation region with a preset color, and fill the region outside the contour line range with black;
optionally, before extracting the contour of the target segmented region, the region outside the segmented region needs to be changed into a black background, and a specific implementation may be to perform contour extraction using a related function such as findContours () in OpenCV, so as to generate a black and white contour image and a contour position information coordinate value of the image.
On the other hand, the invention also provides a segmentation and annotation system of the figure image, which is characterized by comprising a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;
the data transmission module is used for acquiring an image set to be annotated and a high-resolution image set;
the storage module is used for storing all image data and image sets generated in the segmentation process and the labeling process;
the first sampling module is used for acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
the reconstruction module is used for training a reconstruction network according to the low-frequency information and the high-frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest; acquiring the image set to be annotated from the storage module, and acquiring a reconstructed image corresponding to each image to be annotated through the optimal reconstruction network;
the segmentation module is used for segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, obtaining a corresponding segmented image and generating a segmented image set;
the marking module is used for checking the obtained segmentation image set, and if any segmentation image does not meet the preset requirement, performing secondary processing on the corresponding segmentation image; if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image, and outputting a corresponding annotated image set;
optionally, the user may finally download the annotation image to the local by using the data transmission module, and the annotation task is completed.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method for segmenting and labeling a human image provided by the above methods, including:
s1, acquiring an image set to be annotated and a high-resolution image set;
s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;
s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;
s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;
s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to execute the method for segmenting and labeling a human image provided above, including:
s1, acquiring an image set to be annotated and a high-resolution image set;
s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;
s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;
s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;
s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A segmentation and annotation method for a human image is characterized by comprising the following steps:
s1, acquiring an image set to be annotated and a high-resolution image set;
s2, carrying out first sampling on a plurality of high-resolution images in the high-resolution image set, and acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
s3 training a reconstruction network based on the low frequency information and the high frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest;
s4, inputting the image set to be annotated into the optimal reconstruction network, and acquiring a reconstruction image corresponding to each image to be annotated;
s5, segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, acquiring a corresponding segmented image, and generating a segmented image set;
s6, the obtained segmentation image set is checked, and if any segmentation image does not meet the preset requirement, the corresponding segmentation image is subjected to secondary processing; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image to obtain a corresponding annotated image set.
2. The method of claim 1, wherein the high-frequency information and the low-frequency information are obtained by performing a first sampling on a plurality of high-resolution images in the high-resolution image set through wavelet transform.
3. The method of claim 1, wherein training a reconstruction network based on the low frequency information and the high frequency information comprises:
storing the high-frequency information into an auxiliary variable and inputting the high-frequency information into the reconstruction network;
generating a corresponding low-resolution image based on the low-frequency information, and inputting the image to the reconstruction network;
and fusing the low-frequency information of the low-resolution image and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the repaired training reconstructed image.
4. The method for segmenting and labeling human images according to claim 3, wherein the step of inputting the image set to be labeled into the optimal reconstruction network to obtain the reconstructed image corresponding to each image to be labeled specifically comprises:
inputting each image to be marked into the optimal reconstruction network, and taking the image to be marked as low-frequency information; generating auxiliary variables corresponding to the image to be marked through the optimal reconstruction network; and fusing the low-frequency information of the image to be marked and the high-frequency information in the corresponding auxiliary variable through second sampling processing, and outputting the restored reconstructed image.
5. The method for segmenting and labeling human images according to claim 1, wherein a reconstruction network is trained based on the low-frequency information and the high-frequency information to obtain the optimal reconstruction network, further comprising:
acquiring forward loss and reverse loss generated in the process of training and reconstructing the network;
comparing a low-resolution image obtained based on the low-frequency information with a low-resolution image obtained by subjecting a high-resolution image to image degradation processing to obtain an error between the low-resolution image and the high-resolution image, and recording the error as the forward loss;
inputting the low-resolution image obtained based on the low-frequency information into the reconstruction network to obtain a corresponding training reconstruction image, comparing the training reconstruction image with an original high-resolution image corresponding to the high-frequency information to obtain an error between the training reconstruction image and the original high-resolution image, and recording the error as the reverse loss;
and taking the corresponding reconstructed network when the forward loss and the reverse loss are minimum as the optimal reconstructed network.
6. The method for segmenting and labeling a human image according to claim 1, wherein the step of checking the obtained segmented image set and judging whether any segmented image meets preset requirements comprises the steps of:
judging whether the character area and the environment area in the segmented image have error segmentation;
judging whether the average intersection ratio and the average pixel precision of the segmented images are lower than preset index values or not;
and if the error segmentation does not exist, and the average intersection ratio and the average pixel precision are not lower than the preset index value, judging that the segmented image meets the preset requirement.
7. The method for segmenting and labeling human images as claimed in claim 1 or 6, wherein for any segmented image, if the segmented image does not meet the preset requirement, the corresponding segmented image is secondarily processed, comprising the steps of:
s601, forming a rechecking image set by the original reconstructed images corresponding to the segmented images which do not meet the preset requirements;
s602, inputting the review image set into the optimal reconstruction network, outputting a new reconstruction image, segmenting the new reconstruction image through the segmentation network, and outputting a new segmentation image set;
s603, judging whether the new segmented image set meets a preset requirement or not, and if so, outputting the corresponding new segmented image set; if the preset requirement is not met, the steps S601-S603 are repeatedly executed until the preset requirement is met.
8. The method for segmenting and labeling human images as claimed in claim 1, wherein each reconstructed image is segmented through a preset segmentation network to obtain segmented regions, the segmented regions are labeled as human regions and environmental regions, contour lines of the segmented regions after labeling are extracted, and coordinates of a plurality of point locations on the contour lines are obtained;
and drawing a corresponding contour line on the image to be marked corresponding to the segmented image based on the coordinates of the point positions on the external contour, filling the contour line based on the color preset by the user so as to mark the corresponding segmented region, and outputting a marked image.
9. A segmentation and annotation system for a figure image is characterized by comprising a data transmission module, a storage module, a first sampling module, a reconstruction module, a segmentation module and an annotation module;
the data transmission module is used for acquiring an image set to be annotated and a high-resolution image set;
the storage module is used for storing all image data and image sets generated in the segmentation process and the labeling process;
the first sampling module is used for acquiring low-frequency information and high-frequency information corresponding to each high-resolution image;
the reconstruction module is used for training a reconstruction network according to the low-frequency information and the high-frequency information; obtaining a training reconstructed image through reconstruction network processing, comparing the training reconstructed image with a corresponding high-resolution image, and outputting the corresponding reconstruction network as an optimal reconstruction network when the peak signal-to-noise ratio and the structural similarity of the training reconstructed image and the corresponding high-resolution image are highest; acquiring the image set to be annotated from the storage module, and acquiring a reconstructed image corresponding to each image to be annotated through the optimal reconstruction network;
the segmentation module is used for segmenting each reconstructed image through a preset segmentation network, distinguishing a person region and an environment region, obtaining a corresponding segmented image and generating a segmented image set;
the marking module is used for checking the obtained segmentation image set, and if any segmentation image does not meet the preset requirement, performing secondary processing on the corresponding segmentation image; and if the preset requirements are met, carrying out annotation processing on the original image to be annotated based on the segmented image, and outputting a corresponding annotated image set.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the method for segmenting and labeling a human image according to any one of claims 1 to 8.
CN202210549650.6A 2022-05-20 2022-05-20 Segmentation and annotation method and system for figure image Pending CN114898096A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210549650.6A CN114898096A (en) 2022-05-20 2022-05-20 Segmentation and annotation method and system for figure image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210549650.6A CN114898096A (en) 2022-05-20 2022-05-20 Segmentation and annotation method and system for figure image

Publications (1)

Publication Number Publication Date
CN114898096A true CN114898096A (en) 2022-08-12

Family

ID=82722947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210549650.6A Pending CN114898096A (en) 2022-05-20 2022-05-20 Segmentation and annotation method and system for figure image

Country Status (1)

Country Link
CN (1) CN114898096A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222739A (en) * 2022-09-20 2022-10-21 成都数之联科技股份有限公司 Defect labeling method, device, storage medium, equipment and computer program product

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222739A (en) * 2022-09-20 2022-10-21 成都数之联科技股份有限公司 Defect labeling method, device, storage medium, equipment and computer program product
CN115222739B (en) * 2022-09-20 2022-12-02 成都数之联科技股份有限公司 Defect labeling method, device, storage medium, equipment and computer program product

Similar Documents

Publication Publication Date Title
CN108986050B (en) Image and video enhancement method based on multi-branch convolutional neural network
CN109299274B (en) Natural scene text detection method based on full convolution neural network
Li et al. Single image dehazing via conditional generative adversarial network
Engin et al. Cycle-dehaze: Enhanced cyclegan for single image dehazing
CN113362223B (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
CN108122197B (en) Image super-resolution reconstruction method based on deep learning
CN111340738B (en) Image rain removing method based on multi-scale progressive fusion
CN111754438B (en) Underwater image restoration model based on multi-branch gating fusion and restoration method thereof
CN111105376B (en) Single-exposure high-dynamic-range image generation method based on double-branch neural network
CN111179196B (en) Multi-resolution depth network image highlight removing method based on divide-and-conquer
CN111931857A (en) MSCFF-based low-illumination target detection method
Hu et al. Single image dehazing algorithm based on sky segmentation and optimal transmission maps
CN115578406A (en) CBCT jaw bone region segmentation method and system based on context fusion mechanism
CN114898096A (en) Segmentation and annotation method and system for figure image
CN113128517B (en) Tone mapping image mixed visual feature extraction model establishment and quality evaluation method
Xiao et al. Image hazing algorithm based on generative adversarial networks
CN113178010B (en) High-resolution image shadow region restoration and reconstruction method based on deep learning
CN117408924A (en) Low-light image enhancement method based on multiple semantic feature fusion network
CN116128768B (en) Unsupervised image low-illumination enhancement method with denoising module
CN117333359A (en) Mountain-water painting image super-resolution reconstruction method based on separable convolution network
Luo et al. A fast denoising fusion network using internal and external priors
Li et al. RGSR: A two-step lossy JPG image super-resolution based on noise reduction
CN116245861A (en) Cross multi-scale-based non-reference image quality evaluation method
CN110298809B (en) Image defogging method and device
Qu et al. LEUGAN: low-light image enhancement by unsupervised generative attentional networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination