CN116385466A - Method and system for dividing targets in image based on boundary box weak annotation - Google Patents
Method and system for dividing targets in image based on boundary box weak annotation Download PDFInfo
- Publication number
- CN116385466A CN116385466A CN202310494738.7A CN202310494738A CN116385466A CN 116385466 A CN116385466 A CN 116385466A CN 202310494738 A CN202310494738 A CN 202310494738A CN 116385466 A CN116385466 A CN 116385466A
- Authority
- CN
- China
- Prior art keywords
- pseudo
- pixel
- segmentation
- labels
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000011218 segmentation Effects 0.000 claims abstract description 101
- 238000002372 labelling Methods 0.000 claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000012795 verification Methods 0.000 claims abstract description 8
- 238000001514 detection method Methods 0.000 claims abstract description 6
- 238000003062 neural network model Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims 1
- 238000002790 cross-validation Methods 0.000 abstract description 7
- 230000004927 fusion Effects 0.000 abstract 1
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method and a system for dividing targets in images based on weak labels of boundary boxes, which relate to the technical field of image division and comprise the following steps: acquiring an image data set, and marking a boundary box of the image data set; generating pixel-level pseudo labels with confidence on the basis of the boundary box labels and training a target segmentation model according to the pixel-level pseudo labels; optimizing pixel-level pseudo labels based on cross verification iteration and training a target segmentation model based on the pixel-level pseudo labels; and outputting the optimal model. Compared with a pixel-level manual labeling method, the method does not need time-consuming and labor-consuming pixel-level labeling, only uses simple bounding box labeling, and saves manpower and material resources; the segmentation performance of the model is improved through simultaneous learning of two tasks of an output graph and a boundary box and fusion of two global and local scale graphs; by the cross-validation pseudo-annotation noise detection method, false annotations in the pseudo-annotations are detected, influence on model training is reduced, and target segmentation accuracy in images is improved.
Description
Technical Field
The invention relates to the technical field of image segmentation, in particular to a method and a system for segmenting targets in an image based on weak annotation of a boundary box.
Background
At present, the object segmentation in the image is widely applied to image content understanding or scene analysis tasks, and has wide application value. In recent years, the deep convolutional neural network greatly improves the performance of target segmentation, and the target segmentation method in the existing image has the advantages that the method with better performance is a full supervision method based on deep learning, but a large amount of manually marked pixel-level data is needed, so that the marking working cost is too high, and the limitation of other application scenes is difficult to generalize.
Therefore, the object segmentation method in the image based on the weak annotation of the bounding box is provided, time-consuming pixel-level annotation data is not needed, only the bounding boxes of all objects in the image are needed to be annotated, and finally a object segmentation model is learned based on the bounding boxes of the objects to carry out object segmentation, so that the problem to be solved by the person skilled in the art is needed to be solved.
Disclosure of Invention
In view of the above, the invention provides a method and a system for dividing targets in an image based on weak labeling of a bounding box, which do not need time-consuming pixel-level labeling data, only need to label bounding boxes of all targets in the image, then generate multi-level pseudo labels, optimize the pseudo labels through a neural network, and finally learn a target dividing model, and in order to achieve the above purpose, the invention adopts the following technical scheme:
a method for dividing targets in an image based on weak annotation of a bounding box comprises the following steps:
acquiring an image data set, and marking a boundary box of the image data set;
generating a pixel-level pseudo label with confidence based on the boundary box label;
training a target segmentation model according to the pixel-level pseudo-labels;
optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
Optionally, performing bounding box labeling on the image dataset includes: and marking the minimum circumscribed rectangular frame of each target in the image data set as a boundary frame, merging the boundary frames of a plurality of targets into one boundary frame when the boundary frames of the targets are overlapped, wherein the targets are positioned in the boundary frame, and the background area is arranged outside the boundary frame.
Optionally, processing the bounding box label to obtain a specific position of the object in the bounding box.
Optionally, the specific steps of processing the bounding box label are as follows:
marking the input image and the boundary box through a color space;
dividing the color space by using a GrabCut algorithm, and generating a pseudo mark according to a division result;
calculating a first segmentation confidence according to consistency of segmentation results of pixels in the pseudo labels:
calculating a second segmentation confidence according to the area ratio of the minimum circumscribed rectangle to the boundary box mark in the segmentation result;
and coupling the first segmentation confidence and the second segmentation confidence to obtain the confidence of the pseudo-label.
Optionally, the training the target segmentation model according to the pixel-level pseudo-labeling includes:
acquiring an image frame, marking a boundary frame, and marking a pixel level pseudo mark;
constructing a deep neural network model;
training a target segmentation depth neural network model under the supervision of pixel-level pseudo labeling;
and constructing a loss function through the loss between the boundary box label and the predicted boundary box, the loss between the global segmentation result and the pseudo label, the loss between the local segmentation result and the pseudo label and the confidence level of the pseudo label to guide the neural network to learn.
Optionally, the optimizing the pixel-level pseudo-labeling based on the cross-validation iteration includes:
randomly dividing the data set T and the pseudo-annotation S into two parts T 1 ,S 1 And T 2 ,S 2 The method comprises the steps of carrying out a first treatment on the surface of the Two network models D (T) 1 ,S 1 ) And D (T) 2 ,S 2 );
With a learned network model D (T 2 ,S 2 ) In the dataset T 1 The result W is obtained by the above deduction 1 By calculating pseudo-labels S 1 And inferred result W 1 Is considered noise when the difference degree is greater than a certain threshold, and is shown in S 1 The middle mark is noise, and the same applies to the pseudo mark S 2 Performs noise detection in (1), and at S 2 Marking noise;
to fake label S 1 And S is 2 And updating, namely merging the two parts of data, and repeating the steps to update iteratively until no noise is detected.
Optionally, the method comprises the following steps: and updating the pixel-level pseudo-labels by the optimized pixel-level pseudo-labels, training the target segmentation depth neural network model under the supervision of the optimized pixel-level pseudo-labels, iterating for N rounds until the model converges, and outputting an optimal model.
Optionally, a target segmentation system in an image based on a weak label of a bounding box includes:
the acquisition module is used for: for acquiring an image dataset;
and the marking module is used for: the method comprises the steps of performing boundary box labeling on the image data set, and generating pixel-level pseudo labels with confidence on the basis of the boundary box labeling;
the segmentation model construction module: training a target segmentation model according to the pixel-level pseudo-labels;
training module: optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and an output module: and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
Compared with the prior art, the method and the system for dividing the target in the image based on the weak annotation of the bounding box have the following beneficial effects:
compared with a pixel-level manual labeling method, the method for dividing the target in the image based on the boundary box weak labeling does not need time-consuming and labor-consuming pixel-level labeling, only uses simple boundary box labeling, is easier to popularize in any application scene, and saves a large amount of manpower and material resources; the pixel-level pseudo-labeling segmentation results and the result consistency thereof generated in different modes are used for obtaining pseudo-labels with confidence coefficient and guiding neural network learning; the global and local scale segmentation results are predicted by learning and predicting the simultaneous learning of the target segmentation task and the boundary frame task, so that the segmentation performance of the model is improved; by the cross-validation pseudo-annotation noise detection method, false annotations in the pseudo-annotations are detected, influence on model training is reduced, and target segmentation accuracy in images is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for dividing targets in an image based on weak annotation of a bounding box.
Fig. 2 is a schematic diagram of bounding box labeling provided in the present invention.
FIG. 3 is a schematic diagram of pseudo label generation with confidence provided by the present invention.
Fig. 4 is a flowchart of cross-validation-based noise detection provided by the present invention.
Fig. 5 is a schematic diagram of a target segmentation model structure provided by the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses a target segmentation method in an image based on a boundary box weak annotation, which comprises the following steps:
acquiring an image data set, and marking a boundary box of the image data set;
generating a pixel-level pseudo label with confidence based on the boundary box label;
training a target segmentation model according to the pixel-level pseudo-labels;
optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
Further, as shown in fig. 2, the image dataset is subjected to bounding box labeling, and the pixel-level labeling has the limitation that the labeling work cost is too high and is difficult to generalize to other application scenes. And adopting simple bounding box labeling, namely using the smallest circumscribed rectangular box of each object in the picture as a bounding box labeling, and merging into one bounding box if the bounding boxes of a plurality of objects are overlapped. Wherein, the red rectangle box in fig. 2 shows the bounding box labeling example used by the method.
Further, the bounding box labels are processed to obtain specific positions of the objects in the bounding box.
Further, the specific steps of processing the bounding box label are as follows:
marking the input image and the boundary box through a color space;
dividing the color space by using a GrabCut algorithm, and generating a pseudo mark according to a division result;
calculating a first segmentation confidence according to consistency of segmentation results of pixels in the pseudo labels:
calculating a second segmentation confidence according to the area ratio of the minimum circumscribed rectangle to the boundary box mark in the segmentation result;
and coupling the first segmentation confidence and the second segmentation confidence to obtain the confidence of the pseudo-label.
The region outside the bounding box must be the background, according to the labeling definition of the bounding box, where the object is present, but no specific location at the pixel level of the object is given. The method is characterized in that firstly, based on the labeling of the boundary box, a pseudo labeling is generated by using a segmentation result obtained by using a GrabCut algorithm.
For an input image I and a boundary box label B, the input image is respectively represented by LAB and RGB two color spaces, and two segmentation results S are obtained by respectively applying GrabCt algorithm to the two color spaces LAB And S is RGB Based on these two division results, a pseudo-label S is generated as shown in formula (1). For each pixel i, if both the segmentation results are 1, the false label is 1, if both the segmentation results are 0, the false label is 0, and otherwise, the false label is 0.5 when the segmentation results are inconsistent.
In the pseudo labeling result S, a pixel having a value of 1 and 0 indicates that the two division results are identical, and a pixel having a value of 0.5 indicates that the two division results are not identical. The proportion of 1 to 0 in the pseudo-label can be used for measuring the consistency of two segmentation results, and the segmentation results with higher consistency are more reliable. The duty cycle of 1 and 0 in the pseudo-label can thus be taken as a confidence in the segmentation result:
confidence level of each segmentation result can be calculated, and two segmentation results S are obtained for the LAB and RGB color spaces LAB And S is RGB Respectively calculating the minimum circumscribed rectangle and the sum thereofThe area ratio of the boundary box label B is obtained to obtain a ratio C LAB And C RGB The closer the ratio is to 1, the higher the confidence, and conversely, the lower the confidence.
By combining the results, the confidence coefficient of the pseudo label S can be obtained:
C=C LAB-RGB ×C LAB ×C RGB (4)
fig. 3 is a schematic diagram of pseudo label generation with confidence, which sequentially includes an image and a bounding box, an LAB segmentation result, an RGB segmentation result and a pseudo label, wherein a red rectangular box is the bounding box label, the LAB segmentation result and the RGB segmentation result respectively represent the segmentation result of the GrabCut algorithm under the LAB and the RGB color spaces, a yellow box is the minimum circumscribed rectangle of the segmentation result, the pseudo label and the confidence level thereof are generated based on the two segmentation results, and white, gray and black in the pseudo label are respectively represented as 1, 0.5 and 0.
Further, the training the target segmentation model according to the pixel-level pseudo-labeling includes:
under the supervision of pixel-level pseudo labeling, a deep neural network model of target segmentation is trained, and the network structure is shown in fig. 4. The input data includes an image I, a bounding box label B, and a pixel-level pseudo label S.
(1) Firstly, obtaining global features by an encoder, carrying out boundary box regression prediction on the basis of the global features to obtain a boundary box B, and simultaneously carrying out global segmentation to obtain a global segmentation result S G* 。
(2) Obtaining a rough position of a target through the predicted boundary box B, extracting image features in the boundary box B to obtain local features, and dividing based on the local features to obtain a local division result S in order to obtain a more accurate division result L* 。
(3) The loss function of model training includes three parts: loss L (B, B) between labeled and predicted bounding boxes B, global segmentation result S G* And the loss L between the pseudo-label S (S, S) G* ) Local segmentation result S L* And the loss L between the pseudo-label S (S, S) L* ). Meanwhile, the false labeling confidence coefficient C is counted into the loss function, and the larger the confidence coefficient is, the larger the weight of the false labeling confidence coefficient C in the loss function is. The total loss function is:
Loss=(L(B,B * )+L(S,S G * )+L(S,S L * ))×C (5)
the invention simultaneously generates the segmentation result and the prediction boundary frame, so that the model is a model for multi-task collaborative learning, and the two tasks are simultaneously learned, thereby further improving the characteristic learning capacity of the neural network. The segmentation results are generated based on the global features and the local features, respectively, and are therefore multi-scale segmentation models.
Further, the optimizing pixel-level pseudo-labeling based on cross-validation iteration includes:
in the pixel-level false labeling, although the confidence coefficient implies the probability that the false labeling is correct, the false labeling with low confidence coefficient is not reliable, but the false labeling with high confidence coefficient still has the condition of labeling errors, namely labeling noise. The invention further adopts a marking noise elimination mechanism based on cross validation, and the general flow is shown in figure 5. Firstly, randomly dividing a data set T and a pseudo-label S into two parts T 1 ,S 1 And T 2 ,S 2 From which two network models D (T 1 ,S 1 ) And D (T) 2 ,S 2 ) Then using the learned network model D (T 2 ,S 2 ) In the dataset T 1 The result W is obtained by the above deduction 1 . Due to the conclusion of W 1 Is in the data set T 2 And pseudo-annotation S 2 Training on, deducing result W 1 And pseudo-annotation S 1 Relatively independent. By calculating pseudo-labels S 1 And inferred result W 1 Is considered noise when the difference degree is greater than a certain threshold, and is shown in S 1 The middle mark is noise (confidence set to 0). Similarly, a false mark S can be detected 2 Medium noise, and at S 2 The middle mark is noise (confidence set to 0). Pseudo-annotation S 1 And S is 2 After updating, willAnd combining the two parts of data, and repeating the steps for iterative updating until no pseudo-labeling noise is detected.
Further, the method comprises the step of continuously iteratively optimizing the pseudo label through cross-validation-based pseudo label noise detection on the basis of the initial pixel-level pseudo label with confidence. After the reliable pseudo labels are obtained, inputting all image data and the pseudo labels with confidence to a multi-task multi-scale target segmentation model, and retraining to obtain the target segmentation model which is the optimal model.
Further, a system for segmenting a target in an image based on weak annotation of a bounding box comprises:
the acquisition module is used for: for acquiring an image dataset;
and the marking module is used for: the method comprises the steps of performing boundary box labeling on the image data set, and generating pixel-level pseudo labels with confidence on the basis of the boundary box labeling;
the segmentation model construction module: training a target segmentation model according to the pixel-level pseudo-labels;
training module: optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and an output module: and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (8)
1. The method for dividing the target in the image based on the weak annotation of the bounding box is characterized by comprising the following steps:
acquiring an image data set, and marking a boundary box of the image data set;
generating a pixel-level pseudo label with confidence based on the boundary box label;
training a target segmentation model according to the pixel-level pseudo-labels;
optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
2. The method for segmenting the target in the image based on the weak annotation of the boundary box according to claim 1, wherein the step of carrying out the boundary box annotation on the image data set comprises the following steps: and marking the minimum circumscribed rectangular frame of each target in the image data set as a boundary frame, merging the boundary frames of a plurality of targets into one boundary frame when the boundary frames of the targets are overlapped, wherein the targets are positioned in the boundary frame, and the background area is arranged outside the boundary frame.
3. The method of claim 1, comprising processing the bounding box label to obtain a specific location of the object within the bounding box.
4. The method for segmenting the target in the image based on the weak annotation of the bounding box according to claim 3, wherein the specific steps of processing the annotation of the bounding box are as follows:
marking the input image and the boundary box through a color space;
dividing the color space by using a GrabCut algorithm, and generating a pseudo mark according to a division result;
calculating a first segmentation confidence according to consistency of segmentation results of pixels in the pseudo labels:
calculating a second segmentation confidence according to the area ratio of the minimum circumscribed rectangle to the boundary box mark in the segmentation result;
and coupling the first segmentation confidence and the second segmentation confidence to obtain the confidence of the pseudo-label.
5. The method for segmenting the target in the image based on the weak annotation of the bounding box according to claim 1, wherein the training the target segmentation model according to the pixel-level pseudo annotation comprises:
acquiring an image frame, marking a boundary frame, and marking a pixel level pseudo mark;
constructing a deep neural network model;
training a target segmentation depth neural network model under the supervision of pixel-level pseudo labeling;
and constructing a loss function through the loss between the boundary box label and the predicted boundary box, the loss between the global segmentation result and the pseudo label, the loss between the local segmentation result and the pseudo label and the confidence level of the pseudo label to guide the neural network to learn.
6. The method for segmenting the target in the image based on the weak annotation of the bounding box according to claim 1, wherein the iterative optimization of the pixel-level pseudo annotation based on the cross verification comprises:
randomly dividing the data set T and the pseudo-annotation S into two parts T 1 ,S 1 And T 2 ,S 2 The method comprises the steps of carrying out a first treatment on the surface of the Two network models D (T) 1 ,S 1 ) And D (T) 2 ,S 2 );
With a learned network model D (T 2 ,S 2 ) In the dataset T 1 The result W is obtained by the above deduction 1 By calculating pseudo-labels S 1 And inferred result W 1 Is greater than a certain threshold valueI.e. regarded as noise, and at S 1 The middle mark is noise, and the same applies to the pseudo mark S 2 Performs noise detection in (1), and at S 2 Marking noise;
to fake label S 1 And S is 2 And updating, namely merging the two parts of data, and repeating the steps to update iteratively until no noise is detected.
7. The method for segmenting the target in the image based on the weak annotation of the bounding box as claimed in claim 6, comprising the following steps: and updating the pixel-level pseudo-labels by the optimized pixel-level pseudo-labels, training the target segmentation depth neural network model under the supervision of the optimized pixel-level pseudo-labels, iterating for N rounds until the model converges, and outputting an optimal model.
8. A system for segmenting a target in an image based on weak annotation of a bounding box, comprising:
the acquisition module is used for: for acquiring an image dataset;
and the marking module is used for: the method comprises the steps of performing boundary box labeling on the image data set, and generating pixel-level pseudo labels with confidence on the basis of the boundary box labeling;
the segmentation model construction module: training a target segmentation model according to the pixel-level pseudo-labels;
training module: optimizing pixel-level pseudo labels based on cross verification iteration, and training a target segmentation model through the optimized pixel-level pseudo labels to obtain an optimal model;
and an output module: and inputting the image to be segmented into the optimal model for target segmentation to obtain a segmentation result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310494738.7A CN116385466B (en) | 2023-05-05 | 2023-05-05 | Method and system for dividing targets in image based on boundary box weak annotation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310494738.7A CN116385466B (en) | 2023-05-05 | 2023-05-05 | Method and system for dividing targets in image based on boundary box weak annotation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116385466A true CN116385466A (en) | 2023-07-04 |
CN116385466B CN116385466B (en) | 2024-06-21 |
Family
ID=86971138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310494738.7A Active CN116385466B (en) | 2023-05-05 | 2023-05-05 | Method and system for dividing targets in image based on boundary box weak annotation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116385466B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117078698A (en) * | 2023-08-22 | 2023-11-17 | 山东第一医科大学第二附属医院 | Peripheral blood vessel image auxiliary segmentation method and system based on deep learning |
CN117710734A (en) * | 2023-12-13 | 2024-03-15 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device, and medium for obtaining semantic data |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298387A (en) * | 2019-06-10 | 2019-10-01 | 天津大学 | Incorporate the deep neural network object detection method of Pixel-level attention mechanism |
CN111242940A (en) * | 2020-01-19 | 2020-06-05 | 复旦大学 | Tongue image segmentation method based on weak supervised learning |
AU2020102091A4 (en) * | 2019-10-17 | 2020-10-08 | Wuhan University Of Science And Technology | Intelligent steel slag detection method and system based on convolutional neural network |
CN112926681A (en) * | 2021-03-29 | 2021-06-08 | 复旦大学 | Target detection method and device based on deep convolutional neural network |
CN113673338A (en) * | 2021-07-16 | 2021-11-19 | 华南理工大学 | Natural scene text image character pixel weak supervision automatic labeling method, system and medium |
CN114169389A (en) * | 2021-10-22 | 2022-03-11 | 福建亿榕信息技术有限公司 | Class-expanded target detection model training method and storage device |
KR20220114320A (en) * | 2021-02-08 | 2022-08-17 | 연세대학교 산학협력단 | Apparatus and Method for Generating Learning Data for Semantic Image Segmentation Based On Weak Supervised Learning |
CN115761574A (en) * | 2022-10-27 | 2023-03-07 | 之江实验室 | Weak surveillance video target segmentation method and device based on frame labeling |
-
2023
- 2023-05-05 CN CN202310494738.7A patent/CN116385466B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298387A (en) * | 2019-06-10 | 2019-10-01 | 天津大学 | Incorporate the deep neural network object detection method of Pixel-level attention mechanism |
AU2020102091A4 (en) * | 2019-10-17 | 2020-10-08 | Wuhan University Of Science And Technology | Intelligent steel slag detection method and system based on convolutional neural network |
CN111242940A (en) * | 2020-01-19 | 2020-06-05 | 复旦大学 | Tongue image segmentation method based on weak supervised learning |
KR20220114320A (en) * | 2021-02-08 | 2022-08-17 | 연세대학교 산학협력단 | Apparatus and Method for Generating Learning Data for Semantic Image Segmentation Based On Weak Supervised Learning |
CN112926681A (en) * | 2021-03-29 | 2021-06-08 | 复旦大学 | Target detection method and device based on deep convolutional neural network |
CN113673338A (en) * | 2021-07-16 | 2021-11-19 | 华南理工大学 | Natural scene text image character pixel weak supervision automatic labeling method, system and medium |
CN114169389A (en) * | 2021-10-22 | 2022-03-11 | 福建亿榕信息技术有限公司 | Class-expanded target detection model training method and storage device |
CN115761574A (en) * | 2022-10-27 | 2023-03-07 | 之江实验室 | Weak surveillance video target segmentation method and device based on frame labeling |
Non-Patent Citations (2)
Title |
---|
刘枢;林迪;冯伟;: "基于点标注的弱监督实例分割", 信号处理, no. 09, pages 68 - 77 * |
徐树奎;周浩;: "对象边框标注数据的弱监督图像语义分割", 国防科技大学学报, no. 01, 3 February 2020 (2020-02-03), pages 190 - 196 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117078698A (en) * | 2023-08-22 | 2023-11-17 | 山东第一医科大学第二附属医院 | Peripheral blood vessel image auxiliary segmentation method and system based on deep learning |
CN117078698B (en) * | 2023-08-22 | 2024-03-05 | 山东第一医科大学第二附属医院 | Peripheral blood vessel image auxiliary segmentation method and system based on deep learning |
CN117710734A (en) * | 2023-12-13 | 2024-03-15 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device, and medium for obtaining semantic data |
Also Published As
Publication number | Publication date |
---|---|
CN116385466B (en) | 2024-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI742382B (en) | Neural network system for vehicle parts recognition executed by computer, method for vehicle part recognition through neural network system, device and computing equipment for vehicle part recognition | |
CN109886066B (en) | Rapid target detection method based on multi-scale and multi-layer feature fusion | |
CN116385466B (en) | Method and system for dividing targets in image based on boundary box weak annotation | |
WO2020000390A1 (en) | Systems and methods for depth estimation via affinity learned with convolutional spatial propagation networks | |
US10262214B1 (en) | Learning method, learning device for detecting lane by using CNN and testing method, testing device using the same | |
US20210326638A1 (en) | Video panoptic segmentation | |
Li et al. | A robust instance segmentation framework for underground sewer defect detection | |
CN113673338B (en) | Automatic labeling method, system and medium for weak supervision of natural scene text image character pixels | |
US10275667B1 (en) | Learning method, learning device for detecting lane through lane model and testing method, testing device using the same | |
CN111738295B (en) | Image segmentation method and storage medium | |
US11042986B2 (en) | Method for thinning and connection in linear object extraction from an image | |
CN115410059B (en) | Remote sensing image part supervision change detection method and device based on contrast loss | |
US11367206B2 (en) | Edge-guided ranking loss for monocular depth prediction | |
CN116258877A (en) | Land utilization scene similarity change detection method, device, medium and equipment | |
CN116957051A (en) | Remote sensing image weak supervision target detection method for optimizing feature extraction | |
CN116228623B (en) | Metal surface defect detection method, equipment and storage medium based on isomorphism regularization self-supervision attention network | |
CN111612802A (en) | Re-optimization training method based on existing image semantic segmentation model and application | |
CN113222867B (en) | Image data enhancement method and system based on multi-template image | |
CN116030346A (en) | Unpaired weak supervision cloud detection method and system based on Markov discriminator | |
Das et al. | Object Detection on Scene Images: A Novel Approach | |
Long et al. | Spectrum decomposition in Gaussian scale space for uneven illumination image binarization | |
Wang et al. | MCMC algorithm based on Markov random field in image segmentation | |
Fang et al. | PiPo-Net: A Semi-automatic and Polygon-based Annotation Method for Pathological Images | |
CN114998990B (en) | Method and device for identifying safety behaviors of personnel on construction site | |
CN116309245B (en) | Underground drainage pipeline defect intelligent detection method and system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |