CN113409327A - Example segmentation improvement method based on ordering and semantic consistency constraint - Google Patents

Example segmentation improvement method based on ordering and semantic consistency constraint Download PDF

Info

Publication number
CN113409327A
CN113409327A CN202110608265.XA CN202110608265A CN113409327A CN 113409327 A CN113409327 A CN 113409327A CN 202110608265 A CN202110608265 A CN 202110608265A CN 113409327 A CN113409327 A CN 113409327A
Authority
CN
China
Prior art keywords
loss
segmentation
mask
stage
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110608265.XA
Other languages
Chinese (zh)
Inventor
王立春
杨臣
王少帆
孔德慧
李敬华
尹宝才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110608265.XA priority Critical patent/CN113409327A/en
Publication of CN113409327A publication Critical patent/CN113409327A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an example segmentation improvement method based on ordering and semantic consistency constraint, which mainly aims at solving the problem of how to improve the mask quality of a segmented example and aims at providing ordering loss and semantic consistency loss oriented to an example segmentation network, wherein the ordering loss and the semantic consistency loss optimize a selection result of a subregion, and the semantic segmentation result optimizes the subregion. Instance segmentation belongs to an important task in the field of computer vision, and requires not only distinguishing specific instances, but also completing classification and positioning tasks. The current example segmentation method has the problem that the mask quality for segmenting examples is not high, which has non-negligible negative effects on many practical tasks. The proposed ordering loss and semantic consistency loss can be applied to any existing two-stage and single-stage instance segmentation framework. Experiments carried out on the public data set show that after the sequencing loss and the semantic consistency loss are increased, the example segmentation effect of the deep network is improved to a certain degree, and the mask quality of the segmented examples is improved to some extent.

Description

Example segmentation improvement method based on ordering and semantic consistency constraint
Technical Field
The invention belongs to the technical field of computer vision image segmentation, and particularly relates to a novel example segmentation improvement.
Background
Example segmentation is a very important task in the field of computer vision. Several major tasks of modern computer vision include: image classification, target detection, semantic segmentation, instance segmentation, etc., the complexity of several tasks is progressive step by step. Instance segmentation not only requires segmenting the mask (mask) of the object, identifying its class, but also requires distinguishing between different instances of the same class, and thus instance segmentation can be defined as a technique that solves both the object detection problem and the semantic segmentation problem.
For a given image, the example segmentation algorithm predicts a set of data labels Cj,Bj,Ij,SjIn which C isjClass representing the jth instance of prediction, BjIndicating the predicted location information of the jth instance, IjInformation of the division mask representing the j-th example, SjIs the confidence of the class of the j-th instance of the prediction.
In recent years, with the development of the field of artificial intelligence, a plurality of example segmentation methods and technologies emerge, and the current mainstream example segmentation method can be divided into a two-stage example segmentation method and a single-stage example segmentation method.
Wherein, the two-stage example segmentation comprises two major types of target detection algorithms from top to bottom and basically takes off the fetus from two stages. As shown in fig. 1, the first stage is a simple foreground-background two-classification and regression process, which aims to extract sub-regions, where a sub-region is a rectangular frame (prediction frame) represented by two pairs of coordinates, where the two pairs of coordinates respectively represent two corner points of the rectangular frame, and the rectangular frame may contain an object instance; in the second stage, the corner point coordinates of the rectangular frame are subjected to accurate regression aiming at each sub-region extracted in the first stage, the rectangular frame is subjected to instance classification and pixel level marking, and meanwhile, a mask of an instance is obtained by dividing a sub-network. The classic task of this algorithm is the Mask-RCNN algorithm, which achieves 37.1% mAP (mean rate of accuracy) on the coco dataset.
According to the existing literature, the two-stage example segmentation algorithm achieves the highest precision on the public data set. This type of process has two disadvantages: firstly, the speed of the algorithm is influenced by a two-stage processing scheme, and the real-time performance cannot be guaranteed to be the greatest limitation for the landing use of the algorithm; secondly, the quality of the masks obtained by the algorithm is still uneven, taking Mask-RCNN as an example, the final masks are obtained by sampling on a small area of 28 × 28, so that the quality of the restored masks is poor, and the restored masks cover insufficient or excessive sample areas.
The single-stage example segmentation algorithm is shown in fig. 2, and this type of algorithm basically comes from a single-stage target detection algorithm, and there is no sub-region extraction process as compared with a two-stage algorithm. The classic task of this type of algorithm is the Yolact series of algorithms, which achieve 29.8% mAP on the coco dataset and a speed of 33.5 FPS.
The real-time performance of the single-stage algorithm is enough to meet the landing requirement, and the problems that the precision is more reduced than that of the two-stage algorithm and the mask quality is not high are also serious.
In summary, the best mAP index of the existing two-stage and single-stage example segmentation algorithms on the public data set shows that the mask quality of example segmentation has a further improved space.
The idea of improving the mask quality is to add a new loss function into the framework, define semantic constraints on sub-regions possibly containing examples and pixel-level labels of the examples, and specifically form sequencing loss and semantic consistency constraint loss. The loss function defined by the invention can be directly applied to the existing two-stage and single-stage example segmentation algorithm, and the mAP index of the original algorithm is improved. According to the method, verification experiments are respectively carried out on Mask-RCNN and Yolact which are representative algorithms of two-stage and single-stage example segmentation, and the experimental results show the effectiveness of the loss function provided by the invention.
Disclosure of Invention
Aiming at the problem that the mask quality obtained by the existing example segmentation algorithm is poor, the invention provides a new loss function (sequencing loss and semantic consistency constraint loss) to be added into the existing algorithm framework, thereby forming a new example segmentation method and improving the mask quality obtained by example segmentation.
The invention provides an example segmentation improvement method based on ordering and semantic consistency constraint, which is respectively introduced for implementation schemes on single-stage and two-stage example segmentation frameworks.
1. Two-stage network based example segmentation improvement scheme
As shown in fig. 3, in the two-stage example segmentation algorithm framework, the first stage mainly completes the extraction of the region of interest, and here, the invention increases the sorting loss on the basis of the original classification and regression loss for the sorting selection operation of the sub-regions. Aiming at the segmentation operation of the second stage, semantic consistency loss is increased on the basis of original classification, regression and segmentation loss.
2. Example segmentation improvement scheme based on single-stage network
As shown in fig. 4, in the single-stage algorithm, the ranking loss and the semantic consistency loss are added to the regression head and the segmentation head, respectively, based on the original classification, regression, and segmentation losses.
1. Introduction to the basic model
The invention adds sequencing loss and semantic consistency loss into the original instance segmentation framework.
Two classes of algorithms are presented below as the original basic modules of Mask-RCNN and Yolact, respectively.
Mask-RCNN basic module:
and (3) taking ResnexXt 101+ FPN as a Backbone, and pre-training weights used by the Backbone are resnet101 files which are pre-trained on the ImageNet data set.
RPN network for generating region explosals. The layer judges that anchors belong to positive or negative through softmax, and then correct the anchors by using bounding box regression to obtain accurate ROIs (namely sub-regions).
Full connection layer FC: and (4) obtaining an example label through classification operation, and obtaining an example frame through regression operation.
Full convolution network FCN: and performing mask segmentation operation to obtain an example mask.
The modules are shown in FIG. 5:
yolact base module:
feature extraction network backhaul: resnet101+ FPN is Backbone, and the pre-training weight file is still a Resnet101 file.
And (3) a Protonet network, namely generating k prototype masks for each image. Following the FPN output, a network of full convolution structures is predicted to produce a set of prototype masks.
Prediction Head: performing regression classification operation, namely outputting the subareas and classification thereof in the same way as the subarea extraction process in the two-stage method; and simultaneously predicting k linear combination coefficients for performing linear combination operation on the prototype Mask, and performing Crop operation on a linear combination result to obtain an example Mask.
Loss of sub-region ordering (SRLoss, Subsection Rank Loss)
The core idea of this constraint is: and each prediction box is subjected to non-increasing sequencing according to the scores, so that the positive sample prediction box is encouraged to precede the negative sample prediction box as far as possible, and a more accurate sub-region can be obtained.
The ordering loss is defined as in formula (1):
Figure BDA0003094455760000041
wherein P is a positive sample set, and the positive and negative samples are determined according to a threshold (such as 0.7 and 0.3, samples higher than 0.7 are classified as positive samples, samples lower than 0.3 are classified as negative samples, and the rest samples are not trained and processed) after the intersection and comparison are calculated by a prior frame or an anchor frame and a GT Bbox; sorting the positive and negative samples according to the size of the cross comparison, wherein the sorting serial number is the r value; and sorting according to the classification confidence score of the sample by the size, wherein the sorting sequence number is the value of sort (r). The higher the positive sample ordering result, the smaller the SRloss.
Sub-region Semantic Consistency Loss (SSCLOSs, Subsection Semantic Consistency Loss)
The loss purpose is to constrain semantic consistency of pixels in the mask region: labeling types of pixels in the constraint subarea are as less as possible; pixels in the restricted sub-region belong to the same class as much as possible. The former counts the segmentation class, and when the mask quality is best, the part is reduced to the lowest; the latter calculates the ratio of the number of pixels with correct semantic categories in the mask region to the total number of pixels in the mask region.
Memory MiThe number of pixel points labeled as class i in the mask region is given, and the final loss function form is:
Figure BDA0003094455760000042
Figure BDA0003094455760000043
Figure BDA0003094455760000044
where c is the total number of classes, c is 80 on the MS COCO dataset and α is the hyperparameter.
Model training and testing
In the training process, the actual flow of Mask-RCNN is shown in fig. 7, the names of function blocks are in a rectangular frame, broken line arrows point to network loss, and newly added loss is represented by bold lines. The pre-training model used by the present invention has been trained on imagenet datasets.
The tag file annotation using the standard includes [ id, image _ id, category _ id, segmentation, area, bbox, iscrowd]Category _ id, mask tag segmentation, instance box tag bbox. category _ id is a category tag in which segmentation is performedFor polygon format data (taking adjacent pair of data as coordinate value of example contour edge point), bbox data format is [ x [ ]1,y1,x2,y2]。
In the first stage, the loss function is mainly obtained by training an RPN network, and comprises RPN foreground and background classification losses lrpn_clsThe real label t belongs to {1, 0, -1}, the anchors with the real label of 0 do not participate in the construction of the loss function, and the label of-1 is converted into 0 to carry out cross entropy calculation; RPN target frame regression loss lrpn_regAnd the loss of ordering SRloss proposed by the present invention, denoted as l in this examplerpn_rank
In the second stage, the loss function is mainly the classification loss lclsRegression loss lregThe division loss lsegAnd the semantic consistency loss SSLoss proposed by the present invention is marked as lsc
The classification loss and the segmentation loss are usually cross-entropy losses, and the regression loss is Smooth _ L1 loss. They are in general form:
cross entropy loss:
Figure BDA0003094455760000051
y is the true value, yiFor predicting the probability value, n is the number of samples.
Smoothen _ L1 loss:
Figure BDA0003094455760000052
y*is a predicted value, y is a label, and x is the difference between the two.
The overall training is divided into two parts, firstly, the RPN network part is trained, and the loss function L needing to be optimized1The following were used:
L1=lrpn_cls+lrpn_reg+lrpn_rank (5)
after the ROI region screening enters the second stage, a loss function L needing to be optimized2The following were used:
L2=lcls+lreg+lseg+lsc (6)
in the test section, the top 100 highest scoring detection boxes were processed using Mask prediction branches. In order to accelerate the inference efficiency, the network predicts K mask images for each ROI, but only needs to use the mask image with the largest class probability, the mask image resize is returned to the ROI size, binarization is performed at a set threshold value of 0.5, retention is performed when the threshold value is higher than the threshold value, and finally the segmented mask and the original image are subjected to image-level add operation to obtain a final example segmentation visualization result.
The network structure after the single-stage example segmentation represents the algorithm Yolact loss increase is shown in FIG. 8, the names of the function blocks are in the rectangular box, the broken line arrows point to the network loss, and the new loss is represented in a bold manner. The pre-trained model used was pre-trained on the imagenet dataset. The network training and testing principle is basically similar to the previous one, except that all losses are directly optimized in stages, and the total losses of the network are listed as follows:
L=lcls+lreg+lseg+lsegm+lrank+lsc
wherein lclsAnd lregTo classify losses and regression losses, lsegFor a segmentation loss,/segmFor semantic segmentation penalty, the ordering penalty is denoted as lrankSemantic consistency loss is noted as lscAre all the cross-entropy losses L mentioned aboveceLoss of L from Smooth _ L1smooth_l1Coarse semantic segmentation loss l in addition to regression and classification losssegm(still a cross-entropy form loss).
Drawings
FIG. 1 two-stage example segmentation method
FIG. 2 Single-stage example segmentation method
FIG. 3 is a two-stage network architecture diagram with increased ordering and loss of semantic consistency
FIG. 4 is a diagram of a single-stage network architecture that increases ordering and loss of semantic consistency
FIG. 5 Mask-RCNN schematic
FIG. 6 Yolact schematic
FIG. 7 Structure of Mask-RCNN modified
FIG. 8 Yolact modified Structure
Detailed Description
The invention adopts MS COCO series data sets (COCO2015 and COCO2016) to carry out experiments. The COCO dataset is a large, rich object detection, segmentation and caption dataset. The data set is mainly intercepted from a complex daily scene by taking scene understating as a target, and the target in the image is subjected to position calibration through accurate segmentation. The image included 91 classes of targets, 328,000 shots and 2,500,000 labels. So far, the largest data set with semantic segmentation is provided, the provided categories are 80 types, more than 33 ten thousand pictures are provided, 20 ten thousand pictures are marked, and the number of individuals in the whole data set is more than 150 ten thousand. The COCO dataset now has 3 label types: object instances, object keypoints, and image references (see talking), store tags using JSON files. Training and testing were performed using the training and validation sets, respectively. Because the instance segmentation task is done, target instance type annotation is used.
And evaluating the index. The present invention follows an evaluation criteria protocol for example segmentation, evaluating using mAP pairs at different intersection-union (IOU) thresholds. The present invention uses the assessment codes provided by the authorities for experiments.
And (4) setting an experiment. In experiments, the effectiveness of the invention was tested by adding the losses defined by the invention in sequence to each frame.
Experiments with Mask-RCNN and Yolact were performed on version 2.3 of mmdetection, as introduced by the Shang Tang science.
The hyper-parameters were set in Mask-RCNN experiments as follows: α ═ 1; setting a diversity input size; a non-maximum suppression (NMS) threshold of 0.7; the initial learning rate is 0.333, and the training process is dynamically attenuated; the batch size is set to 4, and the training is respectively carried out on four GPUs; the weight is stored once per round (epoch).
The hyper-parameters set on Yolact are: α ═ 1; setting an initial size of an input image to be 550 × 550; a non-maximum suppression (NMS) threshold of 0.7; the initial learning rate is 0.333, and the training process is dynamically attenuated; the batchsize is set to 4 and trained on four GPUs separately, with weights stored once per round (epoch).
All set up above and train 3 rounds each time, carry out once and verify and continue training again.
The present invention compares the performance score of the model with the baseline performance published by the original author. Table 1 shows the experimental results on the COCO data set, which can be seen to be improved.
TABLE 1 COCO data set test results
Figure BDA0003094455760000081

Claims (1)

1. An example segmentation improvement method based on ordering and semantic consistency constraint is characterized by comprising the following steps:
1) two-stage network based example segmentation improvement scheme
The first stage is to extract the region of interest, and increase the sorting loss on the basis of the original classification and regression loss aiming at the sorting selection operation of the sub-regions; aiming at the segmentation operation of the second stage, semantic consistency loss is increased on the basis of original classification, regression and segmentation loss;
2) example segmentation improvement scheme based on single-stage network
In the single-stage algorithm, on the basis of original classification, regression and segmentation losses, the sequencing loss and the semantic consistency loss are added to a regression head and a segmentation head respectively;
adding sequencing loss and semantic consistency loss into an original instance segmentation frame;
the two types of algorithms are respectively represented as the original basic modules of Mask-RCNN and Yolact;
Mask-RCNN basic module:
a characteristic extraction network Backbone, wherein ResnexXt 101+ FPN is the Backbone, and the used pre-training weight is a resnet101 file which is pre-trained on an ImageNet data set;
the RPN is used for generating region explosals; the layer judges that anchors belong to positive or negative through softmax, and then correct the anchors by using bounding box regression to obtain accurate ROIs (namely sub-regions);
full connection layer FC: classifying operation to obtain an example label, and regressing operation to obtain an example frame;
full convolution network FCN: performing mask segmentation operation to obtain an example mask;
yolact base module:
feature extraction network backhaul: resnet101+ FPN is a Backbone, and the pre-training weight file is still a Resnet101 file;
a Protonet network, which generates k prototype masks for each image; k is 32; a network with a full convolution structure is connected behind the FPN output, and a group of prototype masks are obtained through prediction;
prediction Head: performing regression classification operation, namely outputting the subareas and classification thereof in the same way as the subarea extraction process in the two-stage method; simultaneously predicting k linear combination coefficients for performing linear combination operation on the prototype Mask and performing Crop operation on a linear combination result to obtain an example Mask;
sub-region ordering penalty
The ordering loss is defined as in formula (1):
Figure FDA0003094455750000021
wherein P is a positive sample set, and the positive and negative samples are determined according to a threshold value after the intersection and comparison is calculated by a prior frame or an anchor frame and a GT Bbox; samples above 0.7 are classified as positive samples, samples below 0.3 are classified as negative samples, and the rest samples are not trained and processed; sorting the positive and negative samples according to the size of the cross comparison, wherein the sorting serial number is the r value; sorting according to the classification confidence score of the sample according to the size, wherein the sorting sequence number is the value of sort (r); the more the positive sample sorting result is, the smaller the SRloss is;
sub-region semantic consistency loss
Memory MiThe number of pixel points labeled as class i in the mask region is given, and the final loss function form is:
Figure FDA0003094455750000022
Figure FDA0003094455750000023
Figure FDA0003094455750000024
where c is the total number of classes, c is 80 on the MS COCO dataset and α is a hyperparameter;
model training and testing
The pre-training model used has been trained on the imagenet dataset;
the tag file annotation using the standard includes [ id, image _ id, category _ id, segmentation, area, bbox, iscrowd]Category _ id, mask tag segmentation, instance box tag bbox; category _ id is a category label, wherein segment is polygon format data, a pair of adjacent data is coordinate value of an example contour edge point, bbox data format is [ x [ ]1,y1,x2,y2];
In the first stage, the loss function is mainly obtained by training an RPN network, and comprises RPN foreground and background classification losses lrpn_clsThe real label t belongs to {1, 0, -1}, the anchors with the real label of 0 do not participate in the construction of the loss function, and the label of-1 is converted into 0 to carry out cross entropy calculation; RPN target frame regression loss lrpn_regAnd the loss of ordering SRloss proposed by the present invention, denoted as l in this examplerpn_rank
In the second stage, the loss function is mainly the classification loss lclsRegression loss lregThe division loss lsegAnd the semantic consistency loss SSLoss proposed by the present invention is marked as lsc
The classification loss and the segmentation loss are usually cross entropy losses, and the regression loss is Smooth _ L1 loss; they are in general form:
cross entropy loss:
Figure FDA0003094455750000031
y is the true value, yiIn order to predict the probability value, n is the number of samples;
smoothen _ L1 loss:
Figure FDA0003094455750000032
y*is a predicted value, y is a label, and x is the difference between the two;
the overall training is divided into two parts, firstly, the RPN network part is trained, and the loss function L needing to be optimized1The following were used:
L1=lrpn_cls+lrpn_reg+lrpn_rank (5)
after the ROI region screening enters the second stage, a loss function L needing to be optimized2The following were used:
L2=lcls+lreg+lseg+lsc (6)
in the testing part, the top 100 detection boxes with the highest scores are processed by using Mask prediction branches; in order to accelerate the inference efficiency, the network predicts K mask images for each ROI, but only needs to use the mask image with the largest class probability, the mask image resize is returned to the ROI size, binarization is carried out by using a set threshold value of 0.5, retention is carried out when the threshold value is higher than the threshold value, and finally the segmented mask and the original image are subjected to image-level add operation to obtain a final example segmentation visualization result;
total loss of the network is listed:
L=lcls+lreg+lseg+lsegm+lrank+lsc
wherein lclsAnd lregTo classify losses and regression losses, lsegFor a segmentation loss,/segmFor semantic segmentation penalty, the ordering penalty is denoted as lrankSemantic consistency loss is noted as lscAre all the cross-entropy losses L mentioned aboveceLoss of L from Smooth _ L1smooth_l1Coarse semantic segmentation loss l in addition to regression and classification losssegmStill in the form of cross-entropy losses.
CN202110608265.XA 2021-06-01 2021-06-01 Example segmentation improvement method based on ordering and semantic consistency constraint Pending CN113409327A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110608265.XA CN113409327A (en) 2021-06-01 2021-06-01 Example segmentation improvement method based on ordering and semantic consistency constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110608265.XA CN113409327A (en) 2021-06-01 2021-06-01 Example segmentation improvement method based on ordering and semantic consistency constraint

Publications (1)

Publication Number Publication Date
CN113409327A true CN113409327A (en) 2021-09-17

Family

ID=77675695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110608265.XA Pending CN113409327A (en) 2021-06-01 2021-06-01 Example segmentation improvement method based on ordering and semantic consistency constraint

Country Status (1)

Country Link
CN (1) CN113409327A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363374A (en) * 2023-06-02 2023-06-30 中国科学技术大学 Image semantic segmentation network continuous learning method, system, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110246141A (en) * 2019-06-13 2019-09-17 大连海事大学 It is a kind of based on joint angle point pond vehicles in complex traffic scene under vehicle image partition method
CN111862119A (en) * 2020-07-21 2020-10-30 武汉科技大学 Semantic information extraction method based on Mask-RCNN

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110246141A (en) * 2019-06-13 2019-09-17 大连海事大学 It is a kind of based on joint angle point pond vehicles in complex traffic scene under vehicle image partition method
CN111862119A (en) * 2020-07-21 2020-10-30 武汉科技大学 Semantic information extraction method based on Mask-RCNN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QIJUAN YANG 等: "An Instance Segmentation Algorithm Based on Improved Mask R-CNN", IEEE, 8 June 2020 (2020-06-08), pages 1 - 6 *
何丽等: "融合多尺度边界特征的显著实例分割", 《计算机科学与探索》, 31 March 2021 (2021-03-31), pages 1865 - 1876 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363374A (en) * 2023-06-02 2023-06-30 中国科学技术大学 Image semantic segmentation network continuous learning method, system, equipment and storage medium
CN116363374B (en) * 2023-06-02 2023-08-29 中国科学技术大学 Image semantic segmentation network continuous learning method, system, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110334705B (en) Language identification method of scene text image combining global and local information
CN109614979B (en) Data augmentation method and image classification method based on selection and generation
CN111784685A (en) Power transmission line defect image identification method based on cloud edge cooperative detection
CN111898406B (en) Face detection method based on focus loss and multitask cascade
CN110175613A (en) Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models
CN108537824B (en) Feature map enhanced network structure optimization method based on alternating deconvolution and convolution
CN111767927A (en) Lightweight license plate recognition method and system based on full convolution network
CN110705412A (en) Video target detection method based on motion history image
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN112990282B (en) Classification method and device for fine-granularity small sample images
CN110059539A (en) A kind of natural scene text position detection method based on image segmentation
CN112528058B (en) Fine-grained image classification method based on image attribute active learning
CN113111916A (en) Medical image semantic segmentation method and system based on weak supervision
CN112819837B (en) Semantic segmentation method based on multi-source heterogeneous remote sensing image
CN116188402A (en) Insulator defect identification method based on improved SSD algorithm
CN115719475A (en) Three-stage trackside equipment fault automatic detection method based on deep learning
CN112233105A (en) Road crack detection method based on improved FCN
CN115880529A (en) Method and system for classifying fine granularity of birds based on attention and decoupling knowledge distillation
CN113077438B (en) Cell nucleus region extraction method and imaging method for multi-cell nucleus color image
CN110659702A (en) Calligraphy copybook evaluation system and method based on generative confrontation network model
CN114882204A (en) Automatic ship name recognition method
CN112132839B (en) Multi-scale rapid face segmentation method based on deep convolution cascade network
CN113409327A (en) Example segmentation improvement method based on ordering and semantic consistency constraint
CN117576038A (en) Fabric flaw detection method and system based on YOLOv8 network
CN116977931A (en) High-altitude parabolic identification method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination