CN113159048A - Weak supervision semantic segmentation method based on deep learning - Google Patents
Weak supervision semantic segmentation method based on deep learning Download PDFInfo
- Publication number
- CN113159048A CN113159048A CN202110441665.6A CN202110441665A CN113159048A CN 113159048 A CN113159048 A CN 113159048A CN 202110441665 A CN202110441665 A CN 202110441665A CN 113159048 A CN113159048 A CN 113159048A
- Authority
- CN
- China
- Prior art keywords
- training
- segmentation
- class activation
- semantic segmentation
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000013135 deep learning Methods 0.000 title claims abstract description 13
- 230000004913 activation Effects 0.000 claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 47
- 238000011176 pooling Methods 0.000 claims description 19
- 238000005070 sampling Methods 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 12
- 238000012795 verification Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000001125 extrusion Methods 0.000 claims 1
- 238000002372 labelling Methods 0.000 abstract description 6
- 238000003709 image segmentation Methods 0.000 description 10
- 230000004927 fusion Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a weak supervision semantic segmentation method based on deep learning; firstly, the existing data set is utilized to carry out fine adjustment on a pretrained Resnet50, then a corresponding class activation map is obtained by utilizing a trained Resnet50, a segmented pseudo label is obtained by utilizing a set threshold value, and a fully-connected conditional random field Dense conditional random fields, Dense CRF) is adopted to optimize the label. And finally, training the segmentation network by using the optimized pseudo label. The method can finish the tasks of target classification and semantic segmentation only by using the image-level labels, thereby greatly reducing a large amount of manpower and material resources consumed by manual labeling. Compared with the existing weak supervision method, the method has higher efficiency and more accurate positioning result.
Description
Technical Field
The invention belongs to the field of image processing, relates to semantic segmentation of images, and particularly relates to a deep learning method for performing semantic segmentation on images by using image-level labels.
Background
Image segmentation is one of the basic and key techniques for image understanding. Conventional image segmentation methods mainly include a threshold-based image segmentation method, a region-based image segmentation method, an edge detection-based image segmentation method, a wavelet analysis and wavelet transform-based image segmentation method, a markov random field model-based image segmentation method, a genetic algorithm-based image segmentation method, a cluster-based segmentation method, and the like. Due to inherent limitations, the methods have insignificant effects on the segmentation of complex images, such as natural image segmentation. With the development of deep learning, the convolutional neural network is increasingly applied to image segmentation. The segmentation precision is further improved from FCN, UNet, DilatedNet to deep Lab, PSPNet. Because semantic segmentation training needs to label each pixel in an image, the labeling data is very time-consuming, the complexity degree of the labeling data far exceeds that of image classification and target detection, and a segmentation model is required to be trained, and a large amount of manpower is usually consumed for labeling of a Mask. To solve this problem, it is studied to train a semantic segmentation model, i.e., weakly supervised semantic segmentation, with relatively easy labels such as image _ level labels, Bounding boxes, or scribbels and points. Among them, image-level labels are most widely used because they are most easily available and least costly.
Since the image category label does not contain any position information, an additional method must be adopted to locate the target object in the image during the segmentation. Class Activation Mapping (CAM) is one of the most common positioning methods. By inputting the extracted features of the last convolutional layer of the network such as VGGNet and GoogLeNet into the classification of the full connection layer after global average pooling, the CAM can project the class scores output by the full connection layer back to the last feature map of the convolutional neural network, thereby completing the rough positioning of the object to be segmented. The mainstream image-level labeling weak supervision semantic segmentation method uses CAM to position the segmentation target, and SEC is a representative one. The SEC proposes three principles of seed, expand and constraint, and calculates the contribution of local image regions to the scores of each class in the final picture classification by using CAMs, and roughly estimates the region of each class of objects appearing in the picture. However, this positioning method often only locates the most significant region in the target region, and even sometimes a positioning error occurs, so that the obtained class activation map needs to be corrected.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the positioning area obtained by using the conventional activation map is often not accurate enough and needs to be corrected, so that the accuracy of the result obtained by training the segmentation label obtained by using the activation map is not high enough.
Aiming at the actual situation, the invention provides a weak supervision semantic segmentation method based on deep learning; a unified framework integrating classification and network segmentation is established, a multi-scale reasoning activation graph method is provided, and accuracy of weak supervision positioning can be improved. Meanwhile, a new UNet-based segmentation network, namely Mixed-UNet, is provided, and the semantic segmentation accuracy can be improved.
The method comprises the steps of firstly utilizing an existing data set to conduct fine adjustment on a pre-trained Resnet50, then utilizing a trained Resnet50 to obtain a corresponding class activation map, utilizing a set threshold value to obtain a segmented pseudo label, and utilizing a fully-connected conditional random field Density random fields, Dense CRF) to optimize the label. And finally, training the segmentation network by using the optimized pseudo label. The method specifically comprises the following steps:
step (1), training a classification network;
by taking Resnet50 pre-trained on the ImageNet data set as a classification frame, the convolution of the fourth and fifth volume blocks is replaced by the hole convolution with the hole rate of 2, so that a larger receptive field can be obtained while the spatial resolution of the image is kept unchanged, a denser characteristic response is obtained, and the computation amount can be kept unchanged. Raw data is first collected and labeled by a professional. And dividing the marked data set into a training set, a verification set and a test set. According to the actual situation, the data set has K categories, and c represents any one of the categories. Then, the classification frame is finely adjusted through the training set with the classification labels, and the training of the classification network is finished.
After the classification network training is finished, sampling the images of each training set in multiple scales by a bilinear interpolation method, wherein the sampling rates are 0.5, 1, 1.5 and 2.0 respectively. And respectively inputting the sampled images of four scales into a classification network to obtain class activation maps of four scales, then sampling the class activation maps of four scales to be consistent with the original input image in size, fusing the class activation maps of four scales and averaging the sampled class activation maps to obtain a fused class activation map, namely a modified class activation map.
And (3) normalizing the corrected class activation graphs, and obtaining a segmentation graph of each class by adopting a threshold segmentation method according to the actual classification number.
And (4) taking the training set as input, taking the segmentation graph obtained in the step (3) as a label for semantic segmentation, and simultaneously inputting the original input image and the segmentation graph obtained in the step (3) into a fully-connected conditional random field (Dense conditional random fields, Dense CRF) to optimize the label. Training a UNet-based semantic segmentation network (Mixed-UNet) by using the optimized label and the label obtained in the step (3), wherein the semantic segmentation network is formed by fusing two UNets on the whole frame, the two UNets share one feature extractor in the feature extraction stage, and the feature extractor is divided into two branches in the up-sampling stage. The first branch is added to a Pyramid Pooling Module (PPM) and the second branch is added to an attention mechanism.
And (5) in the training stage, the training set is used as input, the output loss function is calculated, network parameters are adjusted through a back propagation algorithm, and the model is verified by using the verification set. And taking the model with the best effect on the verification set as a final model. And after the training is finished, testing the trained model on the test set.
The invention has the beneficial effects that:
the method can finish the tasks of target classification and semantic segmentation only by using the image-level labels, thereby greatly reducing a large amount of manpower and material resources consumed by manual labeling. Compared with the existing weak supervision method, the method has higher efficiency and more accurate positioning result.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of Mixed-UNet according to an embodiment of the present invention;
FIG. 3 is a diagram of a PPM module according to an embodiment of the present invention;
FIG. 4 is a diagram of an SE module according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the accompanying drawings and examples.
The invention provides a unified image classification and semantic segmentation framework, which can finish the tasks of image classification and weak supervision semantic segmentation only by using image-level labels. The implementation flow is shown in figure 1. A weak supervision semantic segmentation method based on deep learning comprises the following steps:
and (1) training a classification network.
Raw data is first collected and labeled by a professional. And dividing the marked data set into a training set, a verification set and a test set. According to the actual situation, the data set has K categories, and c represents any one of the categories. With Resnet50 pre-trained on the ImageNet dataset as the classification framework, the convolution of the fourth and fifth volume blocks of Resnet50 was replaced with a hole convolution with a hole rate of 2. Under the same condition of the characteristic diagram, the cavity convolution can obtain a larger receptive field, so that more dense characteristic response is obtained. The fully-connected layer of Resnet50 is then removed and Global Average Pooling (GAP) is applied to the resulting feature map and the features obtained after global average Pooling are used as the features of the fully-connected layer that produces the desired output (classification or otherwise). With such a connection structure, the importance of the image region can be identified by projecting the weight of the output layer back to the convolution feature map. The detailed calculation process is as follows:
for any given input image, fk(x, y) represents the active unit k of the last convolutional layer at spatial position (x, y). Then, for the active cell k, the result F after global average pooling is performedkIs sigmax,yfk(x, y). Thus, for a given class c, the output S passes through softmaxcIs composed ofWhereinIs the response weight of the kth cell to class c.Reflect FkImportance for class c. By bringing in FkTo ScCan obtain:
finally, the training set is used for training the classification network, so as to fine-tune the parameters and make the parameters have good performance on a specific data set.
And (3) after the training of the classification network in the step (2) is finished, sampling each image in the training set for four times, wherein the sampling rates are respectively 0.5, 1.0, 1.5 and 2.0, and respectively inputting the obtained images with four scales into a classification network Resnet50 to obtain a class activation map corresponding to each input image. The four different scale class activation maps are then sampled to a size consistent with the original input image size and are fusion averaged. The specific process is as follows:
with respect to the i-th input image,representing a class activation map with a scale j corresponding to the ith image category c, where j is 0.5, 1.0, 1.5, and 0.2, the class activation map after fusing the four scales is:
the fusion reasoning of multiple scales can strengthen the positioning accuracy and avoid target positioning errors on one hand, and on the other hand, the fusion of the reasoning results of different scales can solve the detail loss of single scale reasoning to a certain extent.
And (3) carrying out normalization processing on the fused class activation graph. Let xcFor any position in the c-th class activation map, the value of the class activation map is changed to [0,1 ] by using the following normalization formula]A number in between.
And after normalization is finished, carrying out threshold segmentation on the class activation graph, setting the class activation graph as a foreground when the class activation graph is larger than or equal to a threshold, and setting the class activation graph as a background when the class activation graph is smaller than the threshold. Assuming f (x, y) as the finger of the class activation map at the spatial position (x, y), the segmentation process is as follows:
wherein T is a set threshold value.
And (4) taking the segmentation graph obtained in the step three as a label of the training semantic segmentation network. The semantic segmentation network Mixed-UNet is shown in fig. 2. The semantic segmentation network overall framework is formed by fusing two unets, the two unets share one feature extractor in the feature extraction stage, and the two unets are divided into two branches in the up-sampling stage. The first branch is added to a Pyramid Pooling Module (PPM Module) and the second branch is added to an attention mechanism (SE Module). FIG. 3 is a diagram of a PPM module according to an embodiment of the present invention; the pyramid pooling module can reduce the loss of context information among different subregions, integrates 4 features with different pyramid scales, generates single bin output by the first row which is the coarsest feature through global pooling, and generates pooling features with different scales in the last three rows. To ensure the weight of the global features, if the pyramid has N levels, the number of channels is reduced to 1/N of the original number after each level using 1 × 1 convolution. And then obtaining a feature map with the same size as that before pooling through bilinear interpolation, and finally concat together. As shown in fig. 4, the second branch adds a Squeeze-and-excitation (se) block, and it is desirable to improve the representation capability of the network by modeling the dependency of each channel and to adjust the features channel by channel, so that the network can learn global information to selectively enhance the features containing useful information and suppress useless features. The basic structure of the SE block is shown in fig. 1. The first step is a squeeze (squeeze) operation, which takes the global spatial features of the channels in the feature map as a representation of each channel in the feature map, forming a channel descriptor. And the second step is activation (excitation) operation, learning the dependence degree of the segmentation network on each channel, adjusting the weight of each channel in the feature map according to the difference of the dependence degree, and outputting the adjusted feature map, namely SE block. And obtaining the final segmentation result by convolution according to the multi-scale information extracted by the first branch and the attention information concat extracted by the second branch.
The tags were optimized by using a Dense CRF. Specifically, the training set and the labels are input into a Dense CRF at the same time, and the Dense CRF model optimizes the labels according to the pixel correlation of the training set to obtain more precise output. The Loss function of the semantic segmentation network comprises two parts, namely a Loss pass 1 of the segmentation result and the label obtained by threshold segmentation in the step (3), and a Loss pass 1 of the segmentation result and the label optimized by the Dense CRF. The total Loss function Loss is Loss1+ Loss 2. The loss function takes the form of a weighted cross-entropy loss function, which is formulated as follows:
wherein, p represents the real label,representing the predicted label. Beta is a hyper-parameter which can be set artificially, all positive samples are weighted according to beta (0-1), and the condition of class imbalance can be effectively solved.
And (5) in the training process, verifying the model by using a verification set every 100 iterations, and storing the verified model. And after the training is finished, selecting the model which best performs on the verification set as a final model. In the testing stage, the image data of the testing set is input, and the classification and segmentation results of the image can be obtained.
Claims (6)
1. A weak supervision semantic segmentation method based on deep learning is characterized by comprising the following steps:
step (1), training a classification network;
replacing the convolution of the fourth and fifth volume blocks with a hole convolution with a hole rate of 2, with Resnet50 pre-trained on the ImageNet dataset as a classification frame; firstly, collecting original data, and marking by a professional; dividing the marked data set into a training set, a verification set and a test set; according to the actual situation, the data set has K categories, and c represents any one of the categories; then, fine-tuning the classification frame through a training set with classification labels to finish the training of the classification network;
after the classification network training is finished, sampling the images of each training set in multiple scales by a bilinear interpolation method, wherein the sampling rates are 0.5, 1, 1.5 and 2.0 respectively; respectively inputting the sampled images of four scales into a classification network to obtain class activation maps of four scales, then sampling the class activation maps of four scales to be consistent with the original input image in size, fusing the class activation maps of four scales and averaging the sampled class activation maps to obtain a fused class activation map, namely a modified class activation map;
normalizing the corrected class activation graphs, and obtaining a segmentation graph of each class by adopting a threshold segmentation method according to the actual classification number;
step (4), taking the training set as input, taking the segmentation graph obtained in the step (3) as a label for semantic segmentation, and simultaneously inputting the original input image and the segmentation graph obtained in the step (3) into a fully-connected conditional random field (Dense conditional random fields, Dense CRF) to optimize the label; training a UNet-based semantic segmentation network (Mixed-UNet) by using the optimized label and the label obtained in the step (3), wherein the semantic segmentation network is formed by fusing two UNets on the whole frame, the two UNets share one feature extractor in the feature extraction stage, and the feature extractor is divided into two branches in the up-sampling stage; the first branch is added into a pyramid pooling module, and the second branch is added with an attention mechanism;
step (5), in the training stage, a training set is used as input, an output loss function is calculated, network parameters are adjusted through a back propagation algorithm, and a model is verified through a verification set; taking the model with the best effect on the verification set as a final model; and after the training is finished, testing the trained model on the test set.
2. The weak supervised semantic segmentation method based on deep learning as claimed in claim 1, wherein the specific method in step (1) is as follows:
firstly, collecting original data, and marking by a professional; dividing the marked data set into a training set, a verification set and a test set; according to the actual situation, the data set has K categories, and c represents any one of the categories; replacing the convolution of the fourth and fifth volume blocks of Resnet50 with a hole convolution with a hole rate of 2, with Resnet50 pre-trained on the ImageNet dataset as a classification frame; then removing the full connection layer of Resnet50, applying global average pooling on the finally obtained feature map, and taking the features obtained after global average pooling as the features of the full connection layer for generating required output; the detailed calculation process is as follows:
for any given input image, fk(x, y) represents the active unit k of the last convolutional layer at spatial position (x, y); then, for the active cell k, the result F after global average pooling is performedkIs sigmax,yfk(x, y); thus, for a given class c, the output S passes through softmaxcIs composed ofWhereinIs the response weight of the kth unit to class c;reflect FkImportance to category c; by bringing in FkTo ScCan obtain:
finally, the training set is used for training the classification network, so as to fine-tune the parameters and make the parameters have good performance on a specific data set.
3. The weak supervised semantic segmentation method based on deep learning as claimed in claim 2, wherein the specific method in step (2) is as follows:
after the training of the classification network is finished, sampling each image in the training set for four times, wherein the sampling rates are respectively 0.5, 1.0, 1.5 and 2.0, and respectively inputting the obtained images with four scales into a classification network Resnet50 to obtain a class activation map corresponding to each input image; then sampling the four class activation images with different scales to the size consistent with the size of the original input image, and fusing and averaging the four class activation images; the specific process is as follows:
with respect to the i-th input image,representing a class activation map with a scale j corresponding to the ith image category c, where j is 0.5, 1.0, 1.5, and 0.2, the class activation map after fusing the four scales is:
4. the weak supervised semantic segmentation method based on deep learning of claim 3, wherein the specific method in the step (3) is as follows;
normalizing the fused class activation graph; let xcFor any position in the c-th class activation map, the value of the class activation map is changed to [0,1 ] by using the following normalization formula]A number in between;
after normalization is completed, carrying out threshold segmentation on the class activation graph, setting the class activation graph as a foreground when the class activation graph is larger than or equal to a threshold, and setting the class activation graph as a background when the class activation graph is smaller than the threshold; assuming f (x, y) as the finger of the class activation map at the spatial position (x, y), the segmentation process is as follows:
wherein T is a set threshold value.
5. The weak supervised semantic segmentation method based on deep learning as claimed in claim 4, wherein the specific method in step (4) is as follows:
taking the segmentation graph obtained in the step three as a label of a training semantic segmentation network; the semantic segmentation network overall framework is formed by fusing two unets, the two unets share one feature extractor in the feature extraction stage, and the two unets are divided into two branches in the up-sampling stage; the first branch is added into a pyramid pooling module, and the second branch is added with an attention mechanism; the pyramid pooling module can reduce the loss of context information among different subregions, integrates 4 features with different pyramid scales, generates single bin output by global pooling of the first row which is the coarsest feature, and generates pooling features with different scales in the last three rows; in order to ensure the weight of the global features, if the pyramid has N levels, reducing the number of channels to 1/N of the original number by using 1 × 1 convolution after each level; then obtaining a characteristic diagram with the same size as that before pooling through bilinear interpolation, and finally concat together; adding a Squeeze-and-Excitation block into the second branch, wherein the first step is extrusion operation, and the global spatial characteristics of each channel in the characteristic diagram are taken as the representation of each channel in the characteristic diagram to form a channel descriptor; the second step is activation operation, the dependence degree of the segmentation network on each channel is learned, the weight of each channel in the feature graph is adjusted according to the difference of the dependence degree, and the adjusted feature graph is output of the SE block; obtaining the final segmentation result by convolving the multi-scale information extracted by the first branch and the attention information concat extracted by the second branch;
the tags were optimized by using a Dense CRF; specifically, a training set and labels are input into a Dense CRF at the same time, and the Dense CRF model optimizes the labels according to the pixel correlation of the training set to obtain more precise output; the Loss function of the semantic segmentation network comprises two parts, wherein one part is the Loss of the segmentation result and the label obtained by threshold segmentation in the step (3) is 1, and the other part is the Loss of the segmentation result and the label optimized by the Dense CRF is 1; total Loss function Loss1+ Loss 2; the loss function takes the form of a weighted cross-entropy loss function, which is formulated as follows:
6. The weak supervised semantic segmentation method based on deep learning of claim 5, wherein in the training process of the step (5), the model is verified by using a verification set every 100 iterations, and the verified model is saved; after training is finished, selecting the model which best appears on the verification set as a final model; in the testing stage, the image data of the testing set is input, and the classification and segmentation results of the image can be obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110441665.6A CN113159048A (en) | 2021-04-23 | 2021-04-23 | Weak supervision semantic segmentation method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110441665.6A CN113159048A (en) | 2021-04-23 | 2021-04-23 | Weak supervision semantic segmentation method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113159048A true CN113159048A (en) | 2021-07-23 |
Family
ID=76869829
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110441665.6A Pending CN113159048A (en) | 2021-04-23 | 2021-04-23 | Weak supervision semantic segmentation method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113159048A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113838130A (en) * | 2021-08-30 | 2021-12-24 | 厦门大学 | Weak supervision target positioning method based on feature expansibility learning |
CN114022406A (en) * | 2021-09-15 | 2022-02-08 | 济南国科医工科技发展有限公司 | Image segmentation method, system and terminal for semi-supervised learning |
CN114170233A (en) * | 2021-12-09 | 2022-03-11 | 北京字跳网络技术有限公司 | Image segmentation label generation method and device, electronic equipment and storage medium |
CN114494693A (en) * | 2021-12-31 | 2022-05-13 | 清华大学 | Method and device for performing semantic segmentation on image |
CN114529893A (en) * | 2021-12-22 | 2022-05-24 | 电子科技大学成都学院 | Container code identification method and device |
CN114677515A (en) * | 2022-04-25 | 2022-06-28 | 电子科技大学 | Weak supervision semantic segmentation method based on inter-class similarity |
CN114820655A (en) * | 2022-04-26 | 2022-07-29 | 中国地质大学(武汉) | Weak supervision building segmentation method taking reliable area as attention mechanism supervision |
CN114842330A (en) * | 2022-03-29 | 2022-08-02 | 深圳市规划和自然资源数据管理中心 | Multi-scale background perception pooling weak supervised building extraction method |
CN114998595A (en) * | 2022-07-18 | 2022-09-02 | 赛维森(广州)医疗科技服务有限公司 | Weak supervision semantic segmentation method, semantic segmentation method and readable storage medium |
CN115222942A (en) * | 2022-07-26 | 2022-10-21 | 吉林建筑大学 | New coronary pneumonia CT image segmentation method based on weak supervised learning |
CN116935168A (en) * | 2023-09-13 | 2023-10-24 | 苏州魔视智能科技有限公司 | Method, device, computer equipment and storage medium for training target detection model |
CN118503546A (en) * | 2024-07-18 | 2024-08-16 | 广州博今网络技术有限公司 | Form data pushing method and system based on associated objects |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110363201A (en) * | 2019-07-10 | 2019-10-22 | 上海交通大学 | Weakly supervised semantic segmentation method and system based on Cooperative Study |
CN112052877A (en) * | 2020-08-06 | 2020-12-08 | 杭州电子科技大学 | Image fine-grained classification method based on cascade enhanced network |
-
2021
- 2021-04-23 CN CN202110441665.6A patent/CN113159048A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110363201A (en) * | 2019-07-10 | 2019-10-22 | 上海交通大学 | Weakly supervised semantic segmentation method and system based on Cooperative Study |
CN112052877A (en) * | 2020-08-06 | 2020-12-08 | 杭州电子科技大学 | Image fine-grained classification method based on cascade enhanced network |
Non-Patent Citations (2)
Title |
---|
BOLEI ZHOU等: "Learning Deep Features for Discriminative Localization", 《ARXIV》, 14 December 2015 (2015-12-14), pages 1 - 10 * |
YAN KONG等: "Automated yeast cells segmentation and counting using a parallel U-Net based two-stage framework", 《OSA CONTINUUM》, 8 April 2020 (2020-04-08), pages 982 - 993 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113838130B (en) * | 2021-08-30 | 2023-07-18 | 厦门大学 | Weak supervision target positioning method based on feature expansibility learning |
CN113838130A (en) * | 2021-08-30 | 2021-12-24 | 厦门大学 | Weak supervision target positioning method based on feature expansibility learning |
CN114022406A (en) * | 2021-09-15 | 2022-02-08 | 济南国科医工科技发展有限公司 | Image segmentation method, system and terminal for semi-supervised learning |
CN114170233A (en) * | 2021-12-09 | 2022-03-11 | 北京字跳网络技术有限公司 | Image segmentation label generation method and device, electronic equipment and storage medium |
CN114170233B (en) * | 2021-12-09 | 2024-02-09 | 北京字跳网络技术有限公司 | Image segmentation label generation method and device, electronic equipment and storage medium |
CN114529893A (en) * | 2021-12-22 | 2022-05-24 | 电子科技大学成都学院 | Container code identification method and device |
CN114494693A (en) * | 2021-12-31 | 2022-05-13 | 清华大学 | Method and device for performing semantic segmentation on image |
CN114842330B (en) * | 2022-03-29 | 2023-08-18 | 深圳市规划和自然资源数据管理中心 | Multi-scale background perception pooling weak supervision building extraction method |
CN114842330A (en) * | 2022-03-29 | 2022-08-02 | 深圳市规划和自然资源数据管理中心 | Multi-scale background perception pooling weak supervised building extraction method |
CN114677515A (en) * | 2022-04-25 | 2022-06-28 | 电子科技大学 | Weak supervision semantic segmentation method based on inter-class similarity |
CN114820655A (en) * | 2022-04-26 | 2022-07-29 | 中国地质大学(武汉) | Weak supervision building segmentation method taking reliable area as attention mechanism supervision |
CN114820655B (en) * | 2022-04-26 | 2024-04-19 | 中国地质大学(武汉) | Weak supervision building segmentation method taking reliable area as attention mechanism supervision |
CN114998595A (en) * | 2022-07-18 | 2022-09-02 | 赛维森(广州)医疗科技服务有限公司 | Weak supervision semantic segmentation method, semantic segmentation method and readable storage medium |
CN114998595B (en) * | 2022-07-18 | 2022-11-08 | 赛维森(广州)医疗科技服务有限公司 | Weak supervision semantic segmentation method, semantic segmentation method and readable storage medium |
CN115222942A (en) * | 2022-07-26 | 2022-10-21 | 吉林建筑大学 | New coronary pneumonia CT image segmentation method based on weak supervised learning |
CN116935168A (en) * | 2023-09-13 | 2023-10-24 | 苏州魔视智能科技有限公司 | Method, device, computer equipment and storage medium for training target detection model |
CN116935168B (en) * | 2023-09-13 | 2024-01-30 | 苏州魔视智能科技有限公司 | Method, device, computer equipment and storage medium for target detection |
CN118503546A (en) * | 2024-07-18 | 2024-08-16 | 广州博今网络技术有限公司 | Form data pushing method and system based on associated objects |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113159048A (en) | Weak supervision semantic segmentation method based on deep learning | |
CN110070074B (en) | Method for constructing pedestrian detection model | |
CN111680706B (en) | Dual-channel output contour detection method based on coding and decoding structure | |
CN105701502B (en) | Automatic image annotation method based on Monte Carlo data equalization | |
CN112949828B (en) | Graph convolution neural network traffic prediction method and system based on graph learning | |
CN114092832B (en) | High-resolution remote sensing image classification method based on parallel hybrid convolutional network | |
CN112101430B (en) | Anchor frame generation method for image target detection processing and lightweight target detection method | |
CN110781262B (en) | Semantic map construction method based on visual SLAM | |
CN112347970B (en) | Remote sensing image ground object identification method based on graph convolution neural network | |
CN109697469A (en) | A kind of self study small sample Classifying Method in Remote Sensing Image based on consistency constraint | |
CN113591380B (en) | Traffic flow prediction method, medium and equipment based on graph Gaussian process | |
CN113128355A (en) | Unmanned aerial vehicle image real-time target detection method based on channel pruning | |
CN109029363A (en) | A kind of target ranging method based on deep learning | |
CN100370486C (en) | Typhoon center positioning method based on embedded type concealed Markov model and cross entropy | |
CN111882620B (en) | Road drivable area segmentation method based on multi-scale information | |
CN106355151A (en) | Recognition method, based on deep belief network, of three-dimensional SAR images | |
CN106295613A (en) | A kind of unmanned plane target localization method and system | |
CN110738247A (en) | fine-grained image classification method based on selective sparse sampling | |
CN112001422B (en) | Image mark estimation method based on deep Bayesian learning | |
CN114863348B (en) | Video target segmentation method based on self-supervision | |
CN112581483B (en) | Self-learning-based plant leaf vein segmentation method and device | |
CN113591617B (en) | Deep learning-based water surface small target detection and classification method | |
CN111461006A (en) | Optical remote sensing image tower position detection method based on deep migration learning | |
CN109671019A (en) | A kind of remote sensing image sub-pixed mapping drafting method based on multi-objective optimization algorithm and sparse expression | |
CN114973019A (en) | Deep learning-based geospatial information change detection classification method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |