CN115019039B - Instance segmentation method and system combining self-supervision and global information enhancement - Google Patents
Instance segmentation method and system combining self-supervision and global information enhancement Download PDFInfo
- Publication number
- CN115019039B CN115019039B CN202210582668.6A CN202210582668A CN115019039B CN 115019039 B CN115019039 B CN 115019039B CN 202210582668 A CN202210582668 A CN 202210582668A CN 115019039 B CN115019039 B CN 115019039B
- Authority
- CN
- China
- Prior art keywords
- instance
- network
- supervision
- global information
- self
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 230000004927 fusion Effects 0.000 claims abstract description 4
- 230000003993 interaction Effects 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 18
- 239000013598 vector Substances 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 12
- 239000000654 additive Substances 0.000 claims description 9
- 230000000996 additive effect Effects 0.000 claims description 9
- 230000002708 enhancing effect Effects 0.000 claims description 9
- 238000005065 mining Methods 0.000 claims description 7
- 238000000638 solvent extraction Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 6
- 238000010276 construction Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 6
- 230000009466 transformation Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an example segmentation method and system combining self-supervision and global information enhancement, wherein the construction method firstly obtains a feature pyramid and performs feature map fusion by a feature extraction network based on ResNet networks and FPN modules; modeling the interaction relation among pixels of the feature images by adopting a Fastformer-based global information enhancement network, and extracting global information; then, carrying out instance segmentation through a prediction network, wherein the category prediction network is used for carrying out multi-label classification on the interested instance, and the mask prediction network is used for carrying out pixel value classification on the region where the instance is located, so as to generate an instance mask; in addition, a self-supervision learning network is added for carrying out contrast learning among examples in the picture, and the understanding capability of the model on the picture is enhanced to enhance generalization. The method can solve the problem of low detection performance on shielding and incomplete objects, strengthen the generalization capability of the model and improve the segmentation performance in a scene with more noise.
Description
Technical Field
The invention relates to the technical field of artificial intelligence and computer vision, in particular to an instance segmentation method and system combining self-supervision and global information enhancement.
Background
Instance segmentation is a more challenging task in the field of computer vision relative to object detection, involving the task of object detection and semantic segmentation. Firstly, positioning and classifying objects of interest in an image, and then, carrying out semantic segmentation on an instance to separate a foreground and a background. With the rapid development of intelligent driving and medical image segmentation and other technologies, the performance and instantaneity of an example segmentation algorithm are also put forward higher requirements. However, the conventional top-down object detection-based instance segmentation method and system and the bottom-up semantic segmentation-based method still have difficulty in meeting the requirements of the current intelligent driving and other fields on an instance segmentation algorithm in terms of instantaneity and performance.
How to enhance the performance of the instance segmentation algorithm and shorten the forward reasoning time has great significance. In recent years, some excellent single-stage instance segmentation algorithms are proposed, so that the problems are alleviated, and a more ideal effect is achieved. Nevertheless, these algorithms still suffer from a number of drawbacks: the feature extraction network based on convolution lacks global information during information extraction, so that the detection effect on incomplete or blocked objects is poor; in addition, the supervised training mode causes poor generalization capability of the trained model, and the performance of the algorithm is difficult to develop for a scene with high noise.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide an instance segmentation method and system combining self-supervision and global information enhancement, and aims to solve the problems that the existing instance segmentation method and system lack global information in a feature extraction stage, have poor generalization capability and have poor segmentation effect on a scene with large noise.
In order to achieve the above object, the present invention provides an example segmentation method and system combining self-supervision and global information enhancement, including:
Step S1: establishing an instance segmentation model;
the example segmentation model comprises a feature extraction network, a global information enhancement network, a self-supervision learning network, a category prediction network and a mask prediction network;
the feature extraction network comprises ResNet network and FPN network, resNet is used for obtaining a picture pyramid by superposing a plurality of convolution layers, relu layers and normalization layers and residual connection. The FPN is used for combining semantic information rich in the upper-layer feature map and accurate position information of the lower-layer feature map in the feature pyramid to perform feature fusion;
The global information enhancement network is composed of Fastformer modules and is used for modeling the interaction relation between each pixel point in the feature map, extracting context information and enhancing the global information of the feature map;
The self-supervision learning network is used for carrying out contrast learning on examples in the pictures, enhancing the understanding capability of the pictures and enhancing the generalization capability of the model;
The class prediction network is used for performing multi-label classification on the interested examples to obtain the corresponding class of each example;
The mask prediction network is used for carrying out two classifications on the pixel points in the selected instance area, distinguishing the foreground from the background and generating the mask of the instance.
Step S2: training an example segmentation model;
And inputting the selected training data set which comprises the picture data and the corresponding label file. Firstly extracting a feature map, and then fusing the feature map. And then, the global information is enhanced, the global information is input into a prediction network for prediction, a loss function is obtained by comparing the global information with a label file, and the model training direction is guided by back propagation of the loss function.
Step S3: instance partitioning
The picture is divided into S x S networks, each of which is responsible for predicting the instance where the center point falls in that location. I.e. centering on the grid, predicts the class and mask of the corresponding instance.
Optionally, the feature extraction network is ResNet-50 and FPN network.
Further, the global information enhancement module is a Fastformer network based on additive attention.
The additive attention is subjected to linear transformation according to the input characteristic sequence E epsilon R N×d (N is the sequence length and d is the hidden dimension) to respectively obtain a query matrix, a key matrix and a value matrix, and the query matrix, the key matrix and the value matrix are marked as Q, K and V epsilon R N×d.
And generating a weight matrix by adopting additive attention to the query matrix Q, and adding the weight matrix Q to obtain a global query matrix. The global query vector Q is then dot multiplied with the key vector K, modeling their interrelationships.
Further, the same operation is adopted to generate a global key vector, interactive modeling is carried out on the global key vector and the value vector V, and finally, a feature vector containing rich global semantic information is obtained.
The self-supervision learning network firstly utilizes the marking box label information to obtain all instance characteristic representations, and for a randomly selected sample instance A, other instances are used as candidate pools, and similarity scores of the sample instance A and the candidate pools are calculated.
Optionally, the similarity score calculating process is as follows:
Further, the examples are ranked according to the similarity score, top-k is taken as a query set Q, and then the query set is utilized to mine false positive examples in the candidate pool.
The process for mining the false positive example comprises the following steps:
(1) And calculating the similarity between each instance in the Q and the instance in the candidate pool. Each instance I of the candidate pool gets N similarity scores (N is the number of instances in the query set Q).
(2) And (3) performing aggregation operation on the similarity scores, sequencing, taking an instance of top-k exceeding a threshold value as a pseudo positive instance, and adding the pseudo positive instance into the query set Q.
(3) And continuing to perform false positive mining by using the updated query set Q until the mined false positive is lower than the threshold value. The query set is taken as a pseudo positive example set, and the remaining examples in the candidate pool are taken as negative example sets.
(4) Obtaining a similarity score of the sample A and each example in the pseudo positive example set by using a softmax function:
Where p i is a pseudo-positive example set instance, N n is the number of negative examples, and N i is a negative example set instance.
Optionally, taking a negative logarithm of the similarity score to obtain a comparison learning loss function:
Further, the class prediction network adopts a Focal loss, and a loss function is obtained by predicting the probability that each instance belongs to a certain class.
The mask prediction network is used for carrying out two classifications on the pixel points in the selected instance area, distinguishing the foreground from the background and generating the mask of the instance.
Optionally, the mask predicts the network loss function as:
Where N pos is the positive sample number, is the class score predicted by the cell at the (i, j) position, and ψ is the indicator function.
Optionally, for d mask, use is made of Dice Loss:
LDice=1-D(p,q)
Where P x,y represents the predicted pixel value of the cell at (x, y) and q x,y represents the true pixel value of the cell at (x, y).
Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:
(1) On the basis of a single-stage example segmentation algorithm, the method is used for modeling global semantic information at a pixel level in the feature map by adding the Fastformer module based on the additive attention, so that the segmentation effect of the model on the shielded and incomplete object is improved.
(2) According to the invention, the self-supervision learning module is added in the prediction network, and the understanding capability of the model to the picture is enhanced and the generalization capability of the model is enhanced by carrying out contrast learning on all examples in the picture.
Drawings
FIG. 1 is a flow chart of an example segmentation model provided by an embodiment of the present invention;
FIG. 2 is a diagram of an example segmentation model framework provided by an embodiment of the present invention;
FIG. 3 is an image to be measured provided by an embodiment;
FIG. 4 (a) is a segmentation result obtained by the original single-phase example segmentation method and system;
fig. 4 (b) is an example segmentation result obtained using the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the invention provides an example segmentation method and system combining self-supervision and global information enhancement, comprising the following steps:
Step S1: establishing an instance segmentation model;
As shown in fig. 1, the example segmentation model includes a feature extraction network, a global information enhancement network, a self-supervised learning network, a class prediction network, and a mask prediction network;
The feature extraction network comprises ResNet-50 network and FPN network, resNet is used for obtaining four layers of picture pyramids with different scales by superposing a plurality of convolution layers, relu layers and normalization layers and residual connection. The FPN is used for combining semantic information rich in the upper-layer feature images and accurate position information of the lower-layer feature images in the feature pyramid to perform feature fusion;
The global information enhancement network is Fastformer module, which is used for modeling the interaction relation between each pixel point in the feature map, extracting context information and enhancing the global information of the feature map.
According to the input characteristic sequence E epsilon R N×d (N is sequence length and d is hidden dimension), making linear transformation to obtain inquiry matrix, key matrix and value matrix, respectively, recording them as Q,K,V∈RN×d:Q=[q1,q2,...,qN],K=[k1,k2,...,kN],V=[v1,v2,...,vN].
Generating a weight matrix by adopting additive attention to a query matrix Q, and adding the weight matrix Q to obtain a global query matrix:
Where α i is the attention weight value of a certain vector Q i in the query matrix Q, and w q∈Rd is a learnable parameter vector. The global query vector Q is then dot multiplied with the key vector K, modeling their interrelationships.
And generating a global key vector by adopting the same operation, performing interactive modeling with the value vector V, and finally obtaining a feature vector containing rich global semantic information.
The self-supervision learning network is used for carrying out contrast learning on the examples in the pictures, enhancing the understanding capability of the pictures and enhancing the generalization capability of the model;
Firstly, obtaining all instance characteristic representations by utilizing the marking box label information, and for a randomly selected sample instance A, taking the rest instances as candidate pools, calculating similarity scores of the sample instances and the candidate pools, wherein the calculation formula is as follows:
The examples are ordered according to the similarity score, top-k is taken as a query set Q, then pseudo-positive examples are mined in a candidate pool by utilizing the query set, and the mining process comprises the following steps:
(1) And calculating the similarity between each instance in the Q and the instance in the candidate pool. Each instance I of the candidate pool gets N similarity scores (N is the number of instances in the query set Q):
S(I,Q)=(S(I,q1),S(I,q2),...,S(I,qN))
(2) And (3) performing aggregation operation on the similarity scores, sequencing, taking an instance of top-k exceeding a threshold value as a pseudo positive instance, and adding the pseudo positive instance into the query set Q.
(3) And continuing to perform false positive mining by using the updated query set Q until the mined false positive is lower than the threshold value. The query set is taken as a pseudo positive example set, and the remaining examples in the candidate pool are taken as negative example sets.
(4) Obtaining a similarity score of the sample A and each example in the pseudo positive example set by using a softmax function:
Where p i is a pseudo-positive example set instance, N n is the number of negative examples, and N i is a negative example set instance.
Taking the negative logarithm of the similarity score to obtain a comparison learning loss function:
The class prediction network is used for performing multi-label classification on the interested examples to obtain the corresponding class of each example;
and the mask prediction network is used for carrying out two classifications on the pixel points in the selected instance area, distinguishing the foreground from the background and generating the mask of the instance. The mask predictive network loss function is:
Where N pos is the positive sample number, is the class score predicted by the cell at the (i, j) position, and ψ is the indicator function.
For d mask, use is made of Dice Loss:
LDice=1-D(p,q)
Step S2: training an example segmentation model;
and inputting the selected training data set which comprises the picture data and the corresponding label file. Firstly extracting a feature map, and then fusing the feature map. And then, global information is enhanced, the global information is input into a head network for prediction, a loss function is obtained, the reverse propagation direction is influenced through the loss function, and model training is guided.
The present invention uses a city road street view dataset CITYSCAPES for model training, which uses street view images of different cities. Contains 2975 training sets, 500 validation sets and 1525 test images with high quality annotations.
Step S3: instance partitioning
The picture is first divided into S x S networks, each of which is responsible for predicting the instance where the center point falls in that location. I.e. centering on the grid, predicts the class and mask of the corresponding instance.
Fig. 2 is an input image, fig. 3 is an image to be measured provided in the embodiment, and the segmentation result using the original single-stage example segmentation method and system is shown in fig. 4 (a), it can be seen that the mask generated by the motorcycle on the right side of the first picture has poor matching degree, the enclosing wall is identified as a truck due to poor light and more noise on the right half part in the second picture, and the third picture is for an incomplete example: the motorcycle and rider are not well separated. The example segmentation results using the method of the present invention are shown in FIG. 4 (b), and are all very good improvements to the above case.
The method improves the problem that the original single-stage example segmentation algorithm has poor detection effect on the blocked or incomplete object to a certain extent, and further greatly improves the generalization capability of the model and the segmentation effect in the scenes such as insufficient illumination or over-strong exposure, rainy days and the like.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (10)
1. An instance segmentation method combining self-supervision and global information enhancement, comprising:
Step S1: establishing an instance segmentation model;
the example segmentation model comprises a feature extraction network, a global information enhancement network, a self-supervision learning network, a category prediction network and a mask prediction network;
The feature extraction network comprises ResNet network and FPN network, resNet is used for obtaining a picture pyramid by superposing a plurality of convolution layers, relu layers and normalization layers and residual connection; the FPN is used for combining semantic information rich in the upper-layer feature map and accurate position information of the lower-layer feature map in the feature pyramid to perform feature fusion;
the global information enhancement network is composed of Fastformer modules and is used for modeling the interaction relation between each pixel point in the feature map, extracting context information and enhancing the global information extraction capability of the feature map;
The self-supervision learning network is used for carrying out self-supervision contrast learning on the examples in the pictures, enhancing the understanding capability of the pictures and enhancing the generalization capability of the model;
The class prediction network is used for performing multi-label classification on the interested examples to obtain the corresponding class of each example;
the mask prediction network is used for carrying out two classifications on the pixel points in the selected instance area, distinguishing the foreground from the background and generating a mask of the instance;
Step S2: training an example segmentation model;
Inputting a selected training data set comprising picture data and corresponding tag files; firstly extracting a feature map, and then fusing the feature map; then, global information is enhanced, the global information is input into a head network for prediction, a loss function is obtained, and the direction of model training is optimized through back propagation of the loss function;
Step S3: instance partitioning
Firstly, dividing a picture into S multiplied by S networks, wherein each grid is responsible for predicting an instance that a center point falls at the position; i.e. centering on the grid, predicts the class and mask of the corresponding instance.
2. An instance segmentation method combining self-supervision and global information enhancement according to claim 1, wherein the feature extraction networks are ResNet-50 and FPN networks.
3. An instance splitting method combining self-supervision and global information enhancement according to claim 1, wherein the global information enhancement network is an additive attention-based Fastformer network.
4. An example segmentation method according to claim 3, wherein the additive attention is linearly transformed according to the input characteristic sequence E R B×d, B is the sequence length, d is the hidden dimension to obtain a query matrix, a key matrix and a value matrix, respectively, denoted Q, K, V E R B×d.
5. The method for partitioning instances by combining self-supervision and global information enhancement as recited in claim 4, wherein said query matrix Q is weighted by additive attention, and added to Q to obtain a global query matrix vector; and then, carrying out point multiplication on Q and K, and modeling the interrelationship of the Q and the K.
6. The method for instance segmentation combining self-supervision and global information enhancement according to claim 5, wherein the key matrix K is subjected to additive attention generation weight matrix, global key vectors are obtained by adding the weight matrix K, interactive modeling is performed with V, and finally feature vectors containing rich global semantic information are obtained.
7. The method for partitioning an instance by combining self-supervision and global information enhancement according to claim 1, wherein the self-supervision learning network first obtains feature representations of all instances by using a binding box label information, and calculates similarity scores between the rest instances as candidate pools for randomly selected sample instances a.
8. The method of claim 7, wherein the similarity score calculation process is as follows:
and sorting the examples according to the similarity scores, taking top-k as a query set Q, and then mining pseudo-positive examples in the candidate pool by utilizing the query set.
9. The method of claim 8, wherein the mining pseudo-positive example process comprises:
(1) Calculating the similarity between each instance in Q and the instance in the candidate pool; n similarity scores are obtained for each instance I of the candidate pool, wherein N is the number of instances in the query set Q;
(2) Performing aggregation operation on the similarity scores, sequencing, taking an instance of top-k exceeding a threshold as a pseudo positive instance, and adding the pseudo positive instance into a query set Q;
(3) Continuing to utilize the updated query set Q to perform pseudo-positive example mining until the mined pseudo-positive examples are lower than a threshold value, taking the query set as a pseudo-positive example set and taking the rest examples in the candidate pool as negative example sets;
(4) Obtaining a similarity score of the sample A and each example in the pseudo positive example set by using a softmax function:
Wherein, p i is a pseudo positive example set instance, N n is the number of negative samples, and N i is a negative example set instance;
(5) Taking the negative logarithm of the similarity score to obtain a comparison learning loss function:
10. The method for partitioning instances by combining self-supervision and global information enhancement according to claim 1, wherein the class prediction network uses Focal loss to obtain a loss function by predicting the probability that each instance belongs to a certain class; the mask prediction network is used for carrying out two classifications on the pixel points in the selected instance area, distinguishing the foreground from the background and generating the mask of the instance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210582668.6A CN115019039B (en) | 2022-05-26 | 2022-05-26 | Instance segmentation method and system combining self-supervision and global information enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210582668.6A CN115019039B (en) | 2022-05-26 | 2022-05-26 | Instance segmentation method and system combining self-supervision and global information enhancement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115019039A CN115019039A (en) | 2022-09-06 |
CN115019039B true CN115019039B (en) | 2024-04-16 |
Family
ID=83071360
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210582668.6A Active CN115019039B (en) | 2022-05-26 | 2022-05-26 | Instance segmentation method and system combining self-supervision and global information enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115019039B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024103380A1 (en) * | 2022-11-18 | 2024-05-23 | Robert Bosch Gmbh | Method and apparatus for instance segmentation |
CN116664845B (en) * | 2023-07-28 | 2023-10-13 | 山东建筑大学 | Intelligent engineering image segmentation method and system based on inter-block contrast attention mechanism |
CN117853732A (en) * | 2024-01-22 | 2024-04-09 | 广东工业大学 | Self-supervision re-digitizable terahertz image dangerous object instance segmentation method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN112927245A (en) * | 2021-04-12 | 2021-06-08 | 华中科技大学 | End-to-end instance segmentation method based on instance query |
CN113392711A (en) * | 2021-05-19 | 2021-09-14 | 中国科学院声学研究所南海研究站 | Smoke semantic segmentation method and system based on high-level semantics and noise suppression |
CN113837205A (en) * | 2021-09-28 | 2021-12-24 | 北京有竹居网络技术有限公司 | Method, apparatus, device and medium for image feature representation generation |
CN114387454A (en) * | 2022-01-07 | 2022-04-22 | 东南大学 | Self-supervision pre-training method based on region screening module and multi-level comparison |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11830253B2 (en) * | 2020-04-14 | 2023-11-28 | Toyota Research Institute, Inc. | Semantically aware keypoint matching |
US11941086B2 (en) * | 2020-11-16 | 2024-03-26 | Salesforce, Inc. | Systems and methods for contrastive attention-supervised tuning |
-
2022
- 2022-05-26 CN CN202210582668.6A patent/CN115019039B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN112927245A (en) * | 2021-04-12 | 2021-06-08 | 华中科技大学 | End-to-end instance segmentation method based on instance query |
CN113392711A (en) * | 2021-05-19 | 2021-09-14 | 中国科学院声学研究所南海研究站 | Smoke semantic segmentation method and system based on high-level semantics and noise suppression |
CN113837205A (en) * | 2021-09-28 | 2021-12-24 | 北京有竹居网络技术有限公司 | Method, apparatus, device and medium for image feature representation generation |
CN114387454A (en) * | 2022-01-07 | 2022-04-22 | 东南大学 | Self-supervision pre-training method based on region screening module and multi-level comparison |
Non-Patent Citations (2)
Title |
---|
Advances in Neural information processing systems;WANG X et al.;《Solov2: Dynamic and fast instance segmentation》;20201231;第33卷;第17721-17732页 * |
Self-Supervised Attention Learning for Depth and Ego-motion Estimation;Assem Sadek et al.;《2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)》;20210124;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN115019039A (en) | 2022-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Khodabandeh et al. | A robust learning approach to domain adaptive object detection | |
Wang et al. | Weakly supervised adversarial domain adaptation for semantic segmentation in urban scenes | |
CN110322446B (en) | Domain self-adaptive semantic segmentation method based on similarity space alignment | |
CN115019039B (en) | Instance segmentation method and system combining self-supervision and global information enhancement | |
CN109711463B (en) | Attention-based important object detection method | |
CN114005096B (en) | Feature enhancement-based vehicle re-identification method | |
Wan et al. | An efficient small traffic sign detection method based on YOLOv3 | |
Wang et al. | An advanced YOLOv3 method for small-scale road object detection | |
CN113159120A (en) | Contraband detection method based on multi-scale cross-image weak supervision learning | |
Tian et al. | Small object detection via dual inspection mechanism for UAV visual images | |
Shen et al. | Vehicle detection in aerial images based on lightweight deep convolutional network and generative adversarial network | |
Li et al. | Detection-friendly dehazing: Object detection in real-world hazy scenes | |
CN116596966A (en) | Segmentation and tracking method based on attention and feature fusion | |
Yuan | Language bias in visual question answering: A survey and taxonomy | |
Liu et al. | Density saliency for clustered building detection and population capacity estimation | |
Yan et al. | Video scene parsing: An overview of deep learning methods and datasets | |
Wu et al. | Vehicle detection based on adaptive multi-modal feature fusion and cross-modal vehicle index using RGB-T images | |
Huang et al. | Pedestrian detection using RetinaNet with multi-branch structure and double pooling attention mechanism | |
Lv et al. | Contour deformation network for instance segmentation | |
Li et al. | Object extraction from very high-resolution images using a convolutional neural network based on a noisy large-scale dataset | |
Nam et al. | A novel unsupervised domain adaption method for depth-guided semantic segmentation using coarse-to-fine alignment | |
CN115965968A (en) | Small sample target detection and identification method based on knowledge guidance | |
Li et al. | Prediction model of urban street public space art design indicators based on deep convolutional neural network | |
Islam et al. | Faster R-CNN based traffic sign detection and classification | |
CN113298037B (en) | Vehicle weight recognition method based on capsule network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |