CN114612663B - Domain self-adaptive instance segmentation method and device based on weak supervision learning - Google Patents
Domain self-adaptive instance segmentation method and device based on weak supervision learning Download PDFInfo
- Publication number
- CN114612663B CN114612663B CN202210236149.4A CN202210236149A CN114612663B CN 114612663 B CN114612663 B CN 114612663B CN 202210236149 A CN202210236149 A CN 202210236149A CN 114612663 B CN114612663 B CN 114612663B
- Authority
- CN
- China
- Prior art keywords
- mask
- instance segmentation
- instance
- domain
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 98
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000005070 sampling Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000002372 labelling Methods 0.000 claims abstract description 16
- 238000012937 correction Methods 0.000 claims abstract description 11
- 230000003044 adaptive effect Effects 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005192 partition Methods 0.000 claims description 6
- 238000013461 design Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 238000009833 condensation Methods 0.000 claims 1
- 230000005494 condensation Effects 0.000 claims 1
- 230000002776 aggregation Effects 0.000 abstract description 8
- 238000004220 aggregation Methods 0.000 abstract description 8
- 238000012795 verification Methods 0.000 abstract description 3
- 238000012545 processing Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 6
- 230000006978 adaptation Effects 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a field self-adaptive instance segmentation method and device based on weak supervision learning, which comprises the steps of firstly training an initial instance segmentation model on a source field, outputting backbone network characteristics and semantic score tensors, constructing a semantic tree by using hierarchical aggregation clustering, sampling leaf nodes of the semantic tree, rapidly judging whether an instance segmentation mask is accurate, carrying out mask correction on an instance with inaccurate prediction according to labeling information, and finely adjusting the initial instance segmentation model according to a target field mask correction result, thereby improving the effectiveness of the instance segmentation model. According to the method, the accurate samples are rapidly judged by using limited verification signals, the adaptability of the initial instance segmentation model is improved by spreading the accurate samples, partial noise in the inaccurate samples is processed, and the problems that in the aspect of domain self-adaption, the segmentation model is improved by introducing a supervision signal from a target domain data set, manual labeling is tedious and time-consuming, and self-training contains too much noise in a pseudo tag are solved.
Description
Technical Field
The invention relates to the technical field of instance segmentation, in particular to a domain self-adaptive instance segmentation method and device based on weak supervision learning.
Background
Example segmentation has been an active area of research and engineering in the last decade. It is used in a number of signal processing fields such as image editing, scene understanding, autopilot and human-machine interaction. With the rapid development of deep convolutional neural networks (Deep Convolutional Neural Network, DCNN), the current example segmentation approach achieves satisfactory results and efficiencies. However, when segmentation models learned from source domains are applied to target domains in relation to domain adaptation, they suffer from a rapid degradation in performance due to data drift (DATA DRIFT) between the source and target domains. The labeling of a large number of image pixel levels on the target domain requires significant time costs, while the unsupervised approach minimizes the task-specific and domain-specific losses of the source domain, limited by the distributed overlap between the source and target domains. Other approaches employ self-training strategies to fine tune the segmentation model by using target-specific pseudo tags, with limited improvement due to noise in the pseudo tags or the introduction of strong assumptions.
Disclosure of Invention
The invention aims to provide a domain self-adaptive instance segmentation method and device based on weak supervision learning aiming at the defects of the prior art.
The aim of the invention is realized by the following technical scheme:
According to a first aspect of the present specification, there is provided a domain adaptive instance segmentation method based on weakly supervised learning, the method comprising the steps of:
(1) Training an initial instance segmentation model on a source domain, and outputting backbone network characteristics and a semantic score tensor, wherein the semantic score tensor comprises probabilities of different instances to which each pixel belongs;
(2) Performing instance segmentation on a target domain by using an initial instance segmentation model obtained by source domain training, and outputting backbone network characteristics and semantic score tensors corresponding to each image;
(3) Taking the maximum value on the example dimension of the semantic score tensor obtained in the step (2) to obtain an example segmentation mask of each image of the target domain; multiplying the instance segmentation mask of each image of the target domain with the backbone network characteristics of the target domain and the semantic fraction tensor of the target domain respectively to obtain mask characteristics and mask semantic fraction tensor of each instance of the target domain;
(4) Splicing the mask feature f t of the example t obtained in the step (3) and the mask semantic score tensor s t to obtain an enhanced mask feature f t + of the example t;
(5) Constructing a semantic tree corresponding to each category by using hierarchical aggregation clustering (HIERARCHICAL AGGLOMERATIVE CLUSTERING, HAC), regarding the enhanced mask feature of each instance belonging to the category as a leaf node, merging two child nodes with minimum Euclidean distance of the enhanced mask feature among the instances in each aggregation selection to obtain a merging node, wherein the child nodes comprise leaf nodes and intermediate nodes, and the enhanced mask feature and the mask semantic score tensor of the merging node are respectively linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child nodes;
(6) For each semantic tree, sampling leaf nodes of the semantic tree based on a set sampling rate, rapidly judging whether an instance segmentation mask is accurate or not, and labeling a judgment result;
(7) Comparing statistical values, such as average values, of labeling results of all sampling examples on the semantic tree corresponding to the category k with a set threshold value: if the statistical value is larger than the threshold value, the prediction of the category k is accurate, and an inaccurate sampling example can utilize the accurate sampling example to carry out mask correction; if the statistical value is less than or equal to the threshold value, the prediction of the specification class k is inaccurate, splitting the corresponding semantic tree into two subtrees, re-sampling the example calculation labeling result statistical value of each subtree, comparing with the design threshold value, and repeating the splitting-comparing process until the subtree is not detachable or does not contain any accurate sampling example;
(8) And fine-tuning the initial instance segmentation model according to the target domain mask correction result, thereby improving the effectiveness of the instance segmentation model.
Further, the step (5) specifically comprises:
merging the child node corresponding to the instance t and the child node corresponding to the instance o to obtain a merging node n j, wherein the enhanced mask feature f j + and the mask semantic score tensor s j of the merging node n j are linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child node:
sj=wtst+woso
Wherein the weights w t and w o are related to the size of the child nodes:
Wherein, P t and P o are the number of examples contained in the corresponding child nodes respectively; for leaf nodes, w t=w0 =1/2;
The nodes are combined through multiple aggregation, and finally, a semantic tree corresponding to each category is constructed; the semantic tree root node is denoted as n 0, and the rest of intermediate nodes are denoted as Where J k is the number of intermediate nodes of class k.
Further, in step (7), the calculation formula of the statistics Q k of the labeling results of all the sampling instances on the semantic tree corresponding to the category k is:
Wherein N is the number of instances of sampling the semantic tree of the category k, and the number of instances k 1,…,kN,lt is the judgment result of the instance segmentation mask in the step (6).
Further, the backbone network of the initial instance segmentation model is comprised of swin transformer; in the source domain, the training dataset includes an image set { X source } and a corresponding instance mask image set { Y source }; in the target domain, the test dataset includes only the image set { X tatget }.
Further, the initial instance segmentation model performs data enhancement in a training stage, wherein the data enhancement comprises horizontal/vertical overturning, translation and scale expansion; the initial example segmentation model was trained using AdamW optimizer, with an initial learning rate of 0.001, following polynomial decay strategy, weight decay of 0.0001, and batch size in experiments of 4.
Further, in step (3), the mask feature f t and the mask semantic score tensor s t of the target domain instance t are specifically expressed as:
the mask feature f t is an instance partition mask of the target domain backbone network feature multiplied by the target domain instance t, the mask semantic score tensor s t is an instance partition mask of the target domain semantic score tensor multiplied by the target domain instance t, s t∈RW×H×K, K is the number of instances contained in the image, and w×h is the image size.
Further, in step (6), leaf nodes of the semantic tree are sampled, specifically:
for the semantic tree T k constructed by category k, N instance segmentation masks are randomly selected based on the set sampling rate R The annotator quickly determines whether the selected instance segmentation mask is accurate, if the predicted instance segmentation mask m t is accurate, 1 is annotated, and if not, 0 is annotated.
Further, the invention is implemented on a Pascal VOC 2012 dataset and a COCO dataset, wherein the Pascal VOC 2012 dataset consists of 1464 training images and 1449 Zhang Yanzheng images and has an active segmentation result, the Pascal VOC 2012 dataset has 20 categories, and the index for evaluating the quality of the predictive segmentation mask is the average intersection ratio (mean Intersection over Union, mIoU); there are 80 classes in the COCO dataset, and the evaluation criteria are prediction frame average precision AP box and mask average precision AP mask.
According to a second aspect of the present specification, there is provided a weak supervised learning based domain adaptive instance segmentation apparatus, comprising a memory and one or more processors, the memory having executable code stored therein, which when executed by the processors is operable to implement the weak supervised learning based domain adaptive instance segmentation method as set out in the first aspect.
According to a third aspect of the present specification, there is provided a computer-readable storage medium having stored thereon a program, characterized in that the program, when executed by a processor, implements the domain-adaptive instance segmentation method based on weakly supervised learning as set forth in the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
1. Training an initial instance segmentation model on a source domain, carrying out instance segmentation on a target domain by using the initial instance segmentation model obtained by training the source domain, outputting mask features and mask semantic score tensors of each instance of the target domain, constructing a semantic tree by using a hierarchical aggregation clustering method, and exploring the appearance and semantic similarity among predicted images in a layering manner.
2. Sampling leaf nodes of a semantic tree, quickly judging whether an instance segmentation mask is accurate, carrying out mask correction on an inaccurate sample by using the accurate sample, and fine-adjusting an initial instance segmentation model according to a mask correction result, thereby solving the problems that in the aspect of domain self-adaption, although the segmentation model can be improved by introducing a supervision signal from a target domain data set, manual labeling is tedious and time-consuming, and self-training contains too much noise in a pseudo tag.
3. The experimental results of the Pascal VOC 2012 data set and the COCO data set show that compared with other advanced methods, the invention spends limited human resources for label verification, obtains the effectiveness of the approach supervision learning method, and has considerable competitiveness.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram illustrating the domain adaptation problem provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of an overall framework of domain-adaptive instance segmentation based on weakly supervised learning provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of semantic tree construction provided by an embodiment of the present invention;
FIG. 4 is a schematic illustration of the segment output of an embodiment of the present invention on a Pascal VOC 2012 dataset;
FIG. 5 is a schematic illustration of segment output on a COCO data set according to an embodiment of the invention;
Fig. 6 is a block diagram of a domain adaptive instance segmentation apparatus based on weak supervised learning according to an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Fig. 1 is a schematic diagram of domain adaptation problem, and the invention provides a domain adaptation example segmentation method based on weak supervised learning, which solves the problems that in the aspect of domain adaptation, although a segmentation model can be improved by introducing a supervision signal from a target domain data set, manual labeling is tedious and time-consuming, and self-training contains too much noise in a pseudo tag.
The embodiment of the invention provides a domain self-adaptive instance segmentation method based on weak supervised learning, as shown in fig. 2, comprising the following steps:
1. Source domain training initial segmentation model
An initial instance segmentation model is trained on a source domain, a training dataset comprises an image set { X source } and a corresponding instance mask image set { Y source }, backbone network features and a semantic score tensor are output, the semantic score tensor comprising probabilities of different instances to which each pixel belongs.
During the training phase, data enhancement includes horizontal/vertical rollover, translation, and scale variance.
The backbone network of the initial instance segmentation model is composed of swin transformer. The initial example segmentation model was trained using AdamW optimizer, with an initial learning rate of 0.001, following polynomial decay strategy, weight decay of 0.0001, and batch size in experiments of 4.
2. Application of initial segmentation model to target domain
And carrying out instance segmentation on the target domain by utilizing an initial instance segmentation model obtained by source domain training, wherein the test data set only comprises an image set { X tatget }, and outputting backbone network characteristics and semantic score tensors corresponding to each image.
3. Extracting target domain features
Taking the maximum value on the example dimension of the semantic score tensor obtained in the step 2 to obtain an example segmentation mask of each image of the target domain; multiplying the instance segmentation mask of each image of the target domain with the backbone network characteristics of the target domain and the semantic fraction tensor of the target domain respectively to obtain mask characteristics and mask semantic fraction tensor of each instance of the target domain;
The mask feature f t and the mask semantic score tensor s t of the target domain instance t are specifically: the mask feature f t is an instance partition mask of the target domain backbone network feature multiplied by the target domain instance t, the mask semantic score tensor s t is an instance partition mask of the target domain semantic score tensor multiplied by the target domain instance t, s t∈RW×H×K, K is the number of instances contained in the image, and w×h is the image size.
4. Splice features
And (3) splicing the mask feature f t of the example t obtained in the step (3) and the mask semantic score tensor s t to obtain the enhanced mask feature f t + of the example t.
5. Building semantic trees
As shown in fig. 3, a semantic tree corresponding to each category is constructed by using hierarchical aggregation clustering, the enhanced mask feature of each instance belonging to the category is regarded as a leaf node, two child nodes with the smallest euclidean distance of the enhanced mask feature among the instances are selected for aggregation each time to obtain a merged node, and the enhanced mask feature and the mask semantic score tensor of the merged node are respectively linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child nodes; the method comprises the following steps:
merging the child node corresponding to the instance t and the child node corresponding to the instance o to obtain a merging node n j, wherein the enhanced mask feature f j + and the mask semantic score tensor s j of the merging node n j are linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child node:
sj=wtst+woso
Wherein the weights w t and w o are related to the size of the child nodes:
Wherein, P t and P o are the number of examples contained in the corresponding child nodes respectively; for leaf nodes, w t=wo =1/2;
The nodes are combined through multiple aggregation, and finally, a semantic tree corresponding to each category is constructed; the semantic tree root node is denoted as n 0, and the rest of intermediate nodes are denoted as Where J k is the number of intermediate nodes of class k.
6. Sampling fast judge mask accuracy
For the semantic tree T k constructed by category k, N instance segmentation masks are randomly selected based on the set sampling rate RThe annotator quickly determines whether the selected instance segmentation mask is accurate, if the predicted instance segmentation mask m t is accurate, 1 is annotated, and if not, 0 is annotated.
7. Mask correction
Comparing the statistics of the labeling results of all sampling examples on the semantic tree corresponding to the category k with a set threshold value: if the statistical value is larger than the threshold value, the prediction of the category k is accurate, and an inaccurate sampling example can utilize the accurate sampling example to carry out mask correction; if the statistical value is less than or equal to the threshold value, the prediction of the specification class k is inaccurate, splitting the corresponding semantic tree into two subtrees, re-sampling the example calculation labeling result statistical value of each subtree, comparing with the design threshold value, and repeating the splitting-comparing process until the subtree is not detachable or does not contain any accurate sampling example;
The calculation formula of the statistic value Q k of the labeling results of all the sampling examples on the semantic tree corresponding to the category k is as follows:
Where N is the number of instances of sampling the semantic tree of the class k, and the number of instances is k 1,…,N,t, which is the result of determining the instance segmentation mask in step 6.
8. Fine tuning of initial instance segmentation model
And fine-tuning the initial instance segmentation model according to the target domain mask correction result, thereby improving the effectiveness of the instance segmentation model.
The invention is implemented on a Pascal VOC 2012 dataset and a COCO dataset, wherein the Pascal VOC 2012 dataset consists of 1464 training images and 1449 Zhang Yanzheng images and has an active segmentation result, the Pascal VOC 2012 dataset has 20 categories, and the index for evaluating the quality of a predictive segmentation mask is the average cross ratio (mean Intersection over Union, mIoU); there are 80 classes in the COCO dataset, and the evaluation criteria are prediction frame average precision AP box and mask average precision AP mask.
The experimental results of the applicant in the Pascal VOC 2012 data set and the COCO data set are shown in fig. 4 and 5, and the experimental results show that compared with other advanced methods, the invention spends limited manpower resources for label verification, obtains the effectiveness of the approach supervision learning method, and has considerable competitiveness.
Corresponding to the embodiment of the domain self-adaptive instance segmentation method based on the weak supervision learning, the invention also provides an embodiment of the domain self-adaptive instance segmentation device based on the weak supervision learning.
Referring to fig. 6, the domain adaptive instance segmentation device based on weak supervised learning provided by the embodiment of the invention includes a memory and one or more processors, where the memory stores executable codes, and the processors are configured to implement the domain adaptive instance segmentation method based on weak supervised learning in the above embodiment when executing the executable codes.
The embodiment of the domain adaptive instance segmentation device based on weak supervised learning can be applied to any device with data processing capability, such as a computer or the like. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 6, the hardware structure diagram of the device with data processing capability according to the present invention where the domain adaptive instance splitting apparatus based on weak supervision learning is located is shown in fig. 6, and in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 6, the device with data processing capability according to the present invention in the embodiment may further include other hardware according to the actual function of the device with data processing capability, which is not described herein.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The embodiment of the invention also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the domain adaptive instance segmentation method based on weak supervised learning in the above embodiment.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device of any device having data processing capabilities, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), an SD card, a flash memory card (FLASH CARD), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in one or more embodiments of the present description to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.
The foregoing description of the preferred embodiment(s) is (are) merely intended to illustrate the embodiment(s) of the present invention, and it is not intended to limit the embodiment(s) of the present invention to the particular embodiment(s) described.
Claims (10)
1. The domain self-adaptive instance segmentation method based on weak supervised learning is characterized by comprising the following steps of:
(1) Training an initial instance segmentation model on a source domain, and outputting backbone network characteristics and a semantic score tensor, wherein the semantic score tensor comprises probabilities of different instances to which each pixel belongs;
(2) Performing instance segmentation on a target domain by using an initial instance segmentation model obtained by source domain training, and outputting backbone network characteristics and semantic score tensors corresponding to each image;
(3) Taking the maximum value on the example dimension of the semantic score tensor obtained in the step (2) to obtain an example segmentation mask of each image of the target domain; multiplying the instance segmentation mask of each image of the target domain with the backbone network characteristics of the target domain and the semantic fraction tensor of the target domain respectively to obtain mask characteristics and mask semantic fraction tensor of each instance of the target domain;
(4) Splicing the mask features of the examples obtained in the step (3) with mask semantic score tensors to obtain enhanced mask features of the examples;
(5) Constructing a semantic tree corresponding to each category by using hierarchical condensation clustering, regarding the enhanced mask feature of each instance belonging to the category as a leaf node, and merging two child nodes with the minimum Euclidean distance of the enhanced mask feature among the selected instances each time to obtain a merged node, wherein the enhanced mask feature and the mask semantic score tensor of the merged node are respectively linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child nodes;
(6) For each semantic tree, sampling leaf nodes of the semantic tree based on a set sampling rate, rapidly judging whether an instance segmentation mask is accurate or not, and labeling a judgment result;
(7) Comparing the statistics of the labeling results of all sampling examples on the semantic tree corresponding to the category k with a set threshold value: if the statistical value is larger than the threshold value, the prediction of the category k is accurate, and an inaccurate sampling example can utilize the accurate sampling example to carry out mask correction; if the statistical value is less than or equal to the threshold value, the prediction of the specification class k is inaccurate, splitting the corresponding semantic tree into two subtrees, re-sampling the example calculation labeling result statistical value of each subtree, comparing with the design threshold value, and repeating the splitting-comparing process until the subtree is not detachable or does not contain any accurate sampling example;
(8) And fine-tuning the initial instance segmentation model according to the target domain mask correction result.
2. The domain adaptive instance segmentation method based on weak supervised learning as set forth in claim 1, wherein the step (5) specifically includes:
merging the child node corresponding to the instance t and the child node corresponding to the instance o to obtain a merging node n j, wherein the enhanced mask feature f j + and the mask semantic score tensor s j of the merging node n j are linear combinations of the enhanced mask feature and the mask semantic score tensor corresponding to the child node:
fj +=wtft ++wofo +
sj=wtst+woso
Wherein the weights w t and w o are related to the size of the child nodes:
Wherein, P t and P o are the number of examples contained in the corresponding child nodes respectively; for leaf nodes, w t=wo =1/2;
and finally constructing a semantic tree corresponding to each category by condensing and merging the nodes for multiple times.
3. The method for domain adaptive instance segmentation based on weak supervised learning as set forth in claim 1, wherein in the step (7), a calculation formula of the statistics Q k of labeling results of all sampling instances on the semantic tree corresponding to the category k is:
wherein N is the number of instances of sampling the semantic tree of the category k, and the number of instances k 1,...,kN,lt is the judgment result of the instance segmentation mask in the step (6).
4. The domain adaptive instance segmentation method based on weak supervised learning as set forth in claim 1, wherein the backbone network of the initial instance segmentation model is composed of swin transformer; in the source domain, the training dataset includes an image set { X source } and a corresponding instance mask image set { Y source }; in the target domain, the test dataset includes only the image set { X tatget }.
5. The domain adaptive instance segmentation method based on weak supervised learning as set forth in claim 1, wherein the initial instance segmentation model performs data enhancement during a training phase, including horizontal/vertical rollover, translation, and scale scaling; the initial example segmentation model was trained using AdamW optimizer with an initial learning rate of 0.001, following polynomial decay strategy, with a weight decay of 0.0001.
6. The method for domain adaptive instance segmentation based on weakly supervised learning as set forth in claim 1, wherein in step (3), the mask feature f t and the mask semantic score tensor s t of the target domain instance t are specifically:
the mask feature f t is an instance partition mask of the target domain backbone network feature multiplied by the target domain instance t, the mask semantic score tensor s t is an instance partition mask of the target domain semantic score tensor multiplied by the target domain instance t, s t∈RW×H×K, K is the number of instances contained in the image, and w×h is the image size.
7. The method for domain adaptive instance segmentation based on weakly supervised learning as set forth in claim 1, wherein in step (6), leaf nodes of the semantic tree are sampled, specifically:
for the semantic tree T k constructed by category k, N instance segmentation masks are randomly selected based on the set sampling rate R The annotator quickly determines whether the selected instance segmentation mask is accurate, if the predicted instance segmentation mask m t is accurate, 1 is annotated, and if not, 0 is annotated.
8. The method for domain-adaptive instance segmentation based on weakly supervised learning as recited in claim 1, wherein the Pascal VOC 2012 dataset and the COCO dataset are used as training sets for the instance segmentation model.
9. A weakly supervised learning based domain adaptive instance segmentation apparatus comprising a memory and one or more processors, the memory having executable code stored therein, wherein the processor, when executing the executable code, is configured to implement the weakly supervised learning based domain adaptive instance segmentation method as set forth in any of claims 1-8.
10. A computer-readable storage medium, on which a program is stored, which program, when being executed by a processor, implements a domain-adaptive instance segmentation method based on weakly supervised learning as set forth in any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210236149.4A CN114612663B (en) | 2022-03-11 | 2022-03-11 | Domain self-adaptive instance segmentation method and device based on weak supervision learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210236149.4A CN114612663B (en) | 2022-03-11 | 2022-03-11 | Domain self-adaptive instance segmentation method and device based on weak supervision learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114612663A CN114612663A (en) | 2022-06-10 |
CN114612663B true CN114612663B (en) | 2024-09-13 |
Family
ID=81863866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210236149.4A Active CN114612663B (en) | 2022-03-11 | 2022-03-11 | Domain self-adaptive instance segmentation method and device based on weak supervision learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114612663B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115578564B (en) * | 2022-10-25 | 2023-05-23 | 北京医准智能科技有限公司 | Training method and device for instance segmentation model, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112116599A (en) * | 2020-08-12 | 2020-12-22 | 南京理工大学 | Sputum smear tubercle bacillus semantic segmentation method and system based on weak supervised learning |
AU2020103905A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Unsupervised cross-domain self-adaptive medical image segmentation method based on deep adversarial learning |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784424B (en) * | 2019-03-26 | 2021-02-09 | 腾讯科技(深圳)有限公司 | Image classification model training method, image processing method and device |
US20210150281A1 (en) * | 2019-11-14 | 2021-05-20 | Nec Laboratories America, Inc. | Domain adaptation for semantic segmentation via exploiting weak labels |
CN112699892A (en) * | 2021-01-08 | 2021-04-23 | 北京工业大学 | Unsupervised field self-adaptive semantic segmentation method |
-
2022
- 2022-03-11 CN CN202210236149.4A patent/CN114612663B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112116599A (en) * | 2020-08-12 | 2020-12-22 | 南京理工大学 | Sputum smear tubercle bacillus semantic segmentation method and system based on weak supervised learning |
AU2020103905A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Unsupervised cross-domain self-adaptive medical image segmentation method based on deep adversarial learning |
Also Published As
Publication number | Publication date |
---|---|
CN114612663A (en) | 2022-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
CN112487812B (en) | Nested entity identification method and system based on boundary identification | |
Blanchard et al. | Hierarchical testing designs for pattern recognition | |
CN112733866B (en) | Network construction method for improving text description correctness of controllable image | |
CN109978060B (en) | Training method and device of natural language element extraction model | |
CN110147806B (en) | Training method and device of image description model and storage medium | |
CN113469186B (en) | Cross-domain migration image segmentation method based on small number of point labels | |
CN112580346B (en) | Event extraction method and device, computer equipment and storage medium | |
CN110929848A (en) | Training and tracking method based on multi-challenge perception learning model | |
CN110968725B (en) | Image content description information generation method, electronic device and storage medium | |
CN113779282B (en) | Fine-grained cross-media retrieval method based on self-attention and generation countermeasure network | |
CN114297388A (en) | Text keyword extraction method | |
CN114612663B (en) | Domain self-adaptive instance segmentation method and device based on weak supervision learning | |
CN116743493A (en) | Network intrusion detection model construction method and network intrusion detection method | |
CN112749737A (en) | Image classification method and device, electronic equipment and storage medium | |
CN113590811A (en) | Text abstract generation method and device, electronic equipment and storage medium | |
CN115080749A (en) | Weak supervision text classification method, system and device based on self-supervision training | |
CN115169342A (en) | Text similarity calculation method and device, electronic equipment and storage medium | |
CN113408282B (en) | Method, device, equipment and storage medium for topic model training and topic prediction | |
CN118196472A (en) | Recognition method for improving complex and diverse data distribution based on condition domain prompt learning | |
CN116630714A (en) | Multi-tag identification type self-adaptive tag discovery and noise rejection method and equipment | |
CN116030502A (en) | Pedestrian re-recognition method and device based on unsupervised learning | |
CN111126443A (en) | Network representation learning method based on random walk | |
CN114004233B (en) | Remote supervision named entity recognition method based on semi-training and sentence selection | |
CN115358227A (en) | Open domain relation joint extraction method and system based on phrase enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |