CN117197451A - Remote sensing image semantic segmentation method and device based on domain self-adaption - Google Patents
Remote sensing image semantic segmentation method and device based on domain self-adaption Download PDFInfo
- Publication number
- CN117197451A CN117197451A CN202310904440.9A CN202310904440A CN117197451A CN 117197451 A CN117197451 A CN 117197451A CN 202310904440 A CN202310904440 A CN 202310904440A CN 117197451 A CN117197451 A CN 117197451A
- Authority
- CN
- China
- Prior art keywords
- domain
- auxiliary
- training
- remote sensing
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012549 training Methods 0.000 claims abstract description 76
- 239000013598 vector Substances 0.000 claims description 25
- 238000004422 calculation algorithm Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000006978 adaptation Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000003416 augmentation Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013140 knowledge distillation Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000003707 image sharpening Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a remote sensing image semantic segmentation method and device based on domain self-adaption, wherein the method comprises the steps of obtaining a remote sensing data set comprising marked source domain data and unmarked target domain data; constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model; training and optimizing parameters of a student model by adopting a remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and an auxiliary prototype classifier uses the index moving average to update the weights; inputting unlabeled target domain data into a trained student model for point-by-point prediction to obtain a segmentation result corresponding to the unlabeled target domain data; therefore, the domain difference problem in the remote sensing image cross-domain semantic segmentation task can be solved by constructing an average teacher framework with an auxiliary prototype classifier, and the alignment of class layers between a source domain and a target domain is realized, so that the segmentation performance is improved.
Description
Technical Field
The invention relates to the technical field of image segmentation, in particular to a remote sensing image semantic segmentation method based on domain self-adaption, a computer readable storage medium, a computer device and a remote sensing image semantic segmentation device based on domain self-adaption.
Background
Remote sensing images are widely used in various applications such as land cover mapping, urban planning and environmental monitoring; semantic segmentation is a basic task in remote sensing image analysis, and each pixel in an image is assigned with a class label representing an object or region to which the pixel belongs; at present, a depth convolution neural network method based on complete supervision has obtained a remarkable result in a remote sensing image semantic segmentation task; however, when a model trained on a large scale marker dataset (source domain) is used to segment images collected from different scenes (target domain), its performance typically degrades due to inter-domain variability; the formation factors of the domain differences include aspects of imaging conditions, geographic locations, sensor specifications, and the like; unsupervised domain adaptation (Unsupervised Domain Adaptation, UDA) techniques, which aim to learn a domain-invariant feature representation by adjusting the feature distribution of the source and target domains, enable extraction of regions of interest from the target domain image without semantic annotation of the target domain data, can alleviate these differences.
Techniques for UDA semantic segmentation mainly include resistance learning, self-training and multi-stage methods; UDA methods based on challenge learning can effectively align full local marginal distributions, but they are prone to negative migration (i.e. aligned classes are aligned incorrectly), especially in case of class imbalance; to solve this problem, the CLAN adopts a collaborative training method to implicitly determine the alignment degree of each class; but similarities between different classes of features are prone to misalignment; the self-training-based UDA method explicitly performs feature alignment of a class by using a pseudo tag of a target domain and a class prototype of a source domain; the pseudo tag of the target sample is estimated by calculating the distance between its features and the source domain class prototype; however, the source domain class prototype may not effectively reflect the feature center points of each semantic class in the target domain; this may reduce the reliability of the pseudo tag of the target sample and affect the performance of the classifier; this problem is particularly evident in the unsupervised domain adaptation semantic segmentation task of remote sensing images in case of inconsistencies between source domain and target domain class distributions; for example, in a remote sensing image of a city scene, the proportion of building and road targets is higher than in a rural scene; multistage UDA methods generally include resistance learning, self-training and knowledge distillation; in the first stage, they train the model using the challenge-based UDA method and take their weights as the initialization parameters for the next stage; in the second stage, the pre-trained model is used to calculate pseudo tags for the target domain and prototypes for a particular class; in addition, during the training process, prototypes are used to help correct false labels online; in the final stage, knowledge distillation techniques are used to transfer knowledge learned from the domain adaptive segmentation model trained in the previous stage into a self-supervised pre-training model; while multi-stage training methods may produce better results than end-to-end methods, they rely heavily on the performance of challenge-based initialization methods and require complex training strategies.
Disclosure of Invention
The present invention aims to solve at least to some extent one of the technical problems in the above-described technology. Therefore, an object of the present invention is to provide a domain-adaptive-based semantic segmentation method for remote sensing images, which can solve the domain difference problem in the task of cross-domain semantic segmentation of remote sensing images by constructing an average teacher framework with an auxiliary prototype classifier, effectively extract the region of interest from the target domain data, and realize the alignment of class layers between the source domain and the target domain, thereby improving the segmentation performance.
A second object of the present invention is to propose a computer readable storage medium.
A third object of the invention is to propose a computer device.
The fourth object of the invention is to provide a remote sensing image semantic segmentation device based on domain self-adaption.
In order to achieve the above objective, a first embodiment of the present invention provides a domain-adaptive remote sensing image semantic segmentation method, which includes the following steps: acquiring a remote sensing data set, wherein the remote sensing data set comprises marked source domain data and unmarked target domain data; constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model; training and optimizing parameters of the student model by adopting the remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights; and inputting the unlabeled target domain data into a trained student model for point-by-point prediction to obtain a segmentation result corresponding to the unlabeled target domain data.
According to the remote sensing image semantic segmentation method based on domain self-adaption, firstly, a remote sensing data set is obtained, wherein the remote sensing data set comprises marked source domain data and unmarked target domain data; then, constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model; secondly, training and optimizing parameters of a student model by adopting a remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights; finally, inputting unlabeled target domain data into a trained student model for point-by-point prediction to obtain a segmentation result corresponding to the unlabeled target domain data; therefore, the domain difference problem in the remote sensing image cross-domain semantic segmentation task can be solved by constructing an average teacher framework with an auxiliary prototype classifier, the region of interest can be effectively extracted from the target domain data, and the alignment of class layers between the source domain and the target domain can be realized, so that the segmentation performance is improved.
In addition, the remote sensing image semantic segmentation method based on domain self-adaption provided by the embodiment of the invention can also have the following additional technical characteristics:
optionally, constructing the average teacher framework with the auxiliary prototype classifier includes: the student model comprises a feature encoder and a parameterized classifier, and uses deep LabV2 as a network structure of the student model, and ResNet-101 as a skeleton thereof; the network structure, the framework and the student model of the teacher model are consistent; respectively constructing a memory library of a corresponding category for the source domain data and the target domain data in a form of a queue so as to store the characteristics output by the corresponding characteristic encoder into the memory library of the corresponding category in the form of characteristic vectors of different categories after embedding and filtering; and splicing the memory libraries of the categories corresponding to the source domain and the target domain, and clustering out the prototypes corresponding to the categories by using a KMeans clustering algorithm so as to serve as an auxiliary prototyping classifier.
Optionally, training and optimizing parameters of the student model using the remote sensing dataset includes: in a first epoch of training, training and parameter optimization are carried out on the student model by using source domain data with labels, and feature vectors of different types of the source domain data are stored in a source domain memory bank; after the first epoch is trained, the teacher model uses the parameters of the student model to initialize the parameters, and uses a KMeas algorithm to cluster the source domain memory library, so as to obtain an initialized prototype of each class as an auxiliary prototype classifier; in the second epoch of the training, the student model with the auxiliary prototype classifier is trained using source domain data with labels; the teacher model predicts the target domain data to obtain a pseudo tag of the target domain; training a student model with an auxiliary prototype classifier using pseudo tags of a target domain to update parameters; meanwhile, in the training process, the feature vectors of different types of source domain data are stored in a source domain memory bank, and the feature vectors of different types of target domain data are stored in a target domain memory bank; after the second epoch of training is finished, the teacher model updates the parameters of the student model through the index moving average of the parameters; in addition, the source domain memory library and the target domain memory library of the same class are spliced, and a KMeans algorithm is used for clustering the spliced results so as to obtain prototypes of each class for updating an auxiliary prototype classifier; in each subsequent epoch training process, the training and optimizing modes of the student model and the teacher model are different from those of the second epoch in that the auxiliary prototype classifier updates the prototype of each class by the index moving average mode of the clustering result of the memory library until the training is completed.
To achieve the above objective, a second aspect of the present invention provides a computer-readable storage medium having stored thereon a domain-adaptive-based remote sensing image semantic segmentation program, which when executed by a processor, implements the domain-adaptive-based remote sensing image semantic segmentation method as described above.
To achieve the above objective, an embodiment of a third aspect of the present invention provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the domain-adaptive remote sensing image semantic segmentation method as described above when executing the program.
In order to achieve the above object, a fourth aspect of the present invention provides a domain-adaptive remote sensing image semantic segmentation device, including: the remote sensing data set comprises marked source domain data and unmarked target domain data; the model construction module is used for constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model; the training module is used for training and optimizing parameters of the student model by adopting the remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights; the semantic segmentation module is used for inputting the unlabeled target domain data into a trained student model for point-by-point prediction so as to obtain a segmentation result corresponding to the unlabeled target domain data.
According to the remote sensing image semantic segmentation device based on domain self-adaption, the problem of domain difference in a remote sensing image cross-domain semantic segmentation task can be solved by constructing the average teacher framework with the auxiliary prototype classifier, the region of interest can be effectively extracted from the target domain data, and the alignment of class layers between the source domain and the target domain is realized, so that the segmentation performance is improved.
In addition, the remote sensing image semantic segmentation device based on domain self-adaption provided by the embodiment of the invention can also have the following additional technical characteristics:
optionally, constructing the average teacher framework with the auxiliary prototype classifier includes: the student model comprises a feature encoder and a parameterized classifier, and uses deep LabV2 as a network structure of the student model, and ResNet-101 as a skeleton thereof; the network structure, the framework and the student model of the teacher model are consistent; respectively constructing a memory library of a corresponding category for the source domain data and the target domain data in a form of a queue so as to store the characteristics output by the corresponding characteristic encoder into the memory library of the corresponding category in the form of characteristic vectors of different categories after embedding and filtering; and splicing the memory libraries of the categories corresponding to the source domain and the target domain, and clustering out the prototypes corresponding to the categories by using a KMeans clustering algorithm so as to serve as an auxiliary prototyping classifier.
Optionally, training and optimizing parameters of the student model using the remote sensing dataset includes: in a first epoch of training, training and parameter optimization are carried out on the student model by using source domain data with labels, and feature vectors of different types of the source domain data are stored in a source domain memory bank; after the first epoch is trained, the teacher model uses the parameters of the student model to initialize the parameters, and uses a KMeas algorithm to cluster the source domain memory library, so as to obtain an initialized prototype of each class as an auxiliary prototype classifier; in the second epoch of the training, the student model with the auxiliary prototype classifier is trained using source domain data with labels; the teacher model predicts the target domain data to obtain a pseudo tag of the target domain; training a student model with an auxiliary prototype classifier using pseudo tags of a target domain to update parameters; meanwhile, in the training process, the feature vectors of different types of source domain data are stored in a source domain memory bank, and the feature vectors of different types of target domain data are stored in a target domain memory bank; after the second epoch of training is finished, the teacher model updates the parameters of the student model through the index moving average of the parameters; in addition, the source domain memory library and the target domain memory library of the same class are spliced, and a KMeans algorithm is used for clustering the spliced results so as to obtain prototypes of each class for updating an auxiliary prototype classifier; in each subsequent epoch training process, the training and optimization of the student model and the teacher model differs from the second epoch in that the auxiliary prototype classifier updates the prototypes of each class in an exponentially moving average of the clustering results of the memory banks of the source and target domains until training is complete.
Drawings
FIG. 1 is a flow chart of a domain-adaptive-based remote sensing image semantic segmentation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a model training framework for domain-adaptive-based semantic segmentation of remote sensing images according to one embodiment of the present invention;
fig. 3 is a block schematic diagram of a domain-adaptive-based remote sensing image semantic segmentation apparatus according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
In order that the above-described aspects may be better understood, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a domain-adaptive-based semantic segmentation method for a remote sensing image according to an embodiment of the present invention, as shown in fig. 1, the domain-adaptive-based semantic segmentation method for a remote sensing image includes the following steps:
s101, acquiring a remote sensing data set, wherein the remote sensing data set comprises marked source domain data and unmarked target domain data.
The source domain data is the red, green and blue (RGB) band data of the watsdam, and the target domain data is the near infrared, green and blue (IRGB) band data of the Vaihingen.
That is, the remote sensing dataset may be obtained by downloading over the network.
S102, constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model.
As one embodiment, constructing an average teacher framework (Mean Teacher Framework with an Auxiliary prototype classifier, MTA) with an auxiliary prototype classifier includes:
the student model comprises a feature encoder and a parameterized classifier, and uses deep LabV2 as a network structure of the student model, resNet-101 as a skeleton thereof, and the network structure, the skeleton and the student model of the teacher model are consistent;
respectively constructing a memory library of a corresponding category for the source domain data and the target domain data in a form of a queue so as to store the characteristics output by the corresponding characteristic encoder into the memory library of the corresponding category by characteristic vectors of different categories after embedding and filtering;
it should be noted that, the following Embedding Filter mechanism is followed to Filter the feature vectors extracted from the student model, and then update and delete the memory bank:
wherein,<,>indicates the inner product, and the "; for the source domain samples,for the target domain samples,
and splicing the memory libraries of the categories corresponding to the source domain and the target domain, and clustering out the prototypes corresponding to the categories by using a KMeans clustering algorithm so as to serve as an auxiliary prototyping classifier.
That is, as shown in fig. 2, a Target image (Target Images) is input into a Teacher Model (Teacher Model) through weak emphasis (Weak Augmentation) including horizontal flip, vertical flip, image sharpening, and color dithering, and is input into a Student Model (Student Model) through strong emphasis (Strong Augmentation) including random rotation, shear mapping, and displacement; the teacher model and the student model both comprise a Feature Encoder (Feature Encoder) and a parameterized classifier (Parametric Classifier), and adopt deep LabV2 as a network structure and ResNet101 as a framework; inputting the weakly enhanced target image into a teacher model to obtain a Pseudo tag (Pseudo Labels); input of strongly enhanced target images to studentsAfter the model, the prediction results of the parameterized classifier and the auxiliary prototype classifier are respectively calculated to cross entropy loss L with pseudo labels from a teacher model t 、Optimizing parameters of a student model, and enhancing classification capability of target domain data; meanwhile, after target features output by the feature encoder are subjected to embedded filtering (Embedding Filter), feature vectors of different categories are stored in a target memory bank (Target Memory Bank) of corresponding categories.
The weakly enhanced Source Images (Source Images) are input into a student model, and the prediction results of the parameterized classifier and the auxiliary prototype classifier respectively calculate cross entropy loss L with the labels of the Source Images s 、Meanwhile, after the source features output by the feature encoder are embedded and filtered, the feature vectors of different categories are stored in a source memory bank (Source Memory Bank) of the corresponding category.
After the feature vectors of the corresponding categories in the target memory library and the source memory library are spliced, the feature vectors are input into a KMeans clustering algorithm, a prototype of each category is output, and the prototype is updated through an index moving average (Exponential Moving Average, EMA) to be used as an auxiliary prototype classifier.
The cross entropy loss is calculated as follows:
wherein p represents the prediction result of a parameterized classifier of a source domain, a target domain or an auxiliary prototype classifier, y represents the real label of the source domain or the pseudo label of the target domain, N represents the number of pixel points, and K represents the number of categories.
The optimization objective function of the student model is defined as:
wherein,the memory bank size of each class of the other source domain and the target domain is 16384x256; the parameterized classifier is composed of a convolution layer with a convolution kernel size of 1, a padding of 0, and a step size of 1.
The teacher model parameters are updated by the following exponential moving average:
wherein,parameters representing the teacher model after training iteration iota epochs, < >>Parameters representing the student model after training for iota epochs, iota being greater than or equal to 2, and a smoothing coefficient alpha=0.99; initializing a teacher model by using parameters of the student model when iota=1; auxiliary prototype classifier->e K A prototype of class K is represented.
Class c prototype e c The exponential moving average update manner of (a) is defined as:
wherein,prototype representing class c after training iteration I>The method comprises the steps of representing a class prototype obtained by KMeans clustering of a class memory library of a source domain and a target domain after training iteration I epoch, wherein I is more than or equal to 3; when i=2, e c The method comprises the steps of carrying out KMeans clustering on a class memory bank of a source domain and a target domain; when i=1, e c Obtained by KMeans clustering the class memory bank of the source domain.
And S103, training and optimizing parameters of the student model by adopting a remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights.
As one embodiment, training and optimizing parameters of a student model using a remote sensing dataset includes: training and parameter optimization are carried out on the student model by using labeled source domain data in the first epoch of training, and feature vectors of different types of the source domain data are stored in a source domain memory bank; after the first epoch is trained, initializing parameters by using parameters of a student model by using a teacher model, and clustering a source domain memory library by using a KMeans algorithm to obtain an initialized prototype of each class as an auxiliary prototype classifier; in the second epoch of the training, the student model with the auxiliary prototype classifier is trained using source domain data with labels; the teacher model predicts the target domain data to obtain a pseudo tag of the target domain; training a student model with an auxiliary prototype classifier using pseudo tags of a target domain to update parameters; meanwhile, in the training process, the feature vectors of different types of source domain data are stored in a source domain memory bank, and the feature vectors of different types of target domain data are stored in a target domain memory bank; after the second epoch is finished, the teacher model updates the parameters of the student model through the index moving average of the parameters of the student model; in addition, the source domain memory library and the target domain memory library of the same class are spliced, and a KMeans algorithm is used for clustering the spliced results so as to obtain prototypes of each class for updating an auxiliary prototype classifier; in each subsequent epoch training process, the training and optimization modes of the student model and the teacher model differ from the second epoch in that the auxiliary prototype classifier updates the prototype of each class in an exponential moving average mode through the clustering results of the source domain and the target domain memory until the training is completed.
It should be noted that, in the training, a random gradient descent (stochastic gradient descent, SGD) is adopted as an optimizer, and the weight attenuation coefficient, the impulse value and the initial learning rate are respectively set to be 5e-4, 0.9 and 2.5e-4; the learning rate is tapered using a polynomial decay strategy, the current learning rate being equal to the initial learning rate multiplied byWhere power=0.9.
S104, inputting the unlabeled target domain data into a trained student model for point-by-point prediction to obtain a segmentation result corresponding to the unlabeled target domain data.
In summary, the invention can solve the domain difference problem in the remote sensing image cross-domain semantic segmentation task by constructing the average teacher framework with the auxiliary prototype classifier, and can effectively extract the region of interest from the target domain data; compared with a multi-stage UDA method, the method does not need complex training technology and multi-stage training strategies; in addition, the alignment of class levels between the source domain and the target domain can be realized; compared with other prototype calculation methods which only use a source domain or a target domain, the method has the advantage that the prototype obtained by clustering has domain invariance.
In order to achieve the above embodiments, an embodiment of the present invention provides a computer readable storage medium, on which a domain-adaptive-based remote sensing image semantic segmentation program is stored, which implements the domain-adaptive-based remote sensing image semantic segmentation method described above when executed by a processor.
According to the computer readable storage medium, the domain-adaptive remote sensing image semantic segmentation program is stored, so that the processor can realize the domain-adaptive remote sensing image semantic segmentation method when executing the domain-adaptive remote sensing image semantic segmentation program, and therefore, the domain difference problem in a remote sensing image cross-domain semantic segmentation task can be solved by constructing an average teacher framework with an auxiliary prototype classifier, the region of interest can be effectively extracted from target domain data, and the alignment of class layers between a source domain and a target domain is realized, so that the segmentation performance on the target domain data is effectively improved.
In order to achieve the above embodiments, the embodiments of the present invention provide a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where when the processor executes the program, the method for semantic segmentation of a remote sensing image based on domain adaptation as described above is implemented.
According to the computer equipment provided by the embodiment of the invention, the domain-adaptive remote sensing image semantic segmentation program is stored through the memory, so that the domain-adaptive remote sensing image semantic segmentation method is realized when the processor executes the domain-adaptive remote sensing image semantic segmentation program, and therefore, the domain difference problem in a remote sensing image cross-domain semantic segmentation task can be solved by constructing an average teacher framework with an auxiliary prototype classifier, the region of interest can be effectively extracted from target domain data, and the alignment of class layers between a source domain and a target domain is realized, so that the segmentation performance on the target domain data is effectively improved.
In order to achieve the above embodiment, the embodiment of the present invention further provides a domain-adaptive remote sensing image semantic segmentation device, as shown in fig. 3, where the domain-adaptive remote sensing image semantic segmentation device includes: an acquisition module 10, a model construction module 20, a training module 30 and a semantic segmentation module 40.
The acquiring module 10 is configured to acquire a remote sensing dataset, where the remote sensing dataset includes labeled source domain data and unlabeled target domain data; model building module 20 is configured to build an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier includes a teacher model and a student model; the training module 30 is configured to train and optimize parameters of the student model using the remote sensing data set, wherein the teacher model updates its parameters using an exponential moving average, and the auxiliary prototype classifier updates its weights using an exponential moving average; the semantic segmentation module 40 is configured to input unlabeled target domain data into a trained student model for point-by-point prediction, so as to obtain a segmentation result corresponding to the unlabeled target domain data.
It should be noted that the description and the illustration of the domain-adaptive-based remote sensing image semantic segmentation method are also applicable to the domain-adaptive-based remote sensing image semantic segmentation device of the present embodiment, and are not described herein.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
In the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.
In the present invention, unless expressly stated or limited otherwise, a first feature "up" or "down" a second feature may be the first and second features in direct contact, or the first and second features in indirect contact via an intervening medium. Moreover, a first feature being "above," "over" and "on" a second feature may be a first feature being directly above or obliquely above the second feature, or simply indicating that the first feature is level higher than the second feature. The first feature being "under", "below" and "beneath" the second feature may be the first feature being directly under or obliquely below the second feature, or simply indicating that the first feature is less level than the second feature.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms should not be understood as necessarily being directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.
Claims (8)
1. The domain self-adaption-based remote sensing image semantic segmentation method is characterized by comprising the following steps of:
acquiring a remote sensing data set, wherein the remote sensing data set comprises marked source domain data and unmarked target domain data;
constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model;
training and optimizing parameters of the student model by adopting the remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights;
and inputting the unlabeled target domain data into a trained student model for point-by-point prediction to obtain a segmentation result corresponding to the unlabeled target domain data.
2. The domain-adaptive-based remote sensing image semantic segmentation method according to claim 1, wherein constructing an average teacher framework with an auxiliary prototype classifier comprises:
the student model comprises a feature encoder and a parameterized classifier, and uses deep LabV2 as a network structure of the student model, and ResNet-101 as a skeleton thereof; the network structure, the framework and the student model of the teacher model are consistent;
respectively constructing a memory library of a corresponding category for the source domain data and the target domain data in a form of a queue so as to store the characteristics output by the corresponding characteristic encoder into the memory library of the corresponding category in the form of characteristic vectors of different categories after embedding and filtering;
and splicing the memory libraries of the categories corresponding to the source domain and the target domain, and clustering out the prototypes corresponding to the categories by using a KMeans clustering algorithm so as to serve as an auxiliary prototyping classifier.
3. The domain-adaptive-based remote sensing image semantic segmentation method of claim 2, wherein training and optimizing parameters of the student model using the remote sensing dataset comprises:
in a first epoch of training, training and parameter optimization are carried out on the student model by using source domain data with labels, and feature vectors of different types of the source domain data are stored in a source domain memory bank;
after the first epoch is trained, the teacher model uses the parameters of the student model to initialize the parameters, and uses a KMeas algorithm to cluster the source domain memory library, so as to obtain an initialized prototype of each class as an auxiliary prototype classifier;
in the second epoch of the training, the student model with the auxiliary prototype classifier is trained using source domain data with labels; the teacher model predicts the target domain data to obtain a pseudo tag of the target domain; training a student model with an auxiliary prototype classifier using pseudo tags of a target domain to update parameters; meanwhile, in the training process, the feature vectors of different types of source domain data are stored in a source domain memory bank, and the feature vectors of different types of target domain data are stored in a target domain memory bank;
after the second epoch of training is finished, the teacher model updates the parameters of the student model through the index moving average of the parameters; in addition, the source domain memory library and the target domain memory library of the same class are spliced, and a KMeans algorithm is used for clustering the spliced results so as to obtain prototypes of each class for updating an auxiliary prototype classifier;
in each subsequent epoch training process, the training and optimization of the student model and the teacher model differs from the second epoch in that the auxiliary prototype classifier updates the prototypes of each class in an exponentially moving average manner by clustering results on the memory banks of the source and target domains until training is completed.
4. A computer-readable storage medium, on which a domain-adaptive remote sensing image semantic segmentation program is stored, which when executed by a processor implements the domain-adaptive remote sensing image semantic segmentation method according to any one of claims 1-3.
5. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the domain-based adaptive remote sensing image semantic segmentation method according to any one of claims 1-3 when executing the program.
6. A domain-adaptive-based remote sensing image semantic segmentation device, comprising:
the remote sensing data set comprises marked source domain data and unmarked target domain data;
the model construction module is used for constructing an average teacher framework with an auxiliary prototype classifier, wherein the average teacher framework with the auxiliary prototype classifier comprises a teacher model and a student model;
the training module is used for training and optimizing parameters of the student model by adopting the remote sensing data set, wherein the teacher model uses an index moving average to update the parameters, and the auxiliary prototype classifier uses the index moving average to update the weights;
the semantic segmentation module is used for inputting the unlabeled target domain data into a trained student model for point-by-point prediction so as to obtain a segmentation result corresponding to the unlabeled target domain data.
7. The domain-adaptive-based remote sensing image semantic segmentation apparatus according to claim 6, wherein constructing an average teacher framework with an auxiliary prototype classifier comprises:
the student model comprises a feature encoder and a parameterized classifier, and uses deep LabV2 as a network structure of the student model, and ResNet-101 as a skeleton thereof; the network structure, the framework and the student model of the teacher model are consistent;
respectively constructing a memory library of a corresponding category for the source domain data and the target domain data in a form of a queue so as to store the characteristics output by the corresponding characteristic encoder into the memory library of the corresponding category in the form of characteristic vectors of different categories after embedding and filtering;
and splicing the memory libraries of the categories corresponding to the source domain and the target domain, and clustering out the prototypes corresponding to the categories by using a KMeans clustering algorithm so as to serve as an auxiliary prototyping classifier.
8. The domain-adaptive-based remote sensing image semantic segmentation device according to claim 7, wherein training and optimizing parameters of the student model using the remote sensing dataset comprises:
in a first epoch of training, training and parameter optimization are carried out on the student model by using source domain data with labels, and feature vectors of different types of the source domain data are stored in a source domain memory bank;
after the first epoch is trained, the teacher model uses the parameters of the student model to initialize the parameters, and uses a KMeas algorithm to cluster the source domain memory library, so as to obtain an initialized prototype of each class as an auxiliary prototype classifier;
in the second epoch of the training, the student model with the auxiliary prototype classifier is trained using source domain data with labels; the teacher model predicts the target domain data to obtain a pseudo tag of the target domain; training a student model with an auxiliary prototype classifier using pseudo tags of a target domain to update parameters; meanwhile, in the training process, the feature vectors of different types of source domain data are stored in a source domain memory bank, and the feature vectors of different types of target domain data are stored in a target domain memory bank;
after the second epoch of training is finished, the teacher model updates the parameters of the student model through the index moving average of the parameters; in addition, the source domain memory library and the target domain memory library of the same class are spliced, and a KMeans algorithm is used for clustering the spliced results so as to obtain prototypes of each class for updating an auxiliary prototype classifier;
in each subsequent epoch training process, the training and optimization of the student model and the teacher model differs from the second epoch in that the auxiliary prototype classifier updates the prototype of each class in an exponentially moving average manner through the clustering results of the memory banks of the source domain and the target domain until the training is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310904440.9A CN117197451A (en) | 2023-07-21 | 2023-07-21 | Remote sensing image semantic segmentation method and device based on domain self-adaption |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310904440.9A CN117197451A (en) | 2023-07-21 | 2023-07-21 | Remote sensing image semantic segmentation method and device based on domain self-adaption |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117197451A true CN117197451A (en) | 2023-12-08 |
Family
ID=88987612
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310904440.9A Pending CN117197451A (en) | 2023-07-21 | 2023-07-21 | Remote sensing image semantic segmentation method and device based on domain self-adaption |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117197451A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117671519A (en) * | 2023-12-14 | 2024-03-08 | 上海勘测设计研究院有限公司 | Method and system for extracting ground object of large-area remote sensing image |
-
2023
- 2023-07-21 CN CN202310904440.9A patent/CN117197451A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117671519A (en) * | 2023-12-14 | 2024-03-08 | 上海勘测设计研究院有限公司 | Method and system for extracting ground object of large-area remote sensing image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107092870B (en) | A kind of high resolution image Semantic features extraction method | |
CN108230278B (en) | Image raindrop removing method based on generation countermeasure network | |
CN110516095B (en) | Semantic migration-based weak supervision deep hash social image retrieval method and system | |
CN113378906B (en) | Unsupervised domain adaptive remote sensing image semantic segmentation method with feature self-adaptive alignment | |
CN114092832B (en) | High-resolution remote sensing image classification method based on parallel hybrid convolutional network | |
CN108021947B (en) | A kind of layering extreme learning machine target identification method of view-based access control model | |
CN112347970B (en) | Remote sensing image ground object identification method based on graph convolution neural network | |
CN112115967B (en) | Image increment learning method based on data protection | |
CN107832835A (en) | The light weight method and device of a kind of convolutional neural networks | |
CN107527068A (en) | Model recognizing method based on CNN and domain adaptive learning | |
CN106920243A (en) | The ceramic material part method for sequence image segmentation of improved full convolutional neural networks | |
CN110728295B (en) | Semi-supervised landform classification model training and landform graph construction method | |
CN111695640B (en) | Foundation cloud picture identification model training method and foundation cloud picture identification method | |
CN109033107A (en) | Image search method and device, computer equipment and storage medium | |
CN111832484A (en) | Loop detection method based on convolution perception hash algorithm | |
CN110874590B (en) | Training and visible light infrared visual tracking method based on adapter mutual learning model | |
CN110281949B (en) | Unified hierarchical decision-making method for automatic driving | |
CN113269224B (en) | Scene image classification method, system and storage medium | |
CN104298974A (en) | Human body behavior recognition method based on depth video sequence | |
CN113111716A (en) | Remote sensing image semi-automatic labeling method and device based on deep learning | |
CN114842343A (en) | ViT-based aerial image identification method | |
CN117197451A (en) | Remote sensing image semantic segmentation method and device based on domain self-adaption | |
CN118230175B (en) | Real estate mapping data processing method and system based on artificial intelligence | |
CN111126155B (en) | Pedestrian re-identification method for generating countermeasure network based on semantic constraint | |
CN116524189A (en) | High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |