CN115205694A - Image segmentation method, device and computer readable storage medium - Google Patents
Image segmentation method, device and computer readable storage medium Download PDFInfo
- Publication number
- CN115205694A CN115205694A CN202110325191.9A CN202110325191A CN115205694A CN 115205694 A CN115205694 A CN 115205694A CN 202110325191 A CN202110325191 A CN 202110325191A CN 115205694 A CN115205694 A CN 115205694A
- Authority
- CN
- China
- Prior art keywords
- image
- target domain
- domain image
- target
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003709 image segmentation Methods 0.000 title claims abstract description 183
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 102
- 230000011218 segmentation Effects 0.000 claims abstract description 97
- 230000003042 antagnostic effect Effects 0.000 claims abstract description 34
- 238000000605 extraction Methods 0.000 claims description 58
- 230000006870 function Effects 0.000 claims description 51
- 238000002372 labelling Methods 0.000 claims description 16
- 238000011176 pooling Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 9
- 238000005520 cutting process Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 17
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The disclosure relates to an image segmentation method, an image segmentation device and a computer-readable storage medium, and relates to the technical field of computers. The method of the present disclosure comprises: respectively inputting the source domain image and the target domain image into a first generation antagonistic network, training the first generation antagonistic network based on antagonistic learning, and obtaining a trained first generation antagonistic network, wherein a first generator in the first generation antagonistic network is an image segmentation model; dividing the target domain images into a first set and a second set, wherein after the target domain images in the first set are trained and a first generation is performed on an image segmentation model in an antagonistic network, the obtained segmentation result is used as annotation information, and the target domain images in the second set are not provided with the annotation information; and respectively inputting the target domain images in the first set and the second set into a second generation confrontation network, training the second generation confrontation network based on confrontation learning, obtaining the trained second generation confrontation network, and determining parameters of the image segmentation model.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an image segmentation method and apparatus, and a computer-readable storage medium.
Background
Remote sensing road image segmentation aims at separating road information from a complex high-resolution remote sensing image, and is a very challenging task. The method is an important research direction in the field of remote sensing, and has important applications in daily life, such as vehicle navigation, map information updating, city planning, disaster rescue and the like.
One method known to the inventors is to perform road segmentation of the image using a domain adaptive method. However, most of the field adaptive methods are researched from the traditional natural scene images, and are adaptive among fields from the synthetic images to the real images.
Disclosure of Invention
The inventor finds that: the high-resolution remote sensing satellite image has a complex environment background and an extremely unbalanced background pixel and road pixel sample number ratio. The difference between the source domain image and the target domain image is large, and the difference between the remote sensing images in the target domain is also large. If the model is trained by directly adopting an inter-domain adaptive method, the trained model is only a rough preliminary model in a target domain and cannot be accurately used for road segmentation of subsequent remote sensing satellite images.
One technical problem to be solved by the present disclosure is: how to improve the accuracy of image segmentation.
According to some embodiments of the present disclosure, there is provided an image segmentation method including: respectively inputting the source domain image and the target domain image into a first generation antagonistic network, training the first generation antagonistic network based on antagonistic learning, and obtaining a trained first generation antagonistic network, wherein a first generator in the first generation antagonistic network is an image segmentation model; dividing the target domain images into a first set and a second set, wherein after the target domain images in the first set are trained and a first generation is performed on an image segmentation model in an antagonistic network, the obtained segmentation result is used as annotation information, and the target domain images in the second set are not provided with the annotation information; and respectively inputting the target domain images in the first set and the second set into a second generation countermeasure network, training the second generation countermeasure network based on countermeasure learning, obtaining a trained second generation countermeasure network, and determining parameters of the image segmentation model, wherein a second generator in the second generation countermeasure network is the image segmentation model, and the initial parameters are assigned as parameters of a first generator in the trained first generation countermeasure network.
In some embodiments, inputting the source domain image and the target domain image into the first generation countermeasure network, respectively, includes: respectively inputting the source domain image and the target domain image into a feature extraction layer of an image segmentation model to respectively obtain a first feature map of the source domain image and a second feature map of the target domain image; respectively inputting the first feature map and the second feature map into an upper sampling layer of an image segmentation model, and respectively obtaining a third feature map of a source domain image with the same size as the source domain image and a fourth feature map of a target domain image with the same size as the target domain image; and respectively inputting the third feature map and the fourth feature map into a softmax layer of the image segmentation model, and respectively obtaining a segmentation result of the source domain image and a segmentation result of the target domain image.
In some embodiments, the feature extraction layers of the image segmentation model comprise a plurality of feature extraction layers, according to the sequence of the source domain image or the target domain image, the down-sampling multiples of each feature extraction layer are sequentially increased, and the last two feature extraction layers are respectively connected with the cavity convolution pyramid pooling module; in the case where there is one or more feature extraction layers in each of the feature extraction layers whose downsampling multiples exceed the threshold, hole convolution is used for the one or more feature extraction layers to keep the downsampling multiples of the one or more feature extraction layers at the threshold.
In some embodiments, the inputting the source domain image and the target domain image into the feature extraction layer of the image segmentation model respectively, and the obtaining the first feature map of the source domain image and the second feature map of the target domain image respectively includes: sequentially inputting the source domain image and the target domain image into each feature extraction layer; respectively inputting the features output by the two feature extraction layers through which the source domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the source domain image respectively corresponding to the two output feature layers; fusing the multi-scale global features of the source domain images respectively corresponding to the two feature layers to obtain a first feature map of the source domain image; respectively inputting the features output by the two feature extraction layers through which the target domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the target domain image respectively corresponding to the two output feature layers; fusing the multi-scale global features of the target domain images respectively corresponding to the two feature layers to obtain a second feature map of the target domain image;
in some embodiments, training the first generative antagonistic network based on antagonistic learning comprises: in each training period, adjusting parameters of the image segmentation model according to the segmentation result of the source domain image after passing through the image segmentation model and the difference of the labeling information of the source domain image; readjusting the image segmentation model based on the countermeasure learning, and adjusting parameters of a first discriminator in the first generation countermeasure network; and repeating the process until the training of the first generation pairing-resistance network is completed.
In some embodiments, readjusting the image segmentation model based on the countervailing learning comprises: inputting a segmentation result of the target domain image after passing through the image segmentation model into a first discriminator, and performing domain type discrimination on the target domain image to obtain a discrimination result of the target domain image; determining a first pair of loss-resistant functions according to the discrimination result of the target domain image; and adjusting the image segmentation model again according to the first pair of loss-tolerance functions.
In some embodiments, adjusting the parameters of the first arbiter in the first generative countermeasure network comprises: respectively inputting a segmentation result of the source domain image after passing through the image segmentation model and a segmentation result of the target domain image after passing through the image segmentation model into a first discriminator, and respectively discriminating domain types of the source domain image and the target domain image to obtain a discrimination result of the source domain image and a discrimination result of the target domain image; determining a first cross entropy loss function according to the discrimination result of the source domain image and the discrimination result of the target domain image; and adjusting the parameters of the first discriminator according to the first cross entropy loss function.
In some embodiments, dividing the target domain images into the first set and the second set comprises: inputting a target domain image into a trained first generation impedance network image segmentation model, obtaining a segmentation result of the target domain image, and generating a pseudo label of the target domain image, wherein the pseudo label of the target domain image is used for marking each pixel point in the target domain image to belong to a target or a background; determining the score of the target domain image according to the segmentation result of the target domain image and the pseudo label of the target domain image, wherein the higher the probability value of the pixel points belonging to the target in the target domain image in the segmentation result, the less the number of the pixel points belonging to the target in the target domain image is labeled by the pseudo label of the target domain image, and the higher the score of the target domain image is; and selecting partial target domain images to generate a first set according to the scores of all the target domain images, and generating a second set by using the target domain images outside the first set, wherein the pseudo labels of the target domain images in the first set are used as the labeling information of the target domain images.
In some embodiments, the score in the target domain image is determined using the following formula:
wherein,representing the probability that the pixel at (i, j) in the target domain image belongs to the target,a pseudo label representing a pixel at (i, j) in the target domain image,indicating that the pixel at (i, j) in the target domain image belongs to the target,indicating that the pixel at (i, j) in the target domain image belongs to the background.
In some embodiments, training the second generated confrontation network based on the confrontation learning comprises: in each training period, adjusting parameters of the image segmentation model according to the segmentation result of the target domain image in the first set after passing through the image segmentation model and the difference of the labeling information of the target domain image in the first set; readjusting the image segmentation model based on the confrontation learning, and adjusting parameters of a second discriminator in a second generation confrontation network; and repeating the process until the training of the second generation countermeasure network is completed.
In some embodiments, readjusting the image segmentation model based on the countervailing learning comprises: inputting the segmentation result of the target domain images in the second set after passing through the image segmentation model into a second discriminator, and performing domain type discrimination on the target domain images in the second set to obtain the discrimination result of the target domain images in the second set; determining a second pair of loss-resistant functions according to the discrimination result of the target domain images in the second set; and adjusting the image segmentation model again according to the second pair of loss-tolerant functions.
In some embodiments, adjusting the parameters of the second arbiter in the second generative countermeasure network comprises: respectively inputting the segmentation result of the target domain images in the first set after passing through the image segmentation model and the segmentation result of the target domain images in the second set after passing through the image segmentation model into a second discriminator, and respectively discriminating the domain type of the target domain images in the first set and the second set to obtain the discrimination result of the target domain images in the first set and the discrimination result of the target domain images in the second set; determining a second cross entropy loss function according to the discrimination result of the target domain image in the first set and the discrimination result of the target domain image in the second set; and adjusting the parameters of the second discriminator according to the second cross entropy loss function.
In some embodiments, the source domain image and the target domain image are remote sensing satellite images comprising roads; the method further comprises the following steps: respectively cutting the label mask images of the training sample images and the training sample images into a plurality of training sample image blocks and label mask image blocks according to a preset size, wherein the training sample images comprise: remotely sensing the satellite image, wherein the value of each pixel point in the label mask image is 0 or 1,0 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a road, and 1 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a background; selecting label mask image blocks and corresponding training sample image blocks, wherein the number of pixel points in the label mask image blocks is 1 and exceeds the preset number; performing data enhancement on training sample image blocks corresponding to the selected label mask image blocks, and increasing the number of the training sample image blocks to obtain preprocessed training sample image blocks; and dividing the preprocessed training sample image block into a source domain image and a target domain image.
In some embodiments, the method further comprises: and inputting the image to be segmented into the determined image segmentation model to obtain a segmentation result of the image to be segmented.
According to other embodiments of the present disclosure, there is provided an image segmentation apparatus including: the first training module is used for respectively inputting the source domain image and the target domain image into a first generation antagonistic network, training the first generation antagonistic network based on antagonistic learning, and obtaining a trained first generation antagonistic network, wherein a first generator in the first generation antagonistic network is an image segmentation model; the dividing module is used for dividing the target domain images into a first set and a second set, wherein after the target domain images in the first set are trained and a first generation is performed on an image segmentation model in the antagonizing network, the obtained segmentation result is used as annotation information, and the target domain images in the second set are not provided with the annotation information; and the second training module is used for respectively inputting the target domain images in the first set and the second set into a second generation countermeasure network, training the second generation countermeasure network based on countermeasure learning, obtaining a trained second generation countermeasure network and determining parameters of the image segmentation model, wherein the second generator in the second generation countermeasure network is the image segmentation model, and the initial parameters are assigned as the parameters of the trained first generator in the first generation countermeasure network.
According to still other embodiments of the present disclosure, there is provided an image segmentation apparatus including: a processor; and a memory coupled to the processor for storing instructions that, when executed by the processor, cause the processor to perform the image segmentation method of any of the preceding embodiments.
According to still further embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the image segmentation method of any of the foregoing embodiments.
The method comprises the steps of training a first generation countermeasure network by using source domain images and target domain images, further dividing the target domain images into a first set and a second set according to an image segmentation model in the trained first generation countermeasure network, and training the second generation countermeasure network based on the first set and the second set, wherein the image segmentation model serves as a generator. The scheme disclosed by the invention comprises a two-stage unsupervised model training method combining inter-domain field self-adaptation between a source domain and a target domain and intra-domain field self-adaptation in the target domain. In the first stage, the method adopts counterstudy to realize the domain self-adaptation among domains, and reduces the inter-domain difference between the source domain and the target domain and the intra-domain difference existing in the target domain. In the second stage, the intra-domain difference is gradually eliminated by adopting antagonistic learning, the robustness and the generalization of the model are gradually improved on the target domain, and the segmentation performance of the model is improved, so that the accuracy of image segmentation is improved.
Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 illustrates a flow diagram of an image segmentation method of some embodiments of the present disclosure.
Fig. 2 shows a schematic structural diagram of a model of some embodiments of the present disclosure.
Fig. 3 shows a flow diagram of an image segmentation method of further embodiments of the present disclosure.
Fig. 4 shows a schematic structural diagram of an image segmentation apparatus of some embodiments of the present disclosure.
Fig. 5 shows a schematic structural diagram of an image segmentation apparatus according to further embodiments of the present disclosure.
Fig. 6 is a schematic structural diagram of an image segmentation apparatus according to still other embodiments of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The image segmentation method is not only suitable for road segmentation scenes of remote sensing satellite images, but also suitable for scenes with complex other images, large image difference between a source domain and a target domain and large image difference between the target domain and the target domain. The image segmentation method of the present disclosure is described below with reference to fig. 1 to 3.
Fig. 1 is a flow chart of some embodiments of the disclosed image segmentation method. As shown in fig. 1, the method of this embodiment includes: steps S102 to S106.
In step S102, the source domain image and the target domain image are respectively input into a first generation countermeasure network, and the first generation countermeasure network is trained based on the countermeasure learning, so as to obtain a trained first generation countermeasure network.
In a road segmentation scene of the remote sensing satellite image, the source domain image and the target domain image are remote sensing satellite images comprising roads. As shown in fig. 2, the first generation countermeasure network includes a first generator and a first discriminator, and the first generator is an image segmentation model. In some embodiments, the image segmentation model includes a feature extraction layer, an upsampling layer, and a softmax layer. For example, an existing network model may be used as a backbone network to perform feature extraction, and the network model may be used as a feature extraction layer. E.g., resNet-101, resNet-50, etc., without present examples. The up-sampling layer is used for up-sampling a Feature Map (or Feature tensor) output by the Feature extraction layer and recovering the Feature Map (or Feature tensor) to be the same as the input image, the depth direction of the up-sampled Feature Map has two channels, and after normalization by the softmax layer, each pixel point corresponds to a binary vector which respectively represents probability values predicted as a background and a target (for example, a road) and serves as a segmentation result.
In some embodiments, the source domain image and the target domain image are respectively input into a feature extraction layer of an image segmentation model, and a first feature map of the source domain image and a second feature map of the target domain image are respectively obtained; respectively inputting the first feature map and the second feature map into an upper sampling layer of the image segmentation model, and respectively obtaining a third feature map of a source domain image with the same size as the source domain image and a fourth feature map of a target domain image with the same size as the target domain image; and respectively inputting the third feature map and the fourth feature map into a softmax layer of the image segmentation model to respectively obtain a segmentation result of the source domain image and a segmentation result of the target domain image. Then, the segmentation result of the source domain image and the segmentation result of the target domain image may be input to a first discriminator for discrimination, and a subsequent embodiment will be described.
Aiming at the problem that information is easily lost due to the fact that a road is long and thin in a scene of remote sensing satellite image road segmentation, the disclosure also provides a method for improving the image segmentation model.
In some embodiments, as shown in fig. 2, the feature extraction layers of the image segmentation model include a plurality of feature extraction layers, the downsampling multiples of each feature extraction layer are sequentially increased according to an order in which the source domain image or the target domain image sequentially passes through, and the last two feature extraction layers are respectively connected to an empty convolutional Pyramid Pooling module (ASPP), and in a case that the downsampling multiples of one or more feature extraction layers in each feature extraction layer exceed a threshold, the empty convolution is used for the one or more feature extraction layers, so that the downsampling multiples of the one or more feature extraction layers are kept at the threshold.
Further, for example, a source domain image and a target domain image are sequentially input to the respective feature extraction layers; respectively inputting the features output by the two feature extraction layers through which the source domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the source domain image respectively corresponding to the two output feature layers; and fusing the multi-scale global features of the source domain images respectively corresponding to the two feature layers to obtain a first feature map of the source domain image. Respectively inputting the features output by the two feature extraction layers through which the target domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the target domain image respectively corresponding to the two output feature layers; and fusing the multi-scale global features of the target domain images respectively corresponding to the two feature layers to obtain a second feature map of the target domain image.
For example, the image segmentation model adopts ResNet-101 as a backbone network for feature extraction, and the image (source domain image or target domain image) passes through each feature extraction layer to obtain hierarchical features (C) 1 ,C 2 ,C 3 ,C 4 ,C 5 ). Since the last feature extraction layer adopts 32 times down sampling, road information is lost, and maintaining large-scale features increases the amount of calculation and reduces the receptive field of the model. Thus, the field of view of the model can be expanded while the last feature extraction layer uses hole convolution so that the downsampling factor remains at the threshold. When the test threshold value is 16, the inventor has better effect on the whole model. Go toStep(s) at the last two levels of feature C 4 ,C 5 Obtaining multi-scale global information using a hole convolution pyramid pooling moduleWill eventually beAnd performing upsampling and normalization after fusion to obtain a segmentation result of the image.
According to the method, a hollow convolution pyramid pooling module is designed for road characteristics in the remote sensing road image, robustness and effective representation of road information in characteristic extraction and upsampling are improved, and therefore performance of remote sensing road segmentation is improved. The cavity convolution pyramid pooling module fully considers the model performance and the model complexity, deep features are enabled to keep high resolution ratio by using the cavity convolution to ensure that road information is not lost, then road characterization capability is further explicitly enhanced by multi-level feature fusion, identifiability of the road information is improved, and the final segmentation result prediction accuracy is improved.
After the source domain image and the target domain image are input into the first generation countermeasure network, the image segmentation model may output a segmentation result of the source domain image and a segmentation result of the target domain image, where the segmentation results include, for example, a probability that each pixel belongs to a target and a background. As shown in fig. 2, training the first generative competing network may include: each training period (Epoch), adjusting parameters of the image segmentation model according to the segmentation result of the source domain image passing through the image segmentation model and the difference of the labeling information of the source domain image; readjusting the image segmentation model based on the countermeasure learning, and adjusting parameters of a first discriminator in the first generation countermeasure network; and repeating the process until the training of the first generation antagonistic network is completed.
The segmentation loss function of the source domain can be determined according to the difference between the segmentation result of the source domain image after passing through the image segmentation model and the annotation information of the source domain image, and the image segmentation model is segmented based on the segmentation loss function of the source domainAnd adjusting the parameters. The segmentation loss function of the source domain may employ a cross-entropy loss function. For example, the annotated set of source domain images is represented as (X) s ,Y s ),X s Representing source domain image data, Y s Representing annotation information, the set of unlabeled target domain images is represented as X t . The image segmentation model is denoted G as the first generator inter . The segmentation loss function of the source domain can be expressed by the following formula.
In the formula (1), (h, w) is the two-dimensional position of each pixel point in the source domain image, c is the number of the divided categories, namely the number of channels,labeling information for each pixel point in each source domain image, e.g., 1 for target, 0 for background, x s Data representing each source domain image.
In some embodiments, as shown in fig. 2, readjusting the image segmentation model based on the countervailing learning comprises: inputting a segmentation result of the target domain image after passing through the image segmentation model into a first discriminator, and performing domain type discrimination on the target domain image to obtain a discrimination result of the target domain image; determining a first pair of loss-resistant functions according to the discrimination result of the target domain image; and adjusting the image segmentation model again according to the first pair of loss-tolerant functions. The first pair of loss immunity functions may be determined using the following equation.
D in formula (2) inter (. Cndot.) represents the first discriminator function, and (h, w) is the two-dimensional position of each pixel point in the source domain image.
In some embodiments, as shown in fig. 2, adjusting parameters of a first arbiter in a first generative countermeasure network includes: respectively inputting a segmentation result of the source domain image after passing through the image segmentation model and a segmentation result of the target domain image after passing through the image segmentation model into a first discriminator, and respectively discriminating the domain type of the source domain image and the domain type of the target domain image to obtain a discrimination result of the source domain image and a discrimination result of the target domain image; determining a first cross entropy loss function according to the discrimination result of the source domain image and the discrimination result of the target domain image; and adjusting the parameters of the first discriminator according to the first cross entropy loss function. The first cross entropy loss function may be determined using the following equation.
P in formula (3) s =G inter (X s ) Representing the segmentation result, P, of the source domain image t =G inter (X t ) Representing the segmentation result of the target domain image.
The embodiment continuously resists games on the feature space through the thought of resist learning, so that the model can generate similar feature distribution on the target domain and the source domain, and the robustness and the generalization of the model are improved.
In step S104, the target domain images are divided into a first set and a second set.
In some embodiments, after the target domain images in the first set are trained, and a first generation pairing-target network image segmentation model is generated, the segmentation result is used as annotation information, and the target domain images in the second set are not provided with the annotation information. To further improve the accuracy of the model training, the probability of the presence of an object (e.g., a road) in the object domain image is considered when partitioning the first set and the second set.
In some embodiments, the target domain image is input into a trained first generation impedance network image segmentation model to obtain a segmentation result of the target domain image, and a pseudo label of the target domain image is generated, wherein the pseudo label of the target domain image is used for labeling each pixel point in the target domain image to belong to a target or a background; determining the grade of the target domain image according to the segmentation result of the target domain image and the pseudo label of the target domain image; and selecting partial target domain images to generate a first set according to the scores of all the target domain images, and generating a second set by the target domain images outside the first set, wherein the pseudo labels of the target domain images in the first set are used as the labeling information of the target domain images. The higher the probability value of the pixel points belonging to the target in the target domain image corresponding to the target in the segmentation result is, the less the number of the pixel points belonging to the target in the target domain image is labeled by the pseudo label of the target domain image, and the higher the score of the target domain image is. For example, the score in the target domain image is determined using the following formula.
Wherein,representing the probability that the pixel at (i, j) in the target domain image belongs to the target,a pseudo label representing a pixel at (i, j) in the target domain image,indicating that the pixel at (i, j) in the target domain image belongs to the target,indicating that the pixel at (i, j) in the target domain image belongs to the background.
The target domain images can be sorted according to the scores from high to low, the target domain images sorted before the preset digit are selected as the images in the first set, and the rest target domain images are used as the images in the second set. For example, the top 70% of the images are ranked as images in the first set, and the remaining 30% of the target domain images are ranked as images in the second set.
Based on the intra-domain first set and the intra-domain second set divided by the method, the method mainly scores the target (road) information, sorts and divides the target domain according to the scores, and fully considers the target information which is mainly concerned.
In step S106, the target domain images in the first set and the second set are respectively input into a second generative confrontation network, and the second generative confrontation network is trained based on the confrontation learning, so as to obtain a trained second generative confrontation network, thereby determining parameters of the image segmentation model.
As shown in FIG. 2, the second generator in the second generation countermeasure network is an image segmentation model, and the initial parameters are assigned as the parameters of the first generator in the trained first generation countermeasure network. The second generation countermeasure network comprises a second generator and a second discriminator, the second generator is used for enabling the image segmentation model and the image segmentation model of the first generation countermeasure network to be the same in structure, and the parameters of the image segmentation model in the trained first generation countermeasure network are initialized. The second generative countermeasure network can take a similar training approach as the first generative countermeasure network.
After the source domain image and the target domain image are input into the second generation countermeasure network, the image segmentation model may output a segmentation result of the source domain image and a segmentation result of the target domain image. As shown in fig. 2, training the second generative countermeasure network may include: in each training period, adjusting parameters of the image segmentation model according to the segmentation result of the target domain image in the first set after passing through the image segmentation model and the difference of the labeling information of the target domain image in the first set; readjusting the image segmentation model based on the confrontation learning, and adjusting parameters of a second discriminator in a second generation confrontation network; and repeating the process until the training of the second generation confrontation network is completed.
The segmentation loss function of the target domain can be determined according to the segmentation result of the target domain image in the first set after passing through the image segmentation model and the difference of the labeling information of the target domain image in the first set, and the parameters of the image segmentation model are adjusted based on the segmentation loss function of the target domain. The segmentation loss function of the target domain can adopt cross entropy lossA loss function. For example, the first set is represented as (X) te ,M te ),X te Representing image data of a target field in a first set, M te Representing annotation information, a second set without annotations being represented as X th . The image segmentation model is denoted as G as a second generator intra . The segmentation loss function of the target domain can be expressed by the following formula.
In the formula (5), (h, w) is the two-dimensional position of each pixel point in the target domain image, c is the number of the divided categories, namely the number of channels,labeling information for each pixel in each target domain image in the first set, e.g., 1 for target, 0 for background, x te Data representing each target domain image of the first set.
In some embodiments, as shown in fig. 2, readjusting the image segmentation model based on the countervailing learning includes: inputting the segmentation result of the target domain images in the second set after passing through the image segmentation model into a second discriminator, and discriminating the domain type of the target domain images in the second set to obtain the discrimination result of the target domain images in the second set; determining a second pair of loss-resistant functions according to the discrimination result of the target domain images in the second set; and adjusting the image segmentation model again according to the second pair of loss-tolerant functions. The second pair of loss immunity functions may be determined using the following equation.
D in formula (6) intra And (h) represents the two-dimensional position of each pixel point in the target domain image.
In some embodiments, as shown in fig. 2, adjusting the parameters of the second arbiter in the second generative countermeasure network comprises: respectively inputting the segmentation result of the target domain images in the first set after passing through the image segmentation model and the segmentation result of the target domain images in the second set after passing through the image segmentation model into a second discriminator, and respectively discriminating the domain type of the target domain images in the first set and the second set to obtain the discrimination result of the target domain images in the first set and the discrimination result of the target domain images in the second set; determining a second cross entropy loss function according to the discrimination result of the target domain image in the first set and the discrimination result of the target domain image in the second set; and adjusting the parameters of the second discriminator according to the second cross entropy loss function. The second cross entropy loss function can be determined using the following formula.
P in formula (7) te =G intra (X te ) Representing the result of the segmentation of the target field image in the first set, P th =G intra (X th ) Representing the segmentation result of the target domain image in the second set.
Adjusting parameters of the second generation countermeasure network, namely, in-domain field self-adaptation is carried out according to the formulas (5) - (7), an image segmentation model with better segmentation performance can be obtained, more accurate pseudo labels can be generated, the robustness and the precision of the image segmentation model on a target field are gradually improved by continuously repeating the adjustment of the parameters of the second generation countermeasure network until the performance is saturated, and finally the required image segmentation model G is obtained intra To perform a segmentation task on the target domain data.
Optionally, in step S108, the image to be segmented is input into the determined image segmentation model, so as to obtain a segmentation result of the image to be segmented, and further, the target and the background in the image to be segmented can be determined.
In the above embodiment, the source domain image and the target domain image are used to train the first production countermeasure network, the target domain image is further divided into the first set and the second set according to the trained image segmentation model in the first production countermeasure network, and the second production countermeasure network is trained based on the first set and the second set, where the image segmentation model serves as the generator. The scheme disclosed by the invention comprises a two-stage unsupervised model training method combining inter-domain field self-adaption between a source domain and a target domain and intra-domain field self-adaption in the target domain. In the first stage, the method adopts counterstudy to realize the domain self-adaptation among domains, and reduces the inter-domain difference between the source domain and the target domain and the intra-domain difference existing in the target domain. In the second stage, the intra-domain difference is gradually eliminated by adopting antagonistic learning, the robustness and the generalization of the model are gradually improved on the target domain, and the segmentation performance of the model is improved, so that the accuracy of image segmentation is improved.
Because the sizes of the remote sensing images are not uniform, the width and the height of data are between 1000 and 5000, the high-resolution images are directly input into a depth network, which may cause the network load to be too high, which results in insufficient hardware resources, and the number of background pixels and target pixels in the high-resolution images is greatly different, that is, the remote sensing satellite images have extreme imbalance of background and road pixel sample quantities, and direct training may cause the network to be unable to effectively learn the characteristics of the road. To address this problem, the training sample images may be preprocessed, as described below in conjunction with FIG. 3. The preprocessing method can also be applied to the preprocessing of complex images with unbalanced distribution of other targets and backgrounds.
FIG. 3 is a flow chart of further embodiments of the image segmentation method of the present disclosure. As shown in fig. 3, the method of this embodiment includes: steps S302 to S308.
In step S302, the training sample images and the label mask images of the training sample images are respectively cut into a plurality of training sample image blocks and label mask image blocks according to a preset size.
The training sample image includes: the remote sensing satellite image can also be other high-resolution images with large difference between the target pixel quantity and the background pixel quantity. The value of each pixel point in the label mask image is 0 or 1,0 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a target (road), and 1 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a background.
For example, given a high-resolution remote sensing image, randomly cutting image blocks with the size of 512x512 on the remote sensing image, and selectively cutting 20-30 image blocks according to the size of an input image, wherein the steps are synchronously performed on a label mask (mask) image to ensure the accuracy of a label.
In step S304, label mask image blocks and corresponding training sample image blocks are selected, where the number of pixel points in the label mask image block whose value is 1 exceeds the preset number.
Since randomly cropped image blocks may not have a road, it is necessary to screen these cropped image blocks. And calculating the number of road pixel samples in the image block according to the cut mask image block, setting a preset number, such as 4000, and keeping the image block with the road pixel amount larger than the preset number as effective data, otherwise, discarding the image block.
In step S306, data enhancement is performed on the training sample image blocks corresponding to the selected label mask image blocks, and the number of the training sample image blocks is increased to obtain the preprocessed training sample image blocks.
Some data enhancement strategies, such as turning, rotating and other strategies, can be used for the screened image blocks, so that the sample size and diversity of data are increased, and the generalization and robustness of the model are improved.
In step S308, the preprocessed training sample image blocks are divided into a source domain image and a target domain image.
The source domain image is provided with marking information, and the target domain image is not provided with mark information.
Through the preprocessing of the training sample images, enough source domain images and target domain images suitable for training can be obtained. The method of the embodiment is designed for solving the problem that the adaptability of the traditional field self-adaptive method on the high-resolution remote sensing image is poor, a set of strategies for cutting, screening and data enhancement of the remote sensing road image are designed, and the compatibility of the remote sensing image and the field self-adaptive method in the traditional computer vision field is improved.
The present disclosure also provides an image segmentation apparatus, which is described below in conjunction with fig. 4.
Fig. 4 is a block diagram of some embodiments of an image segmentation apparatus of the present disclosure. As shown in fig. 4, the apparatus 40 of this embodiment includes: a first training module 410, a partitioning module 420, and a second training module 430.
The first training module 410 is configured to input the source domain image and the target domain image into a first generation countermeasure network, respectively, train the first generation countermeasure network based on countermeasure learning, and obtain a trained first generation countermeasure network, where a first generator in the first generation countermeasure network is an image segmentation model.
In some embodiments, the first training module 410 is configured to input the source domain image and the target domain image into a feature extraction layer of the image segmentation model, respectively, to obtain a first feature map of the source domain image and a second feature map of the target domain image; respectively inputting the first feature map and the second feature map into an upper sampling layer of the image segmentation model, and respectively obtaining a third feature map of a source domain image with the same size as the source domain image and a fourth feature map of a target domain image with the same size as the target domain image; and respectively inputting the third feature map and the fourth feature map into a softmax layer of the image segmentation model to respectively obtain a segmentation result of the source domain image and a segmentation result of the target domain image.
In some embodiments, the feature extraction layers of the image segmentation model comprise a plurality of feature extraction layers, according to the sequence of the source domain image or the target domain image, the down-sampling multiples of each feature extraction layer are sequentially increased, and the last two feature extraction layers are respectively connected with the cavity convolution pyramid pooling module; when the downsampling multiple of one or more feature extraction layers exceeds the threshold value, the downsampling multiple of the one or more feature extraction layers is kept at the threshold value by using hole convolution for the one or more feature extraction layers.
In some embodiments, the first training module 410 is configured to input the source domain image and the target domain image into each feature extraction layer in sequence; respectively inputting the features output by the two feature extraction layers through which the source domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the source domain image respectively corresponding to the two output feature layers; fusing the multi-scale global features of the source domain image respectively corresponding to the two feature layers to obtain a first feature map of the source domain image; respectively inputting the features output by the two feature extraction layers through which the target domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the target domain image respectively corresponding to the two output feature layers; and fusing the multi-scale global features of the target domain images respectively corresponding to the two feature layers to obtain a second feature map of the target domain image.
In some embodiments, the first training module 410 is configured to, for each training period, adjust parameters of the image segmentation model according to a segmentation result of the source domain image after passing through the image segmentation model and a difference between annotation information of the source domain image; readjusting the image segmentation model based on the countermeasure learning, and adjusting parameters of a first discriminator in the first generation countermeasure network; the above process is repeated until the training of the first generative anti-net is completed.
In some embodiments, the first training module 410 is configured to input a segmentation result of the target domain image after passing through the image segmentation model into the first discriminator, and perform domain type discrimination on the target domain image to obtain a discrimination result of the target domain image; determining a first pair of loss-resisting functions according to the discrimination result of the target domain image; and adjusting the image segmentation model again according to the first pair of loss-tolerant functions.
In some embodiments, the first training module 410 is configured to input a segmentation result of the source domain image after passing through the image segmentation model and a segmentation result of the target domain image after passing through the image segmentation model into the first discriminator, and perform domain type discrimination on the source domain image and the target domain image respectively to obtain a discrimination result of the source domain image and a discrimination result of the target domain image; determining a first cross entropy loss function according to the discrimination result of the source domain image and the discrimination result of the target domain image; and adjusting the parameters of the first discriminator according to the first cross entropy loss function.
The dividing module 420 is configured to divide the target domain images into a first set and a second set, where after the target domain images in the first set are trained, a first generation of an image segmentation model in the countermeasure network obtains a segmentation result as annotation information, and the target domain images in the second set do not set the annotation information.
In some embodiments, the dividing module 420 is configured to input the target domain image into a trained first generation countermeasure network image segmentation model, obtain a segmentation result of the target domain image, and generate a pseudo label of the target domain image, where the pseudo label of the target domain image is used to label that each pixel point in the target domain image belongs to a target or a background; determining the score of the target domain image according to the segmentation result of the target domain image and the pseudo label of the target domain image, wherein the higher the probability value of the pixel points belonging to the target in the target domain image in the segmentation result is, the smaller the number of the pixel points belonging to the target in the target domain image is labeled by the pseudo label of the target domain image, and the higher the score of the target domain image is; and selecting partial target domain images to generate a first set according to the scores of all the target domain images, and generating a second set by the target domain images outside the first set, wherein the pseudo labels of the target domain images in the first set are used as the labeling information of the target domain images.
In some embodiments, the score in the target domain image is determined using the following formula:
wherein,representing the probability that the pixel at (i, j) in the target domain image belongs to the target,a pseudo label representing a pixel at (i, j) in the target domain image,indicating that in the target domain image (i,j) The pixel of interest belongs to the object,indicating that the pixel at (i, j) in the target domain image belongs to the background.
The second training module 430 is configured to input the target domain images in the first set and the second set into a second generative confrontation network, respectively, train the second generative confrontation network based on confrontation learning, obtain a trained second generative confrontation network, and determine parameters of the image segmentation model, where a second generator in the second generative confrontation network is the image segmentation model, and the initial parameters are assigned as parameters of a first generator in the trained first generative confrontation network.
In some embodiments, the second training module 430 is configured to, for each training period, adjust parameters of the image segmentation model according to a segmentation result of the target domain image in the first set after passing through the image segmentation model and a difference between annotation information of the target domain image in the first set; readjusting the image segmentation model based on the confrontation learning, and adjusting parameters of a second discriminator in a second generation confrontation network; the above process is repeated until training of the second generative countermeasure network is completed.
In some embodiments, the second training module 430 is configured to input a segmentation result of the target domain images in the second set after passing through the image segmentation model into the second discriminator, and perform domain type discrimination on the target domain images in the second set to obtain a discrimination result of the target domain images in the second set; determining a second pair of loss-resistant functions according to the discrimination result of the target domain images in the second set; and adjusting the image segmentation model again according to the second pair of loss-tolerant functions.
In some embodiments, the second training module 430 is configured to input a segmentation result of the target domain image in the first set after passing through the image segmentation model and a segmentation result of the target domain image in the second set after passing through the image segmentation model into the second determiner, and perform domain category determination on the target domain images in the first set and the second set respectively to obtain a determination result of the target domain image in the first set and a determination result of the target domain image in the second set; determining a second cross entropy loss function according to the discrimination result of the target domain image in the first set and the discrimination result of the target domain image in the second set; and adjusting the parameters of the second discriminator according to the second cross entropy loss function.
In some embodiments, the source domain image and the target domain image are remote sensing satellite images including roads; the apparatus 40 further comprises: the preprocessing module 440 is configured to respectively cut the training sample images and the label mask images of the training sample images into a plurality of training sample image blocks and label mask image blocks according to a preset size, where the training sample images include: the method comprises the steps that a remote sensing satellite image is obtained, the value of each pixel point in a label mask image is 0 or 1,0 indicates that the pixel point in a training sample image at the same position of the pixel point in the label mask image belongs to a road, and 1 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a background; selecting label mask image blocks and corresponding training sample image blocks, wherein the number of pixel points in the label mask image blocks is 1 and exceeds the preset number; performing data enhancement on training sample image blocks corresponding to the selected label mask image blocks, and increasing the number of the training sample image blocks to obtain preprocessed training sample image blocks; and dividing the preprocessed training sample image block into a source domain image and a target domain image.
In some embodiments, the apparatus 40 further comprises: the image segmentation module 450 is configured to input the image to be segmented into the determined image segmentation model, so as to obtain a segmentation result of the image to be segmented.
The image segmentation apparatus in the embodiments of the present disclosure may each be implemented by various computing devices or computer systems, which are described below in conjunction with fig. 5 and 6.
Fig. 5 is a block diagram of some embodiments of an image segmentation apparatus of the present disclosure. As shown in fig. 5, the apparatus 50 of this embodiment includes: a memory 510 and a processor 520 coupled to the memory 510, the processor 520 being configured to perform the image segmentation method in any of the embodiments of the present disclosure based on instructions stored in the memory 510.
Fig. 6 is a block diagram of another embodiment of an image segmentation apparatus according to the present disclosure. As shown in fig. 6, the apparatus 60 of this embodiment includes: memory 610 and processor 620 are similar to memory 510 and processor 520, respectively. Input/output interfaces 630, network interfaces 640, storage interfaces 650, etc. may also be included. These interfaces 630, 640, 650 and the connections between the memory 610 and the processor 620 may be, for example, via a bus 660. The input/output interface 630 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 640 provides a connection interface for various networking devices, such as a database server or a cloud storage server. The storage interface 650 provides a connection interface for external storage devices such as an SD card and a usb disk.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only exemplary of the present disclosure and is not intended to limit the present disclosure, so that any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.
Claims (17)
1. An image segmentation method comprising:
respectively inputting a source domain image and a target domain image into a first generation antagonistic network, training the first generation antagonistic network based on antagonistic learning, and obtaining a trained first generation antagonistic network, wherein a first generator in the first generation antagonistic network is an image segmentation model;
dividing target domain images into a first set and a second set, wherein after the target domain images in the first set are trained and a first generation is performed on an image segmentation model in an antagonistic network, obtained segmentation results are used as labeling information, and the target domain images in the second set are not provided with the labeling information;
and respectively inputting the target domain images in the first set and the second set into a second generation countermeasure network, training the second generation countermeasure network based on countermeasure learning, obtaining a trained second generation countermeasure network, and determining parameters of the image segmentation model, wherein a second generator in the second generation countermeasure network is the image segmentation model, and initial parameters are assigned as parameters of a first generator in a trained first generation countermeasure network.
2. The image segmentation method according to claim 1, wherein the inputting the source domain image and the target domain image into the first generation countermeasure network respectively comprises:
inputting the source domain image and the target domain image into a feature extraction layer of the image segmentation model respectively to obtain a first feature map of the source domain image and a second feature map of the target domain image respectively;
respectively inputting the first feature map and the second feature map into an upper sampling layer of the image segmentation model to respectively obtain a third feature map of the source domain image with the same size as the source domain image and a fourth feature map of the target domain image with the same size as the target domain image;
and respectively inputting the third feature map and the fourth feature map into a softmax layer of the image segmentation model to respectively obtain a segmentation result of the source domain image and a segmentation result of the target domain image.
3. The image segmentation method according to claim 2, wherein the feature extraction layers of the image segmentation model include a plurality of feature extraction layers, the downsampling multiples of each feature extraction layer are sequentially increased according to a sequence in which the source domain image or the target domain image sequentially passes through, and the last two feature extraction layers are respectively connected with the cavity convolution pyramid pooling module;
in the case where there is one or more feature extraction layers in each feature extraction layer whose downsampling multiple exceeds a threshold, hole convolution is used for the one or more feature extraction layers to keep the downsampling multiple of the one or more feature extraction layers at the threshold.
4. The image segmentation method according to claim 3, wherein the inputting the source domain image and the target domain image into a feature extraction layer of the image segmentation model respectively to obtain a first feature map of the source domain image and a second feature map of the target domain image respectively comprises:
sequentially inputting the source domain image and the target domain image into each feature extraction layer;
respectively inputting the features output by the two feature extraction layers through which the source domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the source domain image respectively corresponding to the two output feature layers;
fusing the multi-scale global features of the source domain image respectively corresponding to the two feature layers to obtain a first feature map of the source domain image;
respectively inputting the features output by the two feature extraction layers through which the target domain image finally passes into a cavity convolution pyramid pooling module to obtain multi-scale global features of the target domain image respectively corresponding to the two output feature layers;
and fusing the multi-scale global features of the target domain image respectively corresponding to the two feature layers to obtain a second feature map of the target domain image.
5. The image segmentation method of claim 1, wherein the training of the first generative warfare network based on warfare learning comprises:
in each training period, adjusting parameters of the image segmentation model according to the segmentation result of the source domain image after passing through the image segmentation model and the difference of the labeling information of the source domain image;
readjusting the image segmentation model based on countermeasure learning, and adjusting parameters of a first discriminator in the first generation countermeasure network;
and repeating the process until the training of the first generation pairing-resistance network is completed.
6. The image segmentation method according to claim 5, wherein the readjusting the image segmentation model based on the countervailing learning comprises:
inputting a segmentation result of the target domain image after passing through the image segmentation model into the first discriminator, and performing domain type discrimination on the target domain image to obtain a discrimination result of the target domain image;
determining a first pair of loss-resistant functions according to the discrimination result of the target domain image;
and readjusting the image segmentation model according to the first pair of loss-tolerant functions.
7. The image segmentation method of claim 5, wherein the adjusting parameters of a first discriminator in the first generation countermeasure network comprises:
respectively inputting the segmentation result of the source domain image after passing through the image segmentation model and the segmentation result of the target domain image after passing through the image segmentation model into the first discriminator, and respectively discriminating the domain type of the source domain image and the domain type of the target domain image to obtain the discrimination result of the source domain image and the discrimination result of the target domain image;
determining a first cross entropy loss function according to the discrimination result of the source domain image and the discrimination result of the target domain image;
and adjusting the parameters of the first discriminator according to the first cross entropy loss function.
8. The image segmentation method of claim 1, wherein the dividing of the target domain image into the first set and the second set comprises:
inputting the target domain image into a trained first generation impedance network image segmentation model, obtaining a segmentation result of the target domain image, and generating a pseudo label of the target domain image, wherein the pseudo label of the target domain image is used for marking that each pixel point in the target domain image belongs to a target or a background;
determining the score of the target domain image according to the segmentation result of the target domain image and the pseudo label of the target domain image, wherein the higher the probability value of the pixel points belonging to the target in the target domain image in the segmentation result is, the smaller the number of the pixel points belonging to the target in the target domain image is labeled by the pseudo label of the target domain image, and the higher the score of the target domain image is;
according to the scores of all the target domain images, selecting partial target domain images to generate a first set, and generating a second set by the target domain images outside the first set, wherein the pseudo labels of the target domain images in the first set are used as the labeling information of the target domain images.
9. The image segmentation method according to claim 8, wherein the score in the target domain image is determined using the following formula:
wherein,representing the probability that the pixel at (i, j) in the target domain image belongs to the target,a pseudo label representing a pixel at (i, j) in the target domain image,indicating that the pixel at (i, j) in the target domain image belongs to a target,indicating that the pixel at (i, j) in the target domain image belongs to the background.
10. The image segmentation method of claim 1, wherein the training the second generative warfare network based on warfare learning comprises:
in each training period, adjusting parameters of the image segmentation model according to the segmentation result of the target domain images in the first set after passing through the image segmentation model and the difference of the labeling information of the target domain images in the first set;
readjusting the image segmentation model based on countermeasure learning, and adjusting parameters of a second discriminator in the second generative countermeasure network;
and repeating the process until the training of the second generation countermeasure network is completed.
11. The image segmentation method according to claim 10, wherein the readjusting the image segmentation model based on the countervailing learning comprises:
inputting the segmentation result of the target domain images in the second set after passing through the image segmentation model into the second discriminator, and performing domain type discrimination on the target domain images in the second set to obtain the discrimination result of the target domain images in the second set;
determining a second pair of loss-resistant functions according to the discrimination result of the target domain images in the second set;
and readjusting the image segmentation model according to the second pair of loss-tolerant functions.
12. The image segmentation method according to claim 10, wherein the adjusting parameters of the second discriminator in the second generative countermeasure network includes:
respectively inputting the segmentation result of the target domain image in the first set after passing through the image segmentation model and the segmentation result of the target domain image in the second set after passing through the image segmentation model into the second discriminator, and respectively discriminating the domain type of the target domain image in the first set and the domain type of the target domain image in the second set to obtain the discrimination result of the target domain image in the first set and the discrimination result of the target domain image in the second set;
determining a second cross entropy loss function according to the discrimination result of the target domain images in the first set and the discrimination result of the target domain images in the second set;
and adjusting the parameters of the second discriminator according to the second cross entropy loss function.
13. The image segmentation method according to claim 1, wherein the source domain image and the target domain image are remote sensing satellite images including roads;
the method further comprises the following steps:
respectively cutting a training sample image and a label mask image of the training sample image into a plurality of training sample image blocks and label mask image blocks according to a preset size, wherein the training sample image comprises: remote sensing a satellite image, wherein the value of each pixel point in the label mask image is 0 or 1,0 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a road, and 1 indicates that the pixel point in the training sample image at the same position of the pixel point in the label mask image belongs to a background;
selecting label mask image blocks and corresponding training sample image blocks, wherein the number of pixel points in the label mask image blocks is 1 and exceeds the preset number;
performing data enhancement on training sample image blocks corresponding to the selected label mask image blocks, and increasing the number of the training sample image blocks to obtain preprocessed training sample image blocks;
and dividing the preprocessed training sample image blocks into the source domain image and the target domain image.
14. The image segmentation method according to claim 1, further comprising:
and inputting the image to be segmented into the determined image segmentation model to obtain the segmentation result of the image to be segmented.
15. An image segmentation apparatus comprising:
the first training module is used for respectively inputting the source domain image and the target domain image into a first generation antagonistic network, training the first generation antagonistic network based on antagonistic learning, and obtaining a trained first generation antagonistic network, wherein a first generator in the first generation antagonistic network is an image segmentation model;
the dividing module is used for dividing the target domain images into a first set and a second set, wherein after the target domain images in the first set are trained to generate an image segmentation model in the countermeasure network, the obtained segmentation result is used as annotation information, and the target domain images in the second set are not provided with the annotation information;
and the second training module is used for respectively inputting the target domain images in the first set and the second set into a second generation countermeasure network, training the second generation countermeasure network based on countermeasure learning, obtaining a trained second generation countermeasure network, and determining parameters of the image segmentation model, wherein a second generator in the second generation countermeasure network is the image segmentation model, and initial parameters are assigned to parameters of a first generator in the trained first generation countermeasure network.
16. An image segmentation apparatus comprising:
a processor; and
a memory coupled to the processor for storing instructions that, when executed by the processor, cause the processor to perform the image segmentation method of any of claims 1-14.
17. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the steps of the method of any one of claims 1 to 14.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110325191.9A CN115205694A (en) | 2021-03-26 | 2021-03-26 | Image segmentation method, device and computer readable storage medium |
PCT/CN2022/071371 WO2022199225A1 (en) | 2021-03-26 | 2022-01-11 | Decoding method and apparatus, and computer-readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110325191.9A CN115205694A (en) | 2021-03-26 | 2021-03-26 | Image segmentation method, device and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115205694A true CN115205694A (en) | 2022-10-18 |
Family
ID=83396306
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110325191.9A Pending CN115205694A (en) | 2021-03-26 | 2021-03-26 | Image segmentation method, device and computer readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115205694A (en) |
WO (1) | WO2022199225A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115661001A (en) * | 2022-12-14 | 2023-01-31 | 临沂大学 | Single-channel coal rock image enhancement method based on generation of countermeasure network |
CN116895003A (en) * | 2023-09-07 | 2023-10-17 | 苏州魔视智能科技有限公司 | Target object segmentation method, device, computer equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108875935B (en) * | 2018-06-11 | 2020-08-11 | 兰州理工大学 | Natural image target material visual characteristic mapping method based on generation countermeasure network |
CN110276811B (en) * | 2019-07-02 | 2022-11-01 | 厦门美图之家科技有限公司 | Image conversion method and device, electronic equipment and readable storage medium |
CN111340819B (en) * | 2020-02-10 | 2023-09-12 | 腾讯科技(深圳)有限公司 | Image segmentation method, device and storage medium |
CN111723780B (en) * | 2020-07-22 | 2023-04-18 | 浙江大学 | Directional migration method and system of cross-domain data based on high-resolution remote sensing image |
-
2021
- 2021-03-26 CN CN202110325191.9A patent/CN115205694A/en active Pending
-
2022
- 2022-01-11 WO PCT/CN2022/071371 patent/WO2022199225A1/en active Application Filing
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115661001A (en) * | 2022-12-14 | 2023-01-31 | 临沂大学 | Single-channel coal rock image enhancement method based on generation of countermeasure network |
CN116895003A (en) * | 2023-09-07 | 2023-10-17 | 苏州魔视智能科技有限公司 | Target object segmentation method, device, computer equipment and storage medium |
CN116895003B (en) * | 2023-09-07 | 2024-01-30 | 苏州魔视智能科技有限公司 | Target object segmentation method, device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2022199225A1 (en) | 2022-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Serna et al. | Classification of traffic signs: The european dataset | |
CN109753885B (en) | Target detection method and device and pedestrian detection method and system | |
CN108830285B (en) | Target detection method for reinforcement learning based on fast-RCNN | |
Wang et al. | Land cover change detection at subpixel resolution with a Hopfield neural network | |
CN112734775B (en) | Image labeling, image semantic segmentation and model training methods and devices | |
CN108108751B (en) | Scene recognition method based on convolution multi-feature and deep random forest | |
CN110929577A (en) | Improved target identification method based on YOLOv3 lightweight framework | |
CN111612807A (en) | Small target image segmentation method based on scale and edge information | |
CN111882620B (en) | Road drivable area segmentation method based on multi-scale information | |
CN110879960B (en) | Method and computing device for generating image data set for convolutional neural network learning | |
CN112528934A (en) | Improved YOLOv3 traffic sign detection method based on multi-scale feature layer | |
CN106910202B (en) | Image segmentation method and system for ground object of remote sensing image | |
KR20200091318A (en) | Learning method and learning device for attention-driven image segmentation by using at least one adaptive loss weight map to be used for updating hd maps required to satisfy level 4 of autonomous vehicles and testing method and testing device using the same | |
CN113223042B (en) | Intelligent acquisition method and equipment for remote sensing image deep learning sample | |
CN114519819B (en) | Remote sensing image target detection method based on global context awareness | |
CN111860124B (en) | Remote sensing image classification method based on space spectrum capsule generation countermeasure network | |
CN115471467A (en) | High-resolution optical remote sensing image building change detection method | |
CN113034495A (en) | Spine image segmentation method, medium and electronic device | |
CN113256649B (en) | Remote sensing image station selection and line selection semantic segmentation method based on deep learning | |
CN115205694A (en) | Image segmentation method, device and computer readable storage medium | |
Chen et al. | Contrast limited adaptive histogram equalization for recognizing road marking at night based on YOLO models | |
Li et al. | A guided deep learning approach for joint road extraction and intersection detection from RS images and taxi trajectories | |
CN111507359A (en) | Self-adaptive weighting fusion method of image feature pyramid | |
CN115410081A (en) | Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium | |
CN112801109A (en) | Remote sensing image segmentation method and system based on multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |