CN109636742B - Mode conversion method of SAR image and visible light image based on countermeasure generation network - Google Patents
Mode conversion method of SAR image and visible light image based on countermeasure generation network Download PDFInfo
- Publication number
- CN109636742B CN109636742B CN201811405188.2A CN201811405188A CN109636742B CN 109636742 B CN109636742 B CN 109636742B CN 201811405188 A CN201811405188 A CN 201811405188A CN 109636742 B CN109636742 B CN 109636742B
- Authority
- CN
- China
- Prior art keywords
- image
- visible light
- sar
- input
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 9
- 239000013598 vector Substances 0.000 claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 17
- 230000008569 process Effects 0.000 claims description 16
- 238000011176 pooling Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 7
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 230000009471 action Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- BTCSSZJGUNDROE-UHFFFAOYSA-N gamma-aminobutyric acid Chemical compound NCCCC(O)=O BTCSSZJGUNDROE-UHFFFAOYSA-N 0.000 claims 1
- 238000002474 experimental method Methods 0.000 description 8
- 238000011160 research Methods 0.000 description 4
- 230000003213 activating effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000010287 polarization Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009102 absorption Effects 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G06T5/73—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G06T5/92—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10044—Radar image
Abstract
The invention relates to a mode conversion method for converting an SAR image into a visible light image based on a countermeasure generating network, which comprises the following steps of firstly, extracting a characteristic vector of a satellite image at the same position, and using the characteristic vector as prior information of the SAR image; the prior information and the SAR image are input into a generator together to generate a visible light image with an SAR image target. Secondly, training a discriminator in the generation countermeasure network, and adopting a formula LGAN(GAB,D,A,B)=Eb~B[log D(b)]+Ea~A[log(1‑D(GAB(a)))]As a discriminant loss. And finally, judging whether the trained confrontation generation network has model folding errors or not, namely inputting different SAR images, wherein most of the output of the generator is only the same visible light image. And training the other generator, and comparing the feature similarity of the two images by adopting generation loss. Generating a loss of LGAN(GAB,GBA,A,B)=Ea~A[||GAB(GBA(a))‑a||1]. When the network training is finished, the curves of the discrimination loss and the generation loss tend to be stable, the discrimination loss is not increased any more, and the generation loss is not reduced any more.
Description
Technical Field
The invention belongs to the field of image translation in deep learning, and relates to a mode conversion method of an SAR image and a visible light image based on a countermeasure generation network.
Background
Since 1978, the emergence of Synthetic Aperture Radar (SAR) has revolutionized radar technology. The advantages of the radar antenna are incomparable all-time and all-weather, and the wide application prospect is generated, thereby attracting countless sights of the radar science community. The subsequent SAR-related research constitutes the main melody of the technical revolution surge, and SAR systems of different wave bands, different polarizations and even different resolutions are emerging continuously. Needless to say, this great revolution has affected various fields of military and civil use.
Due to the improvement of resolution, the data volume of the SAR increases in a series manner, and many difficulties are faced in artificial-based information processing and application research (such as target identification): firstly, in a large-scale area, the tasks of ground object detection and identification based on SAR images are realized by manual interpretation, the task amount greatly exceeds the limit of rapid judgment by manual operation, and subjective errors and understanding errors caused by the task amount are inevitable. Secondly, due to a special imaging mechanism of the SAR image, a target is very sensitive to an azimuth angle, and a completely different SAR image can be caused by a large azimuth angle difference, so that the difference between the SAR image and an optical image in visual effect is further increased, and the difficulty of image interpretation and judgment is increased; thirdly, with the continuous improvement of the resolution of the SAR sensor, diversification of the sensor mode, the wave band and the polarization mode, the target information in the SAR image also shows explosive growth, and the target is changed from a point target on the original single-channel single-polarization low-resolution image into a surface target with abundant detail characteristics and scattering characteristics, so that on one hand, more detailed explanation and identification work on the surface feature information is possible, and meanwhile, the types and instability of the surface feature characteristics are greatly increased, therefore, the traditional information processing and application method cannot meet the requirements of practical application, and related key technologies must be subjected to attack and customs, the data processing speed is accelerated, and the accuracy of information extraction is improved.
Based on the method, the SAR image is converted into an image mode with visible light image characteristics based on a method for generating a network by countermeasure, and the main advantages comprise the following aspects: firstly, when a CycleGAN network in a countermeasure generation network is used for generating a picture, an SAR image in a source image domain is used as input, semantic features of a satellite image at the same position are extracted as prior information of the SAR image, and the prior information is used as conditions and input into a generator at the same time, so that the generated image not only has a visible light style, but also has target information which cannot be seen in the SAR image; secondly, in order to prevent the situation that the generation network generates model folding (Mode Collapse) in the training process, two generators are trained, wherein the first generator is used for generating a visible light image required by a target through an SAR image, and the second generator is used for retranslating the generated visible light image into the SAR image. In the training process, the generation loss between the original SAR image and the generated SAR image is calculated, and the common network model memory error in the training countermeasure generation network is avoided by continuously reducing the generation loss.
Disclosure of Invention
Technical problem to be solved
When the SAR image is subjected to image processing, on one hand, because the traditional method is sensitive to a model and has high requirements on the image, the resolution of the SAR image is relatively low, the image quality is fuzzy, and when the image does not accord with the model, a satisfactory result can not be obtained; on the other hand, it is difficult to establish a semi-empirical formula or mathematical model between the characteristic signals of the SAR image and the target, because the characteristics of the target, such as reflection, scattering, transmission, absorption and radiation, are not known sufficiently. Therefore, a method based on a countermeasure generation network is provided, and an SAR image is converted into an image mode with visible light image characteristics, so that the target is recognized and detected.
Technical scheme
The basic idea of the invention is as follows: a deep learning method for generating a countermeasure Network (GAN) is adopted to train an unsupervised learning Network, namely a cyclic GAN, so that the conversion from a low-resolution SAR image to a visible light image mode is realized. The cycleGAN learns the mapping from a source image domain to a target image domain by taking a visible light image as a target image domain and taking an SAR image as a source image domain, so that the conversion from the source image to the target image is realized. The CycleGAN includes two generators G and one discriminator D. The generator G1 is for converting from the source image domain to the target image domain (i.e. converting the SAR image into a visible light mode image), the generator G2 is for converting from the target domain to the source image domain (i.e. converting the visible light mode image into a SAR image), and the discriminator D is for judging whether the inputted picture is a real visible light image.
The method of the invention is characterized by comprising the following steps:
step 1: acquiring prior information of the SAR image: the method based on the neural network extracts the characteristic vector of the satellite image at the same position as the prior information of the SAR image, so that the target in the visible light image generated by the countermeasure network is clearer.
(1) Feature extraction of satellite images: features are extracted by the convolutional layer and compressed by the pooling layer.
(a) Convolution operation extraction features: input feature map F of hypothetical convolutional layerinHas a parameter of Win×Hin×Cin,WinWidth of input feature graph, HinHeight of input feature map, CinThe number of channels of the input profile is indicated. Convolution parameters of the convolutional layer areK, S, P, Stride, K denotes the number of convolution kernels, S denotes the width and height of the convolution kernels, P denotes the zero padding operation performed on the input feature map, for example, P ═ 1 denotes padding the input feature map by 0 around, Stride denotes the sliding step size of the convolution kernels on the input feature map. The output characteristic diagram F of the convolution layeroutHas a parameter of Wout×Hout×Cout,WoutWidth of the output characteristic diagram, HoutHeight of the output characteristic diagram, CoutThe number of channels representing the output signature is calculated as follows:
the size of the current convolution kernel is generally 3 × 3, P is 1, and Stride is 1, which can ensure that the sizes of the input feature map and the output feature map are consistent.
(b) Pooling layer compression characteristics: the maximum pooling layer is generally adopted, that is, when the feature map is downsampled, the number with the largest value in a 2 × 2 grid is selected and transmitted to the output feature map. In the pooling operation, the number of channels of the input and output feature layers is unchanged, and the size of the output feature map is half of the size of the input feature map.
The feature extraction of the satellite image can be completed through the convolution pooling operation.
Step 2: the generator generates a visible light image: the designed generator is provided with two input interfaces, wherein the first interface receives the SAR image, the second interface receives the feature vector of the satellite image extracted in the first step, and then a visible light image is generated under the action of the encoder, the converter and the decoder.
(1) Encoder for encoding a video signal
The SAR image is input into an encoder, and the encoder extracts the characteristic information of the SAR image and expresses the characteristic information by a characteristic vector. The encoder consists of three convolutional layers, one 7 x 7 convolutional layer with 32 filters and step 1, one 3 x 3 convolutional layer with 64 filters and step 2, and one 3 x 3 convolutional layer with 128 filters and step 2. A SAR image with the size of [256, 3] is input into a designed encoder, convolution kernels with different sizes in the encoder move on the input image and extract features, and a feature vector with the size of [64, 256] is obtained.
(2) Converter
The role of the converter is to combine different close features of the SAR image and then, based on these features, determine how to convert into a feature vector of the image in the target domain (visible light image). As the characteristic vector of the satellite image is obtained in the step 1 and is used as the prior information of the SAR image, the encoder obtains the characteristic vector of the SAR image. Therefore, two different feature vectors are first fused to be input as features of the decoder. The decoder consists of several residual blocks, the purpose of which is to ensure that the input data information of the previous network layer is directly applied to the following network layer, so that the deviation of the corresponding output (feature vector of the visible light image) from the original input is reduced.
(3) Decoder
The feature vector of the image (visible light image) in the target domain obtained by the converter is taken as input, the decoder receives the input and restores low-level features from the feature vector, the decoding process and the encoding process are completely opposite, and the whole decoding process adopts a transposed convolution layer. Finally, the low-level features are converted to obtain an image in the target image domain, i.e., a visible light image.
And step 3: the discriminator discriminates the visible light image: the picture output by the generator is input into a trained discriminator D, which generates a score D. The closer the output is to the image in the target domain (i.e., the visible light image), the closer the value of d is to 1; otherwise the closer the value of d is to 0. The discriminator D judges whether or not the generated image is a visible light image. The judgment of D is completed by calculating the judgment loss.
The discrimination loss is:
LGAN(GAB,D,A,B)=Eb~B[logD(b)]+Ea~A[log(1-D(GAB(a)))](2)
wherein, A is a source image domain (SAR image), and B is a target image domain(visible light image), a is SAR image in source image domain, b is visible light image in target image domain, GABIs a generator from a source image domain A to a target image domain B, and D is a discriminator. The training process is to make discrimination lose LGAN(GABD, A, B) are as small as possible.
And 4, step 4: verifying the feature similarity of the generated pictures: since the discriminator D can only judge whether the picture generated by the generator is in the visible light image style, the target feature in the picture cannot be discriminated well. In order to prevent the generation of modelCollapse (model folding), the generator has memory. And simultaneously training the other generator, converting the visible light image generated by the generator in the step two into the SAR image, wherein the network architectures of the two generators are completely the same. And verifying the characteristic similarity of the generated pictures by calculating the generation loss.
The generation loss is:
LGAN(GAB,GBA,A,B)=Ea~A[||GBA(GAB(a))-a||1](3)
the generation loss is the Euclidean distance (the Euclidean distance refers to the real distance between two points in m-dimensional space) of two SAR images, wherein A is a source image domain (SAR image), B is a target image domain (visible light image), and G isABFor the generator from the source image domain A to the target image domain B, GBAFrom the target image domain B to the source image domain a. a is the SAR image in the source image domain, GAB(GBA(a) Is the generated SAR image. In the training process, L is requiredGAN(GAB,GBAA, B) are as small as possible.
Advantageous effects
The invention provides a mode method for converting an SAR image into a visible light image based on a countermeasure generation network. The method effectively solves the problem that the traditional model-based method cannot effectively detect and identify the target in the SAR image due to relatively low resolution and fuzzy image quality of the SAR image. The method not only retains the advantages of the SAR image, but also can effectively utilize the existing image processing method, reduce the limitation caused by the SAR image quality problem, has low research cost and considerable research value, and plays an important role in the national economy and military fields.
Drawings
FIG. 1: general framework diagram of the inventive method.
Fig. 2 (a): a network structure diagram of a generator in a countermeasure network is generated.
Fig. 2 (b): a network structure diagram of discriminators in the countermeasure network is generated.
Detailed Description
The invention will now be further described with reference to the examples, figure 1 and figures 2(a) and 2 (b):
the hardware environment tested here was: GPU: intel to strong series, memory: 8G, hard disk: 500G mechanical hard disk, independent display card: NVIDIA GeForce GTX 1080Ti, 11G; the system environment is Ubuntu 16.0.4; the software environment was python3.6, Tensorflow-GPU. The experiment of actually measured data is performed aiming at a mode conversion method of an SAR image and a visible light image. Firstly, obtaining an SAR image (about 5000 images with the image size of 256 x 256) aerial photographed in the flying process of a military aircraft, then obtaining satellite images (5000 images with the image size of 256 x 256) at the same position by combining with satellites, inputting the satellite images into a network built by people, iteratively training the network for 200000 times, wherein the basic learning rate is 0.0002, the learning rate is changed every 100000 times, an Adam optimizer is adopted in the training process, and the model is saved every 10000 times in the training process. Through practical tests, the generated image not only has the property of a visible light image, but also can well show the target characteristics which are not clearly seen in the original SAR image.
The invention is implemented as follows:
step 1: acquiring prior information of the SAR image: the method based on the neural network extracts the characteristic vector of the satellite image at the same position as the prior information of the aerial SAR image, so that the target in the visible light image generated by the countermeasure network is clearer.
(1) Feature extraction of satellite images: features are extracted by the convolutional layer and compressed by the pooling layer.
(a) Convolution operation extractionIs characterized in that: example input feature map (i.e., satellite image) F of the convolutional layer we designedinThe parameter of (3) is 256 × 3, 256 indicates the width of the input feature map, 256 indicates the height of the input feature map, and 3 indicates the number of channels of the input feature map. The convolution parameters of the convolution layer are K, S, P, Stride, where K denotes the number of convolution kernels, S denotes the width and height of the convolution kernels, P denotes the zero padding operation performed on the input feature map, for example, P ═ 1 denotes the padding of 0 around the input feature map, and Stride denotes the sliding step of the convolution kernels on the input feature map. In the example we use convolution layers with convolution parameters of 64, 3, 1, respectively. The output characteristic diagram F of the convolution layeroutThe parameter of (3) is 256 × 64,256 represents the width of the output feature map, 256 represents the height of the output feature map, and 64 represents the number of channels of the output feature map, and the parameter is calculated as follows:
wherein, Win,HinAnd CinParameters representing input feature maps, Wout,HoutAnd CoutParameters representing the output profile obtained after convolution of each convolutional layer.
(b) Pooling layer compression characteristics: the maximum pooling layer is adopted to pool the output characteristic diagram obtained after convolution layer convolution, namely when the characteristic diagram is subjected to down-sampling, the number with the maximum median value in a 2 x 2 grid is selected and transmitted to the output characteristic diagram. In the pooling operation, the number of channels of the input and output feature layers is unchanged, and the size of the output feature map is half of the size of the input feature map. We pooled only the first, third, and fourth convolutional layers in the experiment.
Through the convolution pooling operation, the feature extraction of the image can be completed, and the feature vector of the satellite image at the same position is obtained, and the size of the vector is [256, 64 ].
Step 2: training a first generator to generate a visible light image: the generator designed by the inventor is provided with two input interfaces, the first interface receives the SAR image, and the size of the SAR image in the experiment is 256 × 3; and the second interface receives the feature vector of the satellite image extracted in the first step, and then generates a visible light image through the actions of the encoder, the converter and the decoder.
(1) Encoder for encoding a video signal
The SAR image is input into an encoder, and the encoder extracts the characteristic information of the SAR image and expresses the characteristic information by a characteristic vector. The encoder consists of three convolutional layers, one 7 x 7 convolutional layer with 32 filters and step 1, one 3 x 3 convolutional layer with 64 filters and step 2, and one 3 x 3 convolutional layer with 128 filters and step 2. In the experiment, the output scale of the first convolution module was 256 × 64, the output scale of the second convolution module was 256 × 128, and the output scale of the third convolution module was 64 × 256. That is, we input a SAR image with the size of [256, 3] into a designed encoder, and convolution kernels with different sizes in the encoder move on the input image and extract features, and finally obtain a feature vector with the size of [64, 256 ].
(2) Converter
The role of the converter is to combine different close features of the SAR image and then, based on these features, determine how to convert into a feature vector of the image in the target domain (visible light image). Since the prior information of the SAR image, namely the eigenvector of the satellite image is obtained in the step 1, the encoder obtains the eigenvector of the SAR image. Therefore, two different feature vectors are first fused and then input as features of the decoder. The decoder consists of several residual blocks, the purpose of which is to ensure that the input data information of the previous network layer is directly applied to the following network layer, so that the deviation of the corresponding output (feature vector of the visible light image) from the original input is reduced. We used 9 residual blocks in the experiment, each consisting of two 3 x 3 convolutional layers with 256 filters and step 2, and the output scale of the 9 th residual block was 64 x 256.
(3) Decoder
The feature vector of the image (visible light image) in the target domain obtained by the converter is taken as input, the decoder receives the input and restores low-level features from the feature vector, the decoding process and the encoding process are completely opposite, and the whole decoding process adopts a transposed convolution layer. Finally, the low-level features are converted to obtain an image in the target image domain, i.e., a visible light image. In the experiment, three deconvolution modules are defined, each deconvolution module receives the output of the last module as input, and the first deconvolution module receives the output of the 9 th residual block in the converter as input. Wherein the first deconvolution module consists of a 3 x 3 convolutional layer with 128 filters and steps 2, and the output scale is 128 x 128; the second deconvolution module consisted of a 3 x 3 convolutional layer with 64 filters and steps 2, with an output scale of 256 x 64; the third deconvolution module consisted of a 7 x 7 convolutional layer with 3 filters and steps 1, with an output scale of 256 x 3. And finally, activating through the tanh function to obtain the generated output.
And step 3: training a discriminator to discriminate the visible light image: the picture output by the generator is input into a trained discriminator D, which generates a score D. The closer the output is to the target domain image (i.e., the visible light image), the closer the value of d is to 1; otherwise the closer the value of d is to 0. The discriminator D judges whether or not the generated image is a visible light image. The judgment of D is completed by calculating the judgment loss.
The discrimination loss is:
LGAN(GAB,D,A,B)=Eb~B[logD(b)]+Ea~A[log(1-D(GAB(a)))](5)
wherein, A is a source image domain (SAR image), B is a target image domain (visible light image), a is the SAR image in the source image domain, B is the visible light image in the target image domain, GABIs a generator from a source image domain A to a target image domain B, and D is a discriminator. The training process is to make discrimination lose LGAN(GABD, A, B) are as small as possible.
In the experiment, 5 convolution modules are designed for a generator, and the last convolution module is followed by a sigmoid layer to control the output within the range of 0 to 1. The first convolution module consists of a 4 x 4 convolution layer with 64 filters and steps 2, with an output scale of 128 x 64; the second convolution module consists of a 4 x 4 convolution layer with 128 filters and steps 2, with an output scale of 64 x 128; the third convolution module consists of a 4 x 4 convolution layer with 256 filters and steps 2, with an output scale of 32 x 256; the fourth convolution module consists of a 4 x 4 convolution layer with 512 filters and steps 1, with an output scale of 32 x 512; the fifth convolution module consists of a 4 x 4 convolution layer with 1 filter and step 1, with an output scale of 32 x 1. And inputting the output of the fifth convolution module into a sigmoid layer, and activating a sigmoid function to obtain the final output.
Step 4, training a second generator, and verifying the feature similarity of the generated pictures: since the discriminator D can only judge whether the picture generated by the generator is of a visible light style, the target feature in the picture cannot be discriminated well. In order to prevent the generation of the model Collapse (model folding), the generator has memory. And simultaneously training another generator, and converting the visible light image generated by the generator in the step two into an SAR image. And verifying the characteristic similarity of the generated pictures by calculating the generation loss.
The generation loss is:
LGAN(GAB,GBA,A,B)=Ea~A[||GBA(GAB(a))-a||1](6)
the generation loss is the Euclidean distance (the Euclidean distance refers to the real distance between two points in m-dimensional space) of two SAR images, wherein A is a source image domain (SAR image), B is a target image domain (visible light image), and G isABFor the generator from the source image domain A to the target image domain B, GBAFrom the target image domain B to the source image domain a. a is the original SAR image, GBA(GAB(a) Is the generated SAR image. In the training process, L is requiredGAN(GAB,GBAA, B) are as small as possible.
In the experiment, the network architectures of two generators designed by the inventor are completely the same (see step two for details)). During the course of the experiment, we recorded three logs of production losses. The first is to calculate the generation loss of the visible light image generated from the SAR image, LGAN(GAB,A,B)=Ea~A[||GAB(a)-a||1]Represents; the second is to calculate the generation loss of the SAR image reconstructed from the visible light image, LGAN(GAB,GBA,A,B)=Ea~A[||GBA(GAB(a))-GAB(a)||1]Represents; the third is the generation loss of the whole generator, which is represented by LGAN(GAB,GBA,A,B)=Ea~A[||GBA(GAB(a))-a||1]And (4) showing. Wherein, A is a source image domain (SAR image), B is a target image domain (visible light image), GABFor the generator from the source image domain A to the target image domain B, GBAFrom the target image domain B to the source image domain a. a is the original SAR image, GAB(a) Is the visible light image generated, GBA(GAB(a) Is the generated SAR image.
Claims (10)
1. A mode conversion method based on SAR images and visible light images of a countermeasure generation network is characterized by comprising the following steps:
step 1: acquiring prior information of the SAR image:
extracting a characteristic vector of a satellite image at the same position as prior information of an SAR image based on a neural network method, so that a target in a visible light image generated by a countermeasure network is clearer;
step 2: the generator generates a visible light image:
the designed generator is provided with two input interfaces, the first interface receives the SAR image, the second interface receives the eigenvector of the satellite image extracted in the first step, and then a visible light image is generated under the action of the encoder, the converter and the decoder;
and step 3: the discriminator discriminates the visible light image:
inputting the picture output by the generator into a trained discriminator D, wherein the discriminator D can generate a score D; the closer the output is to the image in the target domain, the closer the value of d is to 1; otherwise, the closer the value of d is to 0; judging whether the generated image is a visible light image or not by a discriminator D; the judgment of the discriminator D is completed by calculating the discrimination loss;
and 4, step 4: verifying the feature similarity of the generated pictures: the discriminator D can only judge whether the picture generated by the generator is in the visible light image style or not, and cannot well discriminate the target characteristics in the picture; in order to prevent the model from being folded into ModCollapse, namely, the generator has memory; training the other generator simultaneously, converting the visible light image generated by the generator in the step two into an SAR image, wherein the network architectures of the two generators are completely the same; and verifying the characteristic similarity of the generated pictures by calculating the generation loss.
2. The method of claim 1, wherein the method comprises the following steps: the feature extraction of the satellite image is to extract features through a convolution layer and compress the features through a pooling layer;
convolution operation extraction features: input feature map F of hypothetical convolutional layerinHas a parameter of Win×Hin×Cin,WinWidth of input feature graph, HinHeight of input feature map, CinRepresenting the number of channels of the input feature map; the convolution parameters of the convolution layer are K, S, P and Stride, wherein K represents the number of convolution kernels, S represents the width and height of the convolution kernels, P represents the zero padding operation on the input feature map, P-1 represents the padding of a circle 0 around the input feature map, and Stride represents the sliding step length of the convolution kernels on the input feature map; the output characteristic diagram F of the convolution layeroutHas a parameter of Wout×Hout×Cout,WoutWidth of the output characteristic diagram, HoutHeight of the output characteristic diagram, CoutThe number of channels representing the output signature is calculated as follows:
pooling layer compression characteristics: and (4) adopting a maximum pooling layer, namely selecting the number with the maximum value in a 2 x 2 grid to be transmitted to an output feature map when the feature map is subjected to down-sampling.
3. The method of claim 2, wherein the method comprises the following steps: the size of the convolution kernel is 3 × 3, P is 1, and Stride is 1, so that the sizes of the input feature map and the output feature map are consistent.
4. The method of claim 2, wherein the method comprises the following steps: in the pooling operation, the number of channels of the input and output feature layers is unchanged, and the size of the output feature map is half of the size of the input feature map.
5. The method of claim 1, wherein the method comprises the following steps: the SAR image is input into an encoder, and the encoder extracts the characteristic information of the SAR image and expresses the characteristic information by a characteristic vector.
6. Mode conversion method based on SAR images and visible light images of a countermeasure generation network according to claim 1 or 5, characterized in that: the encoder consists of three convolutional layers, one with 32 filters and 7 x 7 with step 1, one with 64 filters and 3 x 3 with step 2, and one with 128 filters and 3 x 3 with step 2; inputting a SAR image with the size of [256, 3] into a designed encoder, moving convolution kernels with different sizes in the encoder on the input image and extracting features to obtain a feature vector with the size of [64, 256 ].
7. The method of claim 1, wherein the method comprises the following steps: the characteristic vector of the satellite image is obtained in the step 1 and is used as prior information of the SAR image, and the encoder obtains the characteristic vector of the SAR image; therefore, two different feature vectors are fused firstly and can be used as the feature input of the decoder; the decoder consists of several residual blocks, which are used to ensure that the input data information of the previous network layer is directly applied to the following network layer, so that the deviation of the corresponding output from the original input is reduced.
8. The method of claim 1, wherein the method comprises the following steps: the feature vector of the image in the target domain obtained by the converter is used as input, the decoder receives the input and restores low-level features from the feature vector, the decoding process and the encoding process are completely opposite, and the whole decoding process adopts a transposed convolution layer; finally, the low-level features are converted to obtain an image in the target image domain, i.e., a visible light image.
9. The method of claim 1, wherein the method comprises the following steps: the discrimination loss is:
LGAN(GAB,D,A,B)=-(Eb~B[logD(b)]+Ea~A[log(1-D(GAB(a)))]) (2)
wherein, A is a source image domain, B is a target image domain, a is an SAR image in the source image domain, B is a visible light image in the target image domain, GABA generator from a source image domain A to a target image domain B, and a discriminator; the training process is to make discrimination lose LGAN(GABD, A, B) are as small as possible.
10. The method of claim 1, wherein the method comprises the following steps: the generation loss is:
LGAN(GAB,GBA,A,B)=Ea~A[||GBA(GAB(a))-a||1](3)
the generation loss is the Euclidean distance of two SAR images, wherein A is a source image domain, B is a target image domain, and G isABFor the generator from the source image domain A to the target image domain B, GBAA generator from a target image domain B to a source image domain A; a is the SAR image in the source image domain, GAB(GBA(a) Is the generated SAR image; in the training process, L is requiredGAN(GAB,GBAA, B) are as small as possible.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811405188.2A CN109636742B (en) | 2018-11-23 | 2018-11-23 | Mode conversion method of SAR image and visible light image based on countermeasure generation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811405188.2A CN109636742B (en) | 2018-11-23 | 2018-11-23 | Mode conversion method of SAR image and visible light image based on countermeasure generation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109636742A CN109636742A (en) | 2019-04-16 |
CN109636742B true CN109636742B (en) | 2020-09-22 |
Family
ID=66069278
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811405188.2A Active CN109636742B (en) | 2018-11-23 | 2018-11-23 | Mode conversion method of SAR image and visible light image based on countermeasure generation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109636742B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188667B (en) * | 2019-05-28 | 2020-10-30 | 复旦大学 | Face rectification method based on three-party confrontation generation network |
CN110197517B (en) * | 2019-06-11 | 2023-01-31 | 常熟理工学院 | SAR image coloring method based on multi-domain cycle consistency countermeasure generation network |
CN110210574B (en) * | 2019-06-13 | 2022-02-18 | 中国科学院自动化研究所 | Synthetic aperture radar image interpretation method, target identification device and equipment |
CN110472627B (en) * | 2019-07-02 | 2022-11-08 | 五邑大学 | End-to-end SAR image recognition method, device and storage medium |
CN110363163B (en) * | 2019-07-18 | 2021-07-13 | 电子科技大学 | SAR target image generation method with controllable azimuth angle |
WO2021016352A1 (en) * | 2019-07-22 | 2021-01-28 | Raytheon Company | Machine learned registration and multi-modal regression |
GB2595122B (en) * | 2019-08-13 | 2022-08-24 | Univ Of Hertfordshire Higher Education Corporation | Method and apparatus |
CN111047525A (en) * | 2019-11-18 | 2020-04-21 | 宁波大学 | Method for translating SAR remote sensing image into optical remote sensing image |
CN112330562B (en) * | 2020-11-09 | 2022-11-15 | 中国人民解放军海军航空大学 | Heterogeneous remote sensing image transformation method and system |
US11915401B2 (en) | 2020-12-09 | 2024-02-27 | Shenzhen Institutes Of Advanced Technology | Apriori guidance network for multitask medical image synthesis |
CN112668621B (en) * | 2020-12-22 | 2023-04-18 | 南京航空航天大学 | Image quality evaluation method and system based on cross-source image translation |
CN113554671A (en) * | 2021-06-23 | 2021-10-26 | 西安电子科技大学 | Method and device for converting SAR image into visible light image based on contour enhancement |
CN113361508B (en) * | 2021-08-11 | 2021-10-22 | 四川省人工智能研究院(宜宾) | Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite |
CN114202679A (en) * | 2021-12-01 | 2022-03-18 | 昆明理工大学 | Automatic labeling method for heterogeneous remote sensing image based on GAN network |
CN115186814B (en) * | 2022-07-25 | 2024-02-13 | 南京慧尔视智能科技有限公司 | Training method, training device, electronic equipment and storage medium of countermeasure generation network |
CN117611644A (en) * | 2024-01-23 | 2024-02-27 | 南京航空航天大学 | Method, device, medium and equipment for converting visible light image into SAR image |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101405435B1 (en) * | 2012-12-14 | 2014-06-11 | 한국항공우주연구원 | Method and apparatus for blending high resolution image |
CN105809194A (en) * | 2016-03-08 | 2016-07-27 | 华中师范大学 | Method for translating SAR image into optical image |
CN108510532A (en) * | 2018-03-30 | 2018-09-07 | 西安电子科技大学 | Optics and SAR image registration method based on depth convolution GAN |
CN108564606A (en) * | 2018-03-30 | 2018-09-21 | 西安电子科技大学 | Heterologous image block matching method based on image conversion |
CN108717698A (en) * | 2018-05-28 | 2018-10-30 | 深圳市唯特视科技有限公司 | A kind of high quality graphic generation method generating confrontation network based on depth convolution |
-
2018
- 2018-11-23 CN CN201811405188.2A patent/CN109636742B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101405435B1 (en) * | 2012-12-14 | 2014-06-11 | 한국항공우주연구원 | Method and apparatus for blending high resolution image |
CN105809194A (en) * | 2016-03-08 | 2016-07-27 | 华中师范大学 | Method for translating SAR image into optical image |
CN108510532A (en) * | 2018-03-30 | 2018-09-07 | 西安电子科技大学 | Optics and SAR image registration method based on depth convolution GAN |
CN108564606A (en) * | 2018-03-30 | 2018-09-21 | 西安电子科技大学 | Heterologous image block matching method based on image conversion |
CN108717698A (en) * | 2018-05-28 | 2018-10-30 | 深圳市唯特视科技有限公司 | A kind of high quality graphic generation method generating confrontation network based on depth convolution |
Also Published As
Publication number | Publication date |
---|---|
CN109636742A (en) | 2019-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109636742B (en) | Mode conversion method of SAR image and visible light image based on countermeasure generation network | |
US11402494B2 (en) | Method and apparatus for end-to-end SAR image recognition, and storage medium | |
CN111145131B (en) | Infrared and visible light image fusion method based on multiscale generation type countermeasure network | |
WO2023050746A1 (en) | Method for enhancing sar image data for ship target detection | |
CN112083422A (en) | Single-voyage InSAR system end-to-end classification method based on multistage deep learning network | |
CN115236655B (en) | Landslide identification method, system, equipment and medium based on fully-polarized SAR | |
CN111784560A (en) | SAR and optical image bidirectional translation method for generating countermeasure network based on cascade residual errors | |
CN114241003B (en) | All-weather lightweight high-real-time sea surface ship detection and tracking method | |
CN112766223A (en) | Hyperspectral image target detection method based on sample mining and background reconstruction | |
CN113222824B (en) | Infrared image super-resolution and small target detection method | |
CN113408540B (en) | Synthetic aperture radar image overlap area extraction method and storage medium | |
Long et al. | Dual self-attention Swin transformer for hyperspectral image super-resolution | |
CN113111706B (en) | SAR target feature unwrapping and identifying method for azimuth continuous deletion | |
CN113050083A (en) | Ultra-wideband radar human body posture reconstruction method based on point cloud | |
Drees et al. | Multi-modal deep learning with sentinel-3 observations for the detection of oceanic internal waves | |
CN111126508A (en) | Hopc-based improved heterogeneous image matching method | |
CN116071664A (en) | SAR image ship detection method based on improved CenterNet network | |
CN110263777B (en) | Target detection method and system based on space-spectrum combination local preserving projection algorithm | |
CN113762271A (en) | SAR image semantic segmentation method and system based on irregular convolution kernel neural network model | |
CN114912499A (en) | Deep learning-based associated imaging method and system | |
CN115457120A (en) | Absolute position sensing method and system under GPS rejection condition | |
Zhang et al. | Structural similarity preserving GAN for infrared and visible image fusion | |
Li et al. | Transformer meets GAN: Cloud-free multispectral image reconstruction via multi-sensor data fusion in satellite images | |
CN115909045B (en) | Two-stage landslide map feature intelligent recognition method based on contrast learning | |
Hu et al. | Detection of Tea Leaf Blight in Low-Resolution UAV Remote Sensing Images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |