CN113221923A - Feature decomposition method and system for multi-mode image block matching - Google Patents
Feature decomposition method and system for multi-mode image block matching Download PDFInfo
- Publication number
- CN113221923A CN113221923A CN202110605524.3A CN202110605524A CN113221923A CN 113221923 A CN113221923 A CN 113221923A CN 202110605524 A CN202110605524 A CN 202110605524A CN 113221923 A CN113221923 A CN 113221923A
- Authority
- CN
- China
- Prior art keywords
- features
- sar
- image block
- image
- private
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000005457 optimization Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 230000003287 optical effect Effects 0.000 claims description 111
- 238000012360 testing method Methods 0.000 claims description 26
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 6
- 230000017105 transposition Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 4
- 241000209202 Bromus secalinus Species 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 16
- 238000000605 extraction Methods 0.000 abstract description 4
- 238000003860 storage Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 238000004590 computer program Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 238000004088 simulation Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 238000009826 distribution Methods 0.000 description 5
- 238000012800 visualization Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Neurology (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a characteristic decomposition method and a system for multi-modal image block matching, which are characterized in that a data set of a heterogeneous image block is manufactured; preprocessing an image; performing feature extraction by using an encoder; characteristic decomposition; reconstructing the image block with a decoder; a discriminator; optimizing a network; predicting the matching probability; finally, evaluating network performance shows that the features of the image blocks are decomposed into public features and private features, a countertraining optimization encoder is introduced, reconstruction loss is utilized to ensure that the encoder can extract the information features, and the original image is reconstructed based on the public features and the private features to obtain a final image block matching result. The method utilizes the combined optimization of the four loss functions, not only greatly improves the accuracy of the matching of the heterogeneous images, but also shortens the training period of the network.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a feature decomposition method and a feature decomposition system for multi-mode image block matching, which can be used for target tracking, heterologous image registration, image retrieval and the like, and can effectively improve the matching precision of heterologous images.
Background
The purpose of image block matching is to establish local correspondence between image blocks. The method has important application in various remote sensing image processing such as remote sensing image registration, change detection, image splicing, image fusion and the like. In recent years, it has become a trend to acquire more abundant information using images of different sensors. However, the multimode image such as the optical image and the sar image has great difference in appearance and texture, which brings great difficulty to image mosaic matching.
The traditional image block matching method relies on a manually constructed descriptor, and obtains the corresponding relation between image blocks, such as SIFT, according to the characteristic distance. However, the accuracy and robustness of these methods still remain to be improved. In recent years, deep learning has achieved remarkable results in the field of image processing. Descriptors learned from deep networks are more adaptive and robust than artificially designed descriptors, such as MatchNet, L2Net, HardNet. These methods achieve better results in homogeneous patch matching tasks, but still have certain limitations in multi-modal patch matching tasks, especially in optical and sar patch matching tasks. Since the optical image and the sar image have a relatively large difference in appearance and texture, it is difficult to achieve a more accurate matching.
The existing method adopts a progressive sampling strategy to ensure that a network can obtain a large number of training samples within a few rounds, emphasizes the relative distance between descriptors, additionally supervises an intermediate characteristic diagram, considers the compactness of the descriptors, and outputs the descriptors in Euclidean space through L2The distances are matched and very significant performance is achieved. And a batch-based sampling strategy is used for mining the negative samples, so that the problem of sample imbalance in the L2Net is solved, namely the distance between the nearest positive sample and the nearest negative sample in one batch is maximized, and the performance is better. Both methods do not take into account the decomposition of the features.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and a system for feature decomposition for multi-modal image block matching, which significantly improve the performance of multi-modal image matching.
The invention adopts the following technical scheme:
a feature decomposition method for multi-modal patch matching, comprising the steps of:
s1, making a data set;
s2, normalizing the pixel values of all image blocks in the data set manufactured in the step S1 to [ -1, 1 ];
s3, inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization in the step S2, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder;
s4, performing feature decomposition on the features of the optical image blocks and the features of the sar image blocks obtained in the step S3 to obtain the common features O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
S5, using decoder to process the common characteristic O of the optical image block obtained in the step S4cAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
s6, sending the two public features obtained in the step S4 to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
s7, optimizing the encoder by using the triple loss; calculating reconstruction loss of the reconstructed optical image and the sar image obtained in the step S5 and the corresponding original images respectively; introducing countermeasures to optimize the encoder and the discriminator in step S6, so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and S8, loading the weights trained in the step S7 into the feature decomposition network model, reading all the test set data in sequence, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
Specifically, in step S3, the encoder includes five convolution layers, the size of the convolution kernel is 3 × 3, and the number of convolution kernels is 32, 32, 64, and 128, respectively.
Specifically, in step S4, the feature decomposition module includes two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 × 3, and the number of channels is 128; the convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
Specifically, in step S4, the optical common module and the sar common module are respectively used to extract the common features from the optical image block features and the sar image block features, and the optical special module F is usedpoAnd sar specific module FpsRespectively extracting the private features of the optical image block features and the sar image block features to obtain the public features and the private features of the optical image and the sar image:
Oc,i,Qp,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),FpS(xsar,i)
wherein, Oc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iRepresenting the public and private features of the sar image block.
Specifically, in step S5, the decoder includes two fully-connected layers and four convolutional layers, and the number of neurons in the input layer, the hidden layer, and the output layer is 256, 512, and 1024, respectively; the sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
Specifically, in step S5, reconstructing the input image block based on the public and private features of the input image is as follows:
where De is the number of bits in the decoder,is a reconstructed block of the optical image,is the reconstructed sar image block.
Specifically, in step S6, the discriminator includes five full-link layers, and the number of neurons in each full-link layer is 128, 512, and 2.
Specifically, in step S7, the final loss function obtained by jointly optimizing the four losses is as follows:
L=Ltri+Ladv+λLdiff+Lrec
wherein L istriIs a triplet loss, LadvFor encoder and discriminator countermeasures against losses, LdiffFor differential losses, LrecTo reconstruct the loss, λ is the weight in the loss function.
Further, the triplet loss LtriComprises the following steps:
wherein d isposAnd dnegEuclidean distances of the positive image block pair and the difficult negative block pair respectively;
the differential losses are as follows:
wherein,for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
loss of reconstruction LrecFrom the reconstructed image and the real image, the following is calculated:
where k is the number of pixels in each image block,for the reconstructed optical image block, xopt,iFor an input block of an optical image,for reconstructed sar image block, xsar,iAn input sar image block;
the penalty function of the arbiter is:
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Being opti of inputFeatures of the cal image block extracted by an encoder;
the encoder's penalty function is as follows:
wherein E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block.
Another technical solution of the present invention is a system for feature decomposition for multi-modal patch matching, comprising:
a data module for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
the decomposition module is used for performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block obtained by the characteristic module to obtain the common characteristics O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a feature decomposition method for multi-modal image block matching, which decomposes the features of image blocks into public features and private features through a feature decomposition module, uses the public features for image block matching, and eliminates the influence of larger difference among the multi-modal image blocks, thereby obtaining better matching effect.
Further, an encoder is used for extracting image block features, common characteristics of the bottom layers of the heterogeneous images are deeply mined, and the obtained features are used for subsequent feature decomposition.
Further, in order to eliminate the influence of large differences between the multi-modal image blocks, the feature decomposition module decomposes the features of the image blocks into public features and private features.
Furthermore, compared with the method that all the features of the image are used for matching, the private features are discarded, and only the public features are used for image block matching, modal differences can be eliminated, and a better matching result is obtained.
Further, to ensure that the learned features contain valid information, a decoder is used to reconstruct the image.
Furthermore, the original image is reconstructed based on the public characteristic and the private characteristic, and the encoder can extract the information characteristic.
Further, in order to ensure that the common features learned by the optical image and the sar image are similar, a discriminator is introduced to identify the corresponding modalities and perform counterstudy, and the purpose of the discriminator is to distinguish the common features of the optical image from the common features of the sar image.
Further, the network is optimized through the final loss, and the weight is continuously adjusted to obtain a high matching result.
Further, the encoder is optimized using the triplet losses to make the distance between matched pairs as close as possible and the distance between difficult mismatched pairs as far as possible. In order to extract consistent common features, countermeasures are introduced to optimize the encoder and the discriminator, and the common features learned by the optical image and the sar image are similar. In addition, the reconstruction loss is utilized to ensure that the encoder can extract the informative features in order to reconstruct the original image based on the public and private features. The public and private features are constrained to be different using a disparity penalty.
In conclusion, the method utilizes the four loss functions to jointly optimize, so that the accuracy of the heterogeneous image matching is greatly improved, and the training period of the network is shortened.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of a network framework of the present invention;
FIG. 3 is some exemplary diagrams of the heterogeneous source data sets used in simulation experiments in accordance with the present invention;
FIG. 4 is a diagram illustrating image block matching results according to the present invention;
FIG. 5 shows the distribution visualization of extracted descriptors, wherein (a) is HardNet and (b) is FDNet.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.
The invention provides a characteristic decomposition method for multi-modal image block matching, which is characterized in that a data set of a heterogeneous image block is manufactured; preprocessing an image; performing feature extraction by using an encoder; characteristic decomposition; reconstructing the image block with a decoder; a discriminator; optimizing a network; predicting the matching probability; finally, evaluating the network performance shows that the characteristics of the image block are decomposed into public characteristics and private characteristics, and a confrontation training optimization encoder and a discriminator are introduced. In addition, reconstruction loss is utilized to ensure that the encoder can extract informative features in order to reconstruct the original image based on public and private features. The method of the invention achieves good effect in multi-modal image matching.
Referring to fig. 1 and fig. 2, a feature decomposition method for multi-modal tile matching according to the present invention includes the following steps:
s1, making a data set
Cutting 323464 pairs of patch blocks from the visible light and sar images with 512 × 512 pixel level alignment of 1500 pairs; of these 246121 are used for training on the patch block and the rest are used for testing. The size of the Patch block is 32 × 32;
s2, image preprocessing
Normalizing the pixel values of all image blocks to be between [ -1, 1 ];
s3, extracting image block characteristics by coder
The encoder is composed of five convolutional layers, the size of the convolutional kernel is 3 × 3, and the number of convolutional kernels is 32, 32, 64, and 128, respectively.
Inputting a pair of matched visible optical and sar image blocks P ═ xopt,xsar) Respectively extracting the features of the optical image block and the sar image block by using two encoders sharing weight;
s4, feature decomposition
The feature decomposition module is composed of two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 x 3, and the number of channels is 128. The convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
The features of the optical image block and the features of the sar image block obtained in step S3 are subjected to feature decomposition to obtain the common feature O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
The optical image block feature and the sar image block feature obtained in step S3 are respectively subjected to an optical common module and an sar common module FcExtracting public features from the optical image block features and the sar image block features; using optical specific modules FpoAnd sar specific module FpsAre respectively pairedAnd carrying out private feature extraction on the optical image block features and the sar image block features. In practical applications, the feature decomposition module is implemented by two convolutional layers. And obtaining public characteristics and private characteristics of the optical image and the sar image through characteristic decomposition:
Oc,i,Qp,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),Fps(xsar,i)
wherein, Oc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iRepresenting the public and private features of the sar image block.
S5, reconstructing the image block by the decoder
The decoder is composed of two full-connection layers and four convolution layers, and the number of neurons of the input layer, the hidden layer and the output layer is 256, 512 and 1024 respectively. The sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
To ensure that the learned features contain valid information, the common features O of the optical image blocks obtained by step S4 are used with a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
reconstructing an input image block based on public and private features of the input image:
where De is the number of bits in the decoder,is heavyThe image block of the created optical image is,is the reconstructed sar image block. In the matching task, it is desirable that the common features of the corresponding optical and sar image blocks are as similar as possible.
S6, discriminator
The discriminator is composed of five fully-connected layers. The number of each full connectivity layer neuron was 128, 512 and 2, respectively.
Distinguishing, with a discriminator, that the two common features obtained from step S4 are from either an optical image block or an sar image block, the common features of the corresponding optical and sar image blocks being expected to be as similar as possible;
s7, network optimization
Optimizing the network through a plurality of loss functions, including triple loss, countermeasure loss, reconstruction loss and difference loss;
the triplet losses are used to optimize the encoder so that the distance between matched pairs is as close as possible and the distance between difficult mismatched pairs is as far as possible. In order to extract consistent common features, countermeasures are introduced to optimize the encoder and the discriminator, and the common features learned by the optical image and the sar image are similar. In addition, the reconstruction loss is utilized to ensure that the encoder can extract the informative features in order to reconstruct the original image based on the public and private features. And uses differential losses to constrain the public and private features to be different. 100 epochs are trained; the learning rate of the encoder and the feature decomposition module is 1.0; the learning rate of the discriminator is 1.0; the learning rate of the decoder is 0.0001; the weight λ in the loss function is 0.001 and the Batchsize is 321.
S701, optimizing an encoder by adopting a difficult sample mining strategy and triple loss;
the distance between matched pairs is made as close as possible and the distance between difficult mismatched pairs is made as far as possible. The triad loss is as follows:
dpos=d(Oc,i,Sc,i)
dneg=min(d(Oc,j,Sc,i),d(Oc,i,Sc,j)),i≠j
wherein d isposAnd dnegThe euclidean distances of the pair of positive image blocks and the pair of difficult negative blocks, respectively.
S702, a discriminator is introduced to identify the mode corresponding to the input public characteristic and carry out counterstudy;
the purpose of the discriminator is to distinguish the common features of the optical image from those of the sar image. Thus, the penalty function of the arbiter is:
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Features extracted by the encoder for the input optical image block.
The encoder expects to fool the discriminator into being unable to separate common features of different modalities.
The encoder's penalty function is as follows:
wherein E is an entropy calculation, Fc(xsar,i) The bits extracted by the encoder are the input sar image block.
S703, in the training process, alternately optimizing the encoder and the discriminator;
furthermore, public and private features should be different.
The differential losses are as follows:
wherein,for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
s704, cascading the public characteristic and the private characteristic and inputting the public characteristic and the private characteristic into a decoder to obtain a reconstructed image;
the reconstruction loss is calculated from the reconstructed image and the real image as follows:
where k is the number of pixels in each image block,for the reconstructed optical image block, xopt,iFor an input block of an optical image,for reconstructed sar image block, xsar,iIs the input sar image block.
S705, jointly optimizing four losses, wherein the final loss function is as follows:
L=Ltri+Ladv+λLdiff+Lrec
s8, predicting matching probability
Loading the weights trained in the step S7 into the model, reading all the test set data in sequence, and predicting the matching probability of each pair of image blocks in the test set;
s9, evaluating network performance
The performance of the network on the disparate source data sets is evaluated by the FPR 95.
In another embodiment of the present invention, a feature decomposition system for multi-modal tile matching is provided, where the system can be used to implement the above-mentioned feature decomposition method for multi-modal tile matching, and specifically, the feature decomposition system for multi-modal tile matching includes a data module, a preprocessing module, a feature module, a decomposition module, a reconstruction module, a distinguishing module, an optimization module, and an output module.
The data module is used for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
the decomposition module is used for performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block obtained by the characteristic module to obtain the common characteristics O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
In yet another embodiment of the present invention, a terminal device is provided that includes a processor and a memory for storing a computer program comprising program instructions, the processor being configured to execute the program instructions stored by the computer storage medium. The Processor may be a Central Processing Unit (CPU), or may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and is specifically adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor according to the embodiment of the present invention may be used for feature decomposition operation of multi-modal image block matching, including:
making a data set; normalizing pixel values of all image blocks in a dataset to [ -1, 1](ii) a Inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder; performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block to obtain the common characteristic O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp(ii) a Common features O for optical image blocks using a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image; sending the two public features to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator; using a triplet loss optimization encoder; respectively calculating reconstruction loss of the reconstructed optical image and the sar image and the corresponding original image; introducing countermeasures to optimize the encoder and the discriminator so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features; and loading the trained weight into the feature decomposition network model, sequentially reading all the test set data, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
In still another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), which is a Memory device in a terminal device and is used for storing programs and data. It is understood that the computer readable storage medium herein may include a built-in storage medium in the terminal device, and may also include an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory.
One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to perform the corresponding steps for feature decomposition for multi-modal tile matching in the above embodiments; one or more instructions in the computer-readable storage medium are loaded by the processor and perform the steps of:
making a data set; normalizing pixel values of all image blocks in a dataset to [ -1, 1](ii) a Inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder; performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block to obtain the common characteristic O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp(ii) a Common features O for optical image blocks using a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image; sending the two public features to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator; using a triplet loss optimization encoder; respectively calculating reconstruction loss of the reconstructed optical image and the sar image and the corresponding original image; introducing countermeasures to optimize the encoder and the discriminator so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features; and loading the trained weight into the feature decomposition network model, sequentially reading all the test set data, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Simulation experiment conditions are as follows:
the hardware platform of the simulation experiment of the invention:
intel (r) Core5 processor of dell computer, main frequency 3.20GHz, memory 64 GB; the simulation software platform is as follows: spyder software (python3.5) version.
Simulation experiment content and result analysis:
the simulation experiment of the invention is divided into two simulation experiments.
Referring to FIG. 3, the present invention uses the disclosed data set. 323464 pairs of patch blocks are cut out from the 512 x 512 sized visible light and sar images aligned at the 1500 pair pixel level. Of these 246121 are used for training on the patch block and the rest are used for testing. The size of the Patch block is 32 × 32. The invention uses the trained network weight to predict the matching probability of each group of data in the test set, and the obtained matching result is shown in figure 4. The first line in the figure is an optical image block and the second line is an sar image block. The first column is real matching patches, the second column is real non-matching patches, the third column is fake matching patches, and the fourth column is fake non-matching patches. It can be seen that there is very similar appearance and semantic information between the mismatched image blocks. The appearances of fake non-matching patches are quite dissimilar. Therefore, it is very difficult to match these confusable image blocks according to appearance.
Simulation experiment 1
The performance of the present invention is compared to the prior art. As shown in the table 1 below, the following examples,
TABLE 1
The image block matching results of the method of the present invention and some existing methods, such as Match-Net, L2-Net and HardNet, are shown. The evaluation indexes used are FPR95 and accuracy, and a smaller FPR95 represents a better matching effect. Where FDNet indicates the results of the present invention, and bold indicates the best results.
It can be seen that the FDNet proposed by the present invention achieves the best matching performance in terms of FPR95 and matching accuracy. Specifically, FPR95 was reduced by 23.1%, 15.7% and 3.1% compared to Match-Net, L2-Net and HardNet, respectively. Meanwhile, the accuracy of the method is respectively improved by 12.4%, 6.7% and 1.4%. Furthermore, the performance of HardNet depends on the size of the minilot. The larger the batchsize, the better the performance. When the blocksize of HardNet is increased, the performance is remarkably improved. However, as the batchsize increases, the amount of memory and computation increases. FDNet achieves the best results even with small batch sizes.
Simulation experiment 2
By adopting the method, the result of the distribution visualization of the descriptors extracted by HardNet and FDNet is shown in FIG. 5. Fig. 5(a) is a HardNet visualization, where opt represents a descriptor of a visible light image and sar represents a descriptor of an sar image. Fig. 5(b) shows an FDNet visualization result, where opt _ com represents a public feature of the visible light image, sar _ com represents a public feature of the sar image, opt _ pri represents a private feature of the visible light image, and sar _ pri represents a private feature of the sar image. Keypoint detection is performed on the same pair of visible-sar images using SIFT, and 32 × 32 sized image blocks are cropped around keypoints. The image blocks are input into HardNet and FDNet extraction descriptors, respectively. The descriptors extracted by HardNet and FDNet were finally processed and visualized separately.
It can be seen that HardNet utilizes a way of sharing weights to constrain the descriptor distribution of visible light images and sar images with certain effect, but has an improved space. FDNet utilizes the mode of feature decomposition and antagonism training to retrain descriptor distribution, makes the private characteristic and the public characteristic separation of light image and sar image, makes the public characteristic distribution of visible light image and sar image more approximate simultaneously, has obtained better matching effect.
In conclusion, the characteristic decomposition method for multi-modal image block matching disclosed by the invention has the advantages that experimental results show that the method achieves a good effect in a multi-modal image matching task.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.
Claims (10)
1. A method of feature decomposition for multi-modal patch matching, comprising the steps of:
s1, making a data set;
s2, normalizing the pixel values of all image blocks in the data set manufactured in the step S1 to [ -1, 1 ];
s3, inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization in the step S2, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder;
s4, performing feature decomposition on the features of the optical image blocks and the features of the sar image blocks obtained in the step S3 to obtain the common features O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
S5, using decoder to process the common characteristic O of the optical image block obtained in the step S4cAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
s6, sending the two public features obtained in the step S4 to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
s7, optimizing the encoder by using the triple loss; calculating reconstruction loss of the reconstructed optical image and the sar image obtained in the step S5 and the corresponding original images respectively; introducing countermeasures to optimize the encoder and the discriminator in step S6, so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and S8, loading the weights trained in the step S7 into the feature decomposition network model, reading all the test set data in sequence, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
2. The method of claim 1, wherein in step S3, the encoder includes five convolutional layers, the size of the convolutional kernel is 3 x 3, and the number of convolutional kernels is 32, 32, 64, 64 and 128, respectively.
3. The method of claim 1, wherein in step S4, the feature decomposition module includes two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 × 3, and the number of channels is 128; the convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
4. The method according to claim 1, wherein in step S4, the optical common module and the sar common module are respectively used to extract common features from the optical image block features and the sar image block features, and the optical special module F is usedpoAnd sar specific module FpsRespectively extracting the private features of the optical image block features and the sar image block features to obtain the public features and the private features of the optical image and the sar image:
Oc,i,Op,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),Fps(xsar,i)
wherein O isc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iPublic and private features representing sar image blocks。
5. The method according to claim 1, wherein in step S5, the decoder comprises two fully-connected layers and four convolutional layers, and the number of neurons in the input layer, the hidden layer and the output layer is 256, 512 and 1024 respectively; the sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
7. The method of claim 1, wherein in step S6, the arbiter comprises five full-connection layers, and the number of each full-connection layer neuron is 128, 512 and 2.
8. The method according to claim 1, wherein in step S7, jointly optimizing the four losses results in a final loss function as follows:
L=Ltri+Ladv+λLdiff+Lrec
wherein L istriIs a triplet loss, LadvFor encoder and discriminator countermeasures against losses, LdiffFor differential losses, LrecTo reconstruct the loss, λ is the weight in the loss function.
9. The method of claim 8, wherein the triplet loss L istriComprises the following steps:
wherein d isposAnd dnegEuclidean distances of the positive image block pair and the difficult negative block pair respectively;
the differential losses are as follows:
wherein,for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
loss of reconstruction LrecFrom the reconstructed image and the real image, the following is calculated:
where k is the number of pixels in each image block,for the reconstructed optical image block, xopt,iFor an input block of an optical image,for reconstructed sar image block, xsar,iAn input sar image block;
the penalty function of the arbiter is:
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Features extracted by an encoder for an input optical image block;
the encoder's penalty function is as follows:
wherein E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block.
10. A feature decomposition system for multi-modal patch matching, comprising:
a data module for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
a decomposition module for performing decomposition on the features of the optical image block and the sar image block obtained by the feature moduleDecomposing the line characteristics to obtain the common characteristics O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp;
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110605524.3A CN113221923B (en) | 2021-05-31 | 2021-05-31 | Feature decomposition method and system for multi-mode image block matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110605524.3A CN113221923B (en) | 2021-05-31 | 2021-05-31 | Feature decomposition method and system for multi-mode image block matching |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113221923A true CN113221923A (en) | 2021-08-06 |
CN113221923B CN113221923B (en) | 2023-02-24 |
Family
ID=77081931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110605524.3A Active CN113221923B (en) | 2021-05-31 | 2021-05-31 | Feature decomposition method and system for multi-mode image block matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113221923B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114722407A (en) * | 2022-03-03 | 2022-07-08 | 中国人民解放军战略支援部队信息工程大学 | Image protection method based on endogenous countermeasure sample |
CN115601576A (en) * | 2022-12-12 | 2023-01-13 | 云南览易网络科技有限责任公司(Cn) | Image feature matching method, device, equipment and storage medium |
CN116597177A (en) * | 2023-03-08 | 2023-08-15 | 西北工业大学 | Multi-source image block matching method based on dual-branch parallel depth interaction cooperation |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105678733A (en) * | 2014-11-21 | 2016-06-15 | 中国科学院沈阳自动化研究所 | Infrared and visible-light different-source image matching method based on context of line segments |
CN108510532A (en) * | 2018-03-30 | 2018-09-07 | 西安电子科技大学 | Optics and SAR image registration method based on depth convolution GAN |
CN108564606A (en) * | 2018-03-30 | 2018-09-21 | 西安电子科技大学 | Heterologous image block matching method based on image conversion |
CN110659680A (en) * | 2019-09-16 | 2020-01-07 | 西安电子科技大学 | Image patch matching method based on multi-scale convolution |
US20200130177A1 (en) * | 2018-10-29 | 2020-04-30 | Hrl Laboratories, Llc | Systems and methods for few-shot transfer learning |
WO2021028650A1 (en) * | 2019-08-13 | 2021-02-18 | University Of Hertfordshire Higher Education Corporation | Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region |
-
2021
- 2021-05-31 CN CN202110605524.3A patent/CN113221923B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105678733A (en) * | 2014-11-21 | 2016-06-15 | 中国科学院沈阳自动化研究所 | Infrared and visible-light different-source image matching method based on context of line segments |
CN108510532A (en) * | 2018-03-30 | 2018-09-07 | 西安电子科技大学 | Optics and SAR image registration method based on depth convolution GAN |
CN108564606A (en) * | 2018-03-30 | 2018-09-21 | 西安电子科技大学 | Heterologous image block matching method based on image conversion |
US20200130177A1 (en) * | 2018-10-29 | 2020-04-30 | Hrl Laboratories, Llc | Systems and methods for few-shot transfer learning |
WO2021028650A1 (en) * | 2019-08-13 | 2021-02-18 | University Of Hertfordshire Higher Education Corporation | Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region |
CN110659680A (en) * | 2019-09-16 | 2020-01-07 | 西安电子科技大学 | Image patch matching method based on multi-scale convolution |
Non-Patent Citations (5)
Title |
---|
ANASTASIYA MISHCHUK ET AL.: "Working hard to know your neighbor"s margins: Local descriptor learning loss", 《ARXIV[CS.CV]》 * |
DOU DUAN ET AL.: "AFD-Net: Aggregated Feature Difference Learning for Cross-Spectral Image Patch Matching", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
SHUANG WANG ET AL.: "Better and Faster: Exponential Loss for Image Patch Matching", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
YURUN TIAN ET AL.: "L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
王若静: "基于特征残差学习和图像转换的异源图像块匹配方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114722407A (en) * | 2022-03-03 | 2022-07-08 | 中国人民解放军战略支援部队信息工程大学 | Image protection method based on endogenous countermeasure sample |
CN114722407B (en) * | 2022-03-03 | 2024-05-24 | 中国人民解放军战略支援部队信息工程大学 | Image protection method based on endogenic type countermeasure sample |
CN115601576A (en) * | 2022-12-12 | 2023-01-13 | 云南览易网络科技有限责任公司(Cn) | Image feature matching method, device, equipment and storage medium |
CN116597177A (en) * | 2023-03-08 | 2023-08-15 | 西北工业大学 | Multi-source image block matching method based on dual-branch parallel depth interaction cooperation |
Also Published As
Publication number | Publication date |
---|---|
CN113221923B (en) | 2023-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107945204B (en) | Pixel-level image matting method based on generation countermeasure network | |
Liu et al. | CNN-enhanced graph convolutional network with pixel-and superpixel-level feature fusion for hyperspectral image classification | |
CN113221923B (en) | Feature decomposition method and system for multi-mode image block matching | |
CN112434721A (en) | Image classification method, system, storage medium and terminal based on small sample learning | |
CN113822209B (en) | Hyperspectral image recognition method and device, electronic equipment and readable storage medium | |
CN111768457B (en) | Image data compression method, device, electronic equipment and storage medium | |
Chen et al. | Dr-tanet: Dynamic receptive temporal attention network for street scene change detection | |
CN110222718B (en) | Image processing method and device | |
Shu et al. | Multiple channels local binary pattern for color texture representation and classification | |
Li et al. | SLViT: Shuffle-convolution-based lightweight Vision transformer for effective diagnosis of sugarcane leaf diseases | |
CN113033454B (en) | Method for detecting building change in urban video shooting | |
CN108898269A (en) | Electric power image-context impact evaluation method based on measurement | |
CN110659680B (en) | Image patch matching method based on multi-scale convolution | |
CN111325766A (en) | Three-dimensional edge detection method and device, storage medium and computer equipment | |
CN111639230B (en) | Similar video screening method, device, equipment and storage medium | |
CN117152823A (en) | Multi-task age estimation method based on dynamic cavity convolution pyramid attention | |
CN112434576A (en) | Face recognition method and system based on depth camera | |
CN110969128A (en) | Method for detecting infrared ship under sea surface background based on multi-feature fusion | |
Kaddar et al. | Divnet: efficient convolutional neural network via multilevel hierarchical architecture design | |
CN115098646A (en) | Multilevel relation analysis and mining method for image-text data | |
CN117437557A (en) | Hyperspectral image classification method based on double-channel feature enhancement | |
CN115063359A (en) | Remote sensing image change detection method and system based on anti-dual-self-encoder network | |
CN115035377A (en) | Significance detection network system based on double-stream coding and interactive decoding | |
CN114445468A (en) | Heterogeneous remote sensing image registration method and system | |
CN114972155A (en) | Polyp image segmentation method based on context information and reverse attention |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |