CN113221923A - Feature decomposition method and system for multi-mode image block matching - Google Patents

Feature decomposition method and system for multi-mode image block matching Download PDF

Info

Publication number
CN113221923A
CN113221923A CN202110605524.3A CN202110605524A CN113221923A CN 113221923 A CN113221923 A CN 113221923A CN 202110605524 A CN202110605524 A CN 202110605524A CN 113221923 A CN113221923 A CN 113221923A
Authority
CN
China
Prior art keywords
features
sar
image block
image
private
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110605524.3A
Other languages
Chinese (zh)
Other versions
CN113221923B (en
Inventor
王爽
魏慧媛
李毅
段宝瑞
权豆
雷睿琪
杨博武
焦李成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110605524.3A priority Critical patent/CN113221923B/en
Publication of CN113221923A publication Critical patent/CN113221923A/en
Application granted granted Critical
Publication of CN113221923B publication Critical patent/CN113221923B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Neurology (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a characteristic decomposition method and a system for multi-modal image block matching, which are characterized in that a data set of a heterogeneous image block is manufactured; preprocessing an image; performing feature extraction by using an encoder; characteristic decomposition; reconstructing the image block with a decoder; a discriminator; optimizing a network; predicting the matching probability; finally, evaluating network performance shows that the features of the image blocks are decomposed into public features and private features, a countertraining optimization encoder is introduced, reconstruction loss is utilized to ensure that the encoder can extract the information features, and the original image is reconstructed based on the public features and the private features to obtain a final image block matching result. The method utilizes the combined optimization of the four loss functions, not only greatly improves the accuracy of the matching of the heterogeneous images, but also shortens the training period of the network.

Description

Feature decomposition method and system for multi-mode image block matching
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a feature decomposition method and a feature decomposition system for multi-mode image block matching, which can be used for target tracking, heterologous image registration, image retrieval and the like, and can effectively improve the matching precision of heterologous images.
Background
The purpose of image block matching is to establish local correspondence between image blocks. The method has important application in various remote sensing image processing such as remote sensing image registration, change detection, image splicing, image fusion and the like. In recent years, it has become a trend to acquire more abundant information using images of different sensors. However, the multimode image such as the optical image and the sar image has great difference in appearance and texture, which brings great difficulty to image mosaic matching.
The traditional image block matching method relies on a manually constructed descriptor, and obtains the corresponding relation between image blocks, such as SIFT, according to the characteristic distance. However, the accuracy and robustness of these methods still remain to be improved. In recent years, deep learning has achieved remarkable results in the field of image processing. Descriptors learned from deep networks are more adaptive and robust than artificially designed descriptors, such as MatchNet, L2Net, HardNet. These methods achieve better results in homogeneous patch matching tasks, but still have certain limitations in multi-modal patch matching tasks, especially in optical and sar patch matching tasks. Since the optical image and the sar image have a relatively large difference in appearance and texture, it is difficult to achieve a more accurate matching.
The existing method adopts a progressive sampling strategy to ensure that a network can obtain a large number of training samples within a few rounds, emphasizes the relative distance between descriptors, additionally supervises an intermediate characteristic diagram, considers the compactness of the descriptors, and outputs the descriptors in Euclidean space through L2The distances are matched and very significant performance is achieved. And a batch-based sampling strategy is used for mining the negative samples, so that the problem of sample imbalance in the L2Net is solved, namely the distance between the nearest positive sample and the nearest negative sample in one batch is maximized, and the performance is better. Both methods do not take into account the decomposition of the features.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and a system for feature decomposition for multi-modal image block matching, which significantly improve the performance of multi-modal image matching.
The invention adopts the following technical scheme:
a feature decomposition method for multi-modal patch matching, comprising the steps of:
s1, making a data set;
s2, normalizing the pixel values of all image blocks in the data set manufactured in the step S1 to [ -1, 1 ];
s3, inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization in the step S2, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder;
s4, performing feature decomposition on the features of the optical image blocks and the features of the sar image blocks obtained in the step S3 to obtain the common features O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
S5, using decoder to process the common characteristic O of the optical image block obtained in the step S4cAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
s6, sending the two public features obtained in the step S4 to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
s7, optimizing the encoder by using the triple loss; calculating reconstruction loss of the reconstructed optical image and the sar image obtained in the step S5 and the corresponding original images respectively; introducing countermeasures to optimize the encoder and the discriminator in step S6, so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and S8, loading the weights trained in the step S7 into the feature decomposition network model, reading all the test set data in sequence, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
Specifically, in step S3, the encoder includes five convolution layers, the size of the convolution kernel is 3 × 3, and the number of convolution kernels is 32, 32, 64, and 128, respectively.
Specifically, in step S4, the feature decomposition module includes two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 × 3, and the number of channels is 128; the convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
Specifically, in step S4, the optical common module and the sar common module are respectively used to extract the common features from the optical image block features and the sar image block features, and the optical special module F is usedpoAnd sar specific module FpsRespectively extracting the private features of the optical image block features and the sar image block features to obtain the public features and the private features of the optical image and the sar image:
Oc,i,Qp,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),FpS(xsar,i)
wherein, Oc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iRepresenting the public and private features of the sar image block.
Specifically, in step S5, the decoder includes two fully-connected layers and four convolutional layers, and the number of neurons in the input layer, the hidden layer, and the output layer is 256, 512, and 1024, respectively; the sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
Specifically, in step S5, reconstructing the input image block based on the public and private features of the input image is as follows:
Figure BDA0003093968570000041
Figure BDA0003093968570000042
where De is the number of bits in the decoder,
Figure BDA0003093968570000043
is a reconstructed block of the optical image,
Figure BDA0003093968570000044
is the reconstructed sar image block.
Specifically, in step S6, the discriminator includes five full-link layers, and the number of neurons in each full-link layer is 128, 512, and 2.
Specifically, in step S7, the final loss function obtained by jointly optimizing the four losses is as follows:
L=Ltri+Ladv+λLdiff+Lrec
wherein L istriIs a triplet loss, LadvFor encoder and discriminator countermeasures against losses, LdiffFor differential losses, LrecTo reconstruct the loss, λ is the weight in the loss function.
Further, the triplet loss LtriComprises the following steps:
Figure BDA0003093968570000045
wherein d isposAnd dnegEuclidean distances of the positive image block pair and the difficult negative block pair respectively;
the differential losses are as follows:
Figure BDA0003093968570000046
wherein,
Figure BDA0003093968570000047
for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,
Figure BDA0003093968570000048
for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
loss of reconstruction LrecFrom the reconstructed image and the real image, the following is calculated:
Figure BDA0003093968570000049
where k is the number of pixels in each image block,
Figure BDA00030939685700000410
for the reconstructed optical image block, xopt,iFor an input block of an optical image,
Figure BDA0003093968570000051
for reconstructed sar image block, xsar,iAn input sar image block;
the penalty function of the arbiter is:
Figure BDA0003093968570000052
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Being opti of inputFeatures of the cal image block extracted by an encoder;
the encoder's penalty function is as follows:
Figure BDA0003093968570000053
wherein E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block.
Another technical solution of the present invention is a system for feature decomposition for multi-modal patch matching, comprising:
a data module for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
the decomposition module is used for performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block obtained by the characteristic module to obtain the common characteristics O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a feature decomposition method for multi-modal image block matching, which decomposes the features of image blocks into public features and private features through a feature decomposition module, uses the public features for image block matching, and eliminates the influence of larger difference among the multi-modal image blocks, thereby obtaining better matching effect.
Further, an encoder is used for extracting image block features, common characteristics of the bottom layers of the heterogeneous images are deeply mined, and the obtained features are used for subsequent feature decomposition.
Further, in order to eliminate the influence of large differences between the multi-modal image blocks, the feature decomposition module decomposes the features of the image blocks into public features and private features.
Furthermore, compared with the method that all the features of the image are used for matching, the private features are discarded, and only the public features are used for image block matching, modal differences can be eliminated, and a better matching result is obtained.
Further, to ensure that the learned features contain valid information, a decoder is used to reconstruct the image.
Furthermore, the original image is reconstructed based on the public characteristic and the private characteristic, and the encoder can extract the information characteristic.
Further, in order to ensure that the common features learned by the optical image and the sar image are similar, a discriminator is introduced to identify the corresponding modalities and perform counterstudy, and the purpose of the discriminator is to distinguish the common features of the optical image from the common features of the sar image.
Further, the network is optimized through the final loss, and the weight is continuously adjusted to obtain a high matching result.
Further, the encoder is optimized using the triplet losses to make the distance between matched pairs as close as possible and the distance between difficult mismatched pairs as far as possible. In order to extract consistent common features, countermeasures are introduced to optimize the encoder and the discriminator, and the common features learned by the optical image and the sar image are similar. In addition, the reconstruction loss is utilized to ensure that the encoder can extract the informative features in order to reconstruct the original image based on the public and private features. The public and private features are constrained to be different using a disparity penalty.
In conclusion, the method utilizes the four loss functions to jointly optimize, so that the accuracy of the heterogeneous image matching is greatly improved, and the training period of the network is shortened.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of a network framework of the present invention;
FIG. 3 is some exemplary diagrams of the heterogeneous source data sets used in simulation experiments in accordance with the present invention;
FIG. 4 is a diagram illustrating image block matching results according to the present invention;
FIG. 5 shows the distribution visualization of extracted descriptors, wherein (a) is HardNet and (b) is FDNet.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.
The invention provides a characteristic decomposition method for multi-modal image block matching, which is characterized in that a data set of a heterogeneous image block is manufactured; preprocessing an image; performing feature extraction by using an encoder; characteristic decomposition; reconstructing the image block with a decoder; a discriminator; optimizing a network; predicting the matching probability; finally, evaluating the network performance shows that the characteristics of the image block are decomposed into public characteristics and private characteristics, and a confrontation training optimization encoder and a discriminator are introduced. In addition, reconstruction loss is utilized to ensure that the encoder can extract informative features in order to reconstruct the original image based on public and private features. The method of the invention achieves good effect in multi-modal image matching.
Referring to fig. 1 and fig. 2, a feature decomposition method for multi-modal tile matching according to the present invention includes the following steps:
s1, making a data set
Cutting 323464 pairs of patch blocks from the visible light and sar images with 512 × 512 pixel level alignment of 1500 pairs; of these 246121 are used for training on the patch block and the rest are used for testing. The size of the Patch block is 32 × 32;
s2, image preprocessing
Normalizing the pixel values of all image blocks to be between [ -1, 1 ];
s3, extracting image block characteristics by coder
The encoder is composed of five convolutional layers, the size of the convolutional kernel is 3 × 3, and the number of convolutional kernels is 32, 32, 64, and 128, respectively.
Inputting a pair of matched visible optical and sar image blocks P ═ xopt,xsar) Respectively extracting the features of the optical image block and the sar image block by using two encoders sharing weight;
s4, feature decomposition
The feature decomposition module is composed of two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 x 3, and the number of channels is 128. The convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
The features of the optical image block and the features of the sar image block obtained in step S3 are subjected to feature decomposition to obtain the common feature O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
The optical image block feature and the sar image block feature obtained in step S3 are respectively subjected to an optical common module and an sar common module FcExtracting public features from the optical image block features and the sar image block features; using optical specific modules FpoAnd sar specific module FpsAre respectively pairedAnd carrying out private feature extraction on the optical image block features and the sar image block features. In practical applications, the feature decomposition module is implemented by two convolutional layers. And obtaining public characteristics and private characteristics of the optical image and the sar image through characteristic decomposition:
Oc,i,Qp,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),Fps(xsar,i)
wherein, Oc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iRepresenting the public and private features of the sar image block.
S5, reconstructing the image block by the decoder
The decoder is composed of two full-connection layers and four convolution layers, and the number of neurons of the input layer, the hidden layer and the output layer is 256, 512 and 1024 respectively. The sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
To ensure that the learned features contain valid information, the common features O of the optical image blocks obtained by step S4 are used with a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
reconstructing an input image block based on public and private features of the input image:
Figure BDA0003093968570000101
Figure BDA0003093968570000102
where De is the number of bits in the decoder,
Figure BDA0003093968570000103
is heavyThe image block of the created optical image is,
Figure BDA0003093968570000104
is the reconstructed sar image block. In the matching task, it is desirable that the common features of the corresponding optical and sar image blocks are as similar as possible.
S6, discriminator
The discriminator is composed of five fully-connected layers. The number of each full connectivity layer neuron was 128, 512 and 2, respectively.
Distinguishing, with a discriminator, that the two common features obtained from step S4 are from either an optical image block or an sar image block, the common features of the corresponding optical and sar image blocks being expected to be as similar as possible;
s7, network optimization
Optimizing the network through a plurality of loss functions, including triple loss, countermeasure loss, reconstruction loss and difference loss;
the triplet losses are used to optimize the encoder so that the distance between matched pairs is as close as possible and the distance between difficult mismatched pairs is as far as possible. In order to extract consistent common features, countermeasures are introduced to optimize the encoder and the discriminator, and the common features learned by the optical image and the sar image are similar. In addition, the reconstruction loss is utilized to ensure that the encoder can extract the informative features in order to reconstruct the original image based on the public and private features. And uses differential losses to constrain the public and private features to be different. 100 epochs are trained; the learning rate of the encoder and the feature decomposition module is 1.0; the learning rate of the discriminator is 1.0; the learning rate of the decoder is 0.0001; the weight λ in the loss function is 0.001 and the Batchsize is 321.
S701, optimizing an encoder by adopting a difficult sample mining strategy and triple loss;
the distance between matched pairs is made as close as possible and the distance between difficult mismatched pairs is made as far as possible. The triad loss is as follows:
Figure BDA0003093968570000111
dpos=d(Oc,i,Sc,i)
dneg=min(d(Oc,j,Sc,i),d(Oc,i,Sc,j)),i≠j
wherein d isposAnd dnegThe euclidean distances of the pair of positive image blocks and the pair of difficult negative blocks, respectively.
S702, a discriminator is introduced to identify the mode corresponding to the input public characteristic and carry out counterstudy;
the purpose of the discriminator is to distinguish the common features of the optical image from those of the sar image. Thus, the penalty function of the arbiter is:
Figure BDA0003093968570000121
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Features extracted by the encoder for the input optical image block.
The encoder expects to fool the discriminator into being unable to separate common features of different modalities.
The encoder's penalty function is as follows:
Figure BDA0003093968570000122
wherein E is an entropy calculation, Fc(xsar,i) The bits extracted by the encoder are the input sar image block.
S703, in the training process, alternately optimizing the encoder and the discriminator;
furthermore, public and private features should be different.
The differential losses are as follows:
Figure BDA0003093968570000123
wherein,
Figure BDA0003093968570000124
for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,
Figure BDA0003093968570000125
for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
s704, cascading the public characteristic and the private characteristic and inputting the public characteristic and the private characteristic into a decoder to obtain a reconstructed image;
the reconstruction loss is calculated from the reconstructed image and the real image as follows:
Figure BDA0003093968570000126
where k is the number of pixels in each image block,
Figure BDA0003093968570000127
for the reconstructed optical image block, xopt,iFor an input block of an optical image,
Figure BDA0003093968570000128
for reconstructed sar image block, xsar,iIs the input sar image block.
S705, jointly optimizing four losses, wherein the final loss function is as follows:
L=Ltri+Ladv+λLdiff+Lrec
s8, predicting matching probability
Loading the weights trained in the step S7 into the model, reading all the test set data in sequence, and predicting the matching probability of each pair of image blocks in the test set;
s9, evaluating network performance
The performance of the network on the disparate source data sets is evaluated by the FPR 95.
In another embodiment of the present invention, a feature decomposition system for multi-modal tile matching is provided, where the system can be used to implement the above-mentioned feature decomposition method for multi-modal tile matching, and specifically, the feature decomposition system for multi-modal tile matching includes a data module, a preprocessing module, a feature module, a decomposition module, a reconstruction module, a distinguishing module, an optimization module, and an output module.
The data module is used for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
the decomposition module is used for performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block obtained by the characteristic module to obtain the common characteristics O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
In yet another embodiment of the present invention, a terminal device is provided that includes a processor and a memory for storing a computer program comprising program instructions, the processor being configured to execute the program instructions stored by the computer storage medium. The Processor may be a Central Processing Unit (CPU), or may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and is specifically adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor according to the embodiment of the present invention may be used for feature decomposition operation of multi-modal image block matching, including:
making a data set; normalizing pixel values of all image blocks in a dataset to [ -1, 1](ii) a Inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder; performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block to obtain the common characteristic O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp(ii) a Common features O for optical image blocks using a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image; sending the two public features to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator; using a triplet loss optimization encoder; respectively calculating reconstruction loss of the reconstructed optical image and the sar image and the corresponding original image; introducing countermeasures to optimize the encoder and the discriminator so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features; and loading the trained weight into the feature decomposition network model, sequentially reading all the test set data, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
In still another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), which is a Memory device in a terminal device and is used for storing programs and data. It is understood that the computer readable storage medium herein may include a built-in storage medium in the terminal device, and may also include an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory.
One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to perform the corresponding steps for feature decomposition for multi-modal tile matching in the above embodiments; one or more instructions in the computer-readable storage medium are loaded by the processor and perform the steps of:
making a data set; normalizing pixel values of all image blocks in a dataset to [ -1, 1](ii) a Inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder; performing characteristic decomposition on the characteristics of the optical image block and the characteristics of the sar image block to obtain the common characteristic O of the optical image blockcAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp(ii) a Common features O for optical image blocks using a decodercAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image; sending the two public features to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator; using a triplet loss optimization encoder; respectively calculating reconstruction loss of the reconstructed optical image and the sar image and the corresponding original image; introducing countermeasures to optimize the encoder and the discriminator so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features; and loading the trained weight into the feature decomposition network model, sequentially reading all the test set data, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Simulation experiment conditions are as follows:
the hardware platform of the simulation experiment of the invention:
intel (r) Core5 processor of dell computer, main frequency 3.20GHz, memory 64 GB; the simulation software platform is as follows: spyder software (python3.5) version.
Simulation experiment content and result analysis:
the simulation experiment of the invention is divided into two simulation experiments.
Referring to FIG. 3, the present invention uses the disclosed data set. 323464 pairs of patch blocks are cut out from the 512 x 512 sized visible light and sar images aligned at the 1500 pair pixel level. Of these 246121 are used for training on the patch block and the rest are used for testing. The size of the Patch block is 32 × 32. The invention uses the trained network weight to predict the matching probability of each group of data in the test set, and the obtained matching result is shown in figure 4. The first line in the figure is an optical image block and the second line is an sar image block. The first column is real matching patches, the second column is real non-matching patches, the third column is fake matching patches, and the fourth column is fake non-matching patches. It can be seen that there is very similar appearance and semantic information between the mismatched image blocks. The appearances of fake non-matching patches are quite dissimilar. Therefore, it is very difficult to match these confusable image blocks according to appearance.
Simulation experiment 1
The performance of the present invention is compared to the prior art. As shown in the table 1 below, the following examples,
TABLE 1
Figure BDA0003093968570000171
Figure BDA0003093968570000181
The image block matching results of the method of the present invention and some existing methods, such as Match-Net, L2-Net and HardNet, are shown. The evaluation indexes used are FPR95 and accuracy, and a smaller FPR95 represents a better matching effect. Where FDNet indicates the results of the present invention, and bold indicates the best results.
It can be seen that the FDNet proposed by the present invention achieves the best matching performance in terms of FPR95 and matching accuracy. Specifically, FPR95 was reduced by 23.1%, 15.7% and 3.1% compared to Match-Net, L2-Net and HardNet, respectively. Meanwhile, the accuracy of the method is respectively improved by 12.4%, 6.7% and 1.4%. Furthermore, the performance of HardNet depends on the size of the minilot. The larger the batchsize, the better the performance. When the blocksize of HardNet is increased, the performance is remarkably improved. However, as the batchsize increases, the amount of memory and computation increases. FDNet achieves the best results even with small batch sizes.
Simulation experiment 2
By adopting the method, the result of the distribution visualization of the descriptors extracted by HardNet and FDNet is shown in FIG. 5. Fig. 5(a) is a HardNet visualization, where opt represents a descriptor of a visible light image and sar represents a descriptor of an sar image. Fig. 5(b) shows an FDNet visualization result, where opt _ com represents a public feature of the visible light image, sar _ com represents a public feature of the sar image, opt _ pri represents a private feature of the visible light image, and sar _ pri represents a private feature of the sar image. Keypoint detection is performed on the same pair of visible-sar images using SIFT, and 32 × 32 sized image blocks are cropped around keypoints. The image blocks are input into HardNet and FDNet extraction descriptors, respectively. The descriptors extracted by HardNet and FDNet were finally processed and visualized separately.
It can be seen that HardNet utilizes a way of sharing weights to constrain the descriptor distribution of visible light images and sar images with certain effect, but has an improved space. FDNet utilizes the mode of feature decomposition and antagonism training to retrain descriptor distribution, makes the private characteristic and the public characteristic separation of light image and sar image, makes the public characteristic distribution of visible light image and sar image more approximate simultaneously, has obtained better matching effect.
In conclusion, the characteristic decomposition method for multi-modal image block matching disclosed by the invention has the advantages that experimental results show that the method achieves a good effect in a multi-modal image matching task.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (10)

1. A method of feature decomposition for multi-modal patch matching, comprising the steps of:
s1, making a data set;
s2, normalizing the pixel values of all image blocks in the data set manufactured in the step S1 to [ -1, 1 ];
s3, inputting a pair of visible light optical image blocks and sar image blocks which are matched after normalization in the step S2, and respectively extracting the image block characteristics of the optical image blocks and the sar image blocks by using an encoder;
s4, performing feature decomposition on the features of the optical image blocks and the features of the sar image blocks obtained in the step S3 to obtain the common features O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
S5, using decoder to process the common characteristic O of the optical image block obtained in the step S4cAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
s6, sending the two public features obtained in the step S4 to a discriminator, and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
s7, optimizing the encoder by using the triple loss; calculating reconstruction loss of the reconstructed optical image and the sar image obtained in the step S5 and the corresponding original images respectively; introducing countermeasures to optimize the encoder and the discriminator in step S6, so that the encoder spoofs the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and S8, loading the weights trained in the step S7 into the feature decomposition network model, reading all the test set data in sequence, predicting the matching probability of each pair of image blocks in the test set, and obtaining the final image block matching result.
2. The method of claim 1, wherein in step S3, the encoder includes five convolutional layers, the size of the convolutional kernel is 3 x 3, and the number of convolutional kernels is 32, 32, 64, 64 and 128, respectively.
3. The method of claim 1, wherein in step S4, the feature decomposition module includes two convolutional layers, the convolutional kernel size of the first convolutional layer is 3 × 3, and the number of channels is 128; the convolution kernel size of the second layer is 8 × 8 and the number of channels is 128.
4. The method according to claim 1, wherein in step S4, the optical common module and the sar common module are respectively used to extract common features from the optical image block features and the sar image block features, and the optical special module F is usedpoAnd sar specific module FpsRespectively extracting the private features of the optical image block features and the sar image block features to obtain the public features and the private features of the optical image and the sar image:
Oc,i,Op,i=Fc(xopt,i),Fpo(xopt,i)
Sc,i,Sp,i=Fc(xsar,i),Fps(xsar,i)
wherein O isc,iAnd Op,iPublic and private features, S, representing an optical image blockc,iAnd Sp,iPublic and private features representing sar image blocks。
5. The method according to claim 1, wherein in step S5, the decoder comprises two fully-connected layers and four convolutional layers, and the number of neurons in the input layer, the hidden layer and the output layer is 256, 512 and 1024 respectively; the sizes of the convolution kernels are 3 × 3, 5 × 5, and 7 × 7, and the numbers of the convolution kernels are 32, 64, 128, and 1, respectively.
6. The method according to claim 1, wherein in step S5, reconstructing the input image block based on the public and private features of the input image is as follows:
Figure FDA0003093968560000021
Figure FDA0003093968560000022
where De is the number of bits in the decoder,
Figure FDA0003093968560000023
is a reconstructed block of the optical image,
Figure FDA0003093968560000024
is the reconstructed sar image block.
7. The method of claim 1, wherein in step S6, the arbiter comprises five full-connection layers, and the number of each full-connection layer neuron is 128, 512 and 2.
8. The method according to claim 1, wherein in step S7, jointly optimizing the four losses results in a final loss function as follows:
L=Ltri+Ladv+λLdiff+Lrec
wherein L istriIs a triplet loss, LadvFor encoder and discriminator countermeasures against losses, LdiffFor differential losses, LrecTo reconstruct the loss, λ is the weight in the loss function.
9. The method of claim 8, wherein the triplet loss L istriComprises the following steps:
Figure FDA0003093968560000031
wherein d isposAnd dnegEuclidean distances of the positive image block pair and the difficult negative block pair respectively;
the differential losses are as follows:
Figure FDA0003093968560000032
wherein,
Figure FDA0003093968560000033
for transposition of common features of optical image blocks, Op,iFor the private features of the optical image block,
Figure FDA0003093968560000034
for transposition of common features of sar image blocks, Sp,iThe private characteristics of the sar image block are that i is 1, …, n is the number of samples;
loss of reconstruction LrecFrom the reconstructed image and the real image, the following is calculated:
Figure FDA0003093968560000035
where k is the number of pixels in each image block,
Figure FDA0003093968560000036
for the reconstructed optical image block, xopt,iFor an input block of an optical image,
Figure FDA0003093968560000037
for reconstructed sar image block, xsar,iAn input sar image block;
the penalty function of the arbiter is:
Figure FDA0003093968560000038
wherein D is a discriminator, E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block, Fc(xopt,i) Features extracted by an encoder for an input optical image block;
the encoder's penalty function is as follows:
Figure FDA0003093968560000041
wherein E is an entropy calculation, Fc(xsar,i) Features extracted by the encoder for the input sar image block.
10. A feature decomposition system for multi-modal patch matching, comprising:
a data module for making a data set;
the preprocessing module normalizes the pixel values of all image blocks in a data set manufactured by the data module to [ -1, 1 ];
the characteristic module is used for inputting the visible light optical image block and the sar image block which are matched after normalization of the pair of preprocessing modules, and extracting the image block characteristics of the optical image block and the sar image block respectively by using an encoder;
a decomposition module for performing decomposition on the features of the optical image block and the sar image block obtained by the feature moduleDecomposing the line characteristics to obtain the common characteristics O of the optical image blockscAnd private characteristics OpCommon features S of sar image blockscAnd private features Sp
A reconstruction module using the common characteristic O of the optical image blocks obtained by the decoder pair decomposition modulecAnd private characteristics OpReconstruction of common features S of optical, sar, image blockscAnd private features SpReconstructing an sar image;
the distinguishing module is used for sending the two public features obtained by the decomposition module to the discriminator and distinguishing the two public features from optical image blocks or sar image blocks through the discriminator;
an optimization module to optimize the encoder using the triplet losses; respectively calculating reconstruction loss of the reconstructed optical image and the sar image obtained by the reconstruction module and the corresponding original image; introducing countermeasures to optimize the encoder and a discriminator in the distinguishing module, so that the encoder cheats the discriminator; constraining public and private features using differential losses; optimizing the feature decomposition network through triple loss, countermeasure loss, reconstruction loss and difference loss to obtain a training weight for subsequent testing of common features;
and the output module loads the trained weight in the optimization module into the feature decomposition network model, sequentially reads all test set data, predicts the matching probability of each pair of image blocks in the test set and obtains the final image block matching result.
CN202110605524.3A 2021-05-31 2021-05-31 Feature decomposition method and system for multi-mode image block matching Active CN113221923B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110605524.3A CN113221923B (en) 2021-05-31 2021-05-31 Feature decomposition method and system for multi-mode image block matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110605524.3A CN113221923B (en) 2021-05-31 2021-05-31 Feature decomposition method and system for multi-mode image block matching

Publications (2)

Publication Number Publication Date
CN113221923A true CN113221923A (en) 2021-08-06
CN113221923B CN113221923B (en) 2023-02-24

Family

ID=77081931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110605524.3A Active CN113221923B (en) 2021-05-31 2021-05-31 Feature decomposition method and system for multi-mode image block matching

Country Status (1)

Country Link
CN (1) CN113221923B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114722407A (en) * 2022-03-03 2022-07-08 中国人民解放军战略支援部队信息工程大学 Image protection method based on endogenous countermeasure sample
CN115601576A (en) * 2022-12-12 2023-01-13 云南览易网络科技有限责任公司(Cn) Image feature matching method, device, equipment and storage medium
CN116597177A (en) * 2023-03-08 2023-08-15 西北工业大学 Multi-source image block matching method based on dual-branch parallel depth interaction cooperation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678733A (en) * 2014-11-21 2016-06-15 中国科学院沈阳自动化研究所 Infrared and visible-light different-source image matching method based on context of line segments
CN108510532A (en) * 2018-03-30 2018-09-07 西安电子科技大学 Optics and SAR image registration method based on depth convolution GAN
CN108564606A (en) * 2018-03-30 2018-09-21 西安电子科技大学 Heterologous image block matching method based on image conversion
CN110659680A (en) * 2019-09-16 2020-01-07 西安电子科技大学 Image patch matching method based on multi-scale convolution
US20200130177A1 (en) * 2018-10-29 2020-04-30 Hrl Laboratories, Llc Systems and methods for few-shot transfer learning
WO2021028650A1 (en) * 2019-08-13 2021-02-18 University Of Hertfordshire Higher Education Corporation Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678733A (en) * 2014-11-21 2016-06-15 中国科学院沈阳自动化研究所 Infrared and visible-light different-source image matching method based on context of line segments
CN108510532A (en) * 2018-03-30 2018-09-07 西安电子科技大学 Optics and SAR image registration method based on depth convolution GAN
CN108564606A (en) * 2018-03-30 2018-09-21 西安电子科技大学 Heterologous image block matching method based on image conversion
US20200130177A1 (en) * 2018-10-29 2020-04-30 Hrl Laboratories, Llc Systems and methods for few-shot transfer learning
WO2021028650A1 (en) * 2019-08-13 2021-02-18 University Of Hertfordshire Higher Education Corporation Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region
CN110659680A (en) * 2019-09-16 2020-01-07 西安电子科技大学 Image patch matching method based on multi-scale convolution

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ANASTASIYA MISHCHUK ET AL.: "Working hard to know your neighbor"s margins: Local descriptor learning loss", 《ARXIV[CS.CV]》 *
DOU DUAN ET AL.: "AFD-Net: Aggregated Feature Difference Learning for Cross-Spectral Image Patch Matching", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
SHUANG WANG ET AL.: "Better and Faster: Exponential Loss for Image Patch Matching", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
YURUN TIAN ET AL.: "L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
王若静: "基于特征残差学习和图像转换的异源图像块匹配方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114722407A (en) * 2022-03-03 2022-07-08 中国人民解放军战略支援部队信息工程大学 Image protection method based on endogenous countermeasure sample
CN114722407B (en) * 2022-03-03 2024-05-24 中国人民解放军战略支援部队信息工程大学 Image protection method based on endogenic type countermeasure sample
CN115601576A (en) * 2022-12-12 2023-01-13 云南览易网络科技有限责任公司(Cn) Image feature matching method, device, equipment and storage medium
CN116597177A (en) * 2023-03-08 2023-08-15 西北工业大学 Multi-source image block matching method based on dual-branch parallel depth interaction cooperation

Also Published As

Publication number Publication date
CN113221923B (en) 2023-02-24

Similar Documents

Publication Publication Date Title
CN107945204B (en) Pixel-level image matting method based on generation countermeasure network
Liu et al. CNN-enhanced graph convolutional network with pixel-and superpixel-level feature fusion for hyperspectral image classification
CN113221923B (en) Feature decomposition method and system for multi-mode image block matching
CN112434721A (en) Image classification method, system, storage medium and terminal based on small sample learning
CN113822209B (en) Hyperspectral image recognition method and device, electronic equipment and readable storage medium
CN111768457B (en) Image data compression method, device, electronic equipment and storage medium
Chen et al. Dr-tanet: Dynamic receptive temporal attention network for street scene change detection
CN110222718B (en) Image processing method and device
Shu et al. Multiple channels local binary pattern for color texture representation and classification
Li et al. SLViT: Shuffle-convolution-based lightweight Vision transformer for effective diagnosis of sugarcane leaf diseases
CN113033454B (en) Method for detecting building change in urban video shooting
CN108898269A (en) Electric power image-context impact evaluation method based on measurement
CN110659680B (en) Image patch matching method based on multi-scale convolution
CN111325766A (en) Three-dimensional edge detection method and device, storage medium and computer equipment
CN111639230B (en) Similar video screening method, device, equipment and storage medium
CN117152823A (en) Multi-task age estimation method based on dynamic cavity convolution pyramid attention
CN112434576A (en) Face recognition method and system based on depth camera
CN110969128A (en) Method for detecting infrared ship under sea surface background based on multi-feature fusion
Kaddar et al. Divnet: efficient convolutional neural network via multilevel hierarchical architecture design
CN115098646A (en) Multilevel relation analysis and mining method for image-text data
CN117437557A (en) Hyperspectral image classification method based on double-channel feature enhancement
CN115063359A (en) Remote sensing image change detection method and system based on anti-dual-self-encoder network
CN115035377A (en) Significance detection network system based on double-stream coding and interactive decoding
CN114445468A (en) Heterogeneous remote sensing image registration method and system
CN114972155A (en) Polyp image segmentation method based on context information and reverse attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant