CN114491135A - Cross-view angle geographic image retrieval method based on variation information bottleneck - Google Patents

Cross-view angle geographic image retrieval method based on variation information bottleneck Download PDF

Info

Publication number
CN114491135A
CN114491135A CN202210352920.4A CN202210352920A CN114491135A CN 114491135 A CN114491135 A CN 114491135A CN 202210352920 A CN202210352920 A CN 202210352920A CN 114491135 A CN114491135 A CN 114491135A
Authority
CN
China
Prior art keywords
image
cross
view
information bottleneck
variation information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210352920.4A
Other languages
Chinese (zh)
Inventor
徐行
胡谦
李宛思
沈复民
申恒涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Koala Youran Technology Co ltd
Original Assignee
Chengdu Koala Youran Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Koala Youran Technology Co ltd filed Critical Chengdu Koala Youran Technology Co ltd
Priority to CN202210352920.4A priority Critical patent/CN114491135A/en
Publication of CN114491135A publication Critical patent/CN114491135A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a cross-view angle geographic image retrieval method based on variation information bottleneck, which relates to the technical field of cross-view angle geographic image retrieval in computer vision, wherein a classifier of a conventional retrieval model can be quickly converged in a training process, so that the generated gradient contains too little information and a feature extraction module cannot be effectively trained, the retrieval model is easy to be over-fitted, and the performance on a tested data set is poor; the method adds Gaussian noise to a classifier in a training process by using a variational information bottleneck module, forces a feature extraction module to extract image representation with view invariance and discriminability so as to improve the generalization capability and robustness of a retrieval model, and uses the features compressed by the variational information bottleneck module as retrieval features; thereby achieving the purpose of improving the accuracy of the retrieval result.

Description

Cross-view angle geographic image retrieval method based on variation information bottleneck
Technical Field
The invention relates to the technical field of cross-view angle geographic image retrieval in computer vision, in particular to a cross-view angle geographic image retrieval method based on variation information bottleneck.
Background
The cross-view geographic image retrieval is to perform retrieval matching on the same geographic target in the image from different views such as a ground view or a satellite view, for example, given a ground view query image, a satellite image of the same geographic target is searched in a candidate image of the satellite view. The method has wide application, such as unmanned driving, and the precise geographic target positioning is required to be realized, so the method has great application value and economic benefit.
Cross-perspective geographic image retrieval is a challenging task because extreme perspective changes cause large changes in visual appearance; in recent years, the task of cross-perspective geographic image retrieval has been greatly developed.
The traditional method focuses on mining the feature representation of the image center geographic target, but ignores the importance of the context information of the adjacent areas of the image. Therefore, the method proposes to use the adjacent area of the image center geographic target as auxiliary information to enrich the judgment clue, and obviously improves the retrieval effect. The method is realized based on the variation information bottleneck module, Gaussian noise can be added to the output result of the feature extraction module, so that the classifier can have robustness to the noise, the feature extraction module is forced to have image representation of extraction view invariance and discriminability, and the generalization capability of the cross-view angle geographic image retrieval model based on the variation information bottleneck is improved.
Disclosure of Invention
The invention aims to: the method comprises the steps of providing a cross-view geographic image retrieval method based on variation information bottleneck, improving generalization capability and robustness of a retrieval model, and using features compressed by a variation information bottleneck module as retrieval features; and obtaining the image representation with view invariance and discriminability as the retrieval feature, thereby achieving the purpose of improving the accuracy of the retrieval result.
The invention specifically adopts the following technical scheme for realizing the purpose:
a cross-view angle geographic image retrieval method based on variation information bottleneck comprises the following steps:
step S1: selecting common cross-view geographic image data sets, wherein the common cross-view geographic image data sets comprise a train data set and a val data set, and the common cross-view geographic image data sets comprise images of two views, namely a ground view image and a satellite view image;
step S2: training a cross-view angle geographic image retrieval model based on variation information bottleneck;
step S3: cross-view angle geographic image retrieval model test based on variation information bottleneck; selecting any one ground visual angle image, inputting the ground visual angle image into the cross-visual angle geographic image retrieval model based on the variation information bottleneck obtained in the step S2, and obtaining the output characteristic Zi jMean value of Ui jWill U isi jAnd splicing the images in rows to obtain features serving as retrieval features, so as to retrieve the satellite view images with the same targets as the ground view images.
As a preferred technical scheme, the cross-view angle geographic image retrieval model based on the variation information bottleneck comprises a feature extraction module, a variation information bottleneck module and a classifier module;
a feature extraction module: is a ResNet-50 model pre-trained on an ImageNet dataset to extract features of an input image;
a variation information bottleneck module: is composed of an encoder, and the input of the variable information bottleneck is Vi jThe encoder is provided with two linear layers as output layers, the dimension size is 512, and the output two feature vectors are respectively used as the mean value and the variance of the learning of the variational information bottleneck module;
the classifier module sequentially comprises a full connection layer, a batch processing normalization layer, a Dropout layer and a linear classification layer, wherein the dimension of the linear classification layer is the number of classes of a classification target.
As a preferable technical scheme, the feature extraction module adopts a square ring feature partitioning strategy to extract image features, and provides attention according to the distance from the peripheral area of the image to the center of the image, so that discriminant clues of the image features are enriched.
As a preferred technical solution, the feature extraction module is specifically operative to:
will input image xjAdjusting the image size to 256 × 256, inputting the image size to a feature extraction module to obtain image features RjWherein x isj∈{xd,xs}, xdAnd xsEach representing two different viewing angles, xdRepresenting ground perspective, xsRepresenting a satellite view;
then using the design of square ring feature partition to divide the feature map into i square ring parts,
is denoted by Ri j=P slice (RjI) then each fraction is averaged and pooled to give a feature R of dimension 2048i jIs marked as Vi j=Avgpool(Ri j) WhereinP slice The operation is partitioned for the square-ring feature,Avgpoolis an average pooling operation.
As a preferred technical solution, the step S2 specifically includes:
step S2.1: extracting image features of a train data set by using a feature extraction module, wherein the input of the feature extraction module is two images with different visual angles, and the two images are recorded as a ground visual angle image xdAnd satellite view angle image xs
Step S2.2: ground view image xdThe input feature extraction module obtains image features Rd(ii) a Utilizing a square ring characteristic partitioning strategy, and obtaining the characteristics V of each part through average poolingi d
Step S2.3: satellite view angle image xsAnd ground perspective image xdThe operation processing flow is the same, and the characteristic V of the satellite view image is obtainedi s
Step S2.4: inputting characteristics V of the two visual angles obtained in the steps S2.2 and S2.3i dAnd Vi SInputting a variation information bottleneck module to obtain respective mean value and variance, and then carrying out a re-parameter to obtain an output characteristic Zi dAnd Zi s
Step S2.5: the operation of the heavy parameters is to study the mean value and variance sum of the variation information bottleneck module in normal distributionN(0,I)Sampling an epsilon, carrying out re-parameter according to the following formula,Idimension of (d) and output characteristic Zi dAnd Zi sThe dimensions are the same;
Z=μ+σ*ε
where μ represents the mean, σ represents the variance, and ε represents the distribution from normalN(0,I)Randomly sampling a numerical value as added Gaussian noise, and adding disturbance to the training of the classifier;
step S2.6: two image characteristics Z obtained by resampling in the step S2.5i dAnd Zi sInputting the data into a classifier module to calculate classification loss;
step S2.7: aiming at enhancing generalization capability and robustness of a cross-view geographic image retrieval model based on variation information bottleneck, and preventing variance of training output of a variation information bottleneck module from being zero according to output characteristics Zi dMean and variance calculation of (2): z is a linear or branched memberi dKL distance from standard normal distribution, and output characteristic Zi sThe same calculation is also carried out, and the proportion of the calculated KL distance loss in the total loss function is controlled by the parameter beta;
the method specifically comprises the following steps: computing output characteristic Zi dAnd Zi sKL distance from the standard normal distribution, and finally total loss function concrete formula L of the whole cross-view angle geographic image retrieval model based on variation information bottleneckVIBThe following were used:
LVIB=L cls +β*DKL[[p(Z|x), r(z)]]
wherein DKLRepresenting calculated KL distance, r (z) representing a prior scoreHere, normal distribution, p (Z | x) represents the predicted distribution of the features Z of the input image x, the specific values include the mean and variance of cross-perspective geographic image retrieval model learning based on variation information bottleneck, β is a weight hyperparameter, the specific values will be set in specific implementation cases,L cls classifying a loss function for cross entropy;
step S2.8: total loss function L of cross-view angle geographic image retrieval model based on variation information bottleneck by utilizing random gradient descent methodVIBCarrying out optimization solution, and recording the calculated total loss function value;
step S2.9: repeating the step S2.1-S2.8, and training a cross-view angle geographic image retrieval model based on variation information bottleneck by using a train data set of the cross-view angle geographic image data set; and stopping training until the total loss function value is reduced to no longer change, which indicates that the cross-view angle geographic image retrieval model based on the variable information bottleneck has converged, and saving the model as the cross-view angle geographic image retrieval model based on the variable information bottleneck for final testing.
As a preferred technical solution, in step S2.6, the classification loss function is a cross-entropy function, which is a cross-entropy classification loss functionL cls The details are as follows:
Figure 81686DEST_PATH_IMAGE001
subscript j represents different visual angles, subscript j represents a ground visual angle when being d, represents a satellite visual angle when being s, i represents an ith divided part, c represents the number of classified objects, gi j(y) is a predicted probability value of the classification target real label у, gi j(c) Predicted probability values for other classification targets.
The invention has the following beneficial effects:
1. according to the invention, the variation information bottleneck module is added, the robustness and the generalization capability of the cross-view geographic image retrieval model based on the variation information bottleneck are improved, the characteristic representation with view invariance and discriminability is obtained as the retrieval characteristic, and the accuracy of the cross-view geographic image retrieval is improved.
Drawings
FIG. 1 is a flow chart of a retrieval method of the present invention;
FIG. 2 is a network framework diagram of the search model of the present invention;
fig. 3 is a diagram of CVACT _ val search results according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations
Thus, the following detailed description of the embodiments of the present invention, as presented in FIGS. 1-3, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
A cross-view angle geographic image retrieval method based on variation information bottleneck comprises the following steps:
step S1: selecting common cross-view geographic image data sets, wherein the common cross-view geographic image data sets comprise a train data set and a val data set, and the common cross-view geographic image data sets comprise images of two views, namely a ground view image and a satellite view image;
step S2: training a cross-view angle geographic image retrieval model based on variation information bottleneck;
step S3: cross-view angle geographic image retrieval model test based on variation information bottleneck; selecting any one ground visual angle image, inputting the ground visual angle image into the cross-visual angle geographic image retrieval model based on the variation information bottleneck obtained in the step S2, and obtaining the output characteristic Zi jMean value of Ui jWill U isi jLine-wise stitching to obtain featuresAs a retrieval feature, a satellite perspective image having the same target as the ground perspective image is thereby retrieved.
The concrete during operation: the generalization capability and robustness of a retrieval model are improved, and the features compressed by a variational information bottleneck module are used as retrieval features; and obtaining the image representation with view invariance and discriminability, thereby achieving the purpose of improving the accuracy of the retrieval result.
Example 2
As shown in fig. 1, the invention relates to a cross-view geographic image retrieval method based on variation information bottleneck, comprising the following steps:
step A1: selecting a commonly-used public cross-view geographic image data set CVACT, wherein the CVACT is a large-scale cross-view public data set, the CVACT provides 35532 to train ground view and satellite view images, and additionally provides 8884 to a test set of images, namely CVACT _ val, and for the CVACT _ val test set, an inquiry image only has a correct matching image, namely, a given ground view inquiry image only has a correct satellite view image to match with.
Step A2 data preprocessing
The preprocessing of the data is to adjust the image of the input train data set to be 256 multiplied by 256 in fixed size and then randomly turn over the image to increase the diversity of the training samples.
Step A3: cross-view angle geographic image retrieval model based on variation information bottleneck in training
In this example, a network framework of a cross-perspective geographic image retrieval model based on variational information bottlenecks is shown in figure 2,
step A.1, inputting the image preprocessed in the step A2 into a feature extraction module to extract image features, and then obtaining input features V by adopting a square ring feature partition strategyi S(satellite perspective image representation) and Vi d(ground perspective image representation), as detailed below;
for two visual angles of the CVACT data set, each processing branch is a satellite visual angle branch and a ground visual angle branch; because of the ground viewing angle in the CVACT data setThe image has a wide ground view, so the characteristics of the CVACT image adopt a square ring partition design with 8 blocks; dividing the extracted image features into 8 square rings according to the distance from the adjacent region to the center of the image, and then obtaining input features Vi SAnd Vi d
Step A3.2: input characteristics V obtained in step 3.1i dAnd Vi SInputting a variation information bottleneck module to obtain respective mean value and variance, and then carrying out a re-parameter to obtain an output characteristic Zi dAnd Zi s
The operation of the heavy parameters is to study the mean value and variance sum of the variation information bottleneck module in normal distributionN(0,I)Sampling an epsilon, and repeating the parameters, the dimension of I and the output characteristic Z according to the following formulai dAnd Zi sThe dimensions are the same;
Z=μ+σ*ε
where μ represents the mean, σ represents the variance, and ε represents the distribution from normalN(0,I)Randomly sampling a numerical value as added Gaussian noise, and adding disturbance to the training of the classifier;
step A3.3: the output characteristic Z obtained by resampling the step A3.2i dAnd Zi sInputting a classifier module, calculating classification loss, wherein a cross entropy function is adopted as a classification loss function, and the method is specifically as follows:
Figure 942195DEST_PATH_IMAGE001
the subscript j indicates different view angles, the subscript j indicates a ground view angle when d, indicates a satellite view angle when s, i indicates an ith division, c indicates the number of classified objects, gi j(y) is a predicted probability value of the classification target real label у, gi j(c) Predicted probability values for other classification targets.
Step A3.4: aiming at enhancing generalization capability and robustness of cross-perspective geographic image retrieval model based on variation information bottleneckThe variance output by the variational information bottleneck module training is prevented from being zero, and the final output characteristic Z is enabled to bei dAnd Zi sApproximate standard distribution, calculating output characteristic Zi dAnd Zi sKL distance (Kullback-Leibler Divergence) from standard normal distribution, and finally, a specific formula L of a loss function of the whole cross-view angle geographic image retrieval model based on variation information bottleneckVIBThe following were used:
LVIB =L cls +β*DKL[[p(Z|x), r(z)]]
wherein DKLRepresenting calculation of KL distance (Kullback-Leibler Divergence), r (Z) representing prior distribution, here normal distribution, p (Z | x) representing prediction distribution of characteristic Z of input image x, specific values including mean and variance of cross-view geographic image retrieval model learning based on variation information bottleneck, beta being weight hyper-parameter, and being initialized to 10-6And increases as the number of iterations of the training increases, by multiplying the initialization value by the number of iterations,L cls classifying a loss function for cross entropy;
step A3.5: total loss function L of cross-view angle geographic image retrieval model based on variation information bottleneck by using random gradient descent methodVIBCarrying out optimization solution, and recording the calculated total loss function value;
step A3.6: repeating the steps A2-A3.5, and training a cross-view angle geographic image retrieval model based on variation information bottleneck by using a train data set of a CVACT data set; stopping training until the total loss function value is reduced to no longer change, indicating that the cross-view angle geographic image retrieval model based on the variation information bottleneck has converged, and saving the model as the cross-view angle geographic image retrieval model based on the variation information bottleneck for final testing;
step A4: cross-view angle geographic image retrieval model test based on variation information bottleneck
Selecting any ground visual angle image, inputting the ground visual angle image into the cross-visual angle geographic image retrieval model based on the variation information bottleneck obtained in the step A3.6, and obtaining the output characteristic Zi jMean value of Ui jWill U isi jAnd splicing the images in rows to obtain features serving as retrieval features, so as to retrieve the satellite view images with the same targets as the ground view images.
The test results on the CVACT test set are shown in table 1, and a judgment index Recall @ K (R @ K, K =1,5,10) is output to evaluate the retrieval performance of the model. (R @ K represents the proportion of the top K of a correctly matched image in the ranking list, the higher the value of R @ K, the better the performance of the model, and in Table 1, terrestrial- > satellite represents that a given terrestrial view image retrieves a satellite view image)
TABLE 1 comparison of model Performance on CVACT _ val
Figure 219723DEST_PATH_IMAGE002
On CVACT _ val, the present invention is compared to other most advanced methods. The results are shown in table 1, and the bold numbers in table 1 indicate that the present invention has a numerical improvement in the search index compared to other methods; it can be observed that the invention realizes 81.04% Recall @1 precision on ground- > satellite, and the precision index is obviously superior to other methods.
The effectiveness of the cross-view geographic image retrieval method based on the variational information bottleneck is proved.
As shown in fig. 3, the search results on CVACT _ val are visualized, and the similarity of the search results is ranked from large to small. In fig. 3, the result of ground- > 3 before satellite search is shown on CVACT _ val, and v represents an image with accurate search, and it can be seen from the figure that the most relevant and correct image can be accurately searched by the present invention, and the above example further intuitively illustrates the effectiveness of the present invention in the actual cross-view geographic image search task.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (6)

1. A cross-view angle geographic image retrieval method based on variation information bottleneck is characterized by comprising the following steps:
step S1: selecting common cross-view geographic image data sets, wherein the common cross-view geographic image data sets comprise a train data set and a val data set, and the common cross-view geographic image data sets comprise images of two views, namely a ground view image and a satellite view image;
step S2: training a cross-view angle geographic image retrieval model based on variation information bottleneck;
step S3: cross-view angle geographic image retrieval model test based on variation information bottleneck; selecting any one ground visual angle image, inputting the ground visual angle image into the cross-visual angle geographic image retrieval model based on the variation information bottleneck obtained in the step S2, and obtaining the output characteristic Zi jMean value of Ui jWill U isi jAnd splicing the images in rows to obtain features serving as retrieval features, so as to retrieve the satellite view images with the same targets as the ground view images.
2. The cross-perspective geographic image retrieval method based on the variation information bottleneck, according to claim 1, is characterized in that the cross-perspective geographic image retrieval model based on the variation information bottleneck comprises a feature extraction module, a variation information bottleneck module and a classifier module;
a feature extraction module: is a ResNet-50 model pre-trained on an ImageNet dataset to extract features of an input image;
a variation information bottleneck module: is composed of an encoder, and the input of the variable information bottleneck is Vi jThe encoder is provided with two linear layers as output layers, the dimension size is 512, and the output two feature vectors are respectively used as the mean value and the variance of the learning of the variational information bottleneck module;
the classifier module sequentially comprises a full connection layer, a batch processing normalization layer, a Dropout layer and a linear classification layer, wherein the dimension of the linear classification layer is the number of classes of a classification target.
3. The cross-view geographic image retrieval method based on the variational information bottleneck as claimed in claim 2, wherein the feature extraction module adopts a square ring feature partitioning strategy to extract image features, and provides attention according to the distance from the peripheral region of the image to the center of the image, thereby enriching the distinguishing clues of the image features.
4. The cross-perspective geographic image retrieval method based on variational information bottlenecks of claim 3, wherein the feature extraction module is specifically operative to:
will input image xjAdjusting the image size to 256 × 256, inputting the image size to a feature extraction module to obtain image features RjWherein x isj∈{xd,xs}, xdAnd xsEach representing two different viewing angles, xdRepresenting ground perspective, xsRepresenting a satellite view;
then using the design of square ring feature partition to divide the feature map into i square ring parts,
is denoted by Ri j=P slice (RjI) then each portion is averaged pooled to obtain a feature R of dimension 2048i jIs marked as Vi j=Avgpool(Ri j) WhereinP slice The operation is partitioned for the square-ring feature,Avgpoolis an average pooling operation.
5. The method for retrieving the cross-perspective geographic image based on the variation information bottleneck as claimed in claim 1, wherein the step S2 specifically comprises:
step S2.1: extracting the image characteristics of the train data set by using a characteristic extraction module, wherein the input of the characteristic extraction module is two images with different visual angles, and the two images are recorded as a ground visual angle image xdAnd satellite view angle image xs
Step S2.2: ground view image xdThe input feature extraction module obtains image features Rd(ii) a Utilizing a square ring characteristic partitioning strategy, and obtaining the characteristics V of each part through average poolingi d
Step S2.3: satellite view angle image xsAnd ground perspective image xdThe operation processing flow is the same, and the characteristic V of the satellite view image is obtainedi s
Step S2.4: inputting characteristics V of the two visual angles obtained in the steps S2.2 and S2.3i dAnd Vi SInputting a variation information bottleneck module to obtain respective mean value and variance, and then carrying out a re-parameter to obtain an output characteristic Zi dAnd Zi s
Step S2.5: the operation of the heavy parameters is to study the mean value and variance sum of the variation information bottleneck module in normal distributionN (0,I)Sampling an epsilon, carrying out re-parameter according to the following formula,Idimension of (d) and output characteristic Zi dAnd Zi sThe dimensions are the same;
Z=μ+σ*ε
where μ represents the mean, σ represents the variance, and ε represents the distribution from normalN(0,I)Randomly sampling a numerical value as added Gaussian noise, and adding disturbance to the training of the classifier;
step S2.6: two image characteristics Z obtained by resampling in the step S2.5i dAnd Zi sInputting the classification loss into a classifier module, and calculating the classification loss;
step S2.7: computing output characteristic Zi dAnd Zi sKL distance from the standard normal distribution, and finally total loss function concrete formula L of the whole cross-view angle geographic image retrieval model based on variation information bottleneckVIBThe following were used:
LVIB=L cls +β*DKL[[p(Z|x), r(z)]]
wherein DKLRepresenting the calculated KL distance, r (Z) representing the prior distribution, here, the normal distribution, p (Z | x) representing the prediction distribution of the characteristic Z of the input image x, specific values including the mean and variance of cross-view geographic image retrieval model learning based on variation information bottleneck, and beta being a weight hyperparameter, will be set in specific implementation casesThe value of the volume is,L cls is a cross entropy classification loss;
step S2.8: total loss function L of cross-view angle geographic image retrieval model based on variation information bottleneck by using random gradient descent methodVIBCarrying out optimization solution, and recording the calculated total loss function value;
step S2.9: repeating the step S2.1-S2.8, and training a cross-view angle geographic image retrieval model based on variation information bottleneck by using a train data set of the cross-view angle geographic image data set; and stopping training until the total loss function value is reduced to no longer change, which indicates that the cross-view angle geographic image retrieval model based on the variable information bottleneck has converged, and saving the model as the cross-view angle geographic image retrieval model based on the variable information bottleneck for final testing.
6. The cross-view geographic image retrieval method based on variation information bottleneck as claimed in claim 5, wherein in step S2.6, the classification loss function is a cross entropy function, and the cross entropy classification loss isL cls The details are as follows:
Figure 154624DEST_PATH_IMAGE001
the subscript j indicates different view angles, the subscript j indicates a ground view angle when d, indicates a satellite view angle when s, i indicates an ith division, c indicates the number of classified objects, gi j(y) is a predicted probability value of the classification target real label у, gi j(c) Predicted probability values for other classification targets.
CN202210352920.4A 2022-04-06 2022-04-06 Cross-view angle geographic image retrieval method based on variation information bottleneck Pending CN114491135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210352920.4A CN114491135A (en) 2022-04-06 2022-04-06 Cross-view angle geographic image retrieval method based on variation information bottleneck

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210352920.4A CN114491135A (en) 2022-04-06 2022-04-06 Cross-view angle geographic image retrieval method based on variation information bottleneck

Publications (1)

Publication Number Publication Date
CN114491135A true CN114491135A (en) 2022-05-13

Family

ID=81489174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210352920.4A Pending CN114491135A (en) 2022-04-06 2022-04-06 Cross-view angle geographic image retrieval method based on variation information bottleneck

Country Status (1)

Country Link
CN (1) CN114491135A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784249A (en) * 2019-01-04 2019-05-21 华南理工大学 A kind of scramble face identification method based on variation cascaded message bottleneck
US20200402223A1 (en) * 2019-06-24 2020-12-24 Insurance Services Office, Inc. Machine Learning Systems and Methods for Improved Localization of Image Forgery
CN113361508A (en) * 2021-08-11 2021-09-07 四川省人工智能研究院(宜宾) Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite
WO2021250850A1 (en) * 2020-06-11 2021-12-16 Nec Corporation Training apparatus, control method, and non-transitory computer-readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784249A (en) * 2019-01-04 2019-05-21 华南理工大学 A kind of scramble face identification method based on variation cascaded message bottleneck
US20200402223A1 (en) * 2019-06-24 2020-12-24 Insurance Services Office, Inc. Machine Learning Systems and Methods for Improved Localization of Image Forgery
WO2021250850A1 (en) * 2020-06-11 2021-12-16 Nec Corporation Training apparatus, control method, and non-transitory computer-readable storage medium
CN113361508A (en) * 2021-08-11 2021-09-07 四川省人工智能研究院(宜宾) Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
M. LIU 等: "Iterative Local-Global Collaboration Learning Towards One-Shot Video Person Re-Identification", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
T. WANG 等: "Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
YOUNGSIK EOM 等: "Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck", 《HTTPS://ARXIV.ORG/ABS/2204.01387》 *
周金坤 等: "基于多视角多监督网络的无人机图像定位方法", 《计算机应用》 *

Similar Documents

Publication Publication Date Title
US11816888B2 (en) Accurate tag relevance prediction for image search
Nech et al. Level playing field for million scale face recognition
CN110851645B (en) Image retrieval method based on similarity maintenance under deep metric learning
US10235623B2 (en) Accurate tag relevance prediction for image search
CN102521366B (en) Image retrieval method integrating classification with hash partitioning and image retrieval system utilizing same
Wang et al. A three-layered graph-based learning approach for remote sensing image retrieval
Bai et al. VHR object detection based on structural feature extraction and query expansion
CN102105901B (en) Annotating images
CN112633382B (en) Method and system for classifying few sample images based on mutual neighbor
CN106951551B (en) Multi-index image retrieval method combining GIST characteristics
CN110866134B (en) Image retrieval-oriented distribution consistency keeping metric learning method
CN110458175B (en) Unmanned aerial vehicle image matching pair selection method and system based on vocabulary tree retrieval
CN109710804B (en) Teaching video image knowledge point dimension reduction analysis method
Li et al. On the integration of topic modeling and dictionary learning
CN103440508A (en) Remote sensing image target recognition method based on visual word bag model
Sadique et al. Content-based image retrieval using color layout descriptor, gray-level co-occurrence matrix and k-nearest neighbors
Oneata et al. Axes at trecvid 2012: Kis, ins, and med
CN114691911B (en) Cross-view angle geographic image retrieval method based on information bottleneck variational distillation
CN113032613A (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN114491135A (en) Cross-view angle geographic image retrieval method based on variation information bottleneck
CN113409351B (en) Unsupervised field self-adaptive remote sensing image segmentation method based on optimal transmission
Shekar et al. Video clip retrieval based on LBP variance
CN110941994B (en) Pedestrian re-identification integration method based on meta-class-based learner
CN114610941A (en) Cultural relic image retrieval system based on comparison learning
Böttcher et al. BTU DBIS'Plant Identification Runs at ImageCLEF 2012.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220513

RJ01 Rejection of invention patent application after publication