CN114691911B

CN114691911B - Cross-view angle geographic image retrieval method based on information bottleneck variational distillation

Info

Publication number: CN114691911B
Application number: CN202210285790.7A
Authority: CN
Inventors: 徐行; 胡谦; 李宛思; 沈复民
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-03-22
Filing date: 2022-03-22
Publication date: 2023-04-07
Anticipated expiration: 2042-03-22
Also published as: CN114691911A

Abstract

The invention discloses a cross-view angle geographic image retrieval method based on information bottleneck variational distillation, which carries out cross-view angle geographic image retrieval by using discriminant representation after redundant information is removed; the feature extracted by the feature extraction module is compressed by the information bottleneck module to obtain low-dimensional image representation by using the variational distillation technology, and the low-dimensional image representation is constrained by using variational distillation loss and cross entropy classification loss to retain more prediction information, so that the aim of removing redundant information is fulfilled; finally, the low-dimensional image representation with discriminability is obtained as the retrieval feature, and the aims of improving the accuracy of the retrieval result and accelerating the retrieval speed are fulfilled.

Description

Cross-view angle geographic image retrieval method based on information bottleneck variational distillation

Technical Field

The invention belongs to the technical field of cross-view image retrieval in computer vision, and particularly relates to a cross-view geographic image retrieval method based on information bottleneck variational distillation.

Background

The cross-view geographic image retrieval is to perform retrieval matching on the same geographic target in an image from different angles such as a satellite view angle or an unmanned aerial vehicle view angle, for example, given an unmanned aerial vehicle view angle query image, searching for an image of the same geographic target in a candidate image of the satellite view angle. It has extensive application, like the accurate express delivery of unmanned aerial vehicle, unmanned aerial vehicle investigation, unmanned aerial vehicle navigation task etc. these tasks all require that unmanned aerial vehicle can realize comparatively accurate geographical target location, have very big using value and economic benefits.

Cross-perspective geographic image retrieval is a challenging task due to the tremendous change in visual appearance caused by extreme perspective changes. With the development of deep learning, the search task of the cross-view geographic image is greatly developed, and the main methods can be divided into the following two types:

(1) And (3) studying the deep neural network learning discriminant features by metric learning: the deep neural network learns a feature space, makes the matched image pair closer, and pushes away the unmatched image pair; attention mechanisms have also found widespread use in network design of such methods.

(2) And (3) enriching and judging clues by using information of adjacent areas in the center of the image: inspired by the work of the human visual system, the human visual system generally adopts a layered processing mode to improve the judgment accuracy; the human visual system focuses first on whether different perspective scenes contain the same geographic object, and then checks the context information around the geographic object to verify the correctness of the match. The method utilizes the adjacent area of the image center geographic target as auxiliary information, explores the information of the geographic image context, and enriches the judgment clues.

Traditional methods generally focus on mining fine-grained features of image-centric geographic objects, while underestimating the importance of context information of neighboring regions. The newly proposed method utilizes the adjacent region of the image center geographic target as auxiliary information, enriches the judgment clues and obviously improves the effect. However, when the context information of the image is focused, inevitable redundant information is brought, which causes a reduction in the search accuracy to some extent, and causes a large search feature dimension, which reduces the search speed.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a cross-view angle geographic image retrieval method based on information bottleneck variational distillation, which uses discriminant representation after redundant information is removed to carry out cross-view angle geographic image retrieval; by utilizing the variational distillation technology, the characteristics extracted by the characteristic extraction module are subjected to redundant information removal by compressing the characteristics through the information bottleneck module, so that the image representation with higher discriminability and lower dimension is obtained, and the aims of improving the accuracy of the retrieval result and accelerating the retrieval speed are fulfilled.

The invention is realized based on a cross-view geographic image retrieval model of information bottleneck variation distillation, and specifically comprises a feature extraction module, a classifier 1 and a classifier 2 which are respectively corresponding to the information bottleneck module and the two modules, and the detailed explanation is as follows.

The feature extraction module extracts global features of the input image by using a residual neural network ResNet-50 with weights pre-trained on ImageNet. The ResNet-50 comprises five blocks named as conv1, conv2, conv3, conv4 and conv5, an average pooling layer and a full connection layer, the invention removes the average pooling layer and the full connection layer of the ResNet-50, and the input image obtains the global features of the image for subsequent processing.

In order to fully utilize the context information of the image, a square ring feature partitioning strategy is adopted for the extracted global features of the image, and the adjacent regions are used as auxiliary information according to the attention provided by the distance from the adjacent regions to the center of the image, so that the judgment clues of the geographic image are enriched. The specific operation is to use a square ring partition design, divide the image global features extracted by the feature extraction module into several square ring parts, then obtain features with dimension 2048 by average pooling of each part, and the process can be expressed as:

f _j ＝F _resnet-50 (x _j )

the subscript j represents the different viewing angles, x _j Representing an input image, f _j Representing the global features of the extracted image,

representing global features f from images _j Characterization of the i-th part of the division, <' > H>

The features representing the i-th part of the cut are averaged over pooled features. F _slice Representing square ring feature partition strategy operation, and Avgpool representing average pooling operation; the resulting initial characteristic->

Will be the input to the classifier 1 and the information bottleneck module.

The classifier 1 is composed of a full connection layer, a batch processing normalization layer, a Dropout layer and a classification layer, wherein the classification layer is the full connection layer, and the dimension of the output vector of the classification layer is the category number of the geographic target.

The information bottleneck module is realized by an encoder, the obtained initial features are compressed and dimensionality reduced, the features with the output dimensionality of 400 are smaller than the commonly-used feature dimensionality 512, and after the cross-view angle geographic image retrieval model based on the information bottleneck variational distillation is trained, the information bottleneck module can obtain low-dimensional and more discriminative image representation as retrieval features, so that the retrieval speed can be accelerated, and the retrieval performance can be improved.

The input of the classifier 2 is the output of the information bottleneck module, the input characteristic dimension is 400, the dimension of the output vector is the number of the categories of the geographic target, and the middle is also composed of a batch processing normalization layer and a Dropout layer.

The invention discloses a cross-view angle geographic image retrieval method based on information bottleneck variational distillation, which is characterized by comprising the following steps of:

step S1: selecting a common cross-perspective geographic image training dataset

Step S2: cross-visual angle geographic image retrieval model based on information bottleneck variation distillation training

Step S2.1: extracting image features of a training data set by using a feature extraction module, wherein the input of the feature extraction module is two images with different visual angles, and the two images are recorded as a visual angle 1 image and a visual angle 2 image;

step S2.2: view 1 image x ₁ The input feature extraction module obtains the global features f of the image ₁ (ii) a Adopting a square ring characteristic partition strategy and using square ring partition design to obtain the characteristics of each part after partition

The initial characteristic ^ of the view 1 image after the average pooling>

Step S2.3: view 2 image x ₂ And View 1 image x ₁ The same operation is carried out, and the initial characteristics of the view angle 2 image are obtained

Step S2.4: the initial characteristics of the two visual angles obtained in the steps S2.2 and S2.3 are compared

And &>

Inputting into a classifier 1, calculating cross entropy classification loss, and classifying loss function L _cls1 The following were used: />

j e {1,2} represents a different view angle, i represents the i-th part of the division, F _classifier1 () represents the operations performed by the classifier 1,

represents the predicted probability of the geotarget real tag y, <' > is>

Representing the prediction probability of each geographic target, wherein C is the category number of the geographic targets; />

Is a vector output by the classifier 1, whose dimensions are the number of classification targets, which are combined>

Represents the value at the position c, which is the predicted probability, and c corresponds to a subscript; />

Is given a value y, and is directly calculated.

Step S2.5: the initial characteristics of the two views obtained in step S2.2 and step S2.3

And &>

Inputting an information bottleneck module to carry out compression characteristic redundancy removal information to obtain low-dimensional image representations which are respectively recorded as ^ greater than or equal to>

And &>

Step S2.6: the images with two visual angles and low dimension obtained in the step S2.5 are represented

And &>

Inputting into a classifier 2, calculating cross entropy classification loss and a classification loss function L _cls2 The following:

j ∈ {1,2} denotes different viewing angles, i denotes the ith part of the partition, F _classifier2 () represents the operations performed by the classifier 2,

represents the predicted probability of the geotarget real tag y, <' > is>

Is a vector output by the classifier 2, the dimension of which is the number of classification targets, which is greater than or equal to>

Is given a value y, and is directly calculated.

Step S2.7: for low dimensional image representation

And &>

It is not sufficient to constrain the remaining prediction information by only cross-entropy classification losses, and the present invention utilizes the prediction distributions &'s obtained by classifier 1 and classifier 2>

And &>

And

calculating variational distillation loss which forcibly restricts the low-dimensional image representation->

And &>

Discarding redundant information, and keeping more prediction information to obtain more discriminative image representation, wherein the variational distillation loss function is as follows:

wherein D _KL To calculate the KL distance (Kullback-Leibler Divergence),

and

and &>

The prediction distributions of the label y are obtained for the classifier 1 and the classifier 2, respectively, by calculating the prediction distribution->

And &>

And &>

KL distance between to ensure that the obtained low-dimensional image representation @>

And &>

Is sufficient for the label y, and +>

And &>

Compared to the initial characteristic->

And &>

Redundant information irrelevant to the task is discarded when the feature dimension is compressed, so that the method is more discriminative; />

Step S2.8: the total loss function L of the cross-view geographic image retrieval model is as follows, wherein lambda is a weight hyperparameter;

L＝L _cls1 +L _cls2 +λL _d

step S2.9: optimizing and solving the total loss function L by using a random gradient descent method, and recording an optimized total loss function value;

step S2.10: repeating the step S2.1-S2.9, and processing the cross-view geographic image training data set; stopping training until the total loss function value does not decrease, indicating that the cross-view angle geographic image retrieval model based on the information bottleneck variation distillation is trained, and saving the model as a finally detected cross-view angle geographic image retrieval model based on the information bottleneck variation distillation;

and step S3: retrieving the cross-perspective geographic image;

selecting a view angle 1 or view angle 2 image of any geographic target, inputting the image into the finally detected cross-view angle geographic image retrieval model based on information bottleneck variation distillation obtained in the step S2.10, and obtaining low-dimensional image representation after redundant information is removed

Will be/are>

And splicing to obtain a characteristic z as a retrieval characteristic, thereby retrieving another perspective image which is most relevant to the same geographic target as the perspective image.

On the basis of a square ring feature partitioning strategy, the context information of adjacent areas is fully mined, the feature information is enriched, and then redundant information irrelevant to tasks is removed through an information bottleneck network to obtain more discriminative features. In addition, the invention uses the variation distillation technology to teach the knowledge learned by the complex model ResNet-50 to an information bottleneck network by restricting the variation self-distillation loss, and obtains redundancy-free and more discriminant characteristics as image retrieval characteristics.

Drawings

FIG. 1 is a flow chart of a cross-view geographic image retrieval method based on information bottleneck variational distillation according to the invention;

FIG. 2 is a network framework diagram of a cross-perspective geographic image retrieval model based on information bottleneck variational distillation according to the present invention;

FIG. 3 is a visual display diagram of the search results of the cross-perspective geographic image search model in the University-1652 test set based on information bottleneck variational distillation.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described below in conjunction with embodiments and accompanying drawings so that those skilled in the art can better understand the present invention. It should be particularly noted that the described embodiments are only some embodiments of the invention, rather than all embodiments, and are not intended to limit the scope of the invention as claimed. All other embodiments obtained by a person skilled in the art without making any inventive step are within the scope of protection of the present invention.

Examples

FIG. 1 is a flow chart of a cross-view geographic image retrieval method based on information bottleneck variational distillation.

In this embodiment, as shown in fig. 1, the cross-view geographic image retrieval method based on information bottleneck variational distillation of the present invention includes the following steps:

step S1: selecting a common published cross-perspective geographic image dataset University-1652:

the invention will describe in detail the operation and flow on the University-1652 data set, the University-1652 is a multi-view data set, including satellite view data, unmanned aerial vehicle view data and ground view data, it includes 1652 buildings of 72 universities in the world, the training set includes 701 teaching buildings of 33 colleges, the test set is 951 teaching buildings of the other 39 colleges, the training set and the test set have no overlapping part. The data set is used for researching two tasks, wherein the first task is unmanned aerial vehicle visual angle target positioning (unmanned aerial vehicle- > satellite), namely the unmanned aerial vehicle visual angle image of a given geographic target queries the most similar satellite visual angle image of the same geographic target; the second task drone navigates (satellite- > drone) and vice versa. In the unmanned aerial vehicle visual angle target positioning task, 37855 unmanned aerial vehicle visual angle images are in a query set, and 951 satellite visual angle images to be matched are in a gallery. In the unmanned aerial vehicle navigation task, 701 satellite view images are collected in the query set, and 51355 unmanned aerial vehicle view images to be matched are contained in the gallery.

Step S2: data pre-processing

The data preprocessing comprises the steps of adjusting the input training set images to be 256 multiplied by 256 in fixed size, and then turning the images randomly, so that the characteristic that the view of the cross-view angle geographic image retrieval model based on the information bottleneck variation distillation is not changed is conveniently learned, and the generalization capability of the cross-view angle geographic image retrieval model based on the information bottleneck variation distillation is improved.

And step S3: cross-visual angle geographic image retrieval model based on information bottleneck variational distillation training

In the example, the network framework of the cross-perspective geographic image retrieval model based on information bottleneck variation distillation is shown in FIG. 2,

step S3.1: inputting the data preprocessed in the step S2 into a feature extraction module to extract global features of the image, dividing the extracted global features of the image into a plurality of square ring parts by adopting a square ring feature partition strategy, and obtaining features with the dimension of 2048 by each part through average pooling to obtain initial features

(initial satellite View feature) and>

(view angle initial feature of drone), this step is detailed as follows;

for two visual angles of the University-1652 data set, each of the two visual angles is provided with a processing branch which is a satellite view branch and an unmanned aerial vehicle view branch, and because input images of the two branches are from aerial visual angles, the feature extraction modules share weight; adopting square ring partition design with the block number of 4 for the global features of the University-1652 image, and dividing the extracted global features of the image into 4 square rings according to the distance from the adjacent region to the center of the image;

step S3.2: initial characteristics obtained in step 3.1

And &>

Inputting the classification into a classifier 1, and calculating the cross entropy classification loss, wherein the classification loss function is as follows:

j e {1,2} represents a different view angle, i represents the ith part of the partition,

represents the predicted probability of the geotarget real tag y, <' > is>

Representing the prediction probability of each geographic target, wherein C is the category number of the geographic targets;

step S3.3: initial characterization of step 3.1

And &>

Inputting information bottleneck module to compress characteristics to obtain low-dimensional image representations which are respectively recorded as->

And &>

Step S3.4: the images with two low-dimensional visual angles obtained in the step S3.3 are represented

And &>

Inputting the classification into a classifier 2, and calculating cross entropy classification loss, wherein a classification loss function is as follows:

/>

a predicted probability, representing a geotarget real tag y, in +>

step S3.5: prediction distribution using classifier 1 and classifier 2

And &>

And

variational distillation losses are calculated to ensure that an image representation in the lower dimension->

And &>

Is sufficient for the label y, and->

And &>

Compared to the initial characteristic->

And &>

Redundant information not related to the task is discarded, and the variational distillation loss function is as follows:

wherein D _KL To calculate the KL distance (Kullback-Leibler Divergence),

and

and &>

And &>

And &>

KL distance therebetween;

step S3.6: the total loss function L of the cross-view geographic image retrieval model is as follows, in the example, lambda is set to be 10, the total loss function L is optimized and solved by using a random gradient descent method, and the optimized total loss function value is recorded;

L＝L _cls1 +L _cls2 +λL _d

step S3.7: repeating the steps S2-S3.6, and processing the University-1652 training data set; stopping training until the total loss function value does not decrease, indicating that the cross-view angle geographic image retrieval model based on the information bottleneck variation distillation is trained, and saving the model as a finally detected cross-view angle geographic image retrieval model based on the information bottleneck variation distillation;

and step S4: cross-perspective geographic image retrieval

Inputting the unmanned aerial vehicle and satellite view angle images of the University-1652 test set into the finally detected cross-view angle geographic image retrieval model based on information bottleneck variational distillation obtained in the step S3.7, and obtaining low-dimensional image representation after redundant information is removed

Will->

And splicing to obtain a characteristic z as a retrieval characteristic, and retrieving another perspective image which is the same with the perspective image and is most relevant to the geographic target.

The test results on the University-1652 test set are shown in table 1, and the evaluation indexes recall @ K (K = 1) and average Accuracy (AP) are output to evaluate the retrieval performance of the model. ( R@K represents the proportion of the front k of a correctly matched image in a ranking list, the higher the R@K value is, the better the performance of a cross-view angle geographic image retrieval model based on information bottleneck variational distillation is, the higher the accuracy of the retrieval performance is reflected by the Average Precision (AP), and the higher the accuracy is; in table 1, drone satellite indicates that the view angle image of the drone is given to retrieve the view angle image of the satellite, and satellite drone vice versa )

TABLE 1 comparison of model Performance on the University-1652 test set

/>

The present invention was compared with other most advanced methods on the University-1652 test set. The results are shown in table 1, the bold numbers in table 1 indicate that the invention has numerical improvements over the search index of the latest method; it can be observed that the invention achieves 77.42% Recall @1 accuracy and 80.43% AP on drone- > satellite, 86.88% Recall @1 accuracy and 76.61% AP on satellite- > drone; the precision index of the method is obviously superior to that of the existing method, and both the precision index and the accuracy index reach the advanced level at present, and the characteristic dimension of the method after compression is 400, which is smaller than 512 which is commonly used, and the retrieval speed is faster.

The effectiveness of the cross-view angle geographic image retrieval method based on information bottleneck variational distillation provided by the invention is proved, the distinguishing clues of the geographic image are enriched by utilizing the context information of the image, the characteristic is compressed by the information bottleneck module, and the image characteristic representation with lower dimension than the common retrieval characteristic dimension is obtained as the retrieval characteristic, so that the redundant information in the extracted image characteristic is eliminated, the cross-view angle geographic image retrieval precision is improved, and the retrieval speed is accelerated.

As shown in fig. 3, the search results on the University-1652 test set are visualized, and the similarity of the search results is sorted from large to small. The results of drone- > satellite search and satellite- > drone search top five on the University-1652 test set are shown in fig. 3, with a square representing the image that was searched correctly and a x representing the image that was searched incorrectly. As can be seen from FIG. 3, the most relevant and correct images can be accurately retrieved by the present invention, and the above example further intuitively illustrates the effectiveness of the present invention in the actual cross-perspective geographic image retrieval task.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. A cross-view angle geographic image retrieval method based on information bottleneck variational distillation is characterized by being realized based on a cross-view angle geographic image retrieval model of information bottleneck variational distillation, wherein the model comprises a feature extraction module, an information bottleneck module, a classifier 1 corresponding to the feature extraction module and a classifier 2 corresponding to the information bottleneck module, and the cross-view angle geographic image retrieval method based on the information bottleneck variational distillation specifically comprises the following steps:

step S1), selecting a public cross-view geographic image training data set;

step S2) preprocessing the cross-view geographic image training data set to obtain a preprocessed training data set, wherein the preprocessing operation comprises adjusting images in the input cross-view geographic image training data set to be 256 multiplied by 256 in fixed size and then randomly overturning the images;

step S3) training a cross-view angle geographic image retrieval model based on information bottleneck variational distillation by adopting the preprocessed training data set, and specifically comprising the following steps:

step S31) extracting image features of the cross-view geographic image training data set by using a feature extraction module, wherein the input of the feature extraction module is two images with different view angles, and the two images are recorded as a view angle 1 image and a view angle 2 image;

step S32) View 1 image x ₁ The input feature extraction module obtains the global feature f of the image ₁ (ii) a Adopting a square ring characteristic partition strategy to ensure thatThe characteristics f of each divided part are obtained by the design of square ring partition ₁ ⁱ And obtaining the initial characteristics of the view angle 1 image through average pooling

Step S33) View 2 image x ₂ And View 1 image x ₁ The same operation is carried out, and the initial characteristics of the view angle 2 image are obtained

Step S34) and initial characteristics of the two perspective images obtained in the step S32) and the step S33)

And &>

Inputting into a classifier 1, calculating cross entropy classification loss and a classification loss function L _cls1 As follows:

j ∈ {1,2} represents a different viewing angle, j =1 represents viewing angle 1, j =2 represents viewing angle 2; f _classifier1 () represents the operations performed by classifier 1; i denotes the i-th part of the division,

a predicted probability, representing a geotarget real tag y, in +>

Denotes the cThe prediction probability of each geographic target, C is the category number of the geographic target;

step S35) of obtaining the initial characteristics of the two perspective images obtained in the step S32) and the step S33)

And &>

Inputting information bottleneck module to compress characteristics to obtain low-dimensional image representations which are respectively recorded as ^ er and ^ er>

And &>

Step S36) represents the two low-dimensional images with the view angles obtained in the step S35)

And &>

Inputting the data into a classifier 2, calculating cross entropy classification loss and a classification loss function L _cls2 As follows:

represents the predicted probability of the geographical target real tag y at that time, and>

representing the predicted probability of the c-th geographic object at the moment; f _classifier2 Represents the operations performed by the classifier 2;

step S37) predicted distribution of label y obtained by using classifier 1 and classifier 2

And &>

And &>

The variational distillation loss was calculated as follows:

wherein D _KL To calculate the KL distance, the above formula calculates the predicted distribution

And &>

And &>

KL distance therebetween to ensure that a low-dimensional image representation is obtained @>

And &>

Is sufficient for the label y, and->

And &>

Compared to the initial characteristic->

And &>

Redundant information irrelevant to the task is discarded when the feature dimension is compressed, so that the method is more discriminative;

step S38) the total loss function L of the cross-perspective geographical image retrieval model is as follows:

L＝L _cls1 +L _cls2 +λL _d

wherein λ is a weight hyperparameter;

step S39) optimizing and solving the total loss function L by using a random gradient descent method, and recording an optimized total loss function value;

step S310) repeating the steps S31) to S39), and processing the preprocessed training data set until the total loss function value does not decrease, stopping training, indicating that the cross-view angle geographic image retrieval model based on the information bottleneck variation distillation is trained, and saving the trained model as a finally detected cross-view angle geographic image retrieval model based on the information bottleneck variation distillation;

step S4) cross-view angle geographic image retrieval

Selecting a view angle 1 or view angle 2 image of any geographic target to be retrieved, inputting the image into the finally detected cross-view angle geographic image retrieval model based on information bottleneck variation distillation obtained in the step S310), and obtaining low-dimensional image representation after redundant information is removed

Will be/are>

And splicing to obtain a characteristic z' as a retrieval characteristic, thereby retrieving another perspective image which is most relevant to the same geographic target as the perspective image.

2. The cross-view geographic image retrieval method based on information bottleneck variation distillation as claimed in claim 1, wherein the cross-view geographic image retrieval model based on information bottleneck variation distillation has a specific structure as follows:

the feature extraction module is used for extracting global features of an input image by using a residual neural network ResNet-50 with weights pre-trained on ImageNet, wherein the ResNet-50 comprises five blocks named as conv1, conv2, conv3, conv4 and conv5, an average pooling layer and a full connection layer, the average pooling layer and the full connection layer of the ResNet-50 are removed, and the input image obtains the global features of the image for subsequent processing;

in order to fully utilize the context information of the image, for the extracted image global features, a square ring feature partitioning strategy is adopted, the adjacent regions are used as auxiliary information according to the attention provided by the distance from the adjacent regions to the center of the image, the distinguishing clue of the geographic image is enriched, the specific operation is to use the square ring partitioning design, the image global features extracted by a feature extraction module are divided into a plurality of square ring parts, then each part is subjected to average pooling to obtain the features with the dimension of 2048, and the process is expressed as follows:

f _j ＝F _resnet-50 (x _j )

the subscript j represents the different view angle numbers, x _j Representing an input image, f _j Representing the global features of the extracted image,

Features representing the i-th part of the cut after an average pooling, F _slice Representing square ring feature partition strategy operation, and Avgpool representing average pooling operation; the resulting initial characteristic->

Will be the input to the classifier 1 and the information bottleneck module;

the classifier 1 consists of a full-connection layer, a batch processing normalization layer, a Dropout layer and a classification layer, wherein the classification layer is the full-connection layer, and the dimensionality of an output vector of the classification layer is the category number of the geographic target;

the information bottleneck module is realized by an encoder, and the obtained initial characteristics are subjected to

Performing compression dimensionality reduction, and outputting the characteristic with the dimension size of 400, wherein the characteristic is smaller than a common characteristic dimension 512;

the input of the classifier 2 is the output of the information bottleneck module, the input characteristic dimension is 400, the dimension of the output vector is the number of the categories of the geographic target, and the middle of the output vector is also composed of a batch processing normalization layer and a Dropout layer.

3. The method for retrieving the cross-view geographical image based on information bottleneck variational distillation according to claim 2, wherein the view 1 is a satellite view and the view 2 is an unmanned aerial vehicle view.