CN112632315B - Method and device for retrieving remote sensing image - Google Patents
Method and device for retrieving remote sensing image Download PDFInfo
- Publication number
- CN112632315B CN112632315B CN202011629213.2A CN202011629213A CN112632315B CN 112632315 B CN112632315 B CN 112632315B CN 202011629213 A CN202011629213 A CN 202011629213A CN 112632315 B CN112632315 B CN 112632315B
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- sensing image
- image data
- network model
- target network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 49
- 238000012795 verification Methods 0.000 claims abstract description 44
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 230000009466 transformation Effects 0.000 claims description 70
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 63
- 230000007246 mechanism Effects 0.000 claims description 43
- 238000013434 data augmentation Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000003190 augmentative effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 29
- 239000013598 vector Substances 0.000 description 14
- 230000006870 function Effects 0.000 description 12
- 238000004590 computer program Methods 0.000 description 11
- 238000012545 processing Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000006872 improvement Effects 0.000 description 6
- 230000004913 activation Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/14—Transformations for image registration, e.g. adjusting or mapping for alignment of images
- G06T3/147—Transformations for image registration, e.g. adjusting or mapping for alignment of images using affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Library & Information Science (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a method and a device for remote sensing image retrieval, wherein the method comprises the following steps: acquiring remote sensing image data; dividing the remote sensing image data into a training set, a verification set and a test set; building a target network model for retrieving remote sensing image data; performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set; and adjusting the target network model according to the model weight file, and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result. By the embodiment of the invention, the remote sensing image is retrieved, and the retrieval accuracy is improved.
Description
Technical Field
The invention relates to the field of image recognition, in particular to a method and a device for remote sensing image retrieval.
Background
With the rapid development of remote sensing technology, remote sensing image databases are explosively increased, and in order to efficiently manage remote sensing image databases, content-based image retrieval (CBIR) systems have become a hot spot of domestic and foreign research.
Early CBIR extracted visual features such as Scale Invariant Feature Transform (SIFT) algorithm and Histogram of Oriented Gradients (HOG) algorithm based on image texture, color, shape, etc., but such low-level global features are susceptible to viewing angle, illumination, occlusion, etc.
Disclosure of Invention
In view of the above, it is proposed to provide a method and apparatus for remote sensing image retrieval that overcomes or at least partially solves the above mentioned problems, comprising:
a method of remote sensing image retrieval, the method comprising:
acquiring remote sensing image data;
dividing the remote sensing image data into a training set, a verification set and a test set;
building a target network model for retrieving remote sensing image data;
performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set;
and adjusting the target network model according to the model weight file, and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result.
Optionally, the building a target network model for remote sensing image data retrieval includes:
constructing a space attention mechanism module;
and carrying out affine transformation on the space attention mechanism module to build a target network model for retrieving the remote sensing image data.
Optionally, the performing affine transformation on the spatial attention mechanism module includes:
determining affine transformation parameters;
and carrying out affine transformation on the space attention mechanism module according to the affine transformation parameters.
Optionally, before the dividing the remote sensing image data into a training set, a verification set, and a test set, the method further includes:
and preprocessing and data augmentation are carried out on the remote sensing image data.
Optionally, the data augmentation includes but is not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
An apparatus for remote sensing image retrieval, the apparatus comprising:
the remote sensing image data acquisition module is used for acquiring remote sensing image data;
the remote sensing image data dividing module is used for dividing the remote sensing image data into a training set, a verification set and a test set;
the target network model building module is used for building a target network model for remote sensing image data retrieval;
a model weight file obtaining module, configured to perform model training on the target network model by using the training set and the verification set, so as to obtain a model weight file with the highest accuracy in the verification set;
and the retrieval result obtaining module is used for adjusting the target network model according to the model weight file and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result.
Optionally, the target network model building module includes:
the space attention mechanism module construction submodule is used for constructing a space attention mechanism module;
and the affine transformation submodule is used for carrying out affine transformation on the space attention mechanism module so as to build a target network model for searching remote sensing image data.
Optionally, the affine transformation submodule includes:
an affine transformation parameter determining unit for determining affine transformation parameters;
and the spatial attention mechanism module transformation unit is used for carrying out affine transformation on the spatial attention mechanism module according to the affine transformation parameters.
Optionally, the method further comprises:
and the image adjusting module is used for preprocessing the remote sensing image data and augmenting the data.
Optionally, the data augmentation includes but is not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, remote sensing image data is obtained; dividing the remote sensing image data into a training set, a verification set and a test set; building a target network model for retrieving remote sensing image data; performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set; and adjusting the target network model according to the model weight file, and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result. By the embodiment of the invention, the remote sensing image is retrieved, and the retrieval accuracy is improved.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1a is a schematic diagram of an improved Spatial Attention Transform module according to an embodiment of the present invention;
FIG. 1b is a schematic diagram of an improved Spatial Attention Transform module according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating steps of a method for remote sensing image retrieval according to an embodiment of the present invention;
FIG. 3 is a diagram of a SAT-Inclusion V4 network according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a training process of an image retrieval model according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating loading of an optimal model for image retrieval according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus for remote sensing image retrieval according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
With the proposal of Convolutional Neural Network (CNN), CNN has been applied more and more widely in the field of computer vision, such as image classification, target detection, and image retrieval. Compared with the traditional manual extraction method, the deep features extracted by the CNN contain richer image information, and the retrieval performance is improved.
The research on image retrieval is mainly two aspects, one improvement on the ternary loss function, such as hard triple loss, multi-similarity loss, circle loss, proxy anchor loss, and the like, is mainly to improve the weights for positive and negative sample pairs.
Another aspect relates generally to improvements in the underlying backbone network, such as increased regional attention networks, visual attention mechanisms, and the like. More and more researchers have introduced these attention mechanisms into the field of image analysis to achieve efficient, accurate image retrieval capabilities.
Based on the traditional processing method, some characteristics are designed in a manual pertinence mode, and some rules are interpenetrated in the traditional processing method to correct the place where the algorithm is not processed properly. However, the traditional method has a poor processing effect for complex image backgrounds, more interference, similar images, large shooting angle difference of the object to be retrieved and the like. The network model of the attention mechanism based on the region or vision is mainly used for extracting information of some local regions, and the method of affine transformation and the space attention mechanism module is combined, so that manual intervention is reduced by utilizing a deep learning mode, and the accuracy of a remote sensing image and the stability of an algorithm are improved.
Based on the problems, the invention provides a remote sensing image retrieval method for improving a post-space attention mechanism. This model can also be seen as an improvement to the spatial attention mechanism. The concrete improvement is as follows:
1. a Feature Affine Transformation (FAT) is used in combination with the IncepotionV 4 as a backbone network, so that the Feature extraction capability of the model is improved.
2. The method helps solve the problem that important information of local regions is covered by introducing a Spatial Attention (Spatial Attention) mechanism to improve the decoding effect of the feature vectors.
Specifically, the method can comprise the following steps:
1. preparing a remote sensing image to be retrieved, placing the same type of image under the same folder, and giving the category name of the folder as the name of the folder.
2. And (3) preprocessing and data augmentation are carried out on the image in the step 1.
3. And (3) dividing all the images finished in the stage 2 into a training set, a verification set and a test set.
4. An Inception V4 model is taken as a core, and an affine transformation is combined with a space attention mechanism module (SAT) to build an end-to-end remote sensing image retrieval network model. The results of the experiments are shown in the following table:
and (3) constructing an end-to-end remote sensing image retrieval network model, such as the graph 1a and the graph 1b, and generally, adding affine transformation and improving a space attention mechanism module. The specific operation content comprises the following steps:
4.1, constructing a Spatial Attention mechanism (Spatial Attention) module, as shown in fig. 1a and 1b, the overall operation can be summarized in the following three parts.
The Squeeze operation is to compress the channel dimension in a channel summation manner, and the image data is also changed into H × W × 1 form from the original H × W × C spatial structure.
The Reshape operation transforms the two-dimensional feature vector graph H × W × 1 into a (H × W) -dimensional vector, and transforms the one-dimensional vector to the vector graph H × W × 1 using the softmax activation function and the Reshape function.
And (4) Channel-multi operation, namely multiplying the H multiplied by W size feature diagram with the C channels of the original feature.
And 4.2, carrying out affine transformation improvement on the SA module constructed in the step 4.1. As shown in fig. 1, adding affine transformation capability before Channel-multi operation in SA module mainly has the following two steps:
solving affine transformation parameter AθThe vector diagram H × W × 1 is changed to a 2 × 3-dimensional vector diagram using a convolution kernel. A. theθIs initialized to an identity transformation matrix, and continuously corrects A by a loss functionθFinally, the expected affine transformation matrix is obtained.
And (4) carrying out parametric grid sampling, wherein the aim of the step is to obtain the positions of the coordinate points of the input characteristic diagram corresponding to the coordinate points of the output characteristic diagram. The calculation method is as follows:
where s represents an input feature image coordinate point, t represents an output feature map coordinate point, AθIs the output of the local network. And extracting a pixel value of a target coordinate point, and extracting the pixel value by using bilinear interpolation for a non-integer coordinate, thereby obtaining an output characteristic diagram V.
4.3, setting the size of the input image as 299 x 299, carrying out multilayer convolution operation by using the SAT network formed in the step 4.2, extracting corrected features in the image, and finally outputting a feature vector of 1x 512.
5. And (4) sending the training set and the verification set which are arranged in the step (3) into the network model which is built in the step (4), training, and selecting a model weight file with the highest accuracy on the verification set through gradual iteration.
6. And (5) loading the optimal weight file extracted in the step 5 by using the inference network, and retrieving the remote sensing image.
The difference lies in that the attention mechanism is the combination of space attention and passageway attention, and this patent is through a lot of experimental contrasts, reachs the promotion that space attention module has certain index to image retrieval, and passageway attention does not have the promotion, includes to fuse the two and train, and the promotion index also does not have the index promotion that the independent use space attention brought. Therefore, in conjunction with this conclusion, the patent fuses the radial transformation into the spatial attention module to obtain the sat (spatial attention transform network).
This patent fuses the affine transformation in the convolutional neural network, and the specific different point of contrast alone (not fusing) is: the common idea is to perform affine transformation on an image during data enhancement, and parameters of the affine transformation are set manually by a trainer and cannot cover various affine transformations. The method fuses affine transformation into the convolutional neural network, and parameters of the affine transformation are not set artificially but obtained by training in model training. Therefore, affine transformation is fused in the model to have a generalization effect. The effect is not the same as that of the single (non-fused) compound.
The method for integrating the radiation transformation into the spatial attention module is innovative because the conventional operation is to perform affine transformation on the data when the data is enhanced, so that the data expansion is completed. Because the shooting angle of the remote sensing image is not fixed and the imaging angle of the retrieval target is changeable, the conventional affine transformation cannot adapt to the retrieval scene of the remote sensing image, and therefore the affine transformation and the spatial attention module are considered to be fused.
The affine transformation is added in the space attention mechanism module, so that the problem of search failure caused by overlarge rotation angle of image content is solved; and by continuously trying to combine the improved spatial attention mechanism module with the Inception V4, the feature extraction capability of the model is improved, so that the accuracy of image retrieval is improved.
Referring to fig. 2, a flowchart illustrating steps of a method for retrieving a remote sensing image according to an embodiment of the present invention is shown, which may specifically include the following steps:
in practical application, remote sensing image data to be retrieved can be obtained, the same type of images are placed under the same folder, and the category name of the folder is given and used as the name of the folder.
specifically, a training set is divided, a verification set is divided, and the data proportion of a test set is 8: 1: 1.
optionally, a plurality of data are divided for training, and the division ratio of the data set is adjusted to 8.5: 0.5: 1.
in an embodiment of the present invention, before step 202, the method further includes:
and preprocessing and data augmentation are carried out on the remote sensing image data.
As an example, ways of data augmentation may include, but are not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
Specifically, the data augmentation method includes, but is not limited to, flipping the image, translating, scaling, adjusting the RGB channel weights of the image, and rotating the image.
Optionally, the Color richness of the image is adjusted, the illumination intensity of the image is changed, the Contrast of the image is adjusted (Contrast) and the image is sharpened (sharps).
in an embodiment of the present invention, step 203 may include the following sub-steps:
constructing a space attention mechanism module; and carrying out affine transformation on the space attention mechanism module to build a target network model for retrieving the remote sensing image data.
In an embodiment of the present invention, performing affine transformation on the spatial attention mechanism module may include the following sub-steps:
determining affine transformation parameters; and carrying out affine transformation on the space attention mechanism module according to the affine transformation parameters.
Specifically, an Inception V4 model is taken as a core, and an affine transformation and a space attention mechanism module are combined to build an end-to-end remote sensing image retrieval network model.
And (3) constructing an end-to-end remote sensing image retrieval network model, such as the graph 1a and the graph 1b, and generally, adding affine transformation and improving a space attention mechanism module. The specific operation content comprises the following steps:
4.1, constructing a spatial attention mechanism (SpatialAttention) module, as shown in fig. 1a, the overall operation can be summarized in the following three parts.
The Squeeze operation is to compress the channel dimension in a channel summation manner, and the image data is also changed into H × W × 1 form from the original H × W × C spatial structure.
Specifically, the feature maps of C channels are sequentially accumulated, for example, channel 1, channel 2 of the feature at the same position and channel C are accumulated and summed, and finally, the feature map H × W × 1 is obtained.
The Reshape operation transforms the two-dimensional feature vector graph H × W × 1 into a (H × W) -dimensional vector, and transforms the one-dimensional vector to the vector graph H × W × 1 using the softmax activation function and the Reshape function.
In particular, the softmax activation function is used in a multi-classification process, which maps the outputs of a plurality of neurons into a (0,1) interval, which can be understood as a probability. Assuming we have an array, V, Vi represents the ith element in V, then the softmax value of this element is:
and (4) Channel-multi operation, namely multiplying the H multiplied by W size characteristic diagram with each Channel of the original feature.
Specifically, channel multiplication is performed on the activated feature map and the original input feature map H × W × C, for example, the feature map at the position after activation (x1, y1) is multiplied by C channels at the position of the original input feature map (x1, y1), and finally an output feature map with the size H × W × C is obtained.
And 4.2, carrying out affine transformation improvement on the SA module constructed in the step 4.1. As shown in fig. 1b, adding affine transformation capability before Channel-multi operation in SA module mainly has the following two steps:
solving affine transformation parameter AθThe vector diagram H × W × 1 is changed to a 2 × 3-dimensional vector diagram using a convolution kernel. A. theθIs initialized to an identity transformation matrix, and continuously corrects A by a loss functionθFinally, the expected affine transformation matrix is obtained.
And (4) carrying out parametric grid sampling, wherein the aim of the step is to obtain the positions of the coordinate points of the input characteristic diagram corresponding to the coordinate points of the output characteristic diagram. The calculation method is as follows:
where s represents the input feature image coordinate point and t represents the inputGo out the feature map coordinate point, AθIs the output of the local network. And extracting a pixel value of a target coordinate point, and extracting the pixel value by using bilinear interpolation for a non-integer coordinate, thereby obtaining an output characteristic diagram V.
The invention uses an increment network as a basic network, a first improved SAT module is inserted in a stem block, the input of the SAT module is 35 × 384, at the moment, a channel accumulation calculation mode is firstly carried out in the SAT module, a characteristic diagram is changed into 35 × 35, and the characteristic diagram is reduced into 35 × 35 after reshape, softmax and reshape operations in sequence; and then, adding an affine transformation module to solve the problem that the image content has affine transformation, and finally fusing the image content with the original input 35 x 384 to complete the final operation of the SAT module, wherein the output is 35 x 384. After the SAT block is completed, 512 dimensions of features are output using the average pooling layer and the full connection layer. Similarly, the present invention performs the same SAT module addition operation after both Reduction-A and Reduction-B blocks, as shown in detail in FIG. 3. Finally, the whole network outputs 4 characteristic graphs of 1x512, and the improved ternary loss function multi-similarity loss (formula 2) is used as the loss function of the network model.
Where α, β, λ are hyper-parameters. PiIs a sample of the positive sample set, NiIs a negative example in the negative example set.
Let x beiIs an anchor, when a negative sample is selected, a counter-example pair (x) is formedi,xj) Then S is the correspondingi,jComprises the following steps:
suppose that when a positive sample is selected, a positive case pair (x) is formedi,xj) Then S is the correspondingi,jComprises the following steps:
step 204, performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set;
and sending the sorted training set and verification set into the built network model, training, and selecting the model weight file with the highest accuracy on the verification set through gradual iteration.
Specifically, as shown in fig. 4, the image retrieval model is trained, and the image is loaded, then edge filling (Padding) processing is performed, and then size scaling processing is performed; for an area ratio that is too small, clipping and size enlargement processing are performed. And (4) receiving and sorting the data set, sending the training set and the verification set into the built image retrieval model, and performing iterative training.
Step 205, adjusting the target network model according to the model weight file, and retrieving the test set by using the adjusted target network model to obtain a retrieval result.
Specifically, as shown in fig. 5, an optimal model is loaded to perform a text recognition process, and a test set image is input and preprocessed, note that preprocessing herein does not enhance the image, only sets the image size to 299 × 299, and normalizes the image to scale the pixel value to (-1, 1). And then initializing the recognition network, setting a dictionary file and an optimal model file path, then loading the model, carrying out image retrieval, and finally outputting a remote sensing image retrieval result.
In the embodiment of the invention, remote sensing image data is obtained; dividing the remote sensing image data into a training set, a verification set and a test set; building a target network model for retrieving remote sensing image data; performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set; and adjusting the target network model according to the model weight file, and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result. By the embodiment of the invention, the remote sensing image is retrieved, and the retrieval accuracy is improved.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 6, a schematic structural diagram of an apparatus for remote sensing image retrieval according to an embodiment of the present invention is shown, and the apparatus may specifically include the following modules:
a remote sensing image data acquisition module 601, configured to acquire remote sensing image data;
a remote sensing image data dividing module 602, configured to divide the remote sensing image data into a training set, a verification set, and a test set;
a target network model building module 603, configured to build a target network model for retrieving remote sensing image data;
a model weight file obtaining module 604, configured to perform model training on the target network model by using the training set and the verification set, so as to obtain a model weight file with the highest accuracy in the verification set;
a retrieval result obtaining module 605, configured to adjust the target network model according to the model weight file, and retrieve the test set by using the adjusted target network model to obtain a retrieval result.
In an embodiment of the present invention, the target network model building module 603 includes:
the space attention mechanism module construction submodule is used for constructing a space attention mechanism module;
and the affine transformation submodule is used for carrying out affine transformation on the space attention mechanism module so as to build a target network model for searching remote sensing image data.
In an embodiment of the present invention, the affine transformation submodule includes:
an affine transformation parameter determining unit for determining affine transformation parameters;
and the spatial attention mechanism module transformation unit is used for carrying out affine transformation on the spatial attention mechanism module according to the affine transformation parameters.
In an embodiment of the present invention, the method further includes:
and the image adjusting module is used for preprocessing the remote sensing image data and augmenting the data.
In an embodiment of the present invention, the data augmentation method includes, but is not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
An embodiment of the present invention further provides an electronic device, which may include a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when executed by the processor, the computer program implements the method for remote sensing image retrieval as above.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for remote sensing image retrieval as above is implemented.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method and the device for remote sensing image retrieval are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (4)
1. A method for remote sensing image retrieval, the method comprising:
acquiring remote sensing image data;
dividing the remote sensing image data into a training set, a verification set and a test set;
building a target network model for retrieving remote sensing image data;
performing model training on the target network model by adopting the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set;
adjusting the target network model according to the model weight file, and retrieving the test set by adopting the adjusted target network model to obtain a retrieval result;
the model training of the target network model by using the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set includes:
sending the training set and the verification set into the established network model, training, and selecting a model weight file with the highest accuracy on the verification set through gradual iteration;
wherein, the establishing of the target network model for remote sensing image data retrieval comprises the following steps:
constructing a space attention mechanism module;
carrying out affine transformation on the space attention mechanism module to build a target network model for retrieving remote sensing image data;
wherein the performing affine transformation on the spatial attention mechanism module comprises:
determining affine transformation parameters;
carrying out affine transformation on the space attention mechanism module according to the affine transformation parameters;
before dividing the remote sensing image data into a training set, a verification set and a test set, the method further comprises the following steps:
and preprocessing and data augmentation are carried out on the remote sensing image data.
2. The method of claim 1, wherein the data augmentation mode includes but is not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
3. An apparatus for remote sensing image retrieval, the apparatus comprising:
the remote sensing image data acquisition module is used for acquiring remote sensing image data;
the remote sensing image data dividing module is used for dividing the remote sensing image data into a training set, a verification set and a test set;
the target network model building module is used for building a target network model for remote sensing image data retrieval;
a model weight file obtaining module, configured to perform model training on the target network model by using the training set and the verification set, so as to obtain a model weight file with the highest accuracy in the verification set;
a retrieval result obtaining module, configured to adjust the target network model according to the model weight file, and retrieve the test set by using the adjusted target network model to obtain a retrieval result;
the model training of the target network model by using the training set and the verification set to obtain a model weight file with the highest accuracy on the verification set includes:
sending the training set and the verification set into the established network model, training, and selecting a model weight file with the highest accuracy on the verification set through gradual iteration;
wherein, the target network model building module comprises:
the space attention mechanism module construction submodule is used for constructing a space attention mechanism module;
the affine transformation submodule is used for carrying out affine transformation on the space attention mechanism module so as to build a target network model for searching remote sensing image data; wherein the affine transformation submodule comprises:
an affine transformation parameter determining unit for determining affine transformation parameters;
the spatial attention mechanism module transformation unit is used for carrying out affine transformation on the spatial attention mechanism module according to the affine transformation parameters;
further comprising:
and the image adjusting module is used for preprocessing the remote sensing image data and augmenting the data.
4. The apparatus of claim 3, wherein the data augmentation modes include but are not limited to:
turning the remote sensing image data, translating the remote sensing image data, zooming the remote sensing image data, adjusting the RGB channel weight of the remote sensing image data, and rotating the remote sensing image data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011629213.2A CN112632315B (en) | 2020-12-30 | 2020-12-30 | Method and device for retrieving remote sensing image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011629213.2A CN112632315B (en) | 2020-12-30 | 2020-12-30 | Method and device for retrieving remote sensing image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112632315A CN112632315A (en) | 2021-04-09 |
CN112632315B true CN112632315B (en) | 2022-03-29 |
Family
ID=75290621
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011629213.2A Active CN112632315B (en) | 2020-12-30 | 2020-12-30 | Method and device for retrieving remote sensing image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112632315B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114170206B (en) * | 2021-12-13 | 2023-01-24 | 上海派影医疗科技有限公司 | Breast pathology image canceration property interpretation method and device considering spatial information correlation |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107845066A (en) * | 2017-10-09 | 2018-03-27 | 中国电子科技集团公司第二十八研究所 | Urban remote sensing image split-joint method and device based on piecewise affine transformations model |
CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
CN110335261A (en) * | 2019-06-28 | 2019-10-15 | 山东科技大学 | It is a kind of based on when idle loop attention mechanism CT lymph node detection system |
CN111462196A (en) * | 2020-03-03 | 2020-07-28 | 中国电子科技集团公司第二十八研究所 | Remote sensing image matching method based on cuckoo search and Krawtchouk moment invariant |
CN111582225A (en) * | 2020-05-19 | 2020-08-25 | 长沙理工大学 | Remote sensing image scene classification method and device |
CN111612066A (en) * | 2020-05-21 | 2020-09-01 | 成都理工大学 | Remote sensing image classification method based on depth fusion convolutional neural network |
CN111680176A (en) * | 2020-04-20 | 2020-09-18 | 武汉大学 | Remote sensing image retrieval method and system based on attention and bidirectional feature fusion |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9014614B2 (en) * | 2011-10-20 | 2015-04-21 | Cogcubed Corporation | Cognitive assessment and treatment platform utilizing a distributed tangible-graphical user interface device |
EP3378222A4 (en) * | 2015-11-16 | 2019-07-03 | Orbital Insight, Inc. | Moving vehicle detection and analysis using low resolution remote sensing imagery |
US10503775B1 (en) * | 2016-12-28 | 2019-12-10 | Shutterstock, Inc. | Composition aware image querying |
CN107273502B (en) * | 2017-06-19 | 2020-05-12 | 重庆邮电大学 | Image geographic labeling method based on spatial cognitive learning |
CN110457516A (en) * | 2019-08-12 | 2019-11-15 | 桂林电子科技大学 | A kind of cross-module state picture and text search method |
CN111401375B (en) * | 2020-03-09 | 2022-12-30 | 苏宁云计算有限公司 | Text recognition model training method, text recognition device and text recognition equipment |
CN111476292B (en) * | 2020-04-03 | 2021-02-19 | 北京全景德康医学影像诊断中心有限公司 | Small sample element learning training method for medical image classification processing artificial intelligence |
-
2020
- 2020-12-30 CN CN202011629213.2A patent/CN112632315B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107845066A (en) * | 2017-10-09 | 2018-03-27 | 中国电子科技集团公司第二十八研究所 | Urban remote sensing image split-joint method and device based on piecewise affine transformations model |
CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
CN110335261A (en) * | 2019-06-28 | 2019-10-15 | 山东科技大学 | It is a kind of based on when idle loop attention mechanism CT lymph node detection system |
CN111462196A (en) * | 2020-03-03 | 2020-07-28 | 中国电子科技集团公司第二十八研究所 | Remote sensing image matching method based on cuckoo search and Krawtchouk moment invariant |
CN111680176A (en) * | 2020-04-20 | 2020-09-18 | 武汉大学 | Remote sensing image retrieval method and system based on attention and bidirectional feature fusion |
CN111582225A (en) * | 2020-05-19 | 2020-08-25 | 长沙理工大学 | Remote sensing image scene classification method and device |
CN111612066A (en) * | 2020-05-21 | 2020-09-01 | 成都理工大学 | Remote sensing image classification method based on depth fusion convolutional neural network |
Non-Patent Citations (3)
Title |
---|
Spatial attention deep net with partial pso for hierarchical hybrid hand pose estimation;Ye Q 等;《European conference on computer vision》;20160917;第346-361页 * |
STAR-Net: a spatial attention residue network for scene text recognition;Liu W 等;《Proceedings of British Machine Vision Conference (BMVC)》;20161231;第1-13页 * |
基于注意力模型和特征层仿射对齐模型的行人再识别研究;马丽;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190715(第7期);第I138-1323页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112632315A (en) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110866140B (en) | Image feature extraction model training method, image searching method and computer equipment | |
CN108509978B (en) | Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion | |
CN106909924B (en) | Remote sensing image rapid retrieval method based on depth significance | |
CN109960742B (en) | Local information searching method and device | |
US20220222918A1 (en) | Image retrieval method and apparatus, storage medium, and device | |
CN108846404B (en) | Image significance detection method and device based on related constraint graph sorting | |
CN111612008A (en) | Image segmentation method based on convolution network | |
CN110866938B (en) | Full-automatic video moving object segmentation method | |
CN116721301B (en) | Training method, classifying method, device and storage medium for target scene classifying model | |
CN112149526B (en) | Lane line detection method and system based on long-distance information fusion | |
US20110225172A1 (en) | System, method, and computer-readable medium for seeking representative images in image set | |
CN111461196B (en) | Rapid robust image identification tracking method and device based on structural features | |
CN114821052A (en) | Three-dimensional brain tumor nuclear magnetic resonance image segmentation method based on self-adjustment strategy | |
Obeso et al. | Saliency-based selection of visual content for deep convolutional neural networks: application to architectural style classification | |
CN112632315B (en) | Method and device for retrieving remote sensing image | |
CN110135435B (en) | Saliency detection method and device based on breadth learning system | |
Vora et al. | Iterative spectral clustering for unsupervised object localization | |
CN111177447A (en) | Pedestrian image identification method based on depth network model | |
WO2024027347A1 (en) | Content recognition method and apparatus, device, storage medium, and computer program product | |
CN117636298A (en) | Vehicle re-identification method, system and storage medium based on multi-scale feature learning | |
CN117315090A (en) | Cross-modal style learning-based image generation method and device | |
CN113849679A (en) | Image retrieval method, image retrieval device, electronic equipment and storage medium | |
CN115984671A (en) | Model online updating method and device, electronic equipment and readable storage medium | |
JP6778625B2 (en) | Image search system, image search method and image search program | |
CN114913402A (en) | Fusion method and device of deep learning model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |