CN105243154A

CN105243154A - Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings

Info

Publication number: CN105243154A
Application number: CN201510708598.4A
Authority: CN
Inventors: 邵振峰; 周维勋; 李从敏
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2015-10-27
Filing date: 2015-10-27
Publication date: 2016-01-13
Anticipated expiration: 2035-10-27
Also published as: CN105243154B

Abstract

A remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings are disclosed. The method comprises the steps of extracting characteristic points of each image from an image library to obtain a characteristic point matrix, and calculating a salient map of each image based on a visual attention model; performing binaryzation on the salient maps by a self-adaption threshold value method, performing a mask calculation with the characteristic point matrix to obtain filtered significant characteristics points; separately choosing a plurality of significant characteristics points from each training image to configure training samples; training a spare auto-encoder network according to a whitened training sample set to obtain a characteristic extractor; extracting characteristics by the characteristic extractor, performing a sparsification treatment on the extracted image characteristics by a threshold function to obtain a final characteristic vector for retrieval; and performing image retrieval according to a preset similarity measurement criterion based on the extracted characteristic vector. The automatic extraction of the image characteristics is realized through the trained spare auto-encoder network; and in addition, the extracted characteristics are quite high in identification level, so that the retrieval precision ratio is ensured.

Description

Based on remote sensing image retrieval method and the system of the sparse own coding of significant point characteristic sum

Technical field

The invention belongs to technical field of image processing, relate to a kind of remote sensing image retrieval method based on the sparse own coding of significant point characteristic sum and system.

Background technology

Along with the raising of remote sensing earth observation ability, retrievable remotely-sensed data presents the feature of diversification and magnanimity.But, mass remote sensing data for while all kinds of major application demand provides enrich data source, due to current ground data process and analysis ability not enough, " data magnanimity, information are flooded " problem of the large data of remote sensing becomes increasingly conspicuous.How utilizing emerging scientific algorithm technology and means, the interesting target in quick position and intelligent retrieval remote sensing images or region, be the large data processing of remote sensing and analyze facing challenges, is also field of remote sensing image processing problem in science urgently to be resolved hurrily.Remote Sensing Image Retrieval technology is the effective ways solving this bottleneck problem, studies efficient image retrieval technologies and has great importance.

Current remote Sensing Image Retrieval technology mainly by carrying out similarity measurement to the low-level feature of image, and then returns similar image.Compare traditional search method based on key word, content-based retrieval method efficiency and accuracy higher, but to design a kind of character description method that effectively can describe various complicated remote sensing images scene be very difficult.In recent years, degree of depth study became the study hotspot of field of image recognition gradually owing to having good feature learning ability.Compare the feature of engineer, the method based on degree of depth study obtains a feature extractor to realize the automatic extraction of characteristics of image by sample training, is applicable to the remote Sensing Image Retrieval comprising complex scene.Because network design is relative with training simple, sparse own coding has become a kind of conventional degree of depth learning method, and is widely used in image procossing.

Train for sparse autoencoder network, in structure training sample, existing method is normally from the image block structure training sample of random selecting some and size training image, and this sample architecture method has following defect.The first, from the angle of human eye vision theory, what people paid close attention to is specific objective on remote sensing images, and the image block of random selecting may not comprise the specific objective of concern.The second, because the size of training image is fixed, therefore the method for random selecting image block structure training sample may cause lack of training samples.3rd, because training sample is image block, obtain being image block but not the feature of entire image when utilizing the network trained to carry out feature extraction, therefore can not be directly used in image retrieval.In order to obtain the feature of entire image, usually need to adopt the method for convolution, but this process not only counting yield is low but also can introduce other parameters.In activation function selection, existing method adopts sigmoid function as the activation function of network hidden layer neuron usually, and sigmoid function exists the problems such as serious gradient disappearance when network backpropagation, be unfavorable for network training.For sparse autoencoder network feature extraction, existing method normally direct using the activation value of hidden layer as extract feature and not through LS-SVM sparseness, and experiment show that sparse features performance is better.

Summary of the invention

For the deficiency that prior art exists, the invention provides a kind of remote Sensing Image Retrieval technical scheme based on the sparse own coding of significant point characteristic sum.The present invention will extract the significant point feature of remote sensing images as the input of sparse autoencoder network and then train it, finally utilizes the feature extractor of training to extract characteristics of image to realize remote Sensing Image Retrieval.

The technical solution adopted in the present invention is a kind of remote sensing image retrieval method based on the sparse own coding of significant point characteristic sum, comprises the following steps:

Step 1, the unique point extracting each image in image library obtains characteristic point matrix, and utilizes visual attention model to calculate the remarkable figure of each image;

Step 2, for the remarkable figure of image each in image library, respectively adopt Adaptive Thresholding by remarkable figure binaryzation, and the characteristic point matrix corresponding to image carry out mask computing obtain filter after remarkable characteristic; Implementation is as follows,

When adopting Adaptive Thresholding by remarkable figure binaryzation, according to the conspicuousness size of specific image element, determine that the binary-state threshold T of remarkable figure is as follows,

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y)

Wherein, w and h represents that remarkable figure's is wide and high respectively, and I (x, y) represents the saliency value of specific image element (x, y);

If according to binary-state threshold T to remarkable figure binaryzation, obtain binaryzation and significantly scheme, matrix I should be had mutually _binaryif P represents the characteristic point matrix of image, P _irepresent the notable feature dot matrix after filtering, calculate notable feature dot matrix as follows,

P_{I} = P &CircleTimes; I_{b i n a r y}

Step 3, from image library, get some images as training image, choose some remarkable characteristics structure training sample respectively from each training image, obtain training sample set X, train sparse autoencoder network according to the training sample set X ' after albefaction, obtain feature extractor;

Described sparse autoencoder network input layer, hidden layer and output layer, wherein hidden layer neuron adopts ReLU function as activation function, and output layer neuron adopts softplus function as activation function, and the cost function of sparse autoencoder network is defined as follows,

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2}

Wherein, Section 1 is square error item, and Section 2 is regular terms, H _w,brepresent the network output valve of training sample set X ', W=[W ₁, W ₂] and b=[b ₁, b ₂] represent weights W between network input layer and hidden layer respectively ₁with biased b ₁and the weights W between hidden layer and output layer ₂with biased b ₂the weight matrix formed, λ represents regularization coefficient;

Step 4, to all images in image library, utilizes step 3 to train the feature extractor of gained to carry out feature extraction, and carries out LS-SVM sparseness with threshold function table to the characteristics of image extracted, and obtains the final proper vector for retrieving; Implementation is as follows,

The characteristics of image Y extracted is expressed as follows,

Y＝f ₁(W ₁P _I′+b ₁)

Wherein, notable feature dot matrix P _i' be according to step 2 gained filter after notable feature dot matrix P _iresult after albefaction;

For the characteristics of image Y extracted, carry out following LS-SVM sparseness and obtain sparse features matrix Z,

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)]

Wherein, α represents the threshold value of threshold function table, matrix Z ₊=max (0, Y-α), Z _-=max (0, α-Y);

If the SIFT point number detected from piece image is n, sparse features matrix Z is processed further, obtains proper vector F as follows,

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}]

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

Step 5, based on the proper vector that step 4 is extracted, the similarity measurement criterion according to presetting carries out image retrieval.

And in step 1, the unique point extracting each image in image library obtains characteristic point matrix, utilizes SIFT operator extraction to realize.

And in step 5, the similarity measurement criterion preset adopts city distance.

The present invention is also corresponding provides a kind of Content-based Remote Sensing Image Retrieval System based on the sparse own coding of significant point characteristic sum, comprises with lower module,

Feature point extraction module, obtains characteristic point matrix for the unique point extracting each image in image library, and utilizes visual attention model to calculate the remarkable figure of each image;

Remarkable characteristic extraction module, for the remarkable figure for image each in image library, respectively adopt Adaptive Thresholding by remarkable figure binaryzation, and the characteristic point matrix corresponding to image carry out mask computing obtain filter after remarkable characteristic; Implementation is as follows,

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y)

P_{I} = P &CircleTimes; I_{b i n a r y}

Training module, for getting some images as training image from image library, choosing some remarkable characteristics structure training sample respectively from each training image, obtaining training sample set X, train sparse autoencoder network according to the training sample set X ' after albefaction, obtain feature extractor;

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2}

Characteristic extracting module, for all images in image library, utilizes step 3 to train the feature extractor of gained to carry out feature extraction, and carries out LS-SVM sparseness with threshold function table to the characteristics of image extracted, and obtains the final proper vector for retrieving; Implementation is as follows,

The characteristics of image Y extracted is expressed as follows,

Y＝f ₁(W ₁P _I′+b ₁)

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)]

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}]

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

Retrieval module, for the proper vector that feature based extraction module extracts, the similarity measurement criterion according to presetting carries out image retrieval.

And in feature point extraction module, the unique point extracting each image in image library obtains characteristic point matrix, utilizes SIFT operator extraction to realize.

And in retrieval module, the similarity measurement criterion preset adopts city distance.

Compared with prior art, the present invention has following features and beneficial effect,

1, adopt the remarkable figure of visual attention model computed image, and remarkable figure binaryzation is filtered the remarkable characteristic obtaining image to the unique point that SIFT extracts, not only meet the vision attention feature of human eye but also the Search Requirement of people can be reflected better.

2, choose the remarkable characteristic structure training sample of image, compensate for the defect of the traditional structure of grab sample on training image training sample.

3, utilize sparse autoencoder network to train the feature extractor obtained to achieve the automatic extraction of characteristics of image, eliminate the characteristic Design process for complicated remote sensing images.

4, favorable expandability, training sample includes but are not limited to remarkable characteristic.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the embodiment of the present invention.

Embodiment

The unique point that first remote sensing image retrieval method based on the sparse own coding of significant point characteristic sum that the present invention proposes extracts image obtains characteristic point matrix, and the remarkable figure of computed image, then adopt adaptive threshold that remarkable figure binaryzation and characteristic point matrix are carried out " mask " computing and obtain remarkable characteristic, then the remarkable characteristic structure training sample choosing some trains sparse autoencoder network, and the proper vector utilizing the feature extractor of training automatically to extract characteristics of image to obtain for retrieving, the method for measuring similarity that last basis is preset carries out image retrieval and returns similar image.

For describing technical solution of the present invention in detail, see Fig. 1, embodiment flow process is provided to be described as follows:

Step 1, the unique point extracting each image in image library obtains characteristic point matrix, and utilizes visual attention model to calculate the remarkable figure of each image.

During concrete enforcement, the image library that can adopt existing image library or be built voluntarily by those skilled in the art.Such as choose the high-resolution remote sensing image that a width comprises multiple atural object classification, adopt Tiles partitioned mode to carry out cutting and build the retrieving images storehouse comprising multiple classification.For the every piece image in image library, first embodiment adopts the unique point (key point) of SIFT (ScaleInvariantFeatureTransform) operator extraction image to obtain characteristic point matrix, then the remarkable figure of GBVS (Graph-BasedVisualSaliency) model computed image is adopted, Tile method of partition, SIFT operator and GBVS model are prior aries, and it will not go into details in the present invention.

Step 2, for the remarkable figure of image each in image library, respectively adopt Adaptive Thresholding by remarkable figure binaryzation, and the characteristic point matrix corresponding to image carry out " mask " computing obtain filter after remarkable characteristic.

Determine the binary-state threshold of remarkable figure in embodiment according to the conspicuousness size of pixel, binaryzation obtains remarkable characteristic after significantly scheming to carry out " mask " computing with characteristic point matrix, realizes as follows:

According to the conspicuousness size of specific image element, determined the binary-state threshold T of remarkable figure by formula (1).

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y) - - - (1)

Wherein, w and h represents that remarkable figure's is wide and high respectively, and I (x, y) represents the saliency value of significantly figure (x, y) place pixel.

According to binary-state threshold T to remarkable figure binaryzation, obtain binaryzation and significantly scheme, matrix I should be had mutually _binary.Utilize the remarkable figure of binaryzation to carry out filtration to the characteristic point matrix of image and obtain remarkable characteristic.If P represents the characteristic point matrix of image, P _irepresent the notable feature dot matrix after filtering, then notable feature dot matrix calculates by formula (2).

P_{I} = P &CircleTimes; I_{b i n a r y} - - - (2)

Wherein,

Matrix

Each element representation SIFT key point characteristic of correspondence vector of matrix P, and SIFT key point characteristic of correspondence vector is generally 128 dimensions, the corresponding use 128 of the embodiment of the present invention is tieed up;

Matrix

Wherein, P ₁₂₈(x, y) representation feature point characteristic of correspondence vector, if (x, y) place pixel does not have unique point, P ₁₂₈(x, y)=0.I _binaryin each element be 0 or 1, I _binary(x, y) represents the value of the remarkable figure of binaryzation at (x, y) place.Symbol for scale multiplication symbol.

Step 3, chooses some images as training image from image library, chooses some remarkable characteristics structure training sample respectively, train sparse autoencoder network, obtain feature extractor from each training image.

In embodiment, choose the remarkable characteristic of the training image of some and unconventional image block structure training sample in step 3, during training, select ReLU (RectifiedLinearUnits) function and unconventional sigmoid function as the activation function of sparse autoencoder network hidden layer neuron.Such as, in step 3, each remarkable characteristic is the proper vector of 4 × 4 × 8=128 dimension, and a unique point forms a training sample.During concrete enforcement, in the number of training image, a width training image, the number of remarkable characteristic can be specified voluntarily by those skilled in the art.

Be implemented as follows:

First, choose the remarkable characteristic of image, structure training sample set.

First from image library, the image of random selecting some is as training image for embodiment, and then the remarkable characteristic of the training image of random selecting some constructs training sample set.Training sample set can represent with formula (3):

Wherein, m represents the number of training sample, and a remarkable characteristic is shown in each list of X, i.e. a training sample.Such as, [x _1,1, x _2,1..., x _128,1] be the 1st training sample, [x _1,2, x _2,2..., x _128,2] be the 2nd training sample.

Then, sparse autoencoder network is trained to obtain feature extractor.

There is certain correlativity in the remarkable characteristic extracted due to same width training image, therefore directly training sample set X can not be inputted sparse autoencoder network and train.ZCA (ZeroComponentAnalysis) albefaction is adopted to process the training sample set X ' after obtaining albefaction to training sample before training, and correlation parameter when preserving ZCA albefaction, ZCA albefaction is embodied as prior art, and it will not go into details in the present invention.

Embodiment defines the sparse autoencoder network that comprises input layer, hidden layer and output layer 3 layers, and wherein hidden layer neuron adopts ReLU function f ₁(0, x) as activation function, output layer neuron adopts softplus function f to=max ₂=ln (1+e ^x) as activation function.Compare traditional sigmoid function, ReLU function can be alleviated gradient disappearance problem to a certain extent and be more conducive to network training.Given training sample set X ', then the cost function of sparse autoencoder network may be defined as formula (4).

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2} - - - (4)

In formula, Section 1 is square error item, and Section 2 is regular terms, H _w,brepresent the network output valve of training sample set X ', W=[W ₁, W ₂] and b=[b ₁, b ₂] represent weights W between network input layer and hidden layer respectively ₁with biased b ₁and the weights W between hidden layer and output layer ₂with biased b ₂the weight matrix formed, λ represents regularization coefficient.During concrete enforcement, the cost function in the method optimized-types (4) such as Gradient Descent during training, can be adopted to obtain weight and bias matrix parameter W and b.

Step 4, to all images in image library, utilizes step 3 to train the feature extractor of gained to carry out feature extraction, and carries out LS-SVM sparseness with threshold function table to the feature extracted, and obtains the final proper vector for retrieving.

In the step 4 of embodiment, the remarkable characteristic input feature vector extraction apparatus of image is carried out mapping and obtain corresponding characteristics of image, recycling threshold function table carries out LS-SVM sparseness to the feature extracted and can obtain the final proper vector for retrieving.

The characteristics of image Y extracted can be expressed as follows with formula (5),

Y＝f ₁(W ₁P _I′+b ₁)(5)

Wherein, by W ₁p _i+ b ₁reLU function f is substituted into as variable x ₁=max (0, x), notable feature dot matrix P used herein _i' be notable feature dot matrix after filtering according to step 2 gained, use the ZCA whitening parameters identical with when carrying out albefaction to training sample set X to carry out pretreated result.For the characteristics of image Y extracted, carry out LS-SVM sparseness by formula (6) and obtain sparse features matrix Z.

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)](6)

Wherein, α represents the threshold value of threshold function table f=max (0, x-α) and f=max (0, α-Y), matrix Z ₊=max (0, Y-α), Z _-=max (0, α-Y).

In order to obtain the final proper vector F for retrieving, if the SIFT point number detected from piece image is n, by formula (7), sparse features matrix Z is processed further.

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}] - - - (7)

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

Step 5, based on the proper vector that step 4 is extracted, the similarity measurement criterion according to presetting carries out image retrieval: when specifically implementing, those skilled in the art can preset similarity measurement criterion voluntarily.Embodiment adopts city distance (L1 norm) to calculate the similarity of query image and other images, and returns associated picture by similarity size.During concrete enforcement, can arbitrary image be query image in image library, obtain the associated picture returned by similarity size, to other images beyond image library, also can adopt and extract proper vector in the same way, and retrieve from image library.

During concrete enforcement, above flow process can adopt computer software mode to realize automatic operational scheme, and modular mode also can be adopted to provide corresponding system.The present invention is also corresponding provides a kind of Content-based Remote Sensing Image Retrieval System based on the sparse own coding of significant point characteristic sum, comprises with lower module,

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y)

P_{I} = P &CircleTimes; I_{b i n a r y}

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2}

Query characteristics extraction module, for image to be checked, utilizes step 3 to train the feature extractor of gained to carry out feature extraction, and carries out LS-SVM sparseness with threshold function table to the characteristics of image extracted, and obtains the final proper vector for retrieving; Implementation is as follows,

The characteristics of image Y extracted is expressed as follows,

Y＝f ₁(W ₁P _I′+b ₁)

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)]

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}]

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

Retrieval module, for the proper vector extracted based on query characteristics extraction module, the similarity measurement criterion according to presetting carries out image retrieval.

Specific embodiment described herein is only to the present invention's explanation for example.Those skilled in the art can make various amendment or supplement or adopt similar mode to substitute to described specific embodiment, but can't depart from spirit of the present invention or surmount the scope that appended claims defines.

Claims

1. based on a remote sensing image retrieval method for the sparse own coding of significant point characteristic sum, it is characterized in that: comprise the following steps,

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y)

P_{I} = P &CircleTimes; I_{b i n a r y}

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2}

The characteristics of image Y extracted is expressed as follows,

Y＝f ₁(W ₁P _I′+b ₁)

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)]

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}]

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

2. according to claim 1 based on the remote sensing image retrieval method of the sparse own coding of significant point characteristic sum, it is characterized in that: in step 1, the unique point extracting each image in image library obtains characteristic point matrix, utilizes SIFT operator extraction to realize.

3. according to claim 1 or 2 based on the remote sensing image retrieval method of the sparse own coding of significant point characteristic sum, it is characterized in that: in step 5, the similarity measurement criterion preset adopts city distance.

4. based on a Content-based Remote Sensing Image Retrieval System for the sparse own coding of significant point characteristic sum, it is characterized in that: comprise with lower module,

T = \frac{2}{w \times h} Σ_{x = 1}^{w} Σ_{y = 1}^{h} I (x, y)

P_{I} = P &CircleTimes; I_{b i n a r y}

J (W, b) = \frac{1}{2} | | X^{'} - H_{W, b} | |^{2} + \frac{λ}{2} | | W | |^{2}

The characteristics of image Y extracted is expressed as follows,

Y＝f ₁(W ₁P _I′+b ₁)

Z＝[Z ₊,Z _-]＝[max(0,Y-α),max(0,α-Y)]

F = \frac{1}{n} Σ_{i = 1}^{n} [Z_{+}^{i}, Z_{-}^{i}]

Wherein, with representing matrix Z respectively ₊and Z _-i-th column vector.

5. according to claim 4 based on the Content-based Remote Sensing Image Retrieval System of the sparse own coding of significant point characteristic sum, it is characterized in that: in feature point extraction module, the unique point extracting each image in image library obtains characteristic point matrix, utilizes SIFT operator extraction to realize.

6. according to claim 4 or 5 based on the Content-based Remote Sensing Image Retrieval System of the sparse own coding of significant point characteristic sum, it is characterized in that: in retrieval module, the similarity measurement criterion preset adopts city distance.