CN101853304B - Remote sensing image retrieval method based on feature selection and semi-supervised learning - Google Patents

Remote sensing image retrieval method based on feature selection and semi-supervised learning Download PDF

Info

Publication number
CN101853304B
CN101853304B CN2010101951398A CN201010195139A CN101853304B CN 101853304 B CN101853304 B CN 101853304B CN 2010101951398 A CN2010101951398 A CN 2010101951398A CN 201010195139 A CN201010195139 A CN 201010195139A CN 101853304 B CN101853304 B CN 101853304B
Authority
CN
China
Prior art keywords
sigma
feature
textural characteristics
color characteristic
cluster centre
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010101951398A
Other languages
Chinese (zh)
Other versions
CN101853304A (en
Inventor
李士进
朱佳丽
朱跃龙
万定生
冯钧
余宇峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN2010101951398A priority Critical patent/CN101853304B/en
Publication of CN101853304A publication Critical patent/CN101853304A/en
Application granted granted Critical
Publication of CN101853304B publication Critical patent/CN101853304B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a remote sensing image retrieval method based on feature selection and semi-supervised learning. In the method, an optimal color feature and an optimal textural feature are selected respectively by utilizing a clustering method according to a minimum description length criterion and an improved Davies-Bouldin index; and then an appropriate semi-supervised learning method is selected according to the binarization weight of the optimal color feature and the optimal textural feature for carrying out remote sensing image retrieval. Compared with the traditional remote sensing image retrieval method, the invention not only can greatly improve the retrieval quality, but also can reduce the calculate quantity in the retrieval process and improve the retrieval speed.

Description

Remote sensing image retrieval method based on feature selecting and semi-supervised learning
Technical field
The present invention relates to image search method, relate in particular to a kind of remote sensing image retrieval method.
Background technology
Along with the continuous development of remote sensing technology, the remote sensing images quantity that obtain every day sharply increases, and the research of automatic inquiry of remote sensing images and retrieval technique is become gradually the problem of urgent need research.At present, Chinese scholars has proposed a lot of methods and has carried out the retrieval of content-based remote sensing images (CBIR), as textural characteristics based on the Gabor conversion, color characteristic and textural characteristics combination, textural characteristics and spatial information merge, histogram feature similarity measurement method, and based on GIS space method of semantic etc.Propositions such as Zhu Bin utilize the Gabor textural characteristics to carry out retrieval [the Bin Zhu of aerial image, Marshall R, Hsinchun C.Creating a large-scale content-based airphoto image digital library IEEE Trans onimage processing, 2000, vol.9, no.1:163-167.]; The Gabor textural characteristics is merged in propositions such as Lu Lizhen and color characteristic carries out remote Sensing Image Retrieval, and adopt the linear weighted function of texture and color characteristic Euclidean distance to measure similarity [Lu Lizhen, Liu Renyi, Liu Nan. the remote sensing image retrieval method of a kind of Fusion of Color and textural characteristics, China's image graphics journal (A), 2004,9 (3): 328-332.]; Ceng Zhiming etc. utilize improved co-occurrence matrix textural characteristics to carry out large scale remote Sensing Image Retrieval [Ceng Zhiming, Li Feng, Fu Kun, Deng. a kind of texture feature extraction algorithm of large scale remote sensing images information retrieval based on contents, Wuhan University's journal (information science version), 2005,30 (12): 1080-1083.].Bao Qian and Guo Ping be at the single band remote Sensing Image Retrieval, studied respectively based on the similarity measurement of proper vector with based on the similarity measurement of probability, finds χ 2Statistical distance is more effective to first kind of similarity measurement with similar included angle cosine tolerance, and based on the computing method of K-neighbour rule to the effective [Bao Qian of second kind of similarity measurement, Guo Ping. based on histogrammic remote sensing images similarity retrieval method relatively, the remote sensing journal, 2006,10 (6): 893-900.].Ferecatu and Boujemaa propose to utilize the method for active relevant feedback to carry out interactive remote Sensing Image Retrieval [Marin Ferecatu, NozhaBoujemaa.Interactive remote sensing image retrieval using active relevance feedback.IEEE Transactionson geoscience and remote sensing, 2007, vol.45, no.4:818-826.].
CBIR relies on feature extraction and High-dimensional Index Technology to retrieve, the method that adopts is: system extracts some low layer visual signatures (as color, texture, shape etc.) automatically from each width of cloth image, form with high dimension vector deposits database in, and relatively the similarity of these features obtains result for retrieval then.Mainly concentrate on Feature Extraction and fusion aspect at content-based remote Sensing Image Retrieval Study on Technology in the above-mentioned prior art, but all do not notice such fact: dissimilar searched targets, feature should be different.For same width of cloth image, different features is also different aspect the validity of describing its content, can represent that the feature of searched targets content should effectively improve retrieval performance if therefore extract.
Relevant feedback (Relevance feedback) is a learning strategy the most frequently used among the CBIR, and it relies on the man-machine interaction process, and the user constantly feeds back, and its performance increases and improves along with the feedback samples collection, but also can increase user's burden simultaneously greatly.Provide a large amount of heavy burdens of marker samples in order to reduce the user because of repeatedly feeding back, also there is the scholar to propose to utilize the semi-supervised learning strategy to carry out image retrieval, the main thought of this strategy is to utilize a large amount of unmarked examples to assist study to a small amount of underlined example, whole learning process does not need manual intervention, only unmarked example is utilized based on learning algorithm self, for example, Yao etc. have proposed a kind of medical image search method (SEMI-SECC) [Jian Yao that corrects output encoder based on the semantic error of semi-supervised learning, Zhongfei Zhang, Antani S, et al.Automatic Medical ImageAnnotation and Retrieval using SEMI-SECC[C] .Proceedings of IEEE International Conference onMultimedia and Expo, Piscaaway, NJ, United States:IEEE Press, 2006:2005-2008.].Because in content-based remote Sensing Image Retrieval, usually has only seldom example sample (sometimes even have only an example goal sample), and to obtain that more mark example sample is also very difficult, therefore the retrieval of adopting semi-supervised learning to carry out remote sensing images is one and more reasonably selects.
Summary of the invention
We know: dissimilar searched targets, feature is different, and for the description of same searched targets content, the validity of different characteristic is different, carry out image retrieval if can find out the feature that can represent the searched targets content, that just can improve retrieval performance greatly.
Based on such thinking, the present invention tries hard to provide a kind of remote sensing image retrieval method in conjunction with feature selecting, promptly at image to be retrieved, selects and can represent that the feature of searched targets content carries out image retrieval.
The present invention utilizes the method for cluster analysis to carry out feature selecting.
As everyone knows, cluster is a kind of typical unsupervised learning method, and it arrives some significant set to image clustering according to picture material; In cluster process, to determine to need clusters number given in advance by artificial usually, this has not only increased user's burden, but also may introduce the interference of human factor to cluster result; In addition, the purpose of image clustering is divided into a plurality of clusters according to certain criterion with image set, make that the image similarity that is positioned at same clustering cluster is big as far as possible, and the image similarity that is positioned at different bunches is as far as possible little, therefore, in order correctly to estimate the cluster effect, thereby carry out feature selecting objectively, it is very important to choose suitable cluster validity index.
The present invention utilizes minimum description length (MDL) criterion to determine clusters number and according to Davies-Bouldin index (the following DB index that all is called for short) validity of cluster is estimated, thereby finds the characteristics of image that can represent the searched targets content.Here said minimum description length criterion is a prior art, but particular content list of references [Horst B, AlesL, Alexander S.MDL principle for robust vector quantisation.Pattern Analysis﹠amp; Applications, 1999,2:59-72, Springer-Verlag London Limited.]; The DB index is an index commonly used when weighing the cluster effect, by the ratio value representation that scatters between distribution and class in the class, the more little expression cluster of ratio effect good more [Davies D.L., Bouldin D.W..A cluster separation measure.1979.IEEETrans.Pattern Anal.Machine Intell.1 (4) .224-227].Consider that remote Sensing Image Retrieval not exclusively is unsupervised, the initial given example of user can be used as weak heuristic information, and characteristics of image should help the difference of this image subblock and other image blocks.Therefore we have carried out certain improvement to existing DB index, thereby more help feature selecting, specific as follows: the interior spread values of class of only calculating the target subclass at example user's image subblock place, and the interior spread values of the class that does not comprise non-target subclass, spread values also includes only spread values between class between non-target subclass and this target subclass between class, and do not comprise spread values between class between the non-target subclass, so not only can give prominence to the importance of target subclass and the difference between target subclass and the non-target subclass, but also can reduce calculated amount.
By above feature selecting, select can represent the characteristics of image of searched targets content after, just can use existing the whole bag of tricks structure respective classified device to carry out image retrieval.
Comprehensive above the analysis, the present invention carries out remote Sensing Image Retrieval as follows:
A kind of remote sensing image retrieval method based on feature selecting and semi-supervised learning, at first select the feature of image to be retrieved, retrieve according to the latent structure respective classified device of selecting then, it is characterized in that: the feature of described selection image to be retrieved is meant: according to MDL criterion and improved DB index, select the optimum color characteristic and the optimum textural characteristics of image to be retrieved by the method for cluster analysis; Specifically realize by following each step:
Step 1) is carried out piecemeal with image to be retrieved;
Step 2) extracts each color characteristic and the textural characteristics of image to be retrieved respectively;
Step 3) is determined clusters number k according to the minimum description length criterion, specifically according to following each step:
Step 31) according to m cluster centre of maximum distance criterion initialization;
Step 32) sets a certain cluster centre C arbitrarily j, calculate according to following formula The expression hypothesis is with C jWhen removing, the total variation of code length before and after removing:
Δl C j = - L 0 - n j log 2 p j + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2
Wherein, L 0The code length at expression clustering cluster center:
L 0 = 9 × σ × Σ j = 1 m ( - n j log 2 p j ) + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2 / m ;
n qRepresent that q gathers the number of class sample; n JqIt is j cluster centre and second neighbour's reference point is the number of samples of q cluster centre that the arest neighbors reference point is satisfied in expression, and the dimension of d representation feature, x are bunch C jIn the sample member, x iBe the numerical value of i feature; c IqThe value of representing the i dimension of q cluster centre, c IjThe value of representing the i dimension of j cluster centre; | I| represents total number of samples; p jRepresent C jGather class sample shared proportion in population sample; σ is the variance of sample data, and span is [0.1,0.2];
Step 33) obtains in the determining step 33
Figure BSA00000156084200042
Whether, in this way, then remove cluster centre C less than 0 jAs not, then keep cluster centre C j
Step 34) iteration execution in step 32-step 33, up to there not being redundant cluster centre, the cluster centre number that remain this moment is needs definite clusters number k;
The clusters number k that step 4) is determined according to step 3 utilizes the K-means clustering method respectively each feature of extracting in the step 2 to be carried out cluster;
The improved DB index of each feature that step 5) obtains in the calculation procedure 4 respectively according to following formula, and select the textural characteristics of improved DB index minimum in the color characteristic of improved DB index minimum in the color characteristic and the textural characteristics respectively, as optimum color characteristic and optimum textural characteristics:
S t = 1 | C t | Σ x ∈ C t D ( x , p t )
DB c = 1 k - 1 Σ i = 1 k i ≠ t 1 / S t 1 / D ( p i , p t ) = 1 k - 1 Σ i = 1 i ≠ t k D ( p i , p t ) S t
DB t = 1 k - 1 Σ i = 1 i ≠ t k S t D ( p i , p t )
Wherein, D () be one apart from operator, for color characteristic, D () expression histogram is handed over distance; And for textural characteristics, D () represents Euclidean distance; T is bunch numbering of target subclass; S tBe among the target subclass t all samples to the mean distance of cluster centre; | C t| be the number of samples among the target subclass t; p tIt is the cluster centre of target subclass t; K represents total clusters number; p iThe cluster centre of representing non-target subclass; DB cThe improved DB index of expression color characteristic; DB tThe improved DB index of expression textural characteristics.
Because the remote sensing images reflection is the physical characteristics that ground covers, and therefore shows existing colouring information in the image, texture information is also arranged, so the present invention selects an optimum color characteristic and an optimum textural characteristics.Certainly, at some in particular cases, lake for example, as long as a feature is just enough, so the present invention also considers the weight of different characteristic, is changed to 0 by the binaryzation method feature weight that weight is low and gets final product.
Consider when the learning method of using relevant feedback commonly used is retrieved, in the process of each feedback, need the user to come the positive counter-example sample of mark, increased user's burden greatly; Simultaneously, in content-based remote Sensing Image Retrieval, usually has only seldom training sample (sometimes even have only a training sample), and the training sample that obtains a large amount of marks is also very difficult, therefore, the preferred semi-supervised learning method of the present invention is carried out image retrieval, particularly coorinated training method (Co-training) and self-training learning method wherein.Below the substance of these two kinds of semi-supervised learning methods is briefly introduced:
The coorinated training method is based on following hypothesis: feature space can be divided into two naturally, and two sorters are trained in these two sub-feature spaces.In the process of coorinated training, each sorter is by adding the training sample set that is enlarged oneself by the sample of the determined high confidence level of another sorter, and iteration successively is up to there not being more unmarked sample;
In the process of self-training study, use initial disaggregated model of flag data sample architecture earlier, remove to estimate the label of unmarked data then with this model, select with suitable selection criterion and correct be labeled data and they are joined in the training set, iteration is up to satisfying certain end condition successively.In the self-training learning process, need to determine that a threshold value Th is used as the condition threshold value that iteration stops, this threshold value can distinguish the non-target subclass the most close with the target subclass, and this threshold value Th sets as follows:
Th = D 1 D 1 + D 2 × D 12
Wherein, D 1, D 2Be respectively the radius of target bunch and the most adjacent non-target bunch, D 12It is the distance between target bunch center and the most adjacent non-target bunch center.When the radius of asking for bunch, sample is to the maximum distance at center in generally selecting for use bunch, but may there be the small amount of noise sample in considering bunch, can adopt the pivot analysis method, the maximum distance that finds in the sample with the distance nearest preceding K% in bunch center as bunch radius, wherein the K value is below 100, can select according to actual needs.
Because the inventive method has been selected optimum color characteristic and optimum textural characteristics, and the weight of these two features may have influence on the retrieval effectiveness of semi-supervised learning method, therefore, the present invention utilizes the improved DB index that obtains in the feature selecting to determine the weight of optimum color characteristic and optimum textural characteristics; Consider simultaneously for color characteristic and textural characteristics, the represented feature difference of DB index of identical numerical value is a non-equivalence in the different codomains, therefore need determine feature weight by non-uniform quantizing, here adopt binarization method to determine feature weight: for color characteristic, when the inverse of the improved DB index of selected optimal characteristics less than threshold value T 1The time, illustrating that target subclass and the difference of non-target subclass in color space are not clearly, this moment, the weight of color characteristic was made as 0, otherwise was 1; For textural characteristics, when the inverse of the improved DB index of selected optimal characteristics less than threshold value T 2The time, illustrating that target subclass and the difference of non-target subclass in the textural characteristics space are not clearly, this moment, the weight of textural characteristics was made as 0, otherwise was 1.
According to above analysis, it is as follows just can to draw preferred version of the present invention:
At first,, select the optimum color characteristic and the optimum textural characteristics of image to be retrieved, promptly carry out according to above-mentioned steps 1-step 5 by the method for cluster analysis according to MDL criterion and improved DB index;
Then, select suitable semi-supervised learning method, and utilize the semi-supervised learning method of choosing to carry out image retrieval according to the weight of optimum color characteristic and optimum textural characteristics; Specifically realize by following each step:
Step 6) is calculated the binaryzation weight of optimum color characteristic and optimum textural characteristics respectively according to improved DB index, and concrete grammar is as follows:
For color characteristic, when the inverse of the improved DB index of selected optimal characteristics less than pre-set threshold T 1The time, illustrating that target subclass and the difference of non-target subclass in color space are not clearly, this moment, the weight of color characteristic was made as 0, otherwise was 1; For textural characteristics, when the inverse of the improved DB index of selected optimal characteristics less than pre-set threshold T 2The time, illustrating that target subclass and the difference of non-target subclass in the textural characteristics space are not clearly, this moment, the weight of textural characteristics was made as 0, otherwise was 1;
Step 7) is chosen suitable semi-supervised learning method and is retrieved, and is specially: when the binaryzation weight of optimum color characteristic and textural characteristics all is 1, select the coorinated training method to retrieve; And when the weight of a certain feature in optimum color characteristic and the textural characteristics was 0, selecting the self-training method to rely on weight separately was that 1 feature is retrieved.
The present invention utilizes clustering method to select optimum color characteristic and textural characteristics respectively at first according to minimum description length criterion and improved Davies-Bouldin index; Select suitable semi-supervised learning method to carry out the retrieval of remote sensing images according to the color characteristic of optimum and the binaryzation weight of textural characteristics then.The prior art of comparing, the present invention not only can improve retrieval quality greatly, can also effectively reduce calculated amount in the retrieving, improves the speed of retrieval.
Description of drawings
Fig. 1 is the process flow diagram of the specific embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing technical scheme of the present invention is elaborated:
Use method of the present invention that the different faces of land is covered (land cover) and carried out the retrieval experiment, the retrieval in wherein existing soil erosion zone also has the residential area, forest land, the retrieval of general objectivess such as lake culture in enclosure; The concrete retrieval according to following each step:
Step 1) is carried out piecemeal with image to be retrieved;
In this embodiment,, taked overlapping partition strategy for fear of same target branch being gone among the different fritters, every block size is length=min (128, sample image is long), wide=min (128, sample image is wide), the overlapping 1/2 long 1/2 wide pixel that multiply by between piece and the piece;
Step 2) extracts each color characteristic and the textural characteristics of image to be retrieved respectively;
In this embodiment, HSI color characteristic, Lab color characteristic, Glcm textural characteristics and Gabor textural characteristics have been extracted respectively;
Step 3) is determined clusters number k according to the minimum description length criterion, specifically according to following each step:
Step 31) according to m cluster centre of maximum distance criterion initialization;
Step 32) sets a certain cluster centre C arbitrarily j, calculate according to following formula
Figure BSA00000156084200071
The expression hypothesis is with C jWhen removing, the total variation of code length before and after removing:
Δ l C j = - L 0 - n j log 2 p j + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2
Wherein, L 0The code length at expression clustering cluster center:
L 0 = 9 × σ × Σ j = 1 m ( - n j log 2 p j + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2 ) / m ;
n qRepresent that q gathers the number of class sample; n JqIt is j cluster centre and second neighbour's reference point is the number of samples of q cluster centre that the arest neighbors reference point is satisfied in expression, and the dimension of d representation feature, x are bunch C jIn the sample member, x iBe the numerical value of i feature; c IqThe value of representing the i dimension of q cluster centre, c IjThe value of representing the i dimension of j cluster centre; | I| represents total number of samples; p jRepresent C jGather class sample shared proportion in population sample; σ is the variance of sample data, and span is [0.1,0.2], and in this embodiment, the value of σ is 0.12;
Step 33) obtains in the determining step 33
Figure BSA00000156084200074
Whether, in this way, then remove cluster centre C less than 0 jAs not, then keep cluster centre C j
Step 34) iteration execution in step 32-step 33, up to there not being redundant cluster centre, the cluster centre number that remain this moment is needs definite clusters number k;
The clusters number k that step 4) is determined according to step 3 utilizes the K-means clustering method respectively each feature of extracting in the step 2 to be carried out cluster;
The improved DB index of each feature that step 5) obtains in the calculation procedure 4 respectively according to following formula, and select the textural characteristics of improved DB index minimum in the color characteristic of improved DB index minimum in the color characteristic and the textural characteristics respectively, as optimum color characteristic and optimum textural characteristics:
S t = 1 | C t | Σ x ∈ C t D ( x , p t )
DB c = 1 k - 1 Σ i = 1 i ≠ t k 1 / S t 1 / D ( p i , p t ) = 1 k - 1 Σ i = 1 i ≠ t k D ( p i , p t ) S t
DB t = 1 k - 1 Σ i = 1 i ≠ t k S t D ( p i , p t )
Wherein, D () be one apart from operator, for color characteristic, D () expression histogram is handed over distance; And for textural characteristics, D () represents Euclidean distance; T is bunch numbering of target subclass; S tBe among the target subclass t all samples to the mean distance of cluster centre; | C t| be the number of samples among the target subclass t; p tIt is the cluster centre of target subclass t; K represents total clusters number; p iThe cluster centre of representing non-target subclass; DB cThe improved DB index of expression color characteristic; DB tThe improved DB index of expression textural characteristics;
Step 6) is calculated the binaryzation weight of optimum color characteristic and optimum textural characteristics respectively according to improved DB index, and concrete grammar is as follows:
For color characteristic, when the inverse of the improved DB index of selected optimal characteristics less than pre-set threshold T 1The time, illustrating that target subclass and the difference of non-target subclass in color space are not clearly, this moment, the weight of color characteristic was made as 0, otherwise was 1; For textural characteristics, when the inverse of the improved DB index of selected optimal characteristics less than pre-set threshold T 2The time, illustrating that target subclass and the difference of non-target subclass in the textural characteristics space are not clearly, this moment, the weight of textural characteristics was made as 0, otherwise was 1;
In this embodiment, threshold value T 1, T 2Value get 2 and 3 respectively;
Step 7) is chosen suitable semi-supervised learning method and is retrieved, and is specially: when the binaryzation weight of optimum color characteristic and textural characteristics all is 1, select the coorinated training method to retrieve; And when the weight of a certain feature in optimum color characteristic and the textural characteristics was 0, selecting the self-training method to rely on weight separately was that 1 feature is retrieved;
If when selecting the self-training method to retrieve in this step, determine in the cluster process threshold value Th as stopping criterion for iteration according to following formula:
Th = D 1 D 1 + D 2 × D 12
Wherein, D 1, D 2Be respectively the sample farthest that finds in the sample of target bunch and the most adjacent non-target bunch preceding K% that this bunch of middle distance center is nearest and the distance between this bunch center, K " 100; D 12It is the distance between target bunch center and the most adjacent non-target bunch center; In this embodiment, K gets 95.
The inventive method can combine with existing C BIR system fully, thereby realizes the remote Sensing Image Retrieval of robotization.
By the inventive method and existing relevant feedback method are retrieved contrast test, can find that the inventive method is suitable with the relevant feedback method on the index of recall ratio and precision ratio, but the time spent in retrieval is far below relevant feedback method required time, and compare relevant feedback method based on man-machine interaction, the inventive method does not need repeatedly man-machine interaction, has alleviated user's burden.

Claims (5)

1. remote sensing image retrieval method based on feature selecting and semi-supervised learning, at first select the feature of image to be retrieved, retrieve according to the latent structure respective classified device of selecting then, it is characterized in that: the feature of described selection image to be retrieved is meant: according to minimum description length criterion and improved Davies-Bouldin index, select the optimum color characteristic and the optimum textural characteristics of image to be retrieved by the method for cluster analysis; Specifically realize by following each step:
Step 1) is carried out piecemeal with image to be retrieved;
Step 2) extracts each color characteristic and the textural characteristics of image to be retrieved respectively;
Step 3) is determined clusters number k according to the minimum description length criterion, specifically according to following each step:
Step 31) according to m cluster centre of maximum distance criterion initialization;
Step 32) sets a certain cluster centre C arbitrarily j, calculate according to following formula
Figure FSB00000557172300011
Figure FSB00000557172300012
The expression hypothesis is with C jWhen removing, the total variation of code length before and after removing:
Δ l C j = - L 0 - n j log 2 p j + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2
Wherein, L 0The code length at expression clustering cluster center:
L 0 = 9 × σ × Σ j = 1 m ( - n j log 2 p j + Σ q = 1 , q ≠ j m n jq log 2 ( n q + n jq | I | ) + Σ x ∈ c j Σ i = 1 d ( x i - c iq ) 2 - ( x i - c ij ) 2 2 ( ln 2 ) σ 2 ) / m ;
n qRepresent that q gathers the number of class sample; n JqIt is j cluster centre and second neighbour's reference point is the number of samples of q cluster centre that the arest neighbors reference point is satisfied in expression, and the dimension of d representation feature, x are bunch C jIn the sample member, x iBe the numerical value of i feature; c IqThe value of representing the i dimension of q cluster centre, c IjThe value of representing the i dimension of j cluster centre; | I| represents total number of samples; p jRepresent C jGather class sample shared proportion in population sample; σ is the variance of sample data, and span is [0.1,0.2];
Step 33) obtains in the determining step 33
Figure FSB00000557172300015
Whether, in this way, then remove cluster centre C less than 0 jAs not, then keep cluster centre C j
Step 34) iteration execution in step 32-step 33, up to there not being redundant cluster centre, the cluster centre number that remain this moment is needs definite clusters number k;
The clusters number k that step 4) is determined according to step 3 utilizes the K-means clustering method respectively each feature of extracting in the step 2 to be carried out cluster;
The improved Davies-Bouldin index of each feature that step 5) obtains in the calculation procedure 4 respectively according to following formula, and select the textural characteristics of improved Davies-Bouldin index minimum in the color characteristic of improved Davies-Bouldin index minimum in the color characteristic and the textural characteristics respectively, as optimum color characteristic and optimum textural characteristics:
S t = 1 | C t | Σ x ∈ C t D ( x , p t )
DB c = 1 k - 1 Σ i = 1 i ≠ t k 1 / S t 1 / D ( p i , p t ) = 1 k - 1 Σ i = 1 i ≠ t k D ( p i , p t ) S t
DB t = 1 k - 1 Σ i = 1 i ≠ t k S t D ( p i , p t )
Wherein, D () be one apart from operator, for color characteristic, D () expression histogram is handed over distance; And for textural characteristics, D () represents Euclidean distance; T is bunch numbering of target subclass; S tBe among the target subclass t all samples to the mean distance of cluster centre; | C t| be the number of samples among the target subclass t; p tIt is the cluster centre of target subclass t; K represents total clusters number; p iThe cluster centre of representing non-target subclass; DB cThe improved Davies-Bouldin index of expression color characteristic; DB tThe improved Davies-Bouldin index of expression textural characteristics.
2. according to claim 1 based on the remote sensing image retrieval method of feature selecting and semi-supervised learning, it is characterized in that: the latent structure respective classified device that described basis is selected is retrieved and is meant: the weight according to optimum color characteristic and optimum textural characteristics is selected suitable semi-supervised learning method, and utilizes the semi-supervised learning method of choosing to carry out image retrieval; Specifically realize by following steps:
Step 6) is calculated the binaryzation weight of optimum color characteristic and optimum textural characteristics respectively according to improved Davies-Bouldin index, and concrete grammar is as follows:
For color characteristic, when the inverse of the improved Davies-Bouldin index of selected optimal characteristics less than pre-set threshold T 1The time, illustrating that target subclass and the difference of non-target subclass in color space are not clearly, this moment, the weight of color characteristic was made as 0, otherwise was 1; For textural characteristics, when the inverse of the improved Davies-Bouldin index of selected optimal characteristics less than pre-set threshold T 2The time, illustrating that target subclass and the difference of non-target subclass in the textural characteristics space are not clearly, this moment, the weight of textural characteristics was made as 0, otherwise was 1;
Step 7) is chosen suitable semi-supervised learning method and is retrieved, and is specially: when the binaryzation weight of optimum color characteristic and textural characteristics all is 1, select the coorinated training method to retrieve; And when the weight of a certain feature in optimum color characteristic and the textural characteristics was 0, selecting the self-training method to rely on weight separately was that 1 feature is retrieved.
As described in the claim 2 based on the remote sensing image retrieval method of feature selecting and semi-supervised learning, it is characterized in that: the T of pre-set threshold described in the step 6 1, T 2Value get 2 and 3 respectively.
As described in the claim 2 based on the remote sensing image retrieval method of feature selecting and semi-supervised learning, it is characterized in that: when selecting the self-training method to retrieve in the step 7, determine in the cluster process threshold value Th as stopping criterion for iteration according to following formula:
Th = D 1 D 1 + D 2 × D 12
Wherein, D 1, D 2Be respectively the sample farthest that finds in the sample of target bunch and the most adjacent non-target bunch preceding K% that this bunch of middle distance center is nearest and the distance between this bunch center, K " 100; D 12It is the distance between target bunch center and the most adjacent non-target bunch center.
As described in the claim 4 based on the remote sensing image retrieval method of feature selecting and semi-supervised learning, it is characterized in that: described K value is 95.
CN2010101951398A 2010-06-08 2010-06-08 Remote sensing image retrieval method based on feature selection and semi-supervised learning Expired - Fee Related CN101853304B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101951398A CN101853304B (en) 2010-06-08 2010-06-08 Remote sensing image retrieval method based on feature selection and semi-supervised learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101951398A CN101853304B (en) 2010-06-08 2010-06-08 Remote sensing image retrieval method based on feature selection and semi-supervised learning

Publications (2)

Publication Number Publication Date
CN101853304A CN101853304A (en) 2010-10-06
CN101853304B true CN101853304B (en) 2011-10-05

Family

ID=42804795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101951398A Expired - Fee Related CN101853304B (en) 2010-06-08 2010-06-08 Remote sensing image retrieval method based on feature selection and semi-supervised learning

Country Status (1)

Country Link
CN (1) CN101853304B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102467564B (en) * 2010-11-12 2013-06-05 中国科学院烟台海岸带研究所 Remote sensing image retrieval method based on improved support vector machine relevance feedback
CN102033933B (en) * 2010-12-17 2012-02-01 南方医科大学 Distance metric optimization method for maximizing mean average precision (MAP)
CN102096825B (en) * 2011-03-23 2014-04-16 西安电子科技大学 Graph-based semi-supervised high-spectral remote sensing image classification method
CN102254303B (en) * 2011-06-13 2013-01-02 河海大学 Methods for segmenting and searching remote sensing image
CN102542050B (en) * 2011-12-28 2016-01-20 辽宁师范大学 Based on the image feedback method and system of support vector machine
CN102622607B (en) * 2012-02-24 2013-09-25 河海大学 Remote sensing image classification method based on multi-feature fusion
CN102646200B (en) * 2012-03-08 2014-06-04 武汉大学 Image classifying method and system for self-adaption weight fusion of multiple classifiers
CN102999542B (en) * 2012-06-21 2015-12-16 杜小勇 Multi-medium data high dimensional indexing and kNN search method
CN104252625A (en) * 2013-06-28 2014-12-31 河海大学 Sample adaptive multi-feature weighted remote sensing image method
CN103714349B (en) * 2014-01-09 2017-01-25 成都淞幸科技有限责任公司 Image recognition method based on color and texture features
CN104182538B (en) * 2014-09-01 2017-06-13 西安电子科技大学 Image search method based on semi-supervised Hash
CN104239551B (en) * 2014-09-24 2017-04-19 河海大学 Multi-feature VP-tree index-based remote sensing image retrieval method and multi-feature VP-tree index-based remote sensing image retrieval device
CN104573650B (en) * 2014-12-31 2017-07-14 国家电网公司 A kind of electric wire detection sorting technique based on filter response
CN104794496A (en) * 2015-05-05 2015-07-22 中国科学院遥感与数字地球研究所 Remote sensing character optimization algorithm for improving mRMR (min-redundancy max-relevance) algorithm
CN104834944B (en) * 2015-05-26 2018-03-27 杭州尚青科技有限公司 A kind of urban area air quality method of estimation based on coorinated training
CN106295478A (en) * 2015-06-04 2017-01-04 深圳市中兴微电子技术有限公司 A kind of image characteristic extracting method and device
CN107292339B (en) * 2017-06-16 2020-07-21 重庆大学 Unmanned aerial vehicle low-altitude remote sensing image high-resolution landform classification method based on feature fusion
CN108898096B (en) * 2018-06-27 2022-04-08 重庆交通大学 High-resolution image-oriented information rapid and accurate extraction method
CN109213886B (en) * 2018-08-09 2021-01-08 山东师范大学 Image retrieval method and system based on image segmentation and fuzzy pattern recognition
CN109657083B (en) * 2018-12-27 2020-07-14 广州华迅网络科技有限公司 Method and device for establishing textile picture feature library
CN109816034B (en) * 2019-01-31 2021-08-27 清华大学 Signal characteristic combination selection method and device, computer equipment and storage medium
CN109933619B (en) * 2019-03-13 2022-02-08 西南交通大学 Semi-supervised classification prediction method
CN111898710B (en) * 2020-07-15 2023-09-29 中国人民解放军火箭军工程大学 Feature selection method and system of graph
CN113780308A (en) * 2021-08-27 2021-12-10 吉林省电力科学研究院有限公司 GIS partial discharge mode identification method and system based on kernel principal component analysis and neural network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101329736A (en) * 2008-06-20 2008-12-24 西安电子科技大学 Method of image segmentation based on character selection and hidden Markov model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101329736A (en) * 2008-06-20 2008-12-24 西安电子科技大学 Method of image segmentation based on character selection and hidden Markov model

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
李士进等.基于内容的图像检索在土壤侵蚀遥感调查中的应用.《江南大学学报(自然科学版)》.2007,第6卷(第6期),第860-863页. *
李士进等.基于半监督学习的土壤侵蚀遥感图像检索.《Pattern Recognition, 2009. CCPR 2009. Chinese Conference on》.2009,第1-5页. *
李士进等.多分类器实例协同训练遥感图像检索.《遥感学报》.2010,第14卷(第3期),第1-7页. *
高祥涛等.基于相关反馈的土壤侵蚀遥感图像检索技术研究.《中国农业资源与区别》.2009,第30卷(第3期),第56-60页. *

Also Published As

Publication number Publication date
CN101853304A (en) 2010-10-06

Similar Documents

Publication Publication Date Title
CN101853304B (en) Remote sensing image retrieval method based on feature selection and semi-supervised learning
Zhu et al. Parameter optimization for automated concrete detection in image data
Muszynski et al. Topological data analysis and machine learning for recognizing atmospheric river patterns in large climate datasets
CN102254303B (en) Methods for segmenting and searching remote sensing image
CN102810158B (en) High-resolution remote sensing target extraction method based on multi-scale semantic model
Soltis et al. Plants meet machines: Prospects in machine learning for plant biology
Tan et al. Automatic extraction of built-up areas from panchromatic and multispectral remote sensing images using double-stream deep convolutional neural networks
CN102096825A (en) Graph-based semi-supervised high-spectral remote sensing image classification method
Wang et al. Using GF-2 imagery and the conditional random field model for urban forest cover mapping
CN104182767B (en) The hyperspectral image classification method that Active Learning and neighborhood information are combined
CN103440508B (en) The Remote Sensing Target recognition methods of view-based access control model word bag model
CN106845496B (en) Fine target identification method and system
CN113223042B (en) Intelligent acquisition method and equipment for remote sensing image deep learning sample
CN104820841A (en) Hyper-spectral classification method based on low-order mutual information and spectral context band selection
Duan et al. Using remote sensing to identify soil types based on multiscale image texture features
Chazalon et al. ICDAR 2021 competition on historical map segmentation
CN105740378A (en) Digital pathology whole slice image retrieval method
Sakharova et al. Issues of tree species classification from LiDAR data using deep learning model
Gevaert et al. Assessing the generalization capability of deep learning networks for aerial image classification using landscape metrics
Shao et al. A Benchmark Dataset for Performance Evaluation of Multi-Label Remote Sensing Image Retrieval.
Wei et al. Coffee flower identification using binarization algorithm based on convolutional neural network for digital images
Wu et al. Object-oriented and deep-learning-based high-resolution mapping from large remote sensing imagery
CN116503750A (en) Large-range remote sensing image rural block type residential area extraction method and system integrating target detection and visual attention mechanisms
Qiao et al. Rapid trajectory clustering based on neighbor spatial analysis
CN106952251B (en) A kind of image significance detection method based on Adsorption Model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111005

Termination date: 20210608

CF01 Termination of patent right due to non-payment of annual fee