CN104794219A

CN104794219A - Scene retrieval method based on geographical position information

Info

Publication number: CN104794219A
Application number: CN201510208102.7A
Authority: CN
Inventors: 姚金良; 吴铤; 黄芬; 王小华; 杨冰; 黄孝喜; 王荣波; 谌志群; 陈浩; 窦文生
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2015-04-28
Filing date: 2015-04-28
Publication date: 2015-07-22

Abstract

The invention discloses a scene retrieval method based on geographical position information. The scene retrieval method based on the geographical position information comprises the following steps that firstly, the geographic information and an overall situation descriptor of a scene image are retrieved in the retrieval process, and a space four-fork tree retrieval structure is structured; secondly, a similar image set is determined in a hierarchy verification mode in the inquiring process, and the information of the scene image to be inquired is obtained through the information of similar images in a vote mode. The overall situation descriptor is structured through the following steps that firstly, local characteristic points of the scene image are extracted, and the scene image is expressed into a set of the local characteristics; secondly, descriptors of the local characteristic points are quantized according to a visual vocabulary dictionary, and corresponding visual vocabularies are obtained; thirdly, the extracted visual vocabularies are collected and projected to a random projection matrix to obtain the overall situation descriptor of the scene image. By means of the scene retrieval method based on the geographical position information, many irrelevant images are efficiently filtered out, and the efficiency of spatial verification of the visual vocabularies and the accuracy rate of image matching are improved.

Description

A kind of scene search method based on geographical location information

Technical field

The present invention relates to a kind of real scene search method based on geographical location information, is belong to image retrieval in Digital Image Processing and identification field.

Background technology

Scene search and identification are important research field in Digital Image Processing, and research staff proposes a lot of method to carry out the identification of scene.The recognition methods of current main flow is content-based scene recognition method, and image table, by extracting the feature in image, is shown as proper vector, and selects suitable sorter to identify by it.Existing content-based scene recognition method can be divided into two classes according to the feature difference extracted.First kind method is based on low-level image feature method, and the low-level image feature that these class methods extract image represents image, such as: color histogram etc.Equations of The Second Kind is the method based on semantic feature, and these class methods can be divided into again: the method based on local feature and the method based on parts.Method based on local feature is current effect the best way.Image table is shown as the set of local feature (SIFT, SURF etc.) by these class methods, is then mated by local feature, utilizes sorter to classify.These class methods all need to build scene Recognition sorter, namely need the classification of pre-defined scene, are then obtained for different classes of sorter by training.Be difficult in these processes be applied to general scene Recognition application demand, such as: assisting navigation, scene information retrieval etc.Method different from the past, a kind of scene recognition method based on retrieval towards general outdoor scene of invention.The method of invention retrieves the image similar to query image by image retrieval technologies, then adopts ballot method to determine scene information.

In addition, along with popularizing of smart mobile phone, and the widespread use of GPS information, geography information when obtaining image taking also becomes easy.The geographical location information of smart mobile phone has been widely used in a lot of field, and the most frequently used is exactly navigational system.In recent years, researchist is also had to propose to be improved by geographical location information when extracting photographed scene accuracy rate and the efficiency of scene image retrieval.Can be applicable to the applications such as outdoor target retrieval and assisting navigation based on the scene search of the geographical location information of smart mobile phone and recognition technology.Outdoor target retrieval has very important actual application value to target query specific in tourist attractions.Such as, step on the stone inscription that the Lu Shangyou in Mount Taishan is a lot, if do not know its history, just can use the camera function photographic subjects image of smart mobile phone, and by inquiring about the relevant information of interesting target based on the retrieval technique of scene image, be convenient to user and obtain better Tourist Experience, and improve the service quality of tourist attractions.In addition, current map needs user to determine direction, and manual confirmation current goal whether for the purpose of.These confirmation work are more difficult for those concerning the unfamiliar user of surrounding environment.If can automatically confirm that whether scene objects is consistent with the destination of user by the retrieval of scene image and recognition technology, to current map system, there is important Auxiliary Significance.The scene image retrieval of the inventive method and recognition technology can meet this kind of application technology demand preferably.

CBIR different from the past and recognition methods, the inventive method is a kind of scene image retrieval technique based on geography information.In recent years, more existing research work are being carried out based in the image retrieval of geography information, propose certain methods, such as: search method (the Gabriel Takacs that Gabriel Takacs proposes, etc. " Outdoors augmented realityon mobile phone using loxel-based visual feature organization, " InProc.of the 1st ACM international conference on Multimediainformation retrieval, 2008, Vancouver, British Columbia, Canada).The method, by the SURF characteristic set of geography information design of graphics picture, realizes the retrieval of image by matching characteristic point and geometric verification during inquiry.This method based on Image Feature Point Matching represents image with unique point descriptor, as: SIFT descriptor, SURF.This method has good robustness to illumination, rotation etc., and when carrying out images match, matching effect compares better.But because each unique point is represented by descriptor vector, and every piece image has hundreds of even several thousand unique points, and like this one is comprised to the data set of great amount of images, they are all no small challenges in storage demand and computing velocity.Therefore, there is researcher to be quantized by local feature description's, propose the method for view-based access control model vocabulary.It is no longer describe piece image with multiple unique point descriptor, but unique point descriptor is quantified as a visual vocabulary, and visual vocabulary is similar to the word in text.Then represent the image as the set of a visual vocabulary, this not only reduces storage demand, also improve the speed of process.But this method also exists obvious shortcoming, the visual vocabulary separating capacity quantizing exactly to cause dies down, and lacks spatial information, so need to carry out the checking of further space.The inventive method is in harness basis, the method for the rapid scene image retrieval that the new fusion geography information proposed for current Problems existing and visual vocabulary realize.

Summary of the invention

The efficiency of the application demand that object of the present invention is mainly retrieved for existing outdoor scene and existence, proposes a kind of scene search method based on geography information.This method can realize scene search and recognition function fast and accurately.

Based on a scene search method for geographical location information, specifically comprise the steps:

Step 1, in Index process, index is carried out to the geography information of scene image and global description's, and build space 4 and pitch tree index structure;

Step 2, in query script by level verification mode determination similar diagram image set, and obtained the information of scene image to be checked by the information of ballot mode and similar image.

The structure of global description's described in step 1 is specific as follows:

1-1. extracts the local feature region of scene image, scene image is expressed as the set of local feature;

1-2. quantizes according to the descriptor of visual vocabulary dictionary by local feature region, obtains corresponding visual vocabulary;

1-3. the visual vocabulary aggregate projection of extraction to accidental projection matrix will obtain global description's of scene image.

Tree index structure is pitched in structure space 4 described in step 1, adopts the trough of the horizontal and vertical projection histogram of target in region to split, and concrete segmentation is as follows:

First area-of-interest on map is divided into identical cell;

Then its value is determined by the quantity of image in cell; Add up the quantity of all images at each horizontal and vertical position, thus on the left side and following formation horizontal histogram and vertical histogram respectively;

Smoothing finally by running mean, smooth window size is 3, and its level and smooth weight is respectively: [0.2,0.6,0.2].By H (i) >H (i-1) and H (i) >H (i+1) determines whether crest, iteration is level and smooth until histogram is double-peak shape, calculate bimodal between the position of minimum point as split position, if iteration smoothly cannot obtain double-peak shape, directly adopt even division methods to be divided into four regions.

Level verification method described in step 2 adopts three layers of feature to verify, successively carries out geography information constraint, global description's is verified, three proof procedures are verified in visual vocabulary space.

Described global description's passes through by visual vocabulary aggregate projection on the projection matrix be made up of-1 and 1 of a stochastic generation, and carries out binaryzation to projection result, obtains global description's that 0,1 sequence represents, specific as follows:

The accidental projection matrix P that stochastic generation one is arranged by the capable M of 1 and-1 K formed, wherein K represents the sub-length of global description, and M represents visual vocabulary number in dictionary; All visual vocabulary set all project on the projection matrix of this generation; Then the vector form V_Img becoming M to tie up visual vocabulary set expression whether is there is in the picture according to the visual vocabulary in dictionary; If occur, correspondence position is 1 otherwise is 0; Then according to V_Result=V_Img × P, V_Img is projected on P, obtain the proper vector V_Result that length is K; Finally quantize according to the positive and negative of element in V_Result vector, if element is more than or equal to 0, assignment is 1, if be less than 0, assignment is 0, thus forms one 0,1 sequence; This 0,1 sequence is global description's of image vision lexical set.

In described geography information constraint proof procedure, the longitude and latitude of query image is expanded, specific as follows:

If the longitude and latitude of query image is (Lon_1, Lat_1), first point (Lon_1, Lat_1) is extended to 4 points: (Lon_1-Dist, Lat_1-Dist), (Lon_1-Dist, Lat_1+Dist), (Lon_1+Dist, Lat_1-Dist), (Lon_1+Dist, Lat_1+Dist); These 4 points respectively correspondence are the square area of the length of side centered by (Lon_1, Lat_1), with 2 × Dist.After expansion 4 are carried out NN Query under tree is pitched in space 4, the then candidate image collection that arrives of Fusion query, and then the some width images alternatively image utilizing Euclidean distance chosen distance nearest.

Described global description's son checking adopts Hamming distance to verify the consistance of global description's.

The local feature region of scene image adopts SIFT, SURF or MSER to extract.

Described visual vocabulary space checking adopts random sampling unification algorism to verify.

Beneficial effect of the present invention is as follows:

The present invention is on the basis of image geography information, by geography information and space index structure, has filtered out a large amount of irrelevant images efficiently.And realize based on the Hamming distance that is verified of global description's, its computation complexity is also very low.And the quantification of visual vocabulary also contributes to the efficiency improving the checking of visual vocabulary space; The space checking of visual vocabulary improves the accuracy rate of images match.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the scene search method illustrated based on geographical location information.

Fig. 2 is Quadtree Spatial Index structure.

Fig. 3 is the horizontal and vertical projection histogram cut zone sample figure based on target in region.

Fig. 4 (a) is the match condition before a width space checking.

Fig. 4 (b) is that the coupling after the checking of space asks condition.

Fig. 5 (a) is a width test pattern.

Fig. 5 (b) is the candidate image collection obtained on the basis of geography information constraint.

Fig. 5 (c) is the result of carrying out global description's son checking on the basis of geography information constraint.

Fig. 5 (d) is the result adopting global description's son checking and the checking of visual vocabulary space on the basis of geography information constraint.

Embodiment

Embodiments of the invention are introduced in detail according to reference to accompanying drawing.

Based on a scene search method for geographical location information, be implemented as follows:

Be illustrated in figure 1 the process flow diagram of the scene search method that the present invention is based on geographical location information.In the present invention, the image of user's input can be by the various equipment that can obtain image and GPS information, such as: the mobile phone containing GPS positioning function and image camera function.In Index process, the image information of user's input can be text or structurized object factory information.Three category features according to image are retrieved by the present invention fast, are respectively: latitude and longitude information, global description's, visual vocabulary and position thereof; Query image obtains corresponding similar image by three category feature inquiries in match index storehouse, is then obtained the information of query image by this similar image.

The embodiment of the present invention comprises two processes: Index process and query script.

In Index process:

First, extract the local feature region of image, image table is shown as the set of local feature;

Then, according to visual vocabulary dictionary, local feature description's is quantized, obtain corresponding visual vocabulary.

Secondly, the visual vocabulary aggregate projection of extraction to accidental projection matrix will obtain global description's of image.

Finally, in order to fast finding similar image, build space 4 according to longitude and latitude spatial information and pitch tree index structure and carry out storage figure as information corresponding to characteristic sum.

The local feature region of image can adopt multiple method to carry out extracting and describing, such as: SIFT, SURF, MSER etc.In the present embodiment, SIFT descriptor is adopted.Wherein, the parameter that SIFT descriptor extracts is as follows: fuzzy parameter sigma is set to 1.6, and Feature Descriptor is divided into 4*4, has 16 subwindows, each subwindow adds up the histogram of 8 gradient directions, so the final each unique point descriptor generated is the vector of 128 dimensions.Thus a local feature region can be expressed as: [F _i, θ _i, σ _i, Px _i, Py _i].Wherein, F _ifor unique point descriptor, θ _ifor principal direction, σ _ifor unique point yardstick, Px _iand Py _ifor unique point position in the picture.

In order to the local feature vectors of higher-dimension is quantified as a visual vocabulary, need to build visual vocabulary dictionary, then on the basis of dictionary, obtain visual vocabulary corresponding to local feature description's by K-NN search.In the present embodiment, visual vocabulary dictionary adopts K mean cluster to obtain.First, from training image storehouse, local feature description's is extracted; Then, a random selecting k initial cluster center, adopts Euclidean distance to calculate the distance of each unique point apart from this k cluster centre, and unique point and the cluster centre point nearest apart from this unique point is classified as one bunch.When all unique points are all sorted out well, calculate the mean value of each Cu Zhong local feature description subvector, form a new k central point, then repeat above-mentioned steps, until central point no longer changes.In the present embodiment, the image library of training visual vocabulary is 5000, and the size of K is set to 10000, using k cluster centre finally generating as visual dictionary.When quantizing, for the ease of from 10000 classes, fast finding is to nearest center in the heart, the present embodiment adopts K-D tree to store cluster centre, and searches arest neighbors with Euclidean distance.Local feature description's is represented by the class center numbering of arest neighbors.The proper vector of the local feature region of every width image is quantified as on the vocabulary of the nearest visual dictionary of distance feature point, thus piece image is expressed as the set { [w of visual vocabulary _i, θ _i, σ _i, Px _i, Py _i], i ∈ N}, N are the local feature region quantity in image.Decrease storage demand by quantifying, improve matching efficiency.

On the basis of image vision lexical representation, in order to the similarity of fast verification two visual vocabulary set, method proposes an image overall descriptor.This global description's passes through by visual vocabulary aggregate projection on an accidental projection matrix, and quantizes to realize to projection result.First stochastic generation one is by the accidental projection matrix P of the capable M row of 1 and-1 K formed, and wherein K represents the sub-length of global description, and M represents visual vocabulary number in dictionary.All visual vocabulary set all project on the projection matrix of this generation.Then the vector form V_Img becoming M to tie up visual vocabulary set expression whether is there is in the picture according to the visual vocabulary in dictionary; If occur, correspondence position is 1 otherwise is 0.Then according to V_Result=V_Img × P, V_Img is projected on P, obtain the proper vector V_Result that length is K.Finally quantize according to the positive and negative of element in V_Result vector, if element is more than or equal to 0, assignment is 1, if be less than 0, assignment is 0, thus forms one 0,1 sequence.This 0,1 sequence is global description's of image vision lexical set.The oversize number of times that can increase calculating of global description's, too short, the accuracy rate of inquiry can be affected.In order to reach the object of dimensionality reduction on the basis ensureing accuracy rate, in the method, the length of global description's is set to 400, then the sub sequence of 400 be made up of 0 and 1 of the global description of each image, totally 50 bytes.In order to the sub-generative process of above-mentioned global description is described, the present embodiment is described by sample.Suppose: visual vocabulary dictionary has 5 vocabulary { w ₁, w ₂, w ₃, w ₄, w ₅, the visual vocabulary set of image is { w ₃, w ₅, then the vector form V_Img that image vision lexical set is corresponding is: [0,0,1,0,1].Suppose that the projection matrix P of stochastic generation is as follows:

Then: V_Result=V_Img × P=[2,0 ,-2]

Quantizing V_Result is: [1,1,0]; Byte value (1 × 2 is represented during storage ²+ 1 × 2 ¹+ 0 × 2 ⁰=6) store.

After being extracted three category features of image, this method is convenient to realize real-time query according to the index database of geographic position (longitude and latitude) the design of graphics picture of image.Because the object of inquiry has localized clusters phenomenon, therefore this method adopts space 4 to pitch the geography information longitude and latitude of tree construction to image and carry out index, and its index structure as shown in Figure 2.Pitch tree by space 4 will take under the image of diverse geographic location is placed into different leaf nodes according to the longitude and latitude threshold value (Lon_Th_X, Lat_Th_X) on intermediate node, thus realize neighbor searching fast when inquiring about.This longitude and latitude threshold value is exactly geographic area corresponding for this intermediate node is further subdivided into four different subregions.Along with the continuous increase in index tree middle layer, construct the index structure of a level.When index, the longitude of image to be indexed is less than Lon_Th_X, and its dimension is less than Lat_Th_X, be then placed on first leaf node of this intermediate node by this image.Therefore, each leaf node stores the image information list in this leaf node corresponding region.Every width thumbnail stores: image ID, longitude and latitude, global description's three kinds of information.Can find the file of the visual vocabulary set of storage figure picture according to image ID, each visual vocabulary preserves this visual vocabulary positional information in image, verifies for space.

Because this method needs whole map index building unlike map reference, but for some object in scene, so image object has intensive and sparse differentiation on map.Therefore, the trough of the horizontal and vertical projection histogram of target in region is adopted to split when this method is split each area of space.As shown in Figure 3, first area-of-interest on map is divided into identical cell by this dividing method, then determines its value by the quantity of image in cell; Add up the quantity of all targets (image) at each horizontal and vertical position, as accompanying drawing 3 left side and following horizontal histogram and vertical histogram, be respectively the result of vertical projection and horizontal projection; Then smoothing by running mean, smooth window size is 3, and its level and smooth weight is respectively: [0.2,0.6,0.2].By H (i) >H (i-1) and H (i) >H (i+1) determines whether crest, iteration is level and smooth until histogram is double-peak shape, the position calculating two peak-to-peak minimum point, as split position, if iteration smoothly cannot obtain double-peak shape, directly adopts even division methods to be divided into four regions.If along with the increase of view data, can divide again this leaf node.In the present embodiment, the quantity setting image in each leaf node is 200, and the leaf node being greater than 200 needs to divide again.

In query script:

The present invention adopts geography information constraint, global description's son checking, visual vocabulary space to verify that three steps are to determine the similar image of query image under spatial geographic information condition, then characterize the information of query image according to the information of similar image.When adopting geography information longitude and latitude query candidate similar diagram image set, in order to the division error introduced when considering index, the longitude and latitude of this method to query image is expanded.If the longitude and latitude of query image is (Lon_1, Lat_1), see the object-point a in accompanying drawing 3 red square frame, then this method is first by point (Lon_1, Lat_1) 4 points are extended to: (Lon_1-Dist, Lat_1-Dist), (Lon_1-Dist, Lat_1+Dist), (Lon_1+Dist, Lat_1-Dist), (Lon_1+Dist, Lat_1+Dist).These 4 points respectively correspondence are the square area (red block) of the length of side centered by (Lon_1, Lat_1), with 2*Dist.After expansion 4 are carried out NN Query under tree is pitched in space 4, then the candidate image collection that arrives of Fusion query, the C width image alternatively image then utilizing Euclidean distance chosen distance nearest.Such as, in fig. 3, for query image a, the candidate image that method obtains is all images in leaf node b, d.In the present embodiment, Dist is set to 0.0005, C is 60; When there being width image to destination object every 3-5 rice region, obtain best performance.Dist can be used for controlling the method susceptibility information constrained to geography, and the constraint of the larger geography information of this value is more weak.

After according to the geography information determination candidate image collection of image taking, this method carrys out the similarity of further authentication image according to image overall descriptor coupling, removes non-similar image.According to the sub-construction method of the global description of this method, this method adopts Hamming distance to confirm the similarity of global description's.Hamming distance is the number that correspondence position character is not identical.Compare discovery by test, Hamming distance threshold value is set to 10 can reach reasonable effect; The image filtering that the Hamming distance of global description's is greater than 10 falls by this method, and the image remained forms new candidate image collection.

Finally, this method carries out the checking of visual vocabulary space to the image that candidate image is concentrated.First obtain corresponding visual vocabulary set according to the ID of candidate image, each vocabulary has position attribution in the picture.By the evolution consistance between coupling visual vocabulary, this method confirms whether candidate image and query image describe same target.A lot of space verification methods may be used to this step.In the present embodiment, have employed consistent (RANSAC) algorithm of random sampling and carry out space checking.The input of RANSAC algorithm is one group of observation data, a parameterized model can explaining or be adapted to observation data.RANSAC reaches target by repeatedly selecting in data one group of random subset.The subset be selected is assumed to be intra-office point, and verifies by following method: 1) have a model to be adapted to the intra-office point supposed, namely all unknown parameters can calculate from the intra-office point of hypothesis.2) go to test other all data with the model obtained in 1, if certain point is applicable to the model estimated, think that it is also intra-office point.3) if there is abundant point to be classified as the intra-office point of hypothesis, the model so estimated is just enough reasonable.4) then, with the intra-office point duplicate removal new estimation model of all hypothesis, because it is only by initial hypothesis intra-office point estimation.5) last, by estimating that the error rate of intra-office point and model carrys out assessment models.This process is repeatedly executed fixing number of times, each model of producing or because intra-office point is rejected very little, or because better and selected than existing model.Filter finally by the quantity of adding up the unique point verified by space, threshold value is greater than (in the present embodiment by the quantity of the unique point of checking, threshold value is set to 6) image be retained, form final candidate image collection, space the result as shown in Figure 4, match condition before the checking of Fig. 4 (a) representation space, the match condition after the checking of Fig. 4 (b) representation space.As can be seen from the result figure, the authenticated matching characteristic point having filtered those and do not met spatial relationship in space, thus improve the accuracy rate of retrieval.

After geography information constraint, global description's son checking and the checking of visual vocabulary space, final candidate image collection is as the similar image set of query image.In order to obtain the object information of query image, this method adopts the method based on ballot.According to the image ID of similar image set, search the object information that image is corresponding in a database, such as: place name, address, company, mechanism etc., then there is the number of times of special object in statistics, and the object information that number of times is maximum is just the object information of this query image.And this information is returned to user as the information of query image, thus realize the function of scene Recognition.Wherein, in database, the descriptor of image index number, Image Name and image has manually marked in advance, and the mode that also can be marked upload images by user and evaluate is realized.In the present embodiment, in the form fulfillment database manually marked, the information of image is filled.

In the present embodiment, system is divided into client and server end.Thumbnail has marked descriptor in advance and has been placed on server end, and server end, to database images extract minutiae, visual vocabulary according to the method described above, builds global description's, then builds space 4 according to latitude and longitude information and pitches tree index structure.This process completes under the state of off-line.When client upload one is containing the query image of GPS geography information, server end is quantized into visual vocabulary set to query image extract minutiae online, and projection generates global description's.Then information constrained according to geography, global description's son checking, visual vocabulary space verifies that three steps obtain candidate image collection, and the descriptor finally returning optimum matching object completes to client alternately.

Fig. 5 illustrates the inquiry effect of a width sample query image, represents query image with the image that blue border is framed, and is wrong identification with the image that red dotted line circle is got up.Fig. 5 (a) represents sample query image, Fig. 5 (b) represents the Query Result under only adopting geography information to retrain, Fig. 5 (c) represents the candidate's similar image result adopting geography information constraint and global description's son checking to combine, and Fig. 5 (d) represents the candidate's similar image set after three steps are verified in geography information constraint, global description's son checking, visual vocabulary space.As can be seen from above-mentioned image, only with time geographical information constrained, the image of a lot of error hiding is retrieved out.After global description's son checking, error hiding amount of images decreases, and after eventually passing visual vocabulary space verification method, the effect of retrieval is promoted greatly.

The present invention is on the basis based on image geography information, takes full advantage of that geographic information retrieval efficiency is high, speed is fast and take and store few feature, and establish global description's son checking on this basis, improve search efficiency.Have employed the method for level checking, drastically increase inquiry velocity and accuracy.Wherein, utilize geographical location information, 80% ~ 90% incoherent image can be filtered out, in raising speed, serve central role.In addition, the method not only can be applied to large-scale navigation, such as: city, also can be applied to smaller scope, such as: campus etc., and should use more convenient.After tested, accuracy rate of the present invention can reach more than 90%.

Claims

1., based on a scene search method for geographical location information, it is characterized in that comprising the steps:

Step 2, in query script by level verification mode determination similar diagram image set, and obtained the information of scene image to be checked by the information of ballot mode and similar image;

2. a kind of scene search method based on geographical location information as claimed in claim 1, it is characterized in that tree index structure is pitched in the structure space 4 described in step 1, adopt the trough of the horizontal and vertical projection histogram of target in region to split, concrete segmentation is as follows:

First area-of-interest on map is divided into identical cell;

Then its value is determined by the quantity of image in cell; According to the amount of images on each horizontal and vertical position of the Data-Statistics of cell, thus obtain horizontal histogram and the vertical histogram of amount of images;

Smoothing finally by running mean, smooth window size is 3, and its level and smooth weight is respectively: [0.2,0.6,0.2]; By H (i) >H (i-1) and H (i) >H (i+1) determines whether crest, iteration is level and smooth until histogram is double-peak shape, calculate bimodal between the position of minimum point as split position, if iteration smoothly cannot obtain double-peak shape, directly adopt even division methods to be divided into four regions.

3. a kind of scene search method based on geographical location information as claimed in claim 1, it is characterized in that the level verification method described in step 2 adopts three layers of feature to verify, successively carry out geography information constraint, global description's is verified, three proof procedures are verified in visual vocabulary space.

4. a kind of scene search method based on geographical location information as claimed in claim 1, it is characterized in that described global description's passes through visual vocabulary aggregate projection on the projection matrix be made up of-1 and 1 of a stochastic generation, and binaryzation is carried out to projection result, obtain global description's that 0,1 sequence represents, specific as follows:

The accidental projection matrix P that stochastic generation one is arranged by the capable M of 1 and-1 K formed, wherein K represents the sub-length of global description, and M represents visual vocabulary number in dictionary; All visual vocabulary set all project on the projection matrix of this generation; Then the vector form V_Img becoming M to tie up visual vocabulary set expression whether is there is in the picture according to the visual vocabulary in dictionary; If occur, correspondence position is 1 otherwise is 0; Then basis , V_Img is projected on P, obtains the proper vector V_Result that length is K; Finally quantize according to the positive and negative of element in V_Result vector, if element is more than or equal to 0, assignment is 1, if be less than 0, assignment is 0, thus forms one 0,1 sequence; This 0,1 sequence is global description's of image vision lexical set.

5. a kind of scene search method based on geographical location information as claimed in claim 3, is characterized in that expanding the longitude and latitude of query image in described geography information constraint proof procedure, specific as follows:

If the longitude and latitude of query image is (Lon_1, Lat_1), first point (Lon_1, Lat_1) is extended to 4 points: (Lon_1-Dist, Lat_1-Dist), (Lon_1-Dist, Lat_1+Dist), (Lon_1+Dist, Lat_1-Dist), (Lon_1+Dist, Lat_1+Dist); These 4 points respectively corresponding centered by (Lon_1, Lat_1), think the square area of the length of side; After expansion 4 are carried out NN Query under tree is pitched in space 4, the then candidate image collection that arrives of Fusion query, and then the some width images alternatively image utilizing Euclidean distance chosen distance nearest.

6. a kind of scene search method based on geographical location information as claimed in claim 3, is characterized in that described global description's son checking adopts Hamming distance to verify the consistance of global description's.

7. a kind of scene search method based on geographical location information as claimed in claim 1, is characterized in that the local feature region of scene image adopts SIFT, SURF or MSER to extract.

8. a kind of scene search method based on geographical location information as claimed in claim 3, is characterized in that described visual vocabulary space checking adopts random sampling unification algorism to verify.