CN106203449A

CN106203449A - The approximation space clustering system of mobile cloud environment

Info

Publication number: CN106203449A
Application number: CN201610539136.9A
Authority: CN
Inventors: 季长清; 陶帅; 王宝凤; 汪祖民; 许玉杰; 宋佳齐
Original assignee: Dalian University
Current assignee: Dalian University
Priority date: 2016-07-08
Filing date: 2016-07-08
Publication date: 2016-12-07

Abstract

The approximation space clustering system of mobile cloud environment, belong to image processing field, for the problem solving image clustering, have technical point that and be mainly made up of cloud center service system, cloud center service system is set up and is performed SIFT feature and extracts and matching algorithm, extracts the correlated characteristic data of image, and mates, and cloud center service system also sets up and perform over-sampling correction algorithm to carry out image correction, feedback cluster approximate image information is to client.Effect is: this image clustering system can be uploaded by the corresponding information that correlation technique gets picture at user side and be saved in cloud server, and then cloud server processes, and obtains optimal image clustering scheme and feeds back to user.

Description

The approximation space clustering system of mobile cloud environment

Technical field

Patent of the present invention belongs to field of mobile communication, is the approximation space clustering system of a kind of mobile cloud environment, this system Relate to the mass data processing under large-scale data analysis, cloud computing environment, relate to data intelligence processing and application and development.

Background technology

Along with Internet technology develops rapidly with digitized, and electronic digital product is universal, and people can obtain Digital Image Data got more and more.Image shows directly perceived, abundant in content multimedia messages as a kind of, each at each row Industry is the most increasingly widely applied, as numeral multi-media book shop, medical image applications management, satellite remote sensing images and GIS-Geographic Information System, authentication identification system, ecommerce, the supervision etc. of trade mark copyright.But, the image of explosive growth Data are own through considerably beyond the receptible degree of people, for the view data of magnanimity, how can fast and effeciently enter Line pipe reason and retrieval, then depending from obtaining potential valuable information becomes problem of concern.And retrieval time is figure As one of most critical issue of retrieval, traditional search method is the sample image to be checked provided according to user, system according to Specific similarity measurement rule, all images in ergodic data storehouse, and select most like some width to return as Query Result Back to user but due in real world image resource the abundantest, view data storage capacity is huge, if still employing order inspection Rope, amount of calculation will be considerable, causes the recall precision can be the lowest.If can first the image in image library be entered before retrieval Row cluster, sets up image index to all kinds of, and retrieving is carried out in specific a certain class then, thus can be at very great Cheng The retrieval matching range of downscaled images on degree, to reach accurately and the purpose of rapidly retrieving images.

The algorithm being presently used for image clustering mainly has K-Means clustering algorithm, scalable k-means++ to cluster calculation Method etc., but the characteristic of these sequence of algorithms limits its extensibility, and it needs when reconnaissance to carry out substantial amounts of iteration behaviour Make so that efficiency comparison when it processes mass data under parallel processing environment is low.Therefore, exploitation better image cluster is calculated Method just seems necessary.

In recent years, cloud computing was developing progressively an important branch into cloud computing.Any intelligent terminal such as intelligence Mobile phone and panel computer the most on-demand from wireless network environment can obtain service, and are not only restricted to limited hardware money Source, computing capability and bandwidth etc..It will be apparent that in cloud computing, efficient analysis and process massive spatio-temporal data, and gather with image Class application combines, it is simply that an emerging practical technique, under cloud computing environment, effective Indexing Techniques for Spatial Databases is to carrying High spatial database lookup efficiency is most important with application Consumer's Experience, and based on this starting point, we have designed and Implemented this Bright patent.

Summary of the invention

According to defect and deficiency present in above-mentioned background technology, by use, patent of the present invention includes that SIFT feature describes Method, over-sampling correction algorithm, at interior multiple image processing algorithms, devise a kind of new approximate image clustering software, more preferably Solve image clustering problem.

To achieve these goals, the technical solution adopted in the present invention is:

The approximation space clustering system of a kind of mobile cloud environment, is mainly made up of cloud center service system, cloud center service System is set up and is performed SIFT feature and extracts and matching algorithm, extracts the correlated characteristic data of image, and mates, and in cloud Central server system also sets up and performs over-sampling correction algorithm to carry out image correction, and feedback cluster approximate image information is to client End.

Further, the image that needs are carried out clustering by client as required sends to cloud center service system, and connects Receive the request of cloud center service system.

Further, described foundation the method performing SIFT feature extraction and matching algorithm, including:

S1. detection yardstick spatial extrema；

S2. key point location；

S3. key point direction coupling；

S4. Feature Descriptor is generated；

S5. characteristic matching.

Further, the method for step S1. detection yardstick spatial extrema is: each sampled point will be with the 8 of present image Individual consecutive points and 9 × 2 points corresponding to neighbouring scalogram picture compare, if this sampled point is both greater than or all Less than other 26 consecutive points, this point is then key point.

Further, the method for step S2. key point location is: key point matching three-dimensional quadratic function is to position key point Yardstick and position.

Further, the method for step S3. key point direction distribution is: adopt in the neighborhood window centered by key point Sample, and with the gradient direction of gradient orientation histogram statistics neighborhood territory pixel, it is whole that histogrammic peak value represents at this key point The principal direction of neighborhood gradient, this principal direction is as the direction of this key point.

Further, in gradient orientation histogram, when there is another peak value of main peak value 80% energy above, then It is the auxiliary direction of this key point by the direction so that main and auxiliary direction is combined.

Further,

Step S4. generates the method for Feature Descriptor:

S4.1., after obtaining the principal direction of key point, centered by key point, the window of 16 × 16 is taken, and by image coordinate Axle rotates to be the principal direction of key point；

S4.2. calculate in the window of 16 × 16 pixels centered by key point each pixel gradient direction and Amplitude；

S4.3. Gauss weighting is carried out；

S4.4. window is divided into 4 × 4 wickets, each wicket calculates the gradient direction Nogata in 8 directions Figure, and draw the accumulated value of each gradient direction, form a seed points；

Wherein, key point by 4 × 4 totally 16 seed points form, each seed points has 8 direction vector information, one Individual characteristic point forms the SIFT feature vector of 128 dimensions.

Further, the method for step S5. characteristic matching is: take certain feature in piece image in two images Point, find out its with another piece image in European closest the first two characteristic point, in the two characteristic point, if nearest away from The ratio of the distance close to homogeneous is less than certain threshold value, then accept this pair match point, otherwise abandon.

Further,

The iterative process of described over-sampling correction algorithm is as follows:

Step one. calculate global clustering error ψ；

Step 2. each Map processing procedure, task uses scalable k-means++ initialization algorithm to select Point, each some probability selected for x ∈ X is p_x=l*d²(x,U₀)/φ_X(c₁)；

Wherein: U₀For initial all central point set, c¹Uniformly random first central point selected, d²(x,U₀) it is every One some x ∈ X_iWith U₀Squared-distance；

Step 3. an all central point from Map task of Reduce task merging, the set U of output center point₁；

Step 4. during iteration, OnR utilizes the global error that-2 iteration gained central points of jth are corresponding to select Point；OnR utilizes another oversample factor o to expand each point further and is chosen as the probability of central point；OnR appoints at Reduce Business use a process revised remove the central point of multiselect.

Beneficial effect: this image clustering system can be uploaded by the corresponding information that correlation technique gets picture at user side And it is saved in cloud server, then cloud server processes, and obtains optimal image clustering scheme and feeds back to user.

Accompanying drawing explanation

The functional block diagram of Fig. 1 present invention；

The schematic diagram of the extensive approximate image clustering system framework of Fig. 2 present invention；

Fig. 3 inventive feature extracts process；

Fig. 4 inventive feature vector generates process；

The code of the over-sampling correction algorithm of Fig. 5 present invention；

The over-sampling correction algorithm flow process of Fig. 6 present invention；

The structured flowchart of the described system of Fig. 7 present invention.

Detailed description of the invention

Embodiment 1:With reference to Fig. 1, the approximation space clustering system of a kind of mobile cloud environment, described system is by a Ge Yun center Service system and a cell phone intelligent mobile client software system composition.Wherein, cloud service system is responsible for carrying out SIFT feature Extraction algorithm sets up the execution with over-sampling correction algorithm, and cluster result is fed back to user side；As required will in mobile terminal Need the image carrying out image clustering to send to cloud center service system, and receive high in the clouds request.

With reference to Fig. 2, as an embodiment, the execution flow process of this approximate image clustering system is, when cluster user sends After associated picture cluster request, cloud system obtain best according to SIFT feature extraction algorithm and over-sampling correction algorithm Clustering schemes also returns final result to user, carries out business confirmation by mobile intercommunion platform.

This large-scale image clustering system uses a kind of data processing method based on cloud computing, gathers when user sends image During class request, data center can extract rapidly the correlated characteristic data of image by the SIFT feature extraction algorithm set up.SIFT The process step of feature extraction algorithm particularly as follows:

With reference to Fig. 3,4, its detailed step is: in order to detect the local maximum minimal point of metric space, each sampled point 9 × 2 points totally 26 points that will be corresponding with neighbouring scalogram picture with the 8 of present image consecutive points compare, if This sampled point will be more than or will be less than other 26 consecutive points, and this point is then key point.These key points also need to intend Close three-dimensional quadratic function to be accurately positioned yardstick and the position of key point.Sample in the neighborhood window centered by key point, And the gradient direction of neighborhood territory pixel is added up with gradient orientation histogram.The angular range of this gradient orientation histogram is 0-360 degree, The most every 10 degree as a post, totally 36 pillars, histogrammic peak value represents the master of the whole neighborhood gradient at this key point Direction, the direction is then as the direction of this key point.In gradient orientation histogram, a key point may be designated to be had Multiple directions, a principal direction and more than one auxiliary direction.When there is another peak value of main peak value 80% energy, then will Individual direction is considered the auxiliary direction of this key point, can be strengthened the robustness of algorithmic match by the associating of primary and secondary direction.Obtaining After the principal direction of key point, centered by key point, then take the window of 16 × 16, and image coordinate axle is rotated to be key point Principal direction, to guarantee rotational invariance.Then calculate each in the window of 16 × 16 pixels centered by key point The gradient direction of individual pixel and amplitude, then carry out Gauss weighting.Finally window is divided into 4 × 4 wickets, each little The gradient orientation histogram in 8 directions of calculating on window, and draw the accumulated value of each gradient direction, a kind can be formed Sub-point.One key point by 4 × 4 totally 16 seed points form, each seed points has 8 direction vector information, such a spy Levy the SIFT feature vector that a little just can form 128 dimensions.As in Figure 2-4.The method of this associating neighborhood directivity information is not Only enhance the noiseproof feature of SIFT algorithm, and the characteristic matching that there is error when location is also had reasonable fault-tolerant Property.After the SIFT feature vector of two width images generates, it next it is exactly SIFT feature matching stage.SIFT feature Vectors matching Using the Euclidean distance between characteristic vector as the similarity measures of characteristic point in two width images.Take one in two images Certain characteristic point in width image, and find out itself and European closest the first two characteristic point in another piece image.This two In individual characteristic point, if the ratio of the near distance of nearest distance homogeneous is less than certain threshold value, then accept this pair match point, otherwise Abandon.When reducing this threshold value when, SIFT match point number can reduce, but match point more accurately and is stablized.

With reference to Fig. 5, as another embodiment, the definition of over-sampling correction algorithm is: in each iteration, over-sampling Revise (Oversampling and Refining, referred to as OnR) use a MapReduce operation select l central point and Calculate the error of the overall situation.OnR method is inspired by scalable k-means++ method, and except oversample factor, it uses Another oversample factor o increases the number of the central point that the Map stage is selected further.

With reference to Fig. 6, Job1 is (from P₁To P₄) remain responsible for calculate ψ, ψ be global clustering error；Job2 is (from P₅To P₈) every One Map (Map is a processing procedure, and this process naming is Map by we) task uses scalable k-means++ Initialization algorithm carries out reconnaissance, and each some probability selected for x ∈ X is p_x=l*d²(x,U₀)/φ_X(c₁), wherein: U₀For initially All central point set, c₁Uniformly random first central point selected, d²(x,U₀) it is that each puts x ∈ X_iWith U₀Square Distance (P₅Stage), all central point (P from Map task of right later Reduce task merging₇Stage), output center point Set U₁(P₈Stage).The process of OnR iteration is from P₉To P₁₁, the process of iteration j is as follows, and wherein, j is iterations, and 1 ≤ j≤n, n are maximum iteration time, data point number determine: because each Map task cannot obtain-1 iteration of jth The global error that gained central point is corresponding, i.e. cannot obtain φ_X(U_j-1), our method OnR utilizes-2 iteration gained of jth The global error that central point is corresponding carries out reconnaissance, i.e. uses φ_X(U_j-2)；Yet with the central point obtained after-2 iteration of jth Number fewer than the number of the central point of gained after j-1 iteration, so-2 corresponding global errors φ of jth_X(U_j-2) ratio -1 corresponding global error of jth is big, result in each some x and is chosen as the Probability p of central point_x=l*d²(x,U_j-1)/φ_X (U_j-2) diminish, and then the number that result in the central point that iteration j selects tails off.In order to solve the problems referred to above, OnR utilizes Another oversample factor o expands each point further and is chosen as the probability of central point, i.e. x is chosen as the probability of central point and is p_x=o*l*d²(x,U_j-1)/φ_X(U_j-2).In this case, the number of the central point of iteration Map task choosing each time Expected value can be more than l (l is the central point number that the expectation of scalable k-means++ initialization algorithm each iteration selects), because of This OnR needs the central point using a process revised to remove multiselect in Reduce task.

The iterative process that OnR algorithm is detailed is as follows: at P₉In the stage, each Map utilizes the Probability p of each some x_x=o*l* d²(x,U_j-1)/φ_X(U_j-2) select new central point and calculate center point set U_j-1Corresponding local errorAnd institute Some local error valuesEach random probability value corresponding for some x chosen(The parameter arranged voluntarily, with p'_x =l*d²(x,U_j-1)/φ_X(U_j-1) compare to select central point) and l*d²(x,U_j-1) it is transferred to Reduce (P₁₀Stage).Repair Positive operation performs (P in a single Reduce task₁₁Stage), the central point that its merging Map stage is selectedTo all of local errorCarry out sum operation and obtain-1 iteration rear center's point correspondence of jth Global error φ_X(U_j-1), real probit corresponding to selected central point x, i.e. p' also can be obtained at this_x=l*d²(x, U_j-1)/φ_X(U_j-1)；IfThen some x is correct central point, otherwise x is removed from C, this time the output of iteration For U_jAnd φ_X(U_j-1), they become the input data of next iteration.

By above analysis, show that the transmission volume of OnR iteration each time includes 4 parts: this time iteration is selected Center point setAll of local errorThe random number that selected central point is correspondingl*d²(x, U_j-1).With the transmission quantity of MRSKMI algorithm (With) compare, the major advantage of OnR is not have at network overhead On the premise of having growth too many, it greatly reduces I/O expense.

Embodiment 2:The approximation space clustering system of a kind of mobile cloud environment is by a cloud center service system and a hands Quick-witted energy mobile client software system composition.Wherein cloud service system is responsible for carrying out the foundation of SIFT feature extraction algorithm and excessively adopting The execution of sample correction algorithm, and cluster result is fed back to user side；Mobile terminal carries out image clustering by needing as required Image sends to cloud center service system, and receives high in the clouds request.

Supplementing as technical scheme, the execution method of this image clustering system is: when cluster user sends associated picture After cluster request, cloud system obtain best clustering schemes also according to SIFT feature extraction algorithm and over-sampling correction algorithm Return final result, to user, carries out business confirmation by mobile intercommunion platform.

Supplementing further as technical scheme, 1. the process step of SIFT feature extraction algorithm particularly as follows: detect yardstick Spatial extrema；2. key point location；3. key point direction distribution；4. Feature Descriptor is generated；5. characteristic matching.

Its detailed step is: in order to detect the local maximum minimal point of metric space, and each sampled point will be with current figure 9x2 point totally 26 points that 8 consecutive points of picture are corresponding with neighbouring scalogram picture compare, if this sampled point will Being more than or will be less than other 26 consecutive points, this point is then key point.These key points also need to matching three-dimensional secondary letter Number is with the yardstick and the position that are accurately positioned key point.Sample in the neighborhood window centered by key point, and use gradient direction The gradient direction of statistics with histogram neighborhood territory pixel.The angular range of this gradient orientation histogram is 0-360 degree, the most every 10 degree of works Being a post, totally 36 pillars, histogrammic peak value represents the principal direction of the whole neighborhood gradient at this key point, and the direction is then Direction as this key point.In gradient orientation histogram, a key point may be designated has multiple directions, one Principal direction and more than one auxiliary direction.When there is another peak value of main peak value 80% energy, then individual direction is considered The auxiliary direction of this key point, can strengthen the robustness of algorithmic match by the associating of primary and secondary direction.Obtaining the main formula of key point Backward, centered by key point, then take the window of 16 × 16, and image coordinate axle is rotated to be the principal direction of key point, with really Protect rotational invariance.Then in the window of 16 × 16 pixels centered by key point, calculate the gradient side of each pixel To and amplitude, then carry out Gauss weighting.Finally window is divided into 4 × 4 wickets, each wicket calculates 8 The gradient orientation histogram in direction, and draw the accumulated value of each gradient direction, a seed points can be formed.One key point By 4 × 4 totally 16 seed points form, each seed points has 8 direction vector information, and such a characteristic point just can be formed The SIFT feature vector of 128 dimensions.As in Figure 2-4.The method of this associating neighborhood directivity information not only increases SIFT and calculates The noiseproof feature of method, and the characteristic matching that there is error when location is also had reasonable fault-tolerance.Display in Fig. 2-4 It is the window of 4 × 4 pixels, but classical SIFT uses the window of 16 × 16 pixels in Practical Calculation.When two width figures After the SIFT feature vector of picture generates, it next it is exactly SIFT feature matching stage.SIFT feature Vectors matching is with characteristic vector Between Euclidean distance as the similarity measures of characteristic point in two width images.Take in two images in piece image Certain characteristic point, and find out itself and European closest the first two characteristic point in another piece image.In the two characteristic point, If the ratio of the distance that nearest distance homogeneous is near is less than certain threshold value, then accepts this pair match point, otherwise abandon.Work as reduction The when of this threshold value, SIFT match point number can reduce, but match point more accurately and is stablized.

Supplementing as technical scheme, the definition of over-sampling correction algorithm is: in each iteration, and OnR uses one MapReduce operation is selected l central point and calculates the error of the overall situation.OnR method is by scalable k-means++ method Inspiration, except oversample factor l, it uses another oversample factor o to increase the number of the central point that the Map stage is selected further Mesh.

The most supplementary as technical scheme, the method for approximate image based on over-sampling correction cluster is: Job1 is (from P₁To P₄) remain responsible for calculating ψ；Job2 is (from P₅To P₈) each Map task use scalable k-means+ + initialization algorithm carries out reconnaissance, and each some probability selected for x ∈ X is p_x=l*d²(x,U₀)/φ_X(c₁)(P₅Stage), then One all central point (P from Map task of Reduce task merging₇Stage), the set U of output center point₁(P₈Stage). The process of OnR iteration is from P₉To P₁₁, the process of iteration j is as follows: because each Map task cannot obtain jth and change for-1 time For the global error that gained central point is corresponding, φ i.e. cannot be obtained_X(U_j-1), our method OnR utilizes-2 iteration institutes of jth The global error obtaining central point corresponding carries out reconnaissance, i.e. uses φ_X(U_j-2)；Yet with the center obtained after-2 iteration of jth The number of point is fewer than the number of the central point of gained after j-1 iteration, so-2 corresponding global errors φ of jth_X (U_j-2) bigger than-1 corresponding global error of jth, result in each some x and be chosen as the Probability p of central point_x=l*d²(x, U_j-1)/φ_X(U_j-2) diminish, and then the number that result in the central point that iteration j selects tails off.In order to solve the problems referred to above, OnR utilizes another oversample factor o to expand each point further and is chosen as the probability of central point, i.e. x is chosen as central point Probability be p_x=o*l*d²(x,U_j-1)/φ_X(U_j-2).In this case, the central point of iteration Map task choosing each time The expected value of number can be more than l (l is the central point that the expectation of scalable k-means++ initialization algorithm each iteration selects Number), therefore OnR needs the central point using a process revised to remove multiselect in Reduce task.

The iterative process that OnR algorithm is detailed is as follows: at P₉In the stage, each Map utilizes the Probability p of each some x_x=o*l* d²(x,U_j-1)/φ_X(U_j-2) select new central point and calculate center point set U_j-1Corresponding local errorAnd institute Some local error valuesEach random probability value corresponding for some x chosenAnd l*d²(x,U_j-1) be transferred to Reduce(P₁₀Stage).The operation revised performs (P in a single Reduce task₁₁Stage), it merges the Map stage The central point selectedTo all of local errorCarry out during sum operation obtains after-1 iteration of jth Global error φ that heart point is corresponding_X(U_j-1), real probit corresponding to selected central point x, i.e. p' also can be obtained at this_x =l*d²(x,U_j-1)/φ_X(U_j-1)；IfThen some x is correct central point, otherwise x is removed from C, the most repeatedly In generation, is output as U_jAnd φ_X(U_j-1), they become the input data of next iteration.

Embodiment 3:In the various embodiments described above, at clustering phase, this image clustering system can be by correlation technique user End gets the corresponding information of picture and uploads and be saved in cloud server, and then cloud server processes, and obtains optimal Image clustering scheme and feed back to user.

Image clustering system described in the present embodiment has following structure and a benefit:

(1) design of single terminal is used.User side is the software being arranged in Android smartphone, enters for user Use during row image clustering.

User passes through system and the base station of mobile phone operators of embedded in mobile phone, relies on 2G/3G network, and wifi etc. obtains and needs Image to be clustered also sends to high in the clouds.

(2) cloud computing is a kind of calculation based on the Internet, in this way, and the software and hardware resources shared and letter Breath on-demand can be supplied to computer and other equipment.The cloud service that in our invention, the image clustering system of design is used Device is the webserver by multiple cloud data centers or fictitious host computer is constituted, and uses this parallelization of cloud computing to calculate Process large-scale data and tackle the image clustering user on line, in such a mode, it is ensured that addressing during high access is stable Property, also accelerate response speed when user searches for, enhance extensibility simultaneously.

(3) when user sends cluster request, data center can pass through SIFT feature extracting method, over-sampling correction algorithm Carry out large-scale image cluster, and provide the image clustering scheme that applicable user is best.

The above, only the invention preferably detailed description of the invention, but the protection domain of the invention is not Being confined to this, any those familiar with the art is in the technical scope that the invention discloses, according to the present invention The technical scheme created and inventive concept thereof in addition equivalent or change, all should contain the invention protection domain it In.

Claims

1. the approximation space clustering system moving cloud environment, it is characterised in that: mainly it is made up of cloud center service system, cloud Center service system is set up and is performed SIFT feature and extracts and matching algorithm, extracts the correlated characteristic data of image, and carries out Join, and cloud center service system also sets up and perform over-sampling correction algorithm to carry out image correction, feedback cluster approximate image Information is to client.

2. the approximation space clustering system of mobile cloud environment as claimed in claim 1, it is characterised in that client is as required The image carrying out needs clustering sends to cloud center service system, and receives the request of cloud center service system.

3. the approximation space clustering system of mobile cloud environment as claimed in claim 1, it is characterised in that described foundation also performs SIFT feature extracts the method with matching algorithm, including:

S1. detection yardstick spatial extrema；

S2. key point location；

S3. key point direction coupling；

S4. Feature Descriptor is generated；

S5. characteristic matching.

4. the approximation space clustering system of mobile cloud environment as claimed in claim 3, it is characterised in that step S1. detection ruler The method of degree spatial extrema is: each sampled point will be with the 8 of present image consecutive points and neighbouring scalogram picture pair 9 × 2 points answered compare, if this sampled point both greater than or is both less than other 26 consecutive points, this point is then crucial Point.

5. the approximation space clustering system of mobile cloud environment as claimed in claim 3, it is characterised in that step S2. key point The method of location is: key point matching three-dimensional quadratic function is to position yardstick and the position of key point.

6. the approximation space clustering system of mobile cloud environment as claimed in claim 3, it is characterised in that step S3. key point The method of direction distribution is: samples in the neighborhood window centered by key point, and adds up neighborhood with gradient orientation histogram The gradient direction of pixel, histogrammic peak value represents the principal direction of the whole neighborhood gradient at this key point, this principal direction conduct The direction of this key point.

7. the approximation space clustering system of mobile cloud environment as claimed in claim 6, it is characterised in that at gradient direction Nogata In figure, when there is another peak value of main peak value 80% energy above, then it is the auxiliary direction of this key point by the direction so that Main and auxiliary direction is combined.

8. the approximation space clustering system of mobile cloud environment as claimed in claim 3, it is characterised in that step S4. generates spy The method levying description is:

S4.1., after obtaining the principal direction of key point, centered by key point, take the window of 16 × 16, and image coordinate axle is revolved Transfer the principal direction of key point to；

S4.2. in the window of 16 × 16 pixels centered by key point, calculate gradient direction and the width of each pixel Value；

S4.3. Gauss weighting is carried out；

S4.4. window is divided into 4 × 4 wickets, each wicket calculates the gradient orientation histogram in 8 directions, And draw the accumulated value of each gradient direction, form a seed points；

Wherein, key point by 4 × 4 totally 16 seed points form, each seed points has 8 direction vector information, a spy Levy a SIFT feature vector forming 128 dimensions.

9. the approximation space clustering system of mobile cloud environment as claimed in claim 3, it is characterised in that step S5. feature The method joined is: take certain characteristic point in piece image in two images, finds out itself and Euclidean distance in another piece image Nearest the first two characteristic point, in the two characteristic point, if the ratio of the near distance of nearest distance homogeneous is less than certain threshold Value, then accept this pair match point, otherwise abandon.

10. the approximation space clustering system of mobile cloud environment as claimed in claim 1, it is characterised in that described over-sampling is repaiied The iterative process of normal operation method is as follows:

Step one. calculate global clustering error ψ；

Step 2. each Map processing procedure, task uses scalable k-means++ initialization algorithm to carry out reconnaissance, often Individual some probability selected for x ∈ X is

Wherein: U₀For initial all central point set, c₁Uniformly random first central point selected, d²(x,U₀) be each point x∈X_iWith U₀Squared-distance；

Step 4. during iteration, OnR utilizes the global error that-2 iteration gained central points of jth are corresponding to carry out reconnaissance； OnR utilizes another oversample factor o to expand each point further and is chosen as the probability of central point；OnR is in Reduce task The process that middle use one is revised removes the central point of multiselect.