CN104298713B - A kind of picture retrieval method based on fuzzy clustering - Google Patents
A kind of picture retrieval method based on fuzzy clustering Download PDFInfo
- Publication number
- CN104298713B CN104298713B CN201410472785.2A CN201410472785A CN104298713B CN 104298713 B CN104298713 B CN 104298713B CN 201410472785 A CN201410472785 A CN 201410472785A CN 104298713 B CN104298713 B CN 104298713B
- Authority
- CN
- China
- Prior art keywords
- picture
- pictures
- point
- similarity
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
Landscapes
- Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of picture retrieval method based on fuzzy clustering, comprise the following steps:S11, it is picture construction feature value storehouse in picture library, and to be numbered per pictures;S12, the mutual distance from picture library between selection picture are all higher than the N pictures apart from threshold A1, carry out first time classification to remaining picture, form N class pictures;S13, the class that quantity threshold setting is more than to contained picture number in N class pictures performs step S12, untill all classes are respectively less than quantity threshold setting, obtain M and represents a little;S14, to all pictures in picture library according to the similarity degree that point is represented with M, it is divided into similarity degree highest and represents in the representative pictures of point;S15, for input picture to be retrieved, to its feature value, itself and all similarities represented between point are calculated respectively, choose several closest representative points of similarity and retrieved.The present invention reduces range of search on the basis of recall precision is ensured, reduces the workload of retrieval.
Description
Technical field
The present invention relates to a kind of picture retrieval method, more particularly to a kind of picture retrieval method based on fuzzy clustering, category
In technical field of information retrieval.
Background technology
One of the important appearance form of picture as multimedia messages, the vision that it is enriched by color, texture, shape etc.
Feature, intuitively, vivo make abstract data visualization, be presented to masses actualization.With internet information spreading more
The convenient and constantly improve of mobile terminal function, image information is by as the main information carrier quilt of another after word
It is widely used in the computer major fields such as information retrieval, data mining, man-machine interaction.But contain letter because picture exists in itself
Breath is complicated, environmental dependence is strong, abstract difficult, computationally intensive and towards mass picture the organizational structure of search modes of high-level semantic
It is the problems such as imperfection, related to pictorial information processing, retrieval, analysis, organization and administration especially with mass picture in internet
Related research turns into a Research Challenges of computer realm.
The basic model of existing picture retrieval is to carry out similarity with the storehouse picture that is retrieved according to retrieving image one by one to compare
Compared with selecting immediate some pictures after sequence as return value, this model is needed in each retrieval to whole
Picture library carries out a traversing operation, can bring the wait of long period to later visitor when retrieval access is excessive, and
And this stand-by period can further increase with being on the increase for visitor.The result of picture retrieval is from the figure collected
Valut, if the needs for the person that to meet different access, or inputted suitable for the retrieval of different type picture, the scale of picture library is just
Need sufficiently large, so just can guarantee that the degree of accuracy of retrieval, but excessive picture library make retrieval load and the response time in times
Increase, the requirement of real-time retrieval can not be reached.
In order to solve the above problems, in Application No.:201010195710.6 Chinese invention patent in, disclose one kind
Image search method, including train and retrieve two parts;Training department point comprises the following steps:The extraction of characteristic point;Characteristic point
Supplement and matching relationship determination;The generation of similar point set;Feature point set clusters;Each image Characteristic Vectors in image data base
The generation of amount;Retrieving portion comprises the following steps:The characteristic point of picture to be retrieved is extracted, generates feature point set;Calculate each spy
Sign point description subvector determines cluster belonging to current signature point to the distance of each cluster centre with minimum range;Calculate to be checked
The frequency of each cluster belonging to the characteristic point of rope picture;The frequency that is clustered belonging to characteristic point based on picture to be retrieved and described
The probability logarithm that respectively clusters generates a characteristic vector and unitization;The characteristic vector for calculating picture to be retrieved is respectively schemed to picture library
As Euler's distance of characteristic vector, the minimum image output of selected distance is retrieval result.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of picture retrieval method based on fuzzy clustering.
To achieve the above object, the present invention uses following technical schemes:
A kind of picture retrieval method based on fuzzy clustering, comprises the following steps:
S11, it is picture construction feature value storehouse in picture library, and to be numbered per pictures;
S12, it is operation object with numbering, the mutual distance chosen from picture library between picture is all higher than apart from threshold A1's
N pictures, first time classification is carried out to remaining picture, forms N class pictures;Wherein, it is described that N pictures are chosen from picture library
Process comprise the following steps:S121, a pictures P is arbitrarily chosen in picture library, be to input in picture library using this pictures
In retrieved, find the maximum picture Q of similarity distance1;S 122, with picture Q1Inputted for retrieval and be divided into pictures
And Q1Similarity distance be more than apart from threshold A1 part SH1, and obtain the maximum picture Q of similarity distance2;S123, circulation
Perform step S122, the least similar pictures Q that each retrieving image obtains for last circulationN, the pictures being retrieved are
The SH that last time circulation obtainsN, until SHNUntill for sky, resulting Q1……QNN pictures as need the N number of generation selected
Table point;
S13, the class that quantity threshold setting is more than to contained picture number in N class pictures perform step S12, the picture of selection
Between mutual distance be all higher than apart from threshold A2, the subclass of varying number is formed per class, is continued to meet contained picture number big
Step S12 is performed in the subclass of quantity threshold setting, untill all classes are respectively less than quantity threshold setting, M are obtained and represents a little;
S14, to all pictures in picture library, according to the similarity degree that point is represented with M, it is divided into similarity degree highest
Representative point representated by pictures go, complete the partition process of whole picture library classification;
S15, for input picture to be retrieved, to its feature value, calculate respectively between the picture and all representative points
Similarity and arranged according to size order, choose closest several of similarity and represent a little, in the representative point institute of selection
Retrieved in the pictures of representative, user is returned to after retrieval result is merged.
A kind of picture retrieval method based on fuzzy clustering, comprises the following steps:
S21, the picture in picture library is numbered, and picture is mapped as characteristic value code, using byte Hash by its
It is assigned on node, is then stored into distributed file system;
S22, one characteristic value code of random read take is each node distribution one as initial point from distributed file system
Individual map functions, the point maximum with its similarity distance is found in each map functions, re-send at reduce functions and carry out
Merge, pick out whole picture library with its similarity apart from farthest point Q1;
S23, with point Q1For new initial point, calculate in each node with point Q1The maximum point of similarity distance, is merged into
Maximum is taken at reduce functions, is obtained and Q1Similarity distance be more than apart from threshold A1 pictures SH1It is and least similar
Picture Q2, in SH1In characteristic value code corresponding to picture is assigned on node again, and be each one map letter of node distribution
Number, continue to find similarity apart from farthest point Q according to above-mentioned steps3, each initial point circulates for the last time to be obtained most
Dissimilar picture QN, the pictures being retrieved are the SH that last circulation obtainsN, repeatedly circulation is until SHNUntill for sky, obtain
It is N number of to represent a little.
S24, point one map function of distribution is represented to be each, each map functions according to remaining picture in picture library with it is known
The similarity distance division classification of point is represented, same category is mapped at a reduce function, according to picture number in classification
Size judge whether to perform with single node;
S25, for be unable to single node execution classification in be continuing with step S23 find represent a little, choose and QNPhase
It is more than the pictures SH apart from threshold A2 like degree distanceNAs the pictures being retrieved, until all categories can be held with single node
Behavior stops, and obtains M and represents a little;
S26, collection is all to be represented a little, distributes a map function for each point that represents, each map functions calculate figure respectively
Similarity distance of remaining picture with representing point, is finally classified in valut, is saved as after the similar merging using reduce functions
File;
S27, for input picture to be retrieved, to its feature value, calculate respectively between the picture and all representative points
Similarity and arranged according to size order, choose closest several of similarity and represent a little, in representative point institute's generation of selection
End product is searched in the file of table and is returned.
Wherein more preferably, retrieving is carried out in the pictures representated by the representative point in selection to comprise the following steps:
S151, it is to distribute a map function, characteristic value corresponding to the picture that will be included in every class pictures per class pictures
Code, is assigned it on node using byte Hash.
S152, map function calculate on same node the similarity distance of picture and retrieving image in pictures, and according to away from
It is ranked up from size, the result after sequence is sent to reduce functions.
S153, reduce function receive the result after the sequence that each map functions transmission comes, and it is merged, sorted,
Obtain final picture retrieval result.
Wherein more preferably, when handling picture, only the numbering corresponding to it is operated, without to figure
Piece is extracted, and only after retrieval result merging, is extracted further in accordance with the corresponding relation of picture and numbering from picture library
Picture, return to user.
Wherein more preferably, the similarity between calculating picture apart from when, the combination using two kinds of characteristic values is entered to picture
Row represents, using combinatorial formula of the geometric mean as two kinds of characteristic values, calculates the similarity distance between picture.
Wherein more preferably, it is described apart from threshold A2 be less than the Arbitrary Digit apart from threshold A1.
Picture retrieval method provided by the invention based on fuzzy clustering, by Selecting Representative Points from A, by the figure in picture library
Piece carries out classification processing according to point is represented, and during retrieval, need to only calculate similarity distance of the picture of input with representing point, choose phase
Picture is carried out like several classifications for representing point place for spending in small distance further to retrieve, and is ensureing the basis of recall precision
On, the scope of retrieval is reduced, reduces the workload of retrieval, effectively meets the demand of user's real-time retrieval.
Brief description of the drawings
Fig. 1 is the flow chart of the picture retrieval method provided by the present invention based on fuzzy clustering;
Fig. 2 is that the flow chart that N pictures are chosen from picture library is realized in embodiment provided by the invention.
Embodiment
The technology contents of the present invention are described in further detail with specific embodiment below in conjunction with the accompanying drawings.
A kind of picture retrieval method based on fuzzy clustering, comprises the following steps:The phase relied on first according to picture place
Like degree computation model and high-dimensional feature space in picture be distributed density degree come choose it is an appropriate number of represent a little, these representative
Point itself can also be picture, ensure that the quantity of the higher Regional Representative's point of picture aggregation extent is more, conversely, picture assembles journey
The quantity of the lower Regional Representative's point of degree is fewer, and the relative distance represented a little separates as far as possible according to the height of density, ensures other
Picture can embody enough taxises when sorting out;By remaining picture according to remote with these representative points after selected representative point
Closely it is divided into different regions, forms high n-dimensional subspace n one by one, i.e., all kinds of pictures;Finally input is schemed in retrieval
Piece is divided into several high n-dimensional subspace ns, is retrieved in high n-dimensional subspace n, and retrieval result is merged and returns to user.
As shown in figure 1, detailed specific description is done to this process below.
S11, it is picture construction feature value storehouse in picture library, and to be numbered per pictures.
At picture construction feature value storehouse in for picture library, picture is indicated using the combination of two kinds of characteristic values, with
Ensure that covered information content is enough substantially to represent image content, in embodiment provided by the present invention, use CEDD and side
Two kinds of characteristic values of edge histogram are built, and characteristic value combinations CEDD and edge histogram not only cover the color of picture, line
Reason and the attribute of profile three, to distinguishing that the agent object of picture has preferable effect, and memory headroom shared by unit character value
It is small, it is easy to store.It is the picture construction feature value storehouse in picture library on the basis of characteristic value combinations CEDD and edge histogram, and
Every pictures are numbered.In embodiment provided by the present invention, when handling picture, only to being compiled corresponding to it
Number operated, picture is not extracted, after only last retrieval result merges, further in accordance with picture and the corresponding relation of numbering
Picture is extracted from picture library, returns to user.Such as:When carrying out the similarity distance calculating between picture, only extract a picture and compile
Characteristic value corresponding to number, the calculating of similarity distance is carried out, picture is not extracted, reduces and operate complexity, carried
High effectiveness of retrieval.
S12, it is operation object with numbering, the mutual distance chosen from picture library between picture is all higher than apart from threshold A1's
N pictures, first time classification is carried out to remaining picture, forms N class pictures.
It is public as the combination of two kinds of characteristic values using geometric mean according to the characteristic value of the picture stored in value indicative storehouse
Formula, the mutual distance between picture is calculated, the advantage of geometric mean in avoiding the normalization to characteristic value, and and multiply merely
Method, which calculates to compare, ensure that the codomain of combination and single features value approaches, and be more beneficial for the comparison of distance value size.From picture library
Mutual distance between middle selection picture is all higher than the N pictures apart from threshold A1, is represented a little as N number of, with the N pictures of selection
On the basis of, according to the similarity of remaining picture and N pictures apart from size, first time classification is carried out to remaining picture, forms N classes
Pictures.During the selection of N class pictures, picture uses corresponding numbering to replace, and does not go to extract picture in picture library, drops
It is low to operate complexity, improve treatment effeciency.
As shown in Fig. 2 the process that N pictures are chosen from picture library comprises the following steps:
S121, a pictures P is arbitrarily chosen in picture library, is retrieved by input of this pictures in picture library,
Find the picture Q of least similar (similarity distance is maximum)1。
Finding the picture Q least similar to picture P1When, it is several according to the characteristic value of the picture stored in value indicative storehouse, use
What combinatorial formula of the average as two kinds of characteristic values, the mutual distance between picture is calculated, find out the figure maximum with picture P distances
Piece, as picture Q1。
S122, with picture Q1For retrieval input and pictures are divided into and Q1Similarity distance be more than apart from threshold A1
Part SH1, and obtain least similar picture Q2。
S123, circulation perform step S1 22, the least similar pictures that each retrieving image obtains for last circulation
QN, the pictures being retrieved are the SH that last circulation obtainsN, until SHNUntill for sky, resulting Q1……QNN pictures
As need to select N number of represents a little.
S13, step S12, the phase this time chosen are performed to class of the contained picture number in N class pictures more than quantity threshold setting H
Mutual edge distance is all higher than being formed the subclass of varying number per class apart from threshold A2, continues meeting contained picture number more than number
Measure and step S12 is performed in threshold H class, untill all classes are respectively less than quantity threshold setting H, form M class pictures, that is, deposit
Represented a little at M.Wherein, it is less than the Arbitrary Digit apart from threshold A1 apart from threshold A2, and A1 and A2 is according to point of picture library
Cloth situation and the system degree of accuracy needs different from the response time in retrieval are set.By setting A1 and A2 appropriate
Regulation picture category size and relative density, improve the flexibility of retrieval.
S14, to all pictures in picture library, according to the similarity degree that point is represented with M, it is divided into similarity degree most
Pictures representated by high representative point are gone, and complete the partition process of whole classification.
M of selection are represented a little, remaining picture in picture library is calculated into it respectively represents the similar of point to this M
Distance is spent, is divided into according to the size of similarity distance in different pictures, completes the final of whole picture library classification
Division.
S15, for input picture to be retrieved, to its feature value, calculate respectively between the picture and all representative points
Similarity and arranged according to size order, choose closest several of similarity and represent a little, representated by these representative points
Pictures in search and end product and return.
After user inputs picture to be retrieved, picture is indicated using the combination of two kinds of characteristic values, then used
Combinatorial formula of the geometric mean as two kinds of characteristic values, calculate picture and represent the similarity distance between point, and according to it
The size of value is ranked up to it.Nearest several of selected distance represent a little according to demand, and it is several that picture is respectively divided into this
Individual represent is retrieved in the representative pictures of point to picture.In embodiment provided by the present invention, by figure to be retrieved
Piece is respectively divided these and represented when being retrieved in the representative pictures of point to picture, does not extract the figure in picture library
Piece, the characteristic value corresponding to picture number is only extracted, the calculating of similarity distance is carried out, is arranged according to size order, and by result
Merge, picture is extracted from picture library further in accordance with the corresponding relation of picture and numbering, returns to user.
In embodiment provided by the present invention, the process retrieved in different classes of pictures is using distributed
Cluster processing, certain independence between class and class be present, the memory node of reasonable distribution class can ensure to examine in the cluster
Rope request is distributed on several nodes of minority, strengthens the scalability of system.Moreover, the classification of division represents point in position
On there is also the difference of distance, the small possibility that is calculated simultaneously in retrieval of difference is big, can be placed on same node
Handled.
MapReduce is one of distributed computing platform of current main-stream, and calculating is decomposed into mapping (Map) and abbreviation
(Reduce) two kinds of processing stages, it can greatly facilitate user when not knowing about Distributed Calculation principle and implementation method by journey
Sequence is deployed in distributed type assemblies and calculated.The basic procedure of MapReduce model is first to the individual element of data
Being operated, this step is referred to as to map (Map), i.e., pending initial data is converted into the data of preliminary treatment, by
Dependence is not present between data in the operation of this step, it is possible to assign data to different nodal parallels and calculate,
Map output data is the form tissue according to key-value pair in Hadoop, then carries out Hash behaviour to the key values in key-value pair
Corresponding node is assigned it to after work up, will enter abbreviation (Reduce) stage by integrating sorting data.The abbreviation stage pair
The data of same key assignments merge or other processing obtain single data result, and then complete whole operation.This processing stream
Journey can ensure processing each stage be not present must through processing node and cause Calculation bottleneck.
For MapReduce model by ensureing the reliability calculated to the feedback of each task, each node can be according to
Certain time interval sends the state of operation, and system will will distribute to appointing for the node when out of touch with a certain node
Other nodes are distributed in business.Principle is localized according to data, processing routine is typically passed to storage corresponding data by system as far as possible
Node on avoid the overload of network, raising efficiency.
In embodiment provided by the present invention, by the figure retrieving method based on fuzzy clustering in inhomogeneity pictures
In the process retrieved to picture be converted into the processing method of MapReduce model, MapReduce model is one kind by mapping
With the computation model based on thought of dividing and ruling of abbreviation composition, independence when being retrieved in inhomogeneity pictures to picture is fitted
For the model, several MapReduce tasks can be translated into according to the classification of the pictures of selection, after conversion,
The process retrieved in per class pictures to picture comprises the following steps:
S151, it is to distribute a map function, characteristic value corresponding to the picture that will be included in every class pictures per class pictures
Code, is assigned it on node using byte Hash.
Can be to distribute a map function per class pictures, when division when distributing map functions for every class pictures
Classification represents point and far and near difference hour in position be present, or multiclass pictures distribute a map function.In this hair
It is to distribute a map function per class pictures in bright provided embodiment.
S152, map function calculate on same node the similarity distance of picture and retrieving image in pictures, and according to away from
It is ranked up from size, the result after sequence is sent to reduce functions.
S153, reduce function receive the result after the sequence that each map functions transmission comes, and it is merged, sorted,
Obtain final picture retrieval result.
In the picture retrieval method provided by the present invention based on fuzzy clustering, the process of prototype selection is in every class
What inside was completed, the computing of other classes is totally independent of, is suitable for Distributed Calculation.Whole retrieving walks except last
The amount of calculation of picture number and classification number product is needed when being divided into every width picture and Similarity Measure is carried out in specific category suddenly
In addition, remainder amount of calculation is smaller, and time complexity will not be caused to become situation big and that exponentially type increases with picture library, can
Suitable for picture library it is larger when retrieved, the needs for the person that effectively can meet different access, suitable for inhomogeneity
The retrieval input of type picture.
In addition, the figure retrieving method provided by the present invention based on fuzzy clustering and not based on class center point conduct
Cluster standard, but several reference base pictures by being differed greatly in space differentiate the taxis of remaining picture, and choose benchmark
The iterations of picture is relevant with the distance threshold and the relative extent of the degree of rarefication of picture library chosen, and big with picture library
It is small unrelated, and category division and iterative process is not present every time.The final category division of picture be in all reference base pictures all
Choose what is just determined after terminating, and the size and space degree of rarefication of reference base picture and the class representated by it have close relation, picture
Relatively intensive region, reference base picture is also relatively more, can so ensure that the size of classification is relatively uniform and according to degree of rarefication
Division.Remaining cluster process after clustering for the first time is carried out in class, meets substantially for algorithm of being divided and ruled in Distributed Calculation
Ask.
In another embodiment provided by the present invention, by the figure retrieving method based on fuzzy clustering in inhomogeneity figure
Piece concentrates the process of Selecting Representative Points from A to be converted into the processing method of MapReduce model, and MapReduce model is one kind by mapping
With the computation model based on thought of dividing and ruling of abbreviation composition, the independence in prototype selection is applied to the model, Ke Yizhuan
Several MapReduce tasks are turned to, are specifically comprised the following steps:
S21, the picture in picture library is numbered, and is mapped as characteristic value code, using byte Hash by its point
It is fitted on node, is then stored into distributed file system.
S22, one characteristic value code of random read take is each node distribution one as initial point from distributed file system
Individual map functions, the point maximum with its similarity distance is found in each map functions, is re-send at reduce functions to it
Merge, pick out whole picture library with its similarity apart from farthest point Q1。
S23, with point Q1For new initial point, calculate in each node with point Q1The maximum point of similarity distance, is merged into
Maximum is taken at reduce functions, is obtained and Q1Similarity distance be more than apart from threshold A1 pictures SH1It is and least similar
Picture Q2, in SH1In characteristic value code corresponding to picture is assigned on node again, and be each one map letter of node distribution
Number, continue to find similarity apart from farthest point Q according to above-mentioned steps3, each initial point circulates for the last time to be obtained most
Dissimilar picture QN, the pictures being retrieved are the SH that last circulation obtainsN, repeatedly circulation is until SHNUntill for sky, obtain
It is N number of to represent a little.
S24, point one map function of distribution is represented to be each, each map functions according to remaining picture in picture library with it is known
The similarity distance division classification of point is represented, same category is mapped at a reduce function, according to picture number in classification
Size judge whether to perform with single node.
In embodiment provided by the present invention, judge whether to transport with single node according to the size of picture number in classification
Row is to judge whether picture number is more than the quantity threshold setting set in classification, when picture number is more than the quantity fault of setting in classification
During value, the category cannot single node perform, turn to step S25, when in classification picture number no more than setting quantity threshold setting
When, the category can be performed with single node, without the division of next step.
S25, for be unable to single node execution classification in be continuing with step S23 find represent a little, choose and QNPhase
It is more than the pictures SH apart from threshold A2 like degree distanceNAs the pictures being retrieved, until all categories can be held with single node
Behavior stops, and obtains M and represents a little.
S26, collection is all to be represented a little, distributes a map function for each point that represents, each map functions calculate figure respectively
Similarity distance of remaining picture with representing point, is finally classified in valut, is saved as after the similar merging using reduce functions
File.
S27, for input picture to be retrieved, to its feature value, calculate respectively between the picture and all representative points
Similarity and arranged according to size order, choose closest several of similarity and represent a little, representated by these representative points
File in search and end product and return.
In embodiment provided by the present invention, in the pictures representated by representative point in selection carry out retrieving with
Above-mentioned steps S151-S153 is identical, just repeats no more herein.
In summary, the picture retrieval method provided by the present invention based on fuzzy clustering, relied on according to picture place
In similarity calculation and high-dimensional feature space picture be distributed density degree come choose it is an appropriate number of represent a little, not only contain
Color, texture and the attribute of profile three of picture are covered, the agent object to distinguishing picture has preferable effect, and unit is special
Memory headroom is small shared by value indicative, is easy to store.Remaining picture is drawn according to the distance with these representative points after selected representative point
Assign in different regions, form high n-dimensional subspace n one by one, i.e., different classes of pictures;Finally will input in retrieval
Picture is divided into several high n-dimensional subspace ns, is retrieved in high n-dimensional subspace n, and retrieval result is merged and returns to use
Family.Wherein, the process retrieved in high n-dimensional subspace n is handled using distributed type colony, can effectively improve the effect of retrieval
Rate, meet the requirement of user's real-time retrieval.
A kind of picture retrieval method based on fuzzy clustering provided by the present invention is described in detail above.It is right
For those skilled in the art, any obviously change to what it was done on the premise of without departing substantially from true spirit
It is dynamic, it will all form to infringement of patent right of the present invention, corresponding legal liabilities will be undertaken.
Claims (6)
1. a kind of picture retrieval method based on fuzzy clustering, it is characterised in that comprise the following steps:
S11, it is picture construction feature value storehouse in picture library, and to be numbered per pictures;
S12, it is operation object with numbering, the mutual distance between picture is chosen from picture library and is all higher than N apart from threshold A1
Picture, first time classification is carried out to remaining picture, forms N class pictures;Wherein, it is described that N pictures are chosen from picture library
Process comprises the following steps:S121, a pictures P is arbitrarily chosen in picture library, be to input in picture library using this pictures
Retrieved, find the maximum picture Q of similarity distance1;S 122, with picture Q1Be divided into for retrieval input and by pictures and
Q1Similarity distance be more than apart from threshold A1 part SH1, and obtain the maximum picture Q of similarity distance2;S123, circulation are held
Row step S122, the least similar pictures Q that each retrieving image obtains for last circulationN, the pictures being retrieved are upper
The SH that one cycle obtainsN, until SHNUntill for sky, resulting Q1……QNN pictures as need the N number of representative selected
Point;
S13, it is mutual between the picture of selection to class execution step S12 of the contained picture number in N class pictures more than quantity threshold setting
Distance is all higher than apart from threshold A2, and the subclass of varying number is formed per class, continues to be more than quantity meeting contained picture number
Step S12 is performed in the subclass of threshold, untill all classes are respectively less than quantity threshold setting, M are obtained and represents a little;
S14, to all pictures in picture library, according to the similarity degree that point is represented with M, it is divided into similarity degree highest generation
Pictures representated by table point are gone, and complete the partition process of whole picture library classification;
S15, for input picture to be retrieved, to its feature value, the picture and all phases represented between point are calculated respectively
Like spending and being arranged according to size order, choose closest several of similarity and represent a little, representated by the representative point in selection
Pictures in retrieved, return to user after retrieval result is merged.
2. a kind of picture retrieval method based on fuzzy clustering, it is characterised in that comprise the following steps:
S21, the picture in picture library is numbered, and picture is mapped as characteristic value code, distributed using byte Hash
Onto node, it is then stored into distributed file system;
S22, one characteristic value code of random read take is each node distribution one as initial point from distributed file system
Map functions, the point maximum with its similarity distance is found in each map functions, re-sends at reduce functions and is closed
And whole picture library is picked out with its similarity apart from farthest point Q1;
S23, with point Q1For new initial point, calculate in each node with point Q1The maximum point of similarity distance, is merged into reduce
Maximum is taken at function, is obtained and Q1Similarity distance be more than apart from threshold A1 pictures SH1And least similar picture
Q2, in SH1In characteristic value code corresponding to picture is assigned on node again, and be each one map function of node distribution, after
It is continuous to find similarity apart from farthest point Q according to above-mentioned steps3, each initial point obtains least similar for last circulation
Picture QN, the pictures being retrieved are the SH that last circulation obtainsN, repeatedly circulation is until SHNUntill for sky, N number of representative is obtained
Point;
S24, point one map function of distribution is represented to be each, each map functions are according to remaining picture in picture library and known representative
The similarity distance division classification of point, same category is mapped at a reduce function, according in classification picture number it is big
It is small to judge whether to perform with single node;
S25, for be unable to single node execution classification in be continuing with step S23 find represent a little, choose and QNSimilarity away from
From the pictures SH of the threshold A2 more than with a distance fromNAs the pictures being retrieved, until all categories can untill single node performs,
M are obtained to represent a little;
S26, collection is all to be represented a little, distributes a map function for each point that represents, each map functions calculate picture library respectively
In similarity distance of remaining picture with representing point, finally classified, file saved as after the similar merging using reduce functions;
S27, for input picture to be retrieved, to its feature value, the picture and all phases represented between point are calculated respectively
Like spending and being arranged according to size order, choose closest several of similarity and represent a little, representated by the representative point in selection
End product is searched in file and is returned.
3. the picture retrieval method based on fuzzy clustering as claimed in claim 1 or 2, it is characterised in that in the representative point of selection
Retrieving is carried out in representative pictures to comprise the following steps:
S151, it is that a map function is distributed per class pictures, characteristic value code corresponding to the picture that will be included in every class pictures,
Assigned it to using byte Hash on node;
The similarity distance of picture and retrieving image in pictures on the same node of S152, map function calculating, and it is big according to distance
It is small that it is ranked up, the result after sequence is sent to reduce functions;
S153, reduce function receive the result after the sequence that each map functions transmission comes, and it is merged, sorted, is obtained
Final picture retrieval result.
4. the picture retrieval method based on fuzzy clustering as claimed in claim 1 or 2, it is characterised in that:
When handling picture, only the numbering corresponding to it is operated, without being extracted to picture, only
After retrieval result merging, picture is extracted from picture library further in accordance with the corresponding relation of picture and numbering, returns to user.
5. the picture retrieval method based on fuzzy clustering as claimed in claim 1 or 2, it is characterised in that:
The similarity between calculating picture apart from when, picture is indicated using the combination of two kinds of characteristic values, using geometry
Combinatorial formula of the average as two kinds of characteristic values, calculate the similarity distance between picture.
6. the picture retrieval method based on fuzzy clustering as claimed in claim 1 or 2, it is characterised in that:
It is described apart from threshold A2 be less than the Arbitrary Digit apart from threshold A1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410472785.2A CN104298713B (en) | 2014-09-16 | 2014-09-16 | A kind of picture retrieval method based on fuzzy clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410472785.2A CN104298713B (en) | 2014-09-16 | 2014-09-16 | A kind of picture retrieval method based on fuzzy clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104298713A CN104298713A (en) | 2015-01-21 |
CN104298713B true CN104298713B (en) | 2017-12-08 |
Family
ID=52318438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410472785.2A Active CN104298713B (en) | 2014-09-16 | 2014-09-16 | A kind of picture retrieval method based on fuzzy clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104298713B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106557523B (en) * | 2015-09-30 | 2020-05-12 | 佳能株式会社 | Representative image selection method and apparatus, and object image retrieval method and apparatus |
CN107122785B (en) * | 2016-02-25 | 2022-09-27 | 中兴通讯股份有限公司 | Text recognition model establishing method and device |
CN107423309A (en) * | 2016-06-01 | 2017-12-01 | 国家计算机网络与信息安全管理中心 | Magnanimity internet similar pictures detecting system and method based on fuzzy hash algorithm |
CN106528629B (en) * | 2016-10-09 | 2018-04-03 | 深圳云天励飞技术有限公司 | A kind of vector based on geometric space division searches for method and system generally |
CN110502953A (en) * | 2018-05-16 | 2019-11-26 | 杭州海康威视数字技术股份有限公司 | A kind of iconic model comparison method and device |
CN108830217B (en) * | 2018-06-15 | 2021-10-26 | 辽宁工程技术大学 | Automatic signature distinguishing method based on fuzzy mean hash learning |
CN109783678B (en) * | 2018-12-29 | 2021-07-20 | 深圳云天励飞技术有限公司 | Image searching method and device |
CN109766470A (en) * | 2019-01-15 | 2019-05-17 | 北京旷视科技有限公司 | Image search method, device and processing equipment |
CN110083732B (en) * | 2019-03-12 | 2021-08-31 | 浙江大华技术股份有限公司 | Picture retrieval method and device and computer storage medium |
CN109948734B (en) * | 2019-04-02 | 2022-03-29 | 北京旷视科技有限公司 | Image clustering method and device and electronic equipment |
CN110069645A (en) * | 2019-04-22 | 2019-07-30 | 北京迈格威科技有限公司 | Image recommendation method, apparatus, electronic equipment and computer readable storage medium |
CN110377781A (en) * | 2019-06-06 | 2019-10-25 | 福建讯网网络科技股份有限公司 | A kind of matched innovatory algorithm of application sole search |
CN110942046B (en) * | 2019-12-05 | 2023-04-07 | 腾讯云计算(北京)有限责任公司 | Image retrieval method, device, equipment and storage medium |
CN112328819B (en) * | 2020-11-07 | 2023-08-18 | 嘉兴智设信息科技有限公司 | Method for recommending similar pictures based on picture set |
CN113360698A (en) * | 2021-06-30 | 2021-09-07 | 北京海纳数聚科技有限公司 | Picture retrieval method based on image-text semantic transfer technology |
CN115129921B (en) * | 2022-06-30 | 2023-05-26 | 重庆紫光华山智安科技有限公司 | Picture retrieval method, apparatus, electronic device, and computer-readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004111931A2 (en) * | 2003-06-10 | 2004-12-23 | California Institute Of Technology | A system and method for attentional selection |
CN101211355A (en) * | 2006-12-30 | 2008-07-02 | 中国科学院计算技术研究所 | Image inquiry method based on clustering |
CN101859326A (en) * | 2010-06-09 | 2010-10-13 | 南京大学 | Image searching method |
CN103617217A (en) * | 2013-11-20 | 2014-03-05 | 中国科学院信息工程研究所 | Hierarchical index based image retrieval method and system |
-
2014
- 2014-09-16 CN CN201410472785.2A patent/CN104298713B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004111931A2 (en) * | 2003-06-10 | 2004-12-23 | California Institute Of Technology | A system and method for attentional selection |
CN101211355A (en) * | 2006-12-30 | 2008-07-02 | 中国科学院计算技术研究所 | Image inquiry method based on clustering |
CN101859326A (en) * | 2010-06-09 | 2010-10-13 | 南京大学 | Image searching method |
CN103617217A (en) * | 2013-11-20 | 2014-03-05 | 中国科学院信息工程研究所 | Hierarchical index based image retrieval method and system |
Also Published As
Publication number | Publication date |
---|---|
CN104298713A (en) | 2015-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104298713B (en) | A kind of picture retrieval method based on fuzzy clustering | |
Kumar et al. | An efficient k-means clustering filtering algorithm using density based initial cluster centers | |
Kapoor et al. | Active learning with gaussian processes for object categorization | |
US9454580B2 (en) | Recommendation system with metric transformation | |
Hore et al. | A scalable framework for cluster ensembles | |
Chen et al. | Parallel spectral clustering in distributed systems | |
Bozas et al. | Large scale sketch based image retrieval using patch hashing | |
US20220058222A1 (en) | Method and apparatus of processing information, method and apparatus of recommending information, electronic device, and storage medium | |
CN109242002A (en) | High dimensional data classification method, device and terminal device | |
Celebi et al. | Linear, deterministic, and order-invariant initialization methods for the k-means clustering algorithm | |
WO2019120023A1 (en) | Gender prediction method and apparatus, storage medium and electronic device | |
CN110147455A (en) | A kind of face matching retrieval device and method | |
CN110119477A (en) | A kind of information-pushing method, device and storage medium | |
Yu et al. | A content-based goods image recommendation system | |
WO2019120007A1 (en) | Method and apparatus for predicting user gender, and electronic device | |
WO2015001416A1 (en) | Multi-dimensional data clustering | |
Huang et al. | Melody-join: Efficient earth mover's distance similarity joins using MapReduce | |
Alam et al. | A hybrid approach for web document clustering using K-means and artificial bee colony algorithm | |
Yang et al. | An effective detection of satellite image via K-means clustering on Hadoop system | |
Gabryel | A bag-of-features algorithm for applications using a NoSQL database | |
Ła̧giewka et al. | Distributed image retrieval with colour and keypoint features | |
An et al. | A K-means-based multi-prototype high-speed learning system with FPGA-implemented coprocessor for 1-NN searching | |
CN110209895B (en) | Vector retrieval method, device and equipment | |
CN111709473A (en) | Object feature clustering method and device | |
CN108268478A (en) | A kind of unbalanced dataset feature selection approach and device based on ur-CAIM algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |