CN109697451A - Similar image clustering method and device, storage medium, electronic equipment - Google Patents
Similar image clustering method and device, storage medium, electronic equipment Download PDFInfo
- Publication number
- CN109697451A CN109697451A CN201710994492.4A CN201710994492A CN109697451A CN 109697451 A CN109697451 A CN 109697451A CN 201710994492 A CN201710994492 A CN 201710994492A CN 109697451 A CN109697451 A CN 109697451A
- Authority
- CN
- China
- Prior art keywords
- image
- value code
- clustering method
- described image
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure is directed to a kind of similar image clustering method, similar image clustering apparatus, computer readable storage medium and electronic equipments, it is related to technical field of image processing, this method comprises: passing through the characteristics of image of convolutional neural networks model extraction multiple images and carrying out Hash mapping to described image feature;Described image feature is clustered by strong continune component clustering method, clusters classification to determine;A classification logotype is provided for each described image feature and the similar image is obtained from database according to the classification logotype.Similar image cluster efficiency can be improved in the disclosure.
Description
Technical field
This disclosure relates to technical field of image processing, in particular to a kind of similar image clustering method, similar image
Clustering apparatus, computer readable storage medium and electronic equipment.
Background technique
In the image storing process in the fields such as image retrieval, the copyright protection of image, video intelligent analysis, often go out
The phenomenon that storing same or similar image is now repeated, in order to avoid the appearance of the phenomenon, similar image can be clustered
To be handled according to cluster result similar image.
In the related technology, K mean cluster method, DBSCAN (Density-Based Spatial can be used mostly
Clustering of Applications with Noise) density clustering method or hierarchy clustering method be to phase
It is clustered like image.Wherein, K mean cluster method from sample by selecting initial classes center at random, by by current sample
Originally sample class is then calculated as the classification where sample the smallest class center number of Euclidean distance between multiple class centers
The mean vector of feature vector and class center corresponding to current class is updated in not corresponding set;DBSCAN clustering method can
Surrounding core point place is determined with the number using sample point in the neighborhood of each very little to calculate the density at the point
Classification;Hierarchical clustering can be used as the bottom, and the method that classification is merged two-by-two by the way that all samples are respectively set as a kind of,
Or all samples are gathered for multiple classifications as top, and the method for being iterated fractionation to each classification sample determines class
Shuo not.
There may be following problems in above-mentioned clustering method: needing when one, clustering to image artificial specified in advance poly-
The classification number of class, accuracy is poor and efficiency is lower;Two, number of samples is larger, complexity is higher in the dimension of feature vector
The case where a large amount of memories can be occupied, therefore lead to low memory.
It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part
Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The disclosure is designed to provide a kind of similar image clustering method, similar image clustering apparatus, computer-readable
Storage medium and electronic equipment, and then caused by overcoming the limitation and defect due to the relevant technologies at least to a certain extent
One or more problem.
Other characteristics and advantages of the disclosure will be apparent from by the following detailed description, or partially by the disclosure
Practice and acquistion.
According to one aspect of the disclosure, a kind of similar image clustering method is provided, comprising:
Pass through the characteristics of image of convolutional neural networks model extraction multiple images and described image feature progress Hash is reflected
It penetrates;
Described image feature is clustered by strong continune component clustering method, clusters classification to determine;
A classification logotype is provided for each described image feature and the phase is obtained from database according to the classification logotype
Like image.
In a kind of exemplary embodiment of the disclosure, carrying out Hash mapping to described image feature includes:
By Hash quantization coding method by the described image Feature Conversion of each described image be two-value code.
In a kind of exemplary embodiment of the disclosure, described image feature is carried out by strong continune component clustering method
Cluster includes:
The similarity between multiple images is calculated by the two-value code and the two-value code is ranked up;
Digraph is constructed by vertex of the two-value code of each described image respectively;
All strong continune components in the digraph are searched to determine the categorical measure of cluster.
In a kind of exemplary embodiment of the disclosure, the similarity between multiple images is calculated simultaneously by the two-value code
The two-value code is ranked up and includes:
The Hamming distance calculated between the corresponding two-value code of each described image and an inquiry two-value code is described to obtain
Similarity;
The two-value code is ranked up according to the Hamming distance.
In a kind of exemplary embodiment of the disclosure, digraph is constructed by vertex of the two-value code of each described image
Include:
When the Hamming distance between the two-value code and the inquiry two-value code meets preset condition, establish by institute
State the directed edge that inquiry two-value code is directed toward the two-value code.
In a kind of exemplary embodiment of the disclosure, all strong continune components searched in the digraph include:
When belonging to same connected component by each vertex of preset data structure decision, searched using Tarjan algorithm
The strong continune component in the digraph.
In a kind of exemplary embodiment of the disclosure, the Hamming distance and the similarity are negatively correlated.
According to one aspect of the disclosure, a kind of similar image clustering apparatus is provided, comprising:
Characteristic extracting module, for the characteristics of image by convolutional neural networks model extraction multiple images and to the figure
As feature carries out Hash mapping;
Feature clustering module, for being clustered by strong continune component clustering method to described image feature, with determination
Cluster classification;
Image collection module, for providing a classification logotype and according to the classification logotype from number for each described image feature
According to obtaining the similar image in library.
According to one aspect of the disclosure, a kind of computer readable storage medium is provided, computer program is stored thereon with,
The computer program realizes similar image clustering method described in above-mentioned any one when being executed by processor.
According to one aspect of the disclosure, a kind of electronic equipment is provided, comprising:
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to execute phase described in above-mentioned any one via the executable instruction is executed
Like image clustering method.
A kind of similar image clustering method, the similar image clustering apparatus, calculating provided in disclosure exemplary embodiment
In machine readable storage medium storing program for executing and electronic equipment, by the characteristics of image of convolutional neural networks model extraction multiple images and to institute
It states characteristics of image and carries out Hash mapping;Described image feature is clustered by strong continune component clustering method, and is each
Described image feature provides a classification logotype and obtains the similar image from database according to the classification logotype.One side
Face clusters described image feature by strong continune component clustering method, can automatically determine cluster class number, avoid
In the related technology by manually determining the operation of clusters number, the accuracy rate and efficiency of cluster are improved;On the other hand, pass through
The characteristics of image of convolutional neural networks model extraction multiple images simultaneously carries out Hash mapping to described image feature, can reduce figure
As the size of feature, to reduce the memory consumption of characteristics of image occupancy.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The disclosure can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure
Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 schematically shows a kind of similar image clustering method schematic diagram in disclosure exemplary embodiment;
Fig. 2 schematically shows a kind of specific flow chart of similar image clustering method in disclosure exemplary embodiment;
Fig. 3 schematically shows a kind of block diagram of similar image clustering apparatus in disclosure exemplary embodiment;
Fig. 4 schematically shows the block diagram of a kind of electronic equipment in disclosure exemplary embodiment;
Fig. 5 schematically shows a kind of program product in disclosure exemplary embodiment.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes
Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more
Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot
Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.In the following description, it provides perhaps
More details fully understand embodiment of the present disclosure to provide.It will be appreciated, however, by one skilled in the art that can
It is omitted with technical solution of the disclosure one or more in the specific detail, or others side can be used
Method, constituent element, device, step etc..In other cases, be not shown in detail or describe known solution to avoid a presumptuous guest usurps the role of the host and
So that all aspects of this disclosure thicken.
In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure
Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function
Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form
Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place
These functional entitys are realized in reason device device and/or microcontroller device.
A kind of similar image clustering method is provided firstly in this example embodiment, can be applied to image retrieval, electricity
In the fields such as sub- business platform image copyright protection, video intelligent analysis.Refering to what is shown in Fig. 1, the similar image clustering method can
With the following steps are included:
Step S110. by the characteristics of image of convolutional neural networks model extraction multiple images and to described image feature into
Row Hash mapping;
Step S120. clusters described image feature by strong continune component clustering method, clusters classification to determine;
Step S130. provides a classification logotype for each described image feature and is obtained from database according to the classification logotype
Take the similar image.
In the similar image clustering method provided in the present example embodiment, on the one hand, clustered by strong continune component
Method clusters described image feature, can automatically determine cluster class number, avoids in the related technology by artificial
The operation for determining clusters number improves the accuracy rate and efficiency of cluster;On the other hand, pass through convolutional neural networks model extraction
The characteristics of image of multiple images simultaneously carries out Hash mapping to described image feature, can reduce the size of characteristics of image, to subtract
The memory consumption of characteristics of image occupancy is lacked.
In the following, will be carried out to each step in similar image clustering method described in this example embodiment detailed
Explanation and explanation.
Firstly, in step S100 as shown in Figure 2, can with input data, the data for example can for image sequence or
Person's sequence of frames of video.
Next in step s 110, by the characteristics of image of convolutional neural networks model extraction multiple images and to described
Characteristics of image carries out Hash mapping.
When carrying out cluster or other operations to image, since the memory space that image needs may result in greatly very much nothing
All image datas are disposably stored in memory by method, in order to solve this problem, can be extracted by step S111 semantic special
It seeks peace visual signature etc..Specifically, figure can be passed through by the characteristics of image of convolutional neural networks model extraction multiple images
As the mode of feature expresses each image, to reduce the memory of image.
The basic structure of convolutional neural networks model (Convolutional Neural Network, CNN) includes feature
Extract layer and Feature Mapping layer.Since the feature detection layer of CNN is learnt by training data, so being kept away when using CNN
The feature extraction of display is exempted from, and has implicitly been learnt from training data, can have been reduced by convolutional neural networks model
The complexity of data reconstruction in feature extraction and assorting process.
Characteristics of image may include local feature and global characteristics herein, in order to enable to characteristics of image become apparent from accurately
The global characteristics and local feature of image are expressed in ground, need to train a convolutional neural networks model based on sequence in advance.Example
A sample (X, Yp) can be such as taken from sample set, and X is inputted into the corresponding reality output of network query function, it is then practical by calculating
Export the difference that Yp is exported with corresponding ideal, and the training volume in such a way that the method backpropagation of minimization error adjusts weight matrix
Product neural network model.
Then the characteristics of image for passing through all images of convolutional neural networks model extraction, extracts the specific of characteristics of image
Process may include: input one include (R), green (G), blue (B) three colors Three Channel Color image, by convolutional Neural
The convolutional layer of network model, bias layer, down-sampling layer, active coating etc. finally export the characteristics of image of 1024 dimensions, figure herein
As feature includes floating type feature.
After obtaining the floating type feature of image, Hash mapping operation further can be executed to described image feature, with
By the floating type Feature Mapping of image a to point in Hamming space.Specifically, to described image feature in step S110
Carrying out Hash mapping may include step S112:
By Hash quantization coding method by the described image Feature Conversion of each described image be two-value code.
Wherein, all features can be indicated the floating of a d dimension by Hash quantization encoding using the binary string of 0,1 bit
Point-type feature, so that the distance between feature can be by the distance between binary string Lai approximate.
When carrying out Hash mapping firstly the need of one Hash mapping model of study, by the floating type Feature Mapping of 1024 dimensions
To a point in Hamming space, the dimension of the point is 1024 dimensions, i.e. 1024 bits by 0,1 character string, which is known as
Two-value code.Assuming that there is N number of image, after convolutional neural networks model extraction feature and Hash mapping operation, this N number of image is all
It is indicated by N number of two-value code in memory.
For example in conjunction with step S110, for the Three Channel Color image of a 800*600, convolutional Neural can be passed through
Network model obtains its floating type feature, which includes 1024 dimensions, 4096 bytes of committed memory, using Hash
After being mapped to the two-value code of 1024 dimensions, committed memory is 1024 bits, i.e. 128 bytes.Relative to described in conventional method 1
Hundred million image datas are put into for memory at least needs to occupy 380GB memory, the memory that two-value code feature occupies be it is original needed in
/ 8th deposited, to reduce the memory consumption of characteristics of image occupancy.
In the step s 120, described image feature is clustered by strong continune component clustering method, to determine cluster
Classification.
If having a reachable path in digraph between any two vertex, which is referred to as strongly connected graph, when
It is still strongly connected graph after removing the side of multi-quantity as far as possible, then the strongly connected graph is strong continune component.For example, in digraph G
In, if (vi > vj) has a directed walk from vi to vj between two vertex vs i, vj, while there are also one from vj to vi
Directed walk then claims two vertex strong continunes.If every two vertex all strong continunes of digraph G, claiming G is a strong continune
Figure.The very big strong continune subgraph of digraph, referred to as strong continune component.
It herein can be by strong continune component clustering method to the two-value code feature obtained after Hash mapping operation
It is clustered, the principle of the two-value code feature clustering method based on strong continune component is that the Semantic Similarity between image can be with
It is indicated using the connectivity between graph theory midpoint, belongs between the other image of same class them and belong to strong continune point
Amount.
Specifically, carrying out cluster to described image feature by strong continune component clustering method can specifically include step
S121 to S123, in which:
Step S121: the similarity between multiple images is calculated by the two-value code and the two-value code is arranged
Sequence;
Step S122: digraph is constructed by vertex of the two-value code of each described image respectively;
Step S123: all strong continune components in the digraph are searched to determine the categorical measure of cluster.
It is possible, firstly, to calculate the similarity between multiple images to carry out similarity retrieval and to institute by the two-value code
Two-value code is stated to be ranked up;Next digraph is constructed by vertex of the two-value code of each described image respectively;Finally in institute
State the categorical measure that all strong continune components are searched in digraph to determine cluster.
Next, each step in clustering method is specifically described.Wherein, it is calculated by the two-value code multiple
Similarity between image is simultaneously ranked up the two-value code and may include:
The Hamming distance calculated between the corresponding two-value code of each described image and an inquiry two-value code is described to obtain
Similarity;
The two-value code is ranked up according to the Hamming distance.
In this example, Hamming distance can be used to indicate that two equal length words correspond to the different quantity in position, can be to two
A character string carries out XOR operation, and the number that statistical result is 1, then 1 number is Hamming distance.Such as: 1011101 with
Hamming distance between 1001001 is that the Hamming distance between 2,2143896 and 2233796 is 3, " toned " and " roses " it
Between Hamming distance be 3.It should be noted that the Hamming distance minimum value of 1024 two-value codes is 0, maximum value 1024.
Inquiry two-value code can be configured according to user demand, such as can be any one in multiple two-value codes.
It can be by calculating the Hamming distance between the corresponding two-value code of each image and the inquiry two-value code of setting, and with the Hamming distance
From the similarity indicated between each two-value code and the inquiry two-value code.Wherein, similarity and Hamming distance are negatively correlated.With
It is illustrated for 1024 two-value codes, when the Hamming distance between two image two-value codes is 0, indicates between them
Similarity is maximum;When the Hamming distance between two image two-value codes is 1024, indicate that the similarity between them is minimum.
After similarity has been calculated, all two-value codes can be ranked up according to the size of the Hamming distance, with
Improving the efficiency of cluster operation, wherein the Hamming distance can be ranked up according to Heap algorithm or other algorithms,
Specifically process can be realized by writing program code.
For example, it for N images, calculates different in Hamming distance i.e. 1024 between their corresponding two-value codes
The number of position, such as Hamming distance between 0101 and 1011 are 3.Since the calculating time complexity of distance between N images is
Ο(N2), but be to calculate the two-value code using 0,1 coding, therefore efficiency increases in this example.
Specifically, Hamming distance can be calculated by the hardware instruction in CPU and GPU _ _ popcnt.Assuming that a, b distinguish
Indicate that length is 1024 two-value code, then the Hamming distance d between themh(a, b) can pass through formula dh(a, b)=_ popcnt
(a ∧ b) is calculated.
After calculating the Hamming distance between N images, it can adjust the distance and be ranked up according to sequence from small to large.
Specifically, when using a as inquiry two-value code, by calculating in inquiry two-value code a and memory between other N number of two-value codes
Hamming distance can obtain being less than all two-value codes of H with the maximum preceding K two-value code of a similitude, and distance.Parameter
K, H is controllable parameter when constructing digraph, can be configured according to actual needs.It, can by the adjusting to this 2 parameters
To control the cluster classification sum finally obtained.Generally, when increasing K, H, obtained classification sum is reduced;When reducing K, H, obtain
The classification sum arrived increases.
In addition to this, constructing digraph as vertex using the two-value code of each described image may include:
When the Hamming distance between the two-value code and the inquiry two-value code meets preset condition, establish by institute
State the directed edge that inquiry two-value code is directed toward the two-value code.
When constructing digraph, the Hamming between each two-value code and the inquiry two-value code may determine that first
Whether distance meets preset condition, and wherein preset condition can be the sequence of the two-value code in preceding K.
If the Hamming distance between each two-value code and the inquiry two-value code can be built in preceding K
A directed edge is found, and the direction of the directed edge is to be directed toward the two-value code by the inquiry two-value code.It can be sentenced using circulation
Disconnected method traverses all two-value codes in memory through the above steps, to establish digraph.
For example, for inquiring two-value code a, if establishing one by pushing up in preceding K at a distance from two-value code b and a
Point a is directed toward the directed edge of vertex b, traverses two-value code all in memory, if the two-value code, in preceding K, increasing by one has
Xiang Bian;It repeats to set up digraph with this.
In addition to this, all strong continune components searched in the digraph may include:
When belonging to same connected component by each vertex of preset data structure decision, searched using Tarjan algorithm
The strong continune component in the digraph.
In this example, preset data structure can be Union-find Sets data structure.When searching the strong continune component of digraph,
It can further be used using Union-find Sets data structure with judging whether two vertex belong to the same connected component
Tarjan algorithm finds out all strong continune components.Wherein, the number of strong continune component is to cluster classification sum.Tarjan
Algorithm is based on the algorithm searched for graph deep optimization, and each strong continune component is the stalk tree in search tree.Handle when search
Untreated node is added a storehouse in current search tree, and when backtracking may determine that whether node of the stack top into stack is one
Strong continune component.The detailed process that the strong continune component in the digraph is searched by Tarjan algorithm can pass through puppet
The program that code, pascal code, C++ code or other language are write is realized.
Since the number of strong continune component is automatically determined by the parameter K in above-mentioned steps, and the number of strong continune component is
It is total for cluster classification, therefore the artificial step for determining cluster classification sum by hand in the related technology is avoided, to improve
Efficiency is clustered, while also avoiding caused mistake when manual operation, improves accuracy rate.
Next, in step s 130, providing a classification logotype for each described image feature and according to the classification logotype
The similar image is obtained from database.
It, can based on step S120 after carrying out clustering determining cluster classification to multiple images feature by strong continune component
A classification logotype is provided with the corresponding cluster classification of characteristics of image to indicate image.The category is identified as globally unique mark
Number, such as can be number or other identifier, this is illustrated for sentencing number.Specifically, in finding out digraph
After all strong continune components, then the image number having the same of all vertex correspondences in each connected component, the position of number
Numerical example such as can be 11.
In this example, all strong continune components can be found out from digraph according to classification logotype, and then can be according to this
Classification logotype quickly finds out all images of traversal from database and finds out similar image, is retouched with executing step S140 by similar image
The goal task stated, the goal task can be copyright protection or key frame extraction.Similar diagram is searched by classification logotype
The efficiency of similar image acquisition can be improved in the method for picture.
The disclosure additionally provides a kind of similar image clustering apparatus.Refering to what is shown in Fig. 3, the similar image clustering apparatus 300
It may include characteristic extracting module 301, feature clustering module 302, image collection module 303.Wherein:
Characteristic extracting module 301 can be used for through the characteristics of image of convolutional neural networks model extraction multiple images simultaneously
Hash mapping is carried out to described image feature;
Feature clustering module 302 can be used for clustering described image feature by strong continune component clustering method,
Classification is clustered to determine;
Image collection module 303 can be used for providing a classification logotype for each described image feature and according to the classification
Mark obtains the similar image from database.
The detail of each module is in corresponding similar image clustering method in above-mentioned similar image clustering apparatus
It is described in detail, therefore details are not described herein again.
It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description
Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more
Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould
The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.
In addition, although describing each step of method in the disclosure in the accompanying drawings with particular order, this does not really want
These steps must be executed in this particular order by asking or implying, or having to carry out step shown in whole could realize
Desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/
Or a step is decomposed into execution of multiple steps etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure
The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating
Equipment (can be personal computer, server, mobile terminal or network equipment etc.) is executed according to disclosure embodiment
Method.
In an exemplary embodiment of the disclosure, a kind of electronic equipment that can be realized the above method is additionally provided.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or
Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete
The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here
Referred to as circuit, " module " or " system ".
The electronic equipment 600 of this embodiment according to the present invention is described referring to Fig. 4.The electronics that Fig. 4 is shown
Equipment 600 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 4, electronic equipment 600 is showed in the form of universal computing device.The component of electronic equipment 600 can wrap
It includes but is not limited to: at least one above-mentioned processing unit 610, at least one above-mentioned storage unit 620, the different system components of connection
The bus 630 of (including storage unit 620 and processing unit 610).
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 610
Row, so that various according to the present invention described in the execution of the processing unit 610 above-mentioned " illustrative methods " part of this specification
The step of illustrative embodiments.For example, the processing unit 610, which can execute step S110. as shown in fig. 1, passes through volume
Product neural network model extracts the characteristics of image of multiple images and carries out Hash mapping to described image feature;Step S120. is logical
Too strong connected component clustering method clusters described image feature, clusters classification to determine;Step S130. is each figure
As feature provides a classification logotype and obtains the similar image from database according to the classification logotype.
Storage unit 620 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit
(RAM) 6201 and/or cache memory unit 6202, it can further include read-only memory unit (ROM) 6203.
Storage unit 620 can also include program/utility with one group of (at least one) program module 6205
6204, such program module 6205 includes but is not limited to: operating system, one or more application program, other program moulds
It may include the realization of network environment in block and program data, each of these examples or certain combination.
Bus 630 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage
Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures
Local bus.
Electronic equipment 600 can also be with one or more external equipments 700 (such as keyboard, sensing equipment, bluetooth equipment
Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 600 communicate, and/or with make
Any equipment (such as the router, modulation /demodulation that the electronic equipment 600 can be communicated with one or more of the other calculating equipment
Device etc.) communication.This communication can be carried out by input/output (I/O) interface 650.Also, electronic equipment 600 can be with
By network adapter 660 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network,
Such as internet) communication.As shown, network adapter 660 is communicated by bus 630 with other modules of electronic equipment 600.
It should be understood that although not shown in the drawings, other hardware and/or software module can not used in conjunction with electronic equipment 600, including but not
Be limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and
Data backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure
The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating
Equipment (can be personal computer, server, terminal installation or network equipment etc.) is executed according to disclosure embodiment
Method.
In an exemplary embodiment of the disclosure, a kind of computer readable storage medium is additionally provided, energy is stored thereon with
Enough realize the program product of this specification above method.In some possible embodiments, various aspects of the invention may be used also
In the form of being embodied as a kind of program product comprising program code, when described program product is run on the terminal device, institute
Program code is stated for executing the terminal device described in above-mentioned " illustrative methods " part of this specification according to this hair
The step of bright various illustrative embodiments.
Refering to what is shown in Fig. 5, describing the program product for realizing the above method of embodiment according to the present invention
800, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device,
Such as it is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing can be with
To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or
It is in connection.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter
Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or
System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive
List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only
Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory
(CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
In carry readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal,
Optical signal or above-mentioned any appropriate combination.Readable signal medium can also be any readable Jie other than readable storage medium storing program for executing
Matter, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or and its
The program of combined use.
The program code for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, have
Line, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages
Code, described program design language include object oriented program language-Java, C++ etc., further include conventional
Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user
It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating
Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far
Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network
(WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP
To be connected by internet).
In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention
It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable
Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure
His embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or
Adaptive change follow the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure or
Conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by claim
It points out.
Claims (10)
1. a kind of similar image clustering method characterized by comprising
Pass through the characteristics of image of convolutional neural networks model extraction multiple images and Hash mapping is carried out to described image feature;
Described image feature is clustered by strong continune component clustering method, clusters classification to determine;
A classification logotype is provided for each described image feature and the similar diagram is obtained from database according to the classification logotype
Picture.
2. similar image clustering method according to claim 1, which is characterized in that carry out Hash to described image feature and reflect
It penetrates and includes:
By Hash quantization coding method by the described image Feature Conversion of each described image be two-value code.
3. similar image clustering method according to claim 2, which is characterized in that pass through strong continune component clustering method pair
Described image feature carries out cluster
The similarity between multiple images is calculated by the two-value code and the two-value code is ranked up;
Digraph is constructed by vertex of the two-value code of each described image respectively;
All strong continune components in the digraph are searched to determine the categorical measure of cluster.
4. similar image clustering method according to claim 3, which is characterized in that calculate multiple figures by the two-value code
Similarity as between is simultaneously ranked up the two-value code and includes:
The Hamming distance calculated between the corresponding two-value code of each described image and an inquiry two-value code is described similar to obtain
Degree;
The two-value code is ranked up according to the Hamming distance.
5. similar image clustering method according to claim 4, which is characterized in that with the two-value code of each described image
Constructing digraph for vertex includes:
When the Hamming distance between the two-value code and the inquiry two-value code meets preset condition, foundation is looked by described
Ask the directed edge that two-value code is directed toward the two-value code.
6. similar image clustering method according to claim 3, which is characterized in that search all strong in the digraph
Connected component includes:
When belonging to same connected component by each vertex of preset data structure decision, using described in the lookup of Tarjan algorithm
The strong continune component in digraph.
7. similar image clustering method according to claim 4, which is characterized in that the Hamming distance and the similarity
It is negatively correlated.
8. a kind of similar image clustering apparatus characterized by comprising
Characteristic extracting module, for the characteristics of image by convolutional neural networks model extraction multiple images and to described image spy
Sign carries out Hash mapping;
Feature clustering module, for being clustered by strong continune component clustering method to described image feature, to determine cluster
Classification;
Image collection module, for providing a classification logotype and according to the classification logotype from database for each described image feature
It is middle to obtain the similar image.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt
Claim 1-7 described in any item similar image clustering methods are realized when processor executes.
10. a kind of electronic equipment characterized by comprising
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to require 1-7 described in any item via executing the executable instruction and carry out perform claim
Similar image clustering method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710994492.4A CN109697451B (en) | 2017-10-23 | 2017-10-23 | Similar image clustering method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710994492.4A CN109697451B (en) | 2017-10-23 | 2017-10-23 | Similar image clustering method and device, storage medium and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109697451A true CN109697451A (en) | 2019-04-30 |
CN109697451B CN109697451B (en) | 2022-01-07 |
Family
ID=66226804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710994492.4A Active CN109697451B (en) | 2017-10-23 | 2017-10-23 | Similar image clustering method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109697451B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110531335A (en) * | 2019-09-18 | 2019-12-03 | 哈尔滨工程大学 | A kind of low complex degree similitude clustering signal sorting method based on Union-find Sets |
CN110705341A (en) * | 2019-08-13 | 2020-01-17 | 平安科技(深圳)有限公司 | Verification method, device and storage medium based on finger vein image |
CN110991514A (en) * | 2019-11-27 | 2020-04-10 | 深圳市商汤科技有限公司 | Image clustering method and device, electronic equipment and storage medium |
CN111046929A (en) * | 2019-11-28 | 2020-04-21 | 北京金山云网络技术有限公司 | Method and device for analyzing model error cases and electronic equipment |
CN111062431A (en) * | 2019-12-12 | 2020-04-24 | Oppo广东移动通信有限公司 | Image clustering method, image clustering device, electronic device, and storage medium |
CN111460234A (en) * | 2020-03-26 | 2020-07-28 | 平安科技(深圳)有限公司 | Graph query method and device, electronic equipment and computer readable storage medium |
CN111597373A (en) * | 2020-05-19 | 2020-08-28 | 清华大学 | Image classification method based on convolutional neural network and connected graph and related equipment |
CN111860575A (en) * | 2020-06-05 | 2020-10-30 | 百度在线网络技术(北京)有限公司 | Method and device for processing article attribute information, electronic equipment and storage medium |
CN112559974A (en) * | 2020-11-13 | 2021-03-26 | 山东浪潮质量链科技有限公司 | Picture copyright protection method, equipment and medium based on block chain |
CN112580676A (en) * | 2019-09-29 | 2021-03-30 | 北京京东振世信息技术有限公司 | Clustering method, clustering device, computer readable medium and electronic device |
CN116018615A (en) * | 2020-09-08 | 2023-04-25 | 科磊股份有限公司 | Unsupervised pattern equivalence detection using image hashing |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7853930B2 (en) * | 2005-01-04 | 2010-12-14 | International Business Machines Corporation | Annotating graphs to allow quick loading and analysis of very large graphs |
CN101976348A (en) * | 2010-10-21 | 2011-02-16 | 中国科学院深圳先进技术研究院 | Image clustering method and system |
CN104063428A (en) * | 2014-06-09 | 2014-09-24 | 国家计算机网络与信息安全管理中心 | Method for detecting unexpected hot topics in Chinese microblogs |
AU2013263846A1 (en) * | 2013-11-29 | 2015-06-18 | Canon Kabushiki Kaisha | Hierarchical determination of metrics for component-based parameterized SoCos |
US20150213239A1 (en) * | 2007-02-23 | 2015-07-30 | Irdeto Canada Corporation | System and method of interlocking to protect software-mediated program and device behaviours |
CN105631210A (en) * | 2015-12-28 | 2016-06-01 | 南京邮电大学 | Directed digraph strongly-connected component analysis method based on MapReduce |
CN106202167A (en) * | 2016-06-21 | 2016-12-07 | 南开大学 | A kind of oriented label figure adaptive index construction method based on structural outline model |
CN106815362A (en) * | 2017-01-22 | 2017-06-09 | 福州大学 | One kind is based on KPCA multilist thumbnail Hash search methods |
CN106886599A (en) * | 2017-02-28 | 2017-06-23 | 北京京东尚科信息技术有限公司 | Image search method and device |
CN107169106A (en) * | 2017-05-18 | 2017-09-15 | 珠海习悦信息技术有限公司 | Video retrieval method, device, storage medium and processor |
CN107193942A (en) * | 2017-05-19 | 2017-09-22 | 西安邮电大学 | The rapid generation of all connected subgraphs in a kind of digraph |
-
2017
- 2017-10-23 CN CN201710994492.4A patent/CN109697451B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7853930B2 (en) * | 2005-01-04 | 2010-12-14 | International Business Machines Corporation | Annotating graphs to allow quick loading and analysis of very large graphs |
US20150213239A1 (en) * | 2007-02-23 | 2015-07-30 | Irdeto Canada Corporation | System and method of interlocking to protect software-mediated program and device behaviours |
CN101976348A (en) * | 2010-10-21 | 2011-02-16 | 中国科学院深圳先进技术研究院 | Image clustering method and system |
AU2013263846A1 (en) * | 2013-11-29 | 2015-06-18 | Canon Kabushiki Kaisha | Hierarchical determination of metrics for component-based parameterized SoCos |
CN104063428A (en) * | 2014-06-09 | 2014-09-24 | 国家计算机网络与信息安全管理中心 | Method for detecting unexpected hot topics in Chinese microblogs |
CN105631210A (en) * | 2015-12-28 | 2016-06-01 | 南京邮电大学 | Directed digraph strongly-connected component analysis method based on MapReduce |
CN106202167A (en) * | 2016-06-21 | 2016-12-07 | 南开大学 | A kind of oriented label figure adaptive index construction method based on structural outline model |
CN106815362A (en) * | 2017-01-22 | 2017-06-09 | 福州大学 | One kind is based on KPCA multilist thumbnail Hash search methods |
CN106886599A (en) * | 2017-02-28 | 2017-06-23 | 北京京东尚科信息技术有限公司 | Image search method and device |
CN107169106A (en) * | 2017-05-18 | 2017-09-15 | 珠海习悦信息技术有限公司 | Video retrieval method, device, storage medium and processor |
CN107193942A (en) * | 2017-05-19 | 2017-09-22 | 西安邮电大学 | The rapid generation of all connected subgraphs in a kind of digraph |
Non-Patent Citations (4)
Title |
---|
JAVAD B. EBRAHIMI等: "Linear index coding via graph homomorphism", 《2014 INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT)》 * |
SCHWENK K等: "Connected Component Labeling algorithm for very complex and high-resolution images on an FPGA platform", 《SPIE REMOTE SENSING 2015》 * |
吴国榕: "基于神经影像的多尺度动态有向连接理论与算法研究", 《中国博士学位论文全文数据库 (医药卫生科技辑)》 * |
王振: "提升近邻检索性能的二值编码算法", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705341A (en) * | 2019-08-13 | 2020-01-17 | 平安科技(深圳)有限公司 | Verification method, device and storage medium based on finger vein image |
CN110531335A (en) * | 2019-09-18 | 2019-12-03 | 哈尔滨工程大学 | A kind of low complex degree similitude clustering signal sorting method based on Union-find Sets |
CN112580676B (en) * | 2019-09-29 | 2024-08-20 | 北京京东振世信息技术有限公司 | Clustering method, clustering device, computer readable medium and electronic equipment |
CN112580676A (en) * | 2019-09-29 | 2021-03-30 | 北京京东振世信息技术有限公司 | Clustering method, clustering device, computer readable medium and electronic device |
CN110991514A (en) * | 2019-11-27 | 2020-04-10 | 深圳市商汤科技有限公司 | Image clustering method and device, electronic equipment and storage medium |
CN110991514B (en) * | 2019-11-27 | 2024-05-17 | 深圳市商汤科技有限公司 | Image clustering method and device, electronic equipment and storage medium |
CN111046929A (en) * | 2019-11-28 | 2020-04-21 | 北京金山云网络技术有限公司 | Method and device for analyzing model error cases and electronic equipment |
CN111046929B (en) * | 2019-11-28 | 2023-09-26 | 北京金山云网络技术有限公司 | Analysis method and device for model error cases and electronic equipment |
CN111062431A (en) * | 2019-12-12 | 2020-04-24 | Oppo广东移动通信有限公司 | Image clustering method, image clustering device, electronic device, and storage medium |
CN111460234B (en) * | 2020-03-26 | 2023-06-09 | 平安科技(深圳)有限公司 | Graph query method, device, electronic equipment and computer readable storage medium |
CN111460234A (en) * | 2020-03-26 | 2020-07-28 | 平安科技(深圳)有限公司 | Graph query method and device, electronic equipment and computer readable storage medium |
CN111597373B (en) * | 2020-05-19 | 2023-06-20 | 清华大学 | Picture classifying method and related equipment based on convolutional neural network and connected graph |
CN111597373A (en) * | 2020-05-19 | 2020-08-28 | 清华大学 | Image classification method based on convolutional neural network and connected graph and related equipment |
CN111860575A (en) * | 2020-06-05 | 2020-10-30 | 百度在线网络技术(北京)有限公司 | Method and device for processing article attribute information, electronic equipment and storage medium |
CN116018615A (en) * | 2020-09-08 | 2023-04-25 | 科磊股份有限公司 | Unsupervised pattern equivalence detection using image hashing |
CN112559974A (en) * | 2020-11-13 | 2021-03-26 | 山东浪潮质量链科技有限公司 | Picture copyright protection method, equipment and medium based on block chain |
Also Published As
Publication number | Publication date |
---|---|
CN109697451B (en) | 2022-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109697451A (en) | Similar image clustering method and device, storage medium, electronic equipment | |
CN111291190B (en) | Training method of encoder, information detection method and related device | |
US7903883B2 (en) | Local bi-gram model for object recognition | |
Rouhani et al. | Semantic segmentation of 3D textured meshes for urban scene analysis | |
US20220301173A1 (en) | Method and system for graph-based panoptic segmentation | |
CN111931067A (en) | Interest point recommendation method, device, equipment and medium | |
CN116664719B (en) | Image redrawing model training method, image redrawing method and device | |
CN111339443A (en) | User label determination method and device, computer equipment and storage medium | |
Pei et al. | Unsupervised multimodal feature learning for semantic image segmentation | |
Song et al. | Boundary‐enhanced supervoxel segmentation for sparse outdoor LiDAR data | |
CN112000763A (en) | Method, device, equipment and medium for determining competition relationship of interest points | |
CN118451423A (en) | Optimal knowledge distillation scheme | |
CN115115914A (en) | Information identification method, device and computer readable storage medium | |
CN115905838A (en) | Audio-visual auxiliary fine-grained tactile signal reconstruction method | |
Wang et al. | Hierarchical space tiling for scene modeling | |
CN117592595A (en) | Method and device for building and predicting load prediction model of power distribution network | |
Borna et al. | An intelligent geospatial processing unit for image classification based on geographic vector agents (GVAs) | |
CN117132804A (en) | Hyperspectral image classification method based on causal cross-domain small sample learning | |
CN115168609A (en) | Text matching method and device, computer equipment and storage medium | |
Norelyaqine et al. | Architecture of Deep Convolutional Encoder‐Decoder Networks for Building Footprint Semantic Segmentation | |
Su et al. | Deep supervised hashing with hard example pairs optimization for image retrieval | |
CN113822291A (en) | Image processing method, device, equipment and storage medium | |
CN115600053A (en) | Navigation method and related equipment | |
Hu et al. | [Retracted] Footprint Extraction and Sports Dance Action Recognition Method Based on Artificial Intelligence Distributed Edge Computing | |
CN112685603A (en) | Efficient retrieval of top-level similarity representations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |