CN107451200A - Search method using Randomized Quantizing words tree and the image search method based on it - Google Patents
Search method using Randomized Quantizing words tree and the image search method based on it Download PDFInfo
- Publication number
- CN107451200A CN107451200A CN201710545225.9A CN201710545225A CN107451200A CN 107451200 A CN107451200 A CN 107451200A CN 201710545225 A CN201710545225 A CN 201710545225A CN 107451200 A CN107451200 A CN 107451200A
- Authority
- CN
- China
- Prior art keywords
- image
- cluster
- characteristic vector
- characteristic
- search method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of search method using Randomized Quantizing words tree and the image search method based on it, comprise the following steps:(1) a nearest neighbor search tree, the root node using all characteristic vectors of whole database as first segment, downward merogenesis are produced;(2) in the second level, center of the k point as cluster is randomly selected from whole database, then according to selected method for measuring similarity, each characteristic vector is assigned to the cluster center of its nearest neighbours, whole database is divided into k subset, continues downward merogenesis;(3) in the third level, in the k cluster obtained for each from the second level, cluster centre of the k characteristic point as its next stage is randomly selected from their characteristic vector pond.(4) repeat.The image search method of the present invention overcomes the problem of words tree establishes the needs substantial amounts of time in the prior art, can establish words tree in a short period of time, meet requirement of real-time.
Description
Technical field
Image retrieval technologies field of the present invention, more particularly to using Randomized Quantizing words tree search method and based on its
Image search method.
Background technology
In recent years, gathered with the development of digital technology particularly network technology with popularization, Internet of Things and computerized information
The development of software and hardware technology, increasing data are collected and store, and the speed of quantity collection is considerably beyond tradition
Method can handle their speed, and this trend is more and more obvious.Facebook is world rankings leading photo point
Website is enjoyed, about 3.5 hundred million photos are uploaded daily by the end of in November, 2013, and the photo capacity only on Facebook is
250PB is reached;In terms of digital video, YouTube was shown in the statistics of 2013, it is per minute upload 72 hours with
On video content, have 4,000,000,000 web video playing requests daily, and these data are still being significantly increased.For so huge
Big data resource and the requirements for access of same magnanimity, how effectively to organize, management and retrieval large scale database, turn into urgent
It is essential the problem of to be solved.
Traditional text based image search method, image is annotated using keyword, image retrieval is become
Lookup to keyword.The shortcomings that its is obvious be:Computer vision and artificial intelligence technology all can not enter style of writing automatically to image
This mark is, it is necessary to rely on artificial mark.Because data scale constantly expands, the speed manually marked is much unable to catch up with view data
Speed of expansion, and due to the subjectivity and inexactness that manually mark, understanding of the different people to image is different, causes to figure
Picture annotates the unified standard of neither one.In order to overcome the limitation of text based image search method, 90 years 20th century
, there is CBIR (Content Based-Image Retrieval, CBIR) in generation.It is different from tradition
Retrieval method, fused images understand technology, there is provided in a kind of image data base from Large Copacity, according to it has been proposed that requirement
The method effectively retrieved.
The basic thought of CBIR system is that the visual signature of image is carried out above and below analysis taken in conjunction
Text is retrieved.Its implementation method is using view data library storage and manages view data, then by the figure based on content
As retrieval technique is as in the engine embedded images database of database, there is provided CBIR function.Existing
CBIR system in generally use low layer image information, including the color of image, texture, shape and
The contents such as the spatial relationship between them, the similarity between query image and target image is calculated, then according to similarity
Matching degree between size, i.e. characteristics of image is retrieved.Therefore, first using feature extraction each width in image library
Image is converted to a point in image feature space, i.e., corresponding characteristic vector, then, image is carried out according to characteristic vector
Retrieval, so as to which CBIR to be converted into retrieval to characteristic point in image feature space.
In the case of image data base scale is smaller, the most frequently used characteristics of image search method is sequential scanning method.
But as people obtain the means continuous development of information and the continuous growth of information requirement, the scale of image data base is more next
Bigger, traditional sequential scanning method can not meet requirement of the user for retrieval time.Therefore, by being carried out to data
Effective tissue is improved retrieval rate, is based on interior so as to establish an efficient Indexing Mechanism with rapid drop range of search
Hold the key point of retrieval.
In conventional correlative study, researchers are directed to specific application field, it is proposed that many data index methods.
However, these data index methods when handling high dimensional data, are all influenceed by higher dimensional space " dimension disaster ", work as data dimension
During degree increase, it is retrieved performance and degenerates to sequential scan, or even also poorer than sequential scan performance.In being studied in CBIR, from
The characteristic vector extracted in original image is generally all higher-dimension, for image feature data index inevitably by
The influence of " dimension disaster ".Nister and Stewenius proposes the search method based on words tree and shown in higher dimensional space
Good retrieval effectiveness, still, its achievement time to higher dimensional space are very long, it is difficult to meet that modern data storehouse is ageing to retrieving
Requirement.Therefore, for the higher-dimension characteristic of image feature data, efficient high dimensional data indexing mechanism is established, is present image
The significant challenge that retrieval research is faced.
The content of the invention
The invention provides a kind of search method using Randomized Quantizing words tree and the image search method based on it, purport
Solving drawbacks described above present in prior art.
To reach above-mentioned technical purpose, the present invention adopts the following technical scheme that:
Using the search method of Randomized Quantizing words tree, comprise the following steps:
(1) one nearest neighbor search tree of generation, the root node using all characteristic vectors of whole database as first segment,
Downward merogenesis;
(2) in the second level, center of the k point as cluster is randomly selected from whole database, then according to selected
Method for measuring similarity, each characteristic vector is assigned to the cluster center of its nearest neighbours, and whole database is divided into k subset,
Continue downward merogenesis;
(3) it is random from their characteristic vector pond in the k cluster obtained for each from the second level in the third level
Cluster centre of the k characteristic point as its next stage is chosen, is then distributed each characteristic vector using method for measuring similarity
To the cluster center of its nearest neighbours, so as to form k on the third level2Individual cluster;
(4) repeat step (2), (3), until the characteristic vector that all leaf nodes include belongs to same class object or leaf segment
The quantity for the characteristic vector that point includes is less than certain limitation;Wherein each characteristic vector has a class label associated with it.
In step (2), characteristic vector to two or more cluster centre distance it is equal, then randomly choose a cluster.
In step (3), the new characteristic vector selected from characteristic vector pond is assigned it in the cluster of its nearest neighbours
The heart, when reaching the leaf node at cluster center of distribution, if all characteristic vector points have identical class label in the leaf node,
Associated class label is then distributed to new characteristic vector, then stops computing;Otherwise, scanned for again in cluster is distributed,
Select in cluster with new characteristic vector apart from most short characteristic vector, and the class label that this feature vector correlation joins is distributed to
New characteristic vector, then stops computing.
A kind of image search method of the search method based on Randomized Quantizing words tree, comprises the following steps:
(1) some overlapped subregions are divided the image into by overlap partition method first;
(2) the characteristic information block of image is combined with its semantic feature;It is corresponding due to the characteristic vector of each extraction
It is (poly- i.e. in data mining by carrying out non supervised learning to the characteristic point in feature space in a point of feature space
Class), all characteristic vectors in characteristic vector storehouse are divided into multiple patterns so that have between the pattern in same class
More similitudes, there is larger diversity between the pattern in inhomogeneity;Characteristic point is marked by class label,
Each class label has specific semantic information, image feature information block is combined with semantic feature, to the difference of image
Region explains, to establish image knowledge storehouse;
(3) a nearest neighbor search tree is produced, using all image feature vectors in image knowledge storehouse as first segment
Root node, downward merogenesis;
(4) in the second level, center of the k point as cluster is randomly selected from image knowledge storehouse, then according to selected
Method for measuring similarity, each image feature vector is assigned to the cluster center of its nearest neighbours, whole database is divided into k son
Collection, continues downward merogenesis;
(5) it is random from their characteristic vector pond in the k cluster obtained for each from the second level in the third level
Cluster centre of the k characteristic point as its next stage is chosen, then using method for measuring similarity by each image feature vector
The cluster center of its nearest neighbours is assigned to, so as to form k on the third level2Individual cluster;
(6) repeat step (2), (3), until the image feature vector that includes of all leaf nodes belong to same class object or
The quantity for the image feature vector that leaf node includes is less than certain limitation;Wherein each image feature vector has one and it
Related class label.
In step (1), overlap partition method carries out the image that size is height × weight with N × N window
Division, row and column direction press Nhop pixel shift, are divided into some overlapped subregions, in order that included in image
Sufficiently small object can be detected, and reduce square window size, increase patch quantity;By overlapped subinterval,
Color histogram is combined with the spatial distribution of color.
In step (2), establish during image knowledge storehouse, using Chameleon clustering algorithms and the cluster based on MST
Algorithm, the color histogram feature vector storehouse of image is clustered, class label is set to cluster result, established based on colored straight
The knowledge base of square figure feature.
Using above technical scheme, have the advantages that:
(1) image search method of the invention, which overcomes words tree in the prior art and established, needs asking for substantial amounts of time
Topic, can establish words tree in a short period of time, meet requirement of real-time;
(2) by the present invention in that with overlap partition method, the color histogram by image thinning into multiple blocks of extraction images
As characteristic vector storehouse, effectively the color histogram of image is combined with color space information, overcome in the prior art
Ignore the spatial character of color this problem during image characteristics extraction;
(3) present invention can more quickly extract picture feature, meet requirement of real-time, while property data base is entered
Row non supervised learning, the different zones of scene image are marked by zone marker, form knowledge base.
Brief description of the drawings
Fig. 1 is the schematic diagram of the present invention;
Fig. 2 is the schematic diagram of overlap partition method of the present invention;
Fig. 3 is the cluster result of 21-14 10648 dimension RGB histograms;
Fig. 4 is the cluster result of 24-16 5000 dimension HSV histograms;
Fig. 5 is the cluster result of 24-16 5832 dimension Opponent histograms;
Fig. 6 is the cluster result of 21-14 10648 dimension Transformed histograms;
Fig. 7 is 21-14 group RGB histogram accuracy rate comparison diagrams;
Fig. 8 is 24-16 group RGB histogram accuracy rate comparison diagrams;
Fig. 9 is 27-18 group RGB histogram accuracy rate comparison diagrams.
Specific embodiment
Below in conjunction with the accompanying drawings, embodiment, this programme is further described.
As shown in figure 1, using the search method of Randomized Quantizing words tree, comprise the following steps:
(1) one nearest neighbor search tree of generation, the root node using all characteristic vectors of whole database as first segment,
Downward merogenesis;
(2) in the second level, center of the k point as cluster is randomly selected from whole database, then according to selected
Method for measuring similarity, each characteristic vector is assigned to the cluster center of its nearest neighbours, and whole database is divided into k subset,
Continue downward merogenesis;
(3) it is random from their characteristic vector pond in the k cluster obtained for each from the second level in the third level
Cluster centre of the k characteristic point as its next stage is chosen, is then distributed each characteristic vector using method for measuring similarity
To the cluster center of its nearest neighbours, so as to form k on the third level2Individual cluster;
(4) repeat step (2), (3), until the characteristic vector that all leaf nodes include belongs to same class object or leaf segment
The quantity for the characteristic vector that point includes is less than certain limitation;Wherein each characteristic vector has a class label associated with it.
In step (2), characteristic vector to two or more cluster centre distance it is equal, then randomly choose a cluster.
In step (3), the new characteristic vector selected from characteristic vector pond is assigned it in the cluster of its nearest neighbours
The heart, when reaching the leaf node at cluster center of distribution, if all characteristic vector points have identical class label in the leaf node,
Associated class label is then distributed to new characteristic vector, then stops computing;Otherwise, scanned for again in cluster is distributed,
Select in cluster with new characteristic vector apart from most short characteristic vector, and the class label that this feature vector correlation joins is distributed to
New characteristic vector, then stops computing.
For large database, the search method selection of Randomized Quantizing tree follows the thought of words tree, but on its basis
An important improvement has been done, has produced a nearest neighbor search tree.As shown in Fig. 1, data-oriented storehouse, only have in the first stage
One node, comprising all characteristic vectors, turns into root node.The second level, K point conduct is randomly selected from whole database
The center of cluster, then according to selected method for measuring similarity, each characteristic vector is assigned to from its immediate cluster
The heart, whole database is divided into K subset.In the third level, the K cluster obtained for each from the second level, from they
Cluster centre of the K characteristic point as its next stage is randomly selected in characteristic vector pond, then will using method for measuring similarity
Each characteristic vector is assigned to the cluster center of its nearest neighbours, so as to form K cluster on the third level.Continue this process, until
The characteristic vector that all leaf nodes include belongs to the feature that same class object (that is, the node is pure) or leaf node include
The quantity of vector is less than certain limitation (for example, 50).Each characteristic vector has a class label associated with it.
When carrying out branch to tree, by a nearest neighbor search tree, the distances of each data to other data item can be with
It is updated to a smaller value.This strategy ensure that immediate data point is more likely assigned in same subregion in space.
However, because any one data point in subregion, compared to the center of other subregions, all closest to the cluster center of its own (not
It is arest neighbors), if the distance at data point to the center of two or more clusters is equal, data point then randomly chooses a cluster.
Pass through the new characteristic vector that the search of Randomized Quantizing tree is given.New characteristic vector is a certain along Randomized Quantizing tree
Particular path, this feature vector is calculated on each layer to the distance at K cluster center, result is this new characteristic vector to K
A closest cluster central point of individual cluster central point.When reaching leaf node, if the leaf node is pure (that is, leaf segment
All characteristic vector points have identical class label in point), associated class label is distributed to new characteristic vector, is then stopped
Computing.Otherwise, a nearest neighbor search is done in the vector of related cluster, the result of search is according to selected similarity measurement
The characteristic vector for the beeline that method obtains, and the class label that this feature vector correlation joins is distributed into new characteristic vector,
Then computing is stopped.
A kind of image search method of the search method based on Randomized Quantizing words tree, comprises the following steps:
(1) some overlapped subregions are divided the image into by overlap partition method first;
(2) the characteristic information block of image is combined with its semantic feature;It is corresponding due to the characteristic vector of each extraction
It is (poly- i.e. in data mining by carrying out non supervised learning to the characteristic point in feature space in a point of feature space
Class), all characteristic vectors in characteristic vector storehouse are divided into multiple patterns so that have between the pattern in same class
More similitudes, there is larger diversity between the pattern in inhomogeneity;Characteristic point is marked by class label,
Each class label has specific semantic information, image feature information block is combined with semantic feature, to the difference of image
Region explains, to establish image knowledge storehouse;
(3) a nearest neighbor search tree is produced, using all image feature vectors in image knowledge storehouse as first segment
Root node, downward merogenesis;
(4) in the second level, center of the k point as cluster is randomly selected from image knowledge storehouse, then according to selected
Method for measuring similarity, each image feature vector is assigned to the cluster center of its nearest neighbours, whole database is divided into k son
Collection, continues downward merogenesis;
(5) it is random from their characteristic vector pond in the k cluster obtained for each from the second level in the third level
Cluster centre of the k characteristic point as its next stage is chosen, then using method for measuring similarity by each image feature vector
The cluster center of its nearest neighbours is assigned to, so as to form k on the third level2Individual cluster;
(6) repeat step (2), (3), until the image feature vector that includes of all leaf nodes belong to same class object or
The quantity for the image feature vector that leaf node includes is less than certain limitation;Wherein each image feature vector has one and it
Related class label.
As shown in Fig. 2 in step (1), image N × N of the overlap partition method by size for height × weight
Window divided, Nhop pixel shift is pressed in row and column direction, is divided into some overlapped subregions, in order that figure
The sufficiently small object included as in can be detected, and reduce square window size, increase patch quantity;Pass through phase mutual respect
Folded subinterval, color histogram is combined with the spatial distribution of color.The image of size is divided into overlapping subinterval more
Individual square:
Line number:Blockrows=(height-N)/Nhop+1
Columns:Blockcols=(weidth-N)/Nhop+1
Square patch number:NumofSamples=blockrows × blockcols
Image size is height × weight pixel, and the size of the image after caused processing is by patch window
Size determine.Changed by the patch window size of picture, the quantity of caused square patch can also change.Work as patch
When window reduces, square patch quantity caused by processing image can increase, and vice versa.Now, it is each in the image after processing
Individual pixel is represented with the color histogram of a square patch.The size of square window directly affects point of the image after processing
Resolution.Therefore, can not be excessive which has limited the size of patch square window so that need to have been represented with greater number of patch
Whole image information.Mean that the patch number of each image increases.
Simple extracts its color histogram to image progress piecemeal, and image alone simply is divided into no any language
The block of adopted information.It is the purpose of establishing image knowledge storehouse that the characteristic information block of image is combined with its semantic feature.Due to every
The characteristic vector of one extraction, corresponding to a point of feature space, by carrying out the characteristic point in feature space without guidance
Learn (cluster i.e. in data mining), all characteristic vectors in characteristic vector storehouse are divided into multiple patterns so that same
There are more similitudes between pattern in one class, there is larger diversity between the pattern in inhomogeneity.Pass through
Characteristic point is marked class label, and each class label has specific semantic information, by image feature information block and semanteme
Feature is combined, and the different zones of image are explained.
In step (2), establish during image knowledge storehouse, using Chameleon clustering algorithms and the cluster based on MST
Algorithm, the color histogram feature vector storehouse of image is clustered, class label is set to cluster result, established based on colored straight
The knowledge base of square figure feature.
Below by the mode of experiment, to further demonstrate that the superiority of the present invention.
The cluster of image feature data
Chameleon clustering algorithms are respectively adopted to characteristic vector data storehouse and the clustering algorithm based on MST gathers
Class, the object in image is divided into multiple clusters, and cluster is marked, form knowledge base.It is and poly- using coloured image performance
Database after class.By by original image set, the cluster result image based on MST, Chameleon cluster result image threes
Contrasted, verify the feasibility of the extracting method to characteristics of image herein.
In experiment, because view data is concentrated, image is relatively more, and we are only to the parts of images cluster result in database
Be shown, simultaneously because property data base has multigroup, we are respectively to rgb space, HSV space, Opponent spaces and
The one group cluster result in Transformed spaces is shown.
Table 1 is based on prevailing scenario and its semantic relation in RGB color histogram RGB knowledge bases
As shown in figure 3, tie up the cluster result of RGB histograms in figure for the 10648 of 21-14.
Table 2 is based on hsv color histogram HSV knowledge bases prevailing scenario and its semantic relation
As shown in figure 4, tie up the cluster result of HSV histograms in figure for the 5000 of 24-16.
Table 3 is based on Opponent color histograms Opponent knowledge bases prevailing scenario and its semantic relation
As shown in figure 5, tie up the cluster result of Opponent histograms in figure for the 5832 of 24-16.
Table 4 is based on prevailing scenario and its semantic relation in Transformed color histogram Transformed knowledge bases
As shown in fig. 6, the cluster result for the 10648 dimension Transformed histograms that figure is 21-14.
Extract characteristic velocity
It is 720 × 1280 pixels from image size, using 21 × 21 patch window sizes, mobile pixel value is arranged to
14 (being expressed as 21-14), then the size of the image after handling is 50 × 90 pixels;When big using 24 × 24 patch windows
Hour, mobile pixel value is arranged to 16 (being expressed as 24-16), then the image size after handling is;When using 27 × 27 patch windows
During mouth size, mobile pixel value is arranged to 18 (being expressed as 27-18), then the image size after handling is 39 × 70.
By the overlap partition method more refined, 4500 blocks, 3476 blocks, 2730 blocks are divided an image into, and
Extract the color histogram of each block.Conventional image block, the main greyscale color histogram for extracting image, or HSV face
Color Histogram, herein, extract the RGB color histogram of image, hsv color histogram, Opponent color histograms and
Transformed color histograms, establish image feature vector storehouse.
In experiment, to 35 width images of 720 × 1280 sizes, 21-14, tri- kinds overlapping point of 24-16,27-18 is respectively adopted
The RGB color histogram 10000 of block method extraction image is tieed up, HSV color histograms 10648 are tieed up, Opponent color histograms
10000 dimensions, Transformed color histograms 10000 are tieed up, Gabor textural characteristics, and the SIFT of extraction identical image special
Sign point.
This experiment counts the extraction time of 35 width picture different characteristics respectively, and the average value of each image feature extraction is as follows
Shown in table.
Table 5
Table 5 is shown:Using identical image block method, even if the higher-dimension color histogram of extraction image, it extracts speed
Degree is substantially better than the extracting method speed of Gabor textural characteristics;Meanwhile the time that the SIFT feature for extracting entire image is spent
Also it is slowly more many than extracting the time of its color histogram after piecemeal.Therefore, the feature extraction mode used herein can be quick
Extraction characteristics of image.
Accuracy rate
Because nearest _neighbor retrieval is that retrieval object belongs to same class with its arest neighbors, its retrieval accuracy highest.We with
Arest neighbors is as standard retrieval result set.Pass through the result and arest neighbors for retrieving the retrieval result of words tree, Randomized Quantizing tree
Retrieval result is compared, and analyzes the retrieval rate of two kinds of trees.
RGB histograms accuracy rate contrasts
First, we be directed to RGB color in different groups (21-14,24-16,27-18) multiple dimensions (64 dimension,
125 dimensions, 216 dimensions, 512 dimensions, 1000 dimensions, 2744 dimensions, 5832 dimensions, 10648 dimensions), totally 24 groups of training sets are retrieved, contrast knot
Fruit table 2, table 3, shown in table 4, wherein KQtree is Randomized Quantizing words tree, and VTree is traditional words tree.
RGB histogram accuracy rate comparing results:Shown from Fig. 7, Fig. 8, Fig. 9, the accuracy rate of Randomized Quantizing tree is substantially high
In the accuracy rate of words tree.
In Fig. 8, the accuracy rate of Randomized Quantizing tree ties up highest 1000, reaches 83.03%, and show the retrieval of higher-dimension
As a result it is better than low-dimensional retrieval result.In fig.9, Randomized Quantizing tree accuracy rate highest when dimension is 5832, reaches 86.73%,
Also show that the retrieval result of higher-dimension is better than low-dimensional.In fig.9, Randomized Quantizing tree ties up accuracy rate highest 512, reaches
85.49%, high dimensional data retrieval effectiveness is better than low-dimensional data retrieval effectiveness, but unobvious, and the retrieval effectiveness of middle dimension is most
It is good.
Fig. 7, Fig. 8, Fig. 9 are contrasted together, it has been found that RGB color histogram is generally imitated in the retrieval of high-dimensional data space
Fruit is better than the retrieval effectiveness in low-dimensional data space, meanwhile, different square window sizes is directed to, window is smaller, square patch
Number is more, and image information is abundanter, but is not that square patch is more, and the retrieval rate of image is higher, and different dimensions have
Different performances, do not unify rule.
Establish the words tree time
Randomized Quantizing tree and words tree in RGB histograms different groups (21-14,24-16,27-18) multiple dimensions
The run time of (64 dimensions, 125 dimensions, 216 dimensions, 512 dimensions, 1000 dimensions, 2744 dimensions, 5832 dimensions, 10648 dimensions), such as table 6, table 7, table
Shown in 8, unit is the second, and wherein KQtree is Randomized Quantizing words tree, and VTree is traditional words tree.
The window 21 × 21 of table 6, displacement 14, Randomized Quantizing words tree and words tree are when RGB histograms different dimensions are run
Between
The window 24 × 24 of table 7, displacement 16, Randomized Quantizing words tree and words tree are when RGB histograms different dimensions are run
Between
The window 27 × 27 of table 8, displacement 18, Randomized Quantizing words tree and words tree are when RGB histograms different dimensions are run
Between
In RGB histograms, the speed of service of Randomized Quantizing tree is substantially faster than the speed of service of words tree.As RGB is straight
The increase of square figure dimension, words tree run time increase than Randomized Quantizing tree run time into geometry multiple.
Randomized Quantizing words tree and words tree in Opponent histograms different groups (21-14,24-16,27-18)
The run time of multiple dimensions (64 dimensions, 125 dimensions, 216 dimensions, 512 dimensions, 1000 dimensions, 2744 dimensions, 5832 dimensions, 10648 dimensions), such as table
Shown in 9,10,11, unit is the second.
The window 21 × 21 of table 9, displacement 14, Randomized Quantizing words tree and words tree are transported in Opponent histograms different dimensions
The row time
The window 24 × 24 of table 10, displacement 16, Randomized Quantizing words tree and words tree are in Opponent histogram different dimensions
Run time
The window 27 × 27 of table 11, displacement 18, Randomized Quantizing words tree and words tree are in Opponent histogram different dimensions
Run time
In Opponent histograms, the speed of service of Randomized Quantizing tree is substantially faster than the speed of service of words tree.With
The increase of Opponent histogram dimensions, words tree run time increase than Randomized Quantizing tree run time into geometry multiple.
It in summary it can be seen, the search method of Randomized Quantizing words tree is in time efficiency, hence it is evident that better than words tree.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
Member, under the premise without departing from the principles of the invention, can also make some improvement and supplement, and these are improved and supplement also should be regarded as
Protection scope of the present invention.
Claims (6)
1. use the search method of Randomized Quantizing words tree, it is characterised in that comprise the following steps:
(1) one nearest neighbor search tree of generation, the root node using all characteristic vectors of whole database as first segment, downwards
Merogenesis;
(2) in the second level, center of the k point as cluster is randomly selected from whole database, then according to selected similar
Property measure, each characteristic vector is assigned to the cluster center of its nearest neighbours, whole database is divided into k subset, continue
Downward merogenesis;
(3) in the third level, in the k cluster obtained for each from the second level, k is randomly selected from their characteristic vector pond
Cluster centre of the individual characteristic point as its next stage, then each characteristic vector is assigned to from it using method for measuring similarity
Nearest cluster center, so as to form k on the third level2Individual cluster;
(4) repeat step (2), (3), until the characteristic vector that all leaf nodes include belongs to same class object or leaf node bag
The quantity of the characteristic vector contained is less than certain limitation;Wherein each characteristic vector has a class label associated with it.
2. the search method of Randomized Quantizing words tree is used as claimed in claim 1, it is characterised in that:It is special in step (2)
The cluster centre distance that sign vector arrives two or more is equal, then randomly chooses a cluster.
3. the search method of Randomized Quantizing words tree is used as claimed in claim 1, it is characterised in that:In step (3), from
The new characteristic vector selected in characteristic vector pond, the cluster center of its nearest neighbours is assigned it to, when the cluster center for reaching distribution
Leaf node when, if all characteristic vector points have identical class label in the leaf node, distribute associated class label
To new characteristic vector, then stop computing;Otherwise, scanned for again in cluster is distributed, select cluster in new feature to
Span distributes to new characteristic vector from most short characteristic vector, and by the class label that this feature vector correlation joins, and then stops
Only computing.
4. a kind of image search method of the search method based on Randomized Quantizing words tree, it is characterised in that comprise the following steps:
(1) some overlapped subregions are divided the image into by overlap partition method first;
(2) the characteristic information block of image is combined with its semantic feature;Due to the characteristic vector of each extraction, corresponding to spy
A point in space is levied, will by carrying out non supervised learning (cluster i.e. in data mining) to the characteristic point in feature space
All characteristic vectors in characteristic vector storehouse are divided into multiple patterns so that have between the pattern in same class more
Similitude, there is larger diversity between the pattern in inhomogeneity;Characteristic point is marked by class label, each
Class label has specific semantic information, and image feature information block is combined with semantic feature, the different zones of image are entered
Row is explained, to establish image knowledge storehouse;
(3) a nearest neighbor search tree, the root section using all image feature vectors in image knowledge storehouse as first segment are produced
Point, downward merogenesis;
(4) in the second level, center of the k point as cluster is randomly selected from image knowledge storehouse, then according to selected similar
Property measure, each image feature vector is assigned to the cluster center of its nearest neighbours, whole database is divided into k subset,
Continue downward merogenesis;
(5) in the third level, in the k cluster obtained for each from the second level, k is randomly selected from their characteristic vector pond
Cluster centre of the individual characteristic point as its next stage, then each image feature vector is assigned to using method for measuring similarity
The cluster center of its nearest neighbours, so as to form k on the third level2Individual cluster;
(6) repeat step (2), (3), until the image feature vector that all leaf nodes include belongs to same class object or leaf segment
The quantity for the image feature vector that point includes is less than certain limitation;Wherein each image feature vector have one it is associated with it
Class label.
5. a kind of image search method of the search method based on Randomized Quantizing words tree as claimed in claim 4, its feature
It is:In step (1), overlap partition method is drawn the image that size is height × weight with N × N window
Point, Nhop pixel shift is pressed in row and column direction, is divided into some overlapped subregions, in order that the foot included in image
Enough small objects can be detected, and reduce square window size, increase patch quantity;, will by overlapped subinterval
Color histogram is combined with the spatial distribution of color.
6. a kind of image search method of the search method based on Randomized Quantizing words tree as claimed in claim 4, its feature
It is:In step (2), establish during image knowledge storehouse, calculated using Chameleon clustering algorithms with the cluster based on MST
Method, the color histogram feature vector storehouse of image is clustered, class label is set to cluster result, foundation is based on color histogram
The knowledge base of figure feature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710545225.9A CN107451200B (en) | 2017-07-06 | 2017-07-06 | Retrieval method using random quantization vocabulary tree and image retrieval method based on same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710545225.9A CN107451200B (en) | 2017-07-06 | 2017-07-06 | Retrieval method using random quantization vocabulary tree and image retrieval method based on same |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107451200A true CN107451200A (en) | 2017-12-08 |
CN107451200B CN107451200B (en) | 2020-07-28 |
Family
ID=60488400
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710545225.9A Active CN107451200B (en) | 2017-07-06 | 2017-07-06 | Retrieval method using random quantization vocabulary tree and image retrieval method based on same |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107451200B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108241745A (en) * | 2018-01-08 | 2018-07-03 | 阿里巴巴集团控股有限公司 | The processing method and processing device of sample set, the querying method of sample and device |
CN108536769A (en) * | 2018-03-22 | 2018-09-14 | 深圳市安软慧视科技有限公司 | Image analysis method, searching method and device, computer installation and storage medium |
CN109992690A (en) * | 2019-03-11 | 2019-07-09 | 中国华戎科技集团有限公司 | A kind of image search method and system |
CN111274367A (en) * | 2018-11-20 | 2020-06-12 | 财团法人资讯工业策进会 | Semantic analysis method, semantic analysis system and non-transitory computer readable medium |
CN112966718A (en) * | 2021-02-05 | 2021-06-15 | 深圳市优必选科技股份有限公司 | Image identification method and device and communication equipment |
CN117392415A (en) * | 2023-10-12 | 2024-01-12 | 南京邮电大学 | Image quick matching method based on deep learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070214172A1 (en) * | 2005-11-18 | 2007-09-13 | University Of Kentucky Research Foundation | Scalable object recognition using hierarchical quantization with a vocabulary tree |
CN103678504A (en) * | 2013-11-19 | 2014-03-26 | 西安华海盈泰医疗信息技术有限公司 | Similarity-based breast image matching image searching method and system |
-
2017
- 2017-07-06 CN CN201710545225.9A patent/CN107451200B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070214172A1 (en) * | 2005-11-18 | 2007-09-13 | University Of Kentucky Research Foundation | Scalable object recognition using hierarchical quantization with a vocabulary tree |
CN103678504A (en) * | 2013-11-19 | 2014-03-26 | 西安华海盈泰医疗信息技术有限公司 | Similarity-based breast image matching image searching method and system |
Non-Patent Citations (2)
Title |
---|
杨树极: "一种结合语义特征和视觉特征的图像检索方法", 《电脑开发与应用》 * |
林克正等: "基于分块主颜色匹配的图像检索", 《计算机工程》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10896164B2 (en) | 2018-01-08 | 2021-01-19 | Advanced New Technologies Co., Ltd. | Sample set processing method and apparatus, and sample querying method and apparatus |
WO2019134567A1 (en) * | 2018-01-08 | 2019-07-11 | 阿里巴巴集团控股有限公司 | Sample set processing method and apparatus, and sample querying method and apparatus |
CN108241745B (en) * | 2018-01-08 | 2020-04-28 | 阿里巴巴集团控股有限公司 | Sample set processing method and device and sample query method and device |
TWI696081B (en) * | 2018-01-08 | 2020-06-11 | 香港商阿里巴巴集團服務有限公司 | Sample set processing method and device, sample query method and device |
CN108241745A (en) * | 2018-01-08 | 2018-07-03 | 阿里巴巴集团控股有限公司 | The processing method and processing device of sample set, the querying method of sample and device |
CN108536769A (en) * | 2018-03-22 | 2018-09-14 | 深圳市安软慧视科技有限公司 | Image analysis method, searching method and device, computer installation and storage medium |
CN108536769B (en) * | 2018-03-22 | 2023-01-03 | 深圳市安软慧视科技有限公司 | Image analysis method, search method and device, computer device and storage medium |
CN111274367A (en) * | 2018-11-20 | 2020-06-12 | 财团法人资讯工业策进会 | Semantic analysis method, semantic analysis system and non-transitory computer readable medium |
CN109992690B (en) * | 2019-03-11 | 2021-04-13 | 中国华戎科技集团有限公司 | Image retrieval method and system |
CN109992690A (en) * | 2019-03-11 | 2019-07-09 | 中国华戎科技集团有限公司 | A kind of image search method and system |
CN112966718A (en) * | 2021-02-05 | 2021-06-15 | 深圳市优必选科技股份有限公司 | Image identification method and device and communication equipment |
CN112966718B (en) * | 2021-02-05 | 2023-12-19 | 深圳市优必选科技股份有限公司 | Image recognition method and device and communication equipment |
CN117392415A (en) * | 2023-10-12 | 2024-01-12 | 南京邮电大学 | Image quick matching method based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN107451200B (en) | 2020-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107451200A (en) | Search method using Randomized Quantizing words tree and the image search method based on it | |
Fan et al. | Taking a deeper look at co-salient object detection | |
CN110210539B (en) | RGB-T image saliency target detection method based on multi-level depth feature fusion | |
CN101894125B (en) | Content-based video classification method | |
Huang et al. | DeepDiff: Learning deep difference features on human body parts for person re-identification | |
CN103207879A (en) | Method and equipment for generating image index | |
CN102254326A (en) | Image segmentation method by using nucleus transmission | |
CN105718555A (en) | Hierarchical semantic description based image retrieving method | |
CN104216974A (en) | Unmanned aerial vehicle aerial image matching method based on vocabulary tree blocking and clustering | |
CN106874421A (en) | Image search method based on self adaptation rectangular window | |
CN109344842A (en) | A kind of pedestrian's recognition methods again based on semantic region expression | |
CN104317946A (en) | Multi-key image-based image content retrieval method | |
CN102402508A (en) | Similar image search device and search method thereof | |
CN103617609A (en) | A k-means nonlinear manifold clustering and representative point selecting method based on a graph theory | |
Plant et al. | Image retrieval on the honeycomb image browser | |
CN111325290A (en) | Chinese painting image classification method based on multi-view fusion and multi-example learning | |
Blažek et al. | Video retrieval with feature signature sketches | |
Rodrigues et al. | Graph visual rhythms in temporal network analyses | |
Xiaoling | A novel circular ring histogram for content-based image retrieval | |
CN103886333B (en) | Method for active spectral clustering of remote sensing images | |
CN110796650A (en) | Image quality evaluation method and device, electronic equipment and storage medium | |
Huang et al. | Tea garden detection from high-resolution imagery using a scene-based framework | |
Sun et al. | A novel region-based approach to visual concept modeling using web images | |
Wang et al. | Intensity filtering and group fusion for accurate mobile place recognition | |
Figueroa et al. | Image retrieval based on the combination of RGB and HSV's histograms and Colour Layout Descriptor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |