CN103064941A - Image retrieval method and device - Google Patents

Image retrieval method and device Download PDF

Info

Publication number
CN103064941A
CN103064941A CN2012105723718A CN201210572371A CN103064941A CN 103064941 A CN103064941 A CN 103064941A CN 2012105723718 A CN2012105723718 A CN 2012105723718A CN 201210572371 A CN201210572371 A CN 201210572371A CN 103064941 A CN103064941 A CN 103064941A
Authority
CN
China
Prior art keywords
image
image collection
node
denoising
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012105723718A
Other languages
Chinese (zh)
Other versions
CN103064941B (en
Inventor
陈世峰
曹琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201210572371.8A priority Critical patent/CN103064941B/en
Publication of CN103064941A publication Critical patent/CN103064941A/en
Application granted granted Critical
Publication of CN103064941B publication Critical patent/CN103064941B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image retrieval method and device. The method includes acquiring a retrieval key word and performing screening in a database according to the retrieval key word to obtain an image set; establishing a first spectrogram model of the image set according to image characteristics to obtain a similarity relation of each two images in the image set; establishing a semi-supervised learning model according to the similarity relation; subjecting the image set to denoising according to the semi-supervised learning model to obtain a denoising image set; and returning the denoising image set to be used as a retrieval result corresponding to the retrieval key word. According to the image retrieval method and device, by means of the establishing of the spectrogram model of the image set, the semi-supervised learning model is established, the image set is subjected to denoising according to the semi-supervised learning model, the image set which is subjected to denoising is returned to be used as the retrieval result corresponding to the retrieval key word, overall denoising is performed on the retrieved image set, and the accuracy of image retrieval is improved.

Description

Image search method and device
Technical field
The present invention relates to image retrieval technologies, particularly relate to a kind of image search method and device.
Background technology
Image retrieval technologies based on keyword is the image retrieval technologies of current main-stream, yet because the existence of error label and the ambiguity of search key language obtain image by the image retrieval technologies retrieval based on keyword usually not accurate enough.
Summary of the invention
Based on this, be necessary for the not accurate enough problem of the result for retrieval of conventional images retrieval technique, a kind of image search method and device that can improve retrieval precision is provided.
A kind of image search method comprises the steps:
Obtain search key, and screen from database according to search key and to obtain image collection;
Set up the first spectrogram model of image collection according to characteristics of image, obtain the in twos similarity relation between the image in the image collection;
Set up the semi-supervised learning model according to similarity relation;
According to the semi-supervised learning model image collection is carried out denoising, obtain the denoising image collection;
Return the denoising image collection as the corresponding result for retrieval of search key.
A kind of image retrieving apparatus comprises:
Acquisition module is used for obtaining search key, and screens from database according to search key and to obtain image collection;
MBM, the first spectrogram model for set up image collection according to characteristics of image obtains the in twos similarity relation between the image in the image collection;
Study module is used for setting up the semi-supervised learning model according to similarity relation;
The denoising module is used for according to the semi-supervised learning model image collection being carried out denoising, obtains the denoising image collection;
Sending module is used for returning the denoising image collection as the corresponding result for retrieval of search key.Above-mentioned image search method and device, by obtaining search key, and screen from database according to search key and to obtain image collection, set up the first spectrogram model of image collection according to characteristics of image, obtain the in twos similarity relation between the image in the image collection, set up the semi-supervised learning model according to similarity relation, according to the semi-supervised learning model image collection is carried out denoising, obtain the denoising image collection, return the denoising image collection as the corresponding result for retrieval of search key, by the image collection that retrieves is carried out overall denoising, improved the degree of accuracy of image retrieval.
Description of drawings
Fig. 1 is image search method schematic flow sheet among the embodiment;
Fig. 2 is image retrieving apparatus structural representation among the embodiment;
Fig. 3 is the schematic flow sheet of image search method in another embodiment.
Embodiment
Be described in detail below in conjunction with specific embodiment and the accompanying drawing technical scheme to image search method and device, so that it is clearer.
As shown in Figure 1, in one embodiment, a kind of image search method comprises the steps:
S110 obtains search key, and screens from database according to search key and to obtain image collection.
In the present embodiment, obtain the words that is used for retrieving images that the user inputs as key word in search engine, according to this key word from image data base or other include in the database of figure and screen image, can be according to the text based retrieval technology such as content to the description of image, image name, image place webpage.
S130 sets up the first spectrogram model of image collection according to characteristics of image, obtain the in twos similarity relation between the image in the image collection.
In the present embodiment, obtain first the image feature value that obtains in the image collection, such as the characteristics of image such as rgb value (intensity levels of three colors of RGB), brightness value, tone, saturation degree or the figure number of plies of image.Set up the Characteristic of Image vector according to this image feature value, this proper vector is multi-C vector, and every one dimension is by a kind of characteristics of image value representation.According to proper vector image collection is gathered χ={ x by first node 1..., x nExpression, wherein x nA multi-C vector, each x nRepresent an image, x nA dimension represent an eigenwert, image by a vector representation in the node set, is convenient to subsequent calculations.
According to first node set opening relationships matrix W, when i ≠ j, w Ij=exp (|| x i-x j|| 2/ σ 2), when i=j, w Ij=0, represent that by relational matrix W the similarity between each image concerns.Further, W does normalized to similar matrix, the similarity between the image is concerned get first normalization limit matrix S=D by the data representation that passes through between 0 and 1 -1/2WD -1/2, wherein D is that diagonal element is
Figure BDA00002650623300031
Diagonal matrix, namely this diagonal element is w IjThe all elements of column and.Normalization limit matrix is based on the in twos foundation of the mutual relationship between the node, can be used for excavating the inner structure of node set.
S150 sets up the semi-supervised learning model according to similarity relation.
In the present embodiment, obtain first in the first node set before p node, be positive sample with p node demarcation, for example, for the set χ of n node={ x 1..., x p, x P+1..., x n, its front p node is demarcated is positive sample, p can be preset value, also can be by spectral clustering to node in conjunction with calculating.Definition query vector y, y is a multi-C vector, for demarcating node, y=y i=1 (i≤p), for not demarcating node, y=y u=0 (p+1≤u≤n), y iPerhaps y uIt is the value of a dimension among the multi-C vector y.Definition prediction label vector f, wherein f i(the expression of 1≤i≤n) node x iThe prediction label, f is multi-C vector.
Further, y and the f according to definition sets up the energy function of predicting the label vector f E ( f ) = Σ i , j = 1 n w ij ( f i d ii - f i d jj ) 2 + μ Σ i = 1 n ( f i - y i ) 2 , Wherein μ is balance factor, can be preset value,
Figure BDA00002650623300034
Level and smooth risk item, if x iAnd x jLarger w is arranged Ij, then keep f iAnd f iMore approaching;
Figure BDA00002650623300035
Be the empiric risk item, keep f to compare with original demarcation y changing little.
At last, the Global optimal solution f=(1-α) (I-α S) that according to this energy function E (f) f is differentiated and can get E (f) -1Y, the semi-supervised learning model that namely obtains, wherein α=1/ (1+ μ), I is unit matrix, S=D -1/2WD -1/2, wherein D is that diagonal element is
Figure BDA00002650623300036
Diagonal matrix.
Come the image of front p position in the image collection that retrieval is obtained as positive sample, it is 1 that its label is set, the label of other images is set to 0 in the image collection, the image collection that sets label is formed binary vector y, calculate by the semi-supervised learning model, can be in the hope of rearrangement mark f.To whenever sorting from big to small with dimension value of f kind, can get sequence node, can resequence to image collection, the node in the node of standing out and the positive sample puts in order more approaching.
S170 carries out denoising according to the semi-supervised learning model to image collection, obtains the denoising image collection.
In one embodiment, above-mentioned steps S170 specifically may further comprise the steps:
Obtain the label of single node prediction according to the semi-supervised learning model, obtain label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure BDA00002650623300041
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label;
Carry out the spectral clustering analysis according to label matrix, obtain a plurality of class group;
Leading mark according to label matrix and the described node of class group definition is
Figure BDA00002650623300042
According to inequality
Figure BDA00002650623300043
Judge noise class group, wherein
Figure BDA00002650623300044
Expression is averaged the data among the c of class group,
Figure BDA00002650623300045
Expression is averaged k class group, and β is preset value;
Remove the noise class and roll into a ball corresponding noise image set, obtain the denoising image collection.
In the present embodiment, the node in the semi-supervised model can form main class group usually, and node corresponding to noise image can dilute the density of class group, the node that is positioned at same geometric configuration can be used as same class group, and noise then is the exceptional value that disperses.Specifically can be by study mapping g () with luv space
Figure BDA00002650623300046
Twist into new space So that all exceptional values can form a new class group, and all class groups are separated from each other, and are convenient to noise remove like this.
Obtain the label of single node prediction according to the semi-supervised learning model, obtain label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure BDA00002650623300048
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label, if x iAnd x jBelong to same class group,
Figure BDA00002650623300049
Value should be larger, and
Figure BDA000026506233000410
With
Figure BDA000026506233000411
At each dimension k=1 ..., the value of n is more close, and abnormal nodes should be less in the value of nearly all dimension.
Definition mapping g: χ → R n,
Figure BDA000026506233000412
Then based on χ *=g (χ) sets up spectrogram, obtains normalization limit matrix S *With normalization figure Laplce L *=I-S *, order
Figure BDA000026506233000413
Be L *Eigenwert and proper vector pair, and λ i≤ ... ≤ λ nL *Be block diagonal matrix, the element between the same class group has larger absolute value.L *Less partial feature value characteristic of correspondence vector is keeping same block structure, makes it form U k=[v 1, v 2..., v k], wherein k is χ *The quantity that middle class is rolled into a ball can be by L *The eigenwert of arranging from small to large in k the largest interval value occurs and determine with k+1.Then will with the K averaging method Gather into the k class, comprising the class that is formed by discrete noise node.If to F *The summation of every row, row corresponding to noise itself and less.
According to label matrix and the definition x of class group iLeading mark be
Figure BDA00002650623300052
Use simultaneously c ∈ 1 ..., the label of k} representation class group.Then can be according to inequality
Figure BDA00002650623300053
Judge noise class group, wherein
Figure BDA00002650623300054
Expression is averaged the data among the c of class group,
Figure BDA00002650623300055
Expression is averaged k class group, and β is the threshold value factor, can be preset value, and removal noise class namely obtains the denoising image collection after rolling into a ball corresponding noise image set.
S190 returns the denoising image collection as the corresponding result for retrieval of search key.
In the present embodiment, the image collection after the denoising is returned to search engine, as the corresponding result for retrieval of search key, namely finish the retrieval of image.
Above-mentioned image search method, by obtaining search key, and screen from database according to search key and to obtain image collection, set up the first spectrogram model of image collection according to characteristics of image, obtain the in twos similarity relation between the image in the image collection, set up the semi-supervised learning model according to similarity relation, according to the semi-supervised learning model image collection is carried out denoising, obtain the denoising image collection, return the denoising image collection as the corresponding result for retrieval of search key, by the image collection that retrieves is carried out overall denoising, improved the degree of accuracy of image retrieval.
In one embodiment, above-mentioned steps S190 specifically may further comprise the steps:
The denoising image collection is set up the second spectrogram model, obtain the corresponding Section Point of denoising image collection set χ ' and based on the second normalization limit matrix S of χ ' ';
Set up the maximization function according to the spectrogram model y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein, M=(I-α S') -1, m Ii=0, γ is preset value, M P * pThe capable p row of front p of M;
Obtain positive sample by alternative manner solution maximization function;
By positive sample training semi-supervised learning model, so that the denoising image collection is resequenced, obtain the image collection that reorders;
Return the result for retrieval of image collection as search key that reorder.
In the present embodiment, as described in Figure 2, on the basis of denoising image collection, set up the spectrogram model, carry out the spectral clustering analysis, obtain leading class, to obtain positive sample, be used for training semi-supervised learning model and then the denoising image collection sorted, obtain final result for retrieval.With key word since with in terms of content with key word the relevant ratio of image in the image of standing out so set up spectrogram based on the image set of standing out, select to dominate class usually above the ratio in whole image set.Leading class, the i.e. many and high class group of density of node in the spectral clustering.
Concrete, make that χ ' is the denoising image collection, S' is the normalization limit matrix based on χ '.
Set up matrix M=(I-α S') -1And m Ii=0.Owing to more may occupy larger proportion in the image that positive sample is stood out to form leading class group in image collection, only consider the front p dimension of M here, p can preset value.
The query vector y of definition p * 1 pExpression comes the demarcation information of front p width of cloth image.In order to make y pDemarcation information accurate, set up the maximization function y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein γ is balance factor, M P * pThe capable p row of front p of M,
Figure BDA00002650623300062
Be the density item, weigh by y pThe density of the block structure that middle nominal data forms,
Figure BDA00002650623300063
Be the yardstick item, guarantee that leading class group has larger size, It is the demarcation query vector after the purification of requirement.
For above-mentioned maximization function, adopt alternative manner to find the solution.At first, at all dimension initialization y p=1, for each iteration, the value of certain one dimension is become 0 from 1, so that
Figure BDA00002650623300065
Amplification is maximum.When
Figure BDA00002650623300066
In the time of can't increasing by this mode, iteration stopping.Remaining 1 corresponding image corresponding to expression is positive sample.After demarcating good positive sample, the processing of resequencing of image that can the denoising image collection is arranged on the denoising image collection of positive sample by the semi-supervised learning model being applied in demarcate.The result that will resequence at last returns as the result for retrieval of search key and connects, and namely gets the result for retrieval that improves retrieval accuracy.
As shown in Figure 3, in one embodiment, a kind of image retrieving apparatus comprises acquisition module 110, MBM 130, study module 150, denoising module 170 and sending module 190.
Acquisition module 110 is used for obtaining search key, and screens from database according to search key and to obtain image collection.
In the present embodiment, acquisition module 110 obtains the words that is used for retrieving images that the user inputs as key word in search engine, according to this key word from image data base or other include in the database of figure and screen image, can be according to the text based retrieval technology such as content to the description of image, image name, image place webpage.
MBM 130, the first spectrogram model for set up image collection according to characteristics of image obtains the in twos similarity relation between the image in the image collection.
In the present embodiment, obtain first the image feature value that obtains in the image collection, such as the characteristics of image such as rgb value (intensity levels of three colors of RGB), brightness value, tone, saturation degree or the figure number of plies of image.Set up the Characteristic of Image vector according to this image feature value, this proper vector is multi-C vector, and every one dimension is by a kind of characteristics of image value representation.According to proper vector image collection is gathered χ={ x by first node 1..., x nExpression, wherein x nA multi-C vector, each x nRepresent an image, x nA dimension represent an eigenwert, image by a vector representation in the node set, is convenient to subsequent calculations.
According to first node set opening relationships matrix W, when i ≠ j, w Ij=exp (|| x i-x j|| 2/ σ 2), when i=j, w Ij=0, represent that by relational matrix W the similarity between each image concerns.Further, W does normalized to similar matrix, the similarity between the image is concerned get first normalization limit matrix S=D by the data representation that passes through between 0 and 1 -1/2WD -1/2, wherein D is that diagonal element is
Figure BDA00002650623300071
Diagonal matrix, namely this diagonal element is w IjThe all elements of column and.Normalization limit matrix is based on the in twos foundation of the mutual relationship between the node, can be used for excavating the inner structure of node set.
Study module 150 is used for setting up the semi-supervised learning model according to similarity relation.
In the present embodiment, obtain first in the first node set before p node, be positive sample with p node demarcation, for example, for the set χ of n node={ x 1..., x p, x P+1..., x n, its front p node is demarcated is positive sample, p can be preset value, also can be by spectral clustering to node in conjunction with calculating.Definition query vector y, y is a multi-C vector, for demarcating node, y=y i=1 (i≤p), for not demarcating node, y=y u=0 (p+1≤u≤n), y iPerhaps y uIt is the value of a dimension among the multi-C vector y.Definition prediction label vector f, wherein f i(the expression of 1≤i≤n) node x iThe prediction label, f is multi-C vector.
Further, y and the f according to definition sets up the energy function of predicting the label vector f E ( f ) = Σ i , j = 1 n w ij ( f i d ii - f i d jj ) 2 + μ Σ i = 1 n ( f i - y i ) 2 , Wherein μ is balance factor, can be preset value,
Figure BDA00002650623300082
Figure BDA00002650623300083
Level and smooth risk item, if x iAnd x jLarger w is arranged Ij, then keep f iAnd f jMore approaching;
Figure BDA00002650623300084
Be the empiric risk item, keep f to compare with original demarcation y changing little.
At last, the Global optimal solution f=(1-α) (I-α S) that according to this energy function E (f) f is differentiated and can get E (f) -1Y, the semi-supervised learning model that namely obtains, wherein α=1/ (1+ μ), I is unit matrix, S=D -1/2WD -1/2, wherein D is that diagonal element is Diagonal matrix.
Come the image of front p position in the image collection that retrieval is obtained as positive sample, it is 1 that its label is set, the label of other images is set to 0 in the image collection, the image collection that sets label is formed binary vector y, calculate by the semi-supervised learning model, can be in the hope of rearrangement mark f.To whenever sorting from big to small with dimension value of f kind, can get sequence node, can resequence to image collection, the node in the node of standing out and the positive sample puts in order more approaching.
Denoising module 170 is used for according to the semi-supervised learning model image collection being carried out denoising, obtains the denoising image collection.
In one embodiment, above-mentioned denoising module 170 also is used for obtaining according to the semi-supervised learning model label of single node prediction, obtains label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure BDA00002650623300086
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label, carry out the spectral clustering analysis according to label matrix, obtain a plurality of class group, according to the leading mark of label matrix and class group defined node be
Figure BDA00002650623300087
According to inequality Judge noise class group, wherein
Figure BDA00002650623300092
Expression is averaged the data among the c of class group, Expression is averaged k class group, and β is preset value, removes described noise class and rolls into a ball corresponding noise image set, obtains the denoising image collection.
In the present embodiment, the node in the semi-supervised model can form main class group usually, and node corresponding to noise image can dilute the density of class group, the node that is positioned at same geometric configuration can be used as same class group, and noise then is the exceptional value that disperses.Specifically can be by study mapping g () with luv space
Figure BDA00002650623300094
Twist into new space
Figure BDA00002650623300095
So that all exceptional values can form a new class group, and all class groups are separated from each other, and are convenient to noise remove like this.
Obtain the label of single node prediction according to the semi-supervised learning model, obtain label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure BDA00002650623300096
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label, if x iAnd x jBelong to same class group,
Figure BDA00002650623300097
Value should be larger, and With
Figure BDA00002650623300099
At each dimension k=1 ..., the value of n is more close, and abnormal nodes should be less in the value of nearly all dimension.
Definition mapping g: χ → R n,
Figure BDA000026506233000910
Then based on x *=g (χ) sets up spectrogram, obtains normalization limit matrix S *With normalization figure Laplce L *=I-S *, order
Figure BDA000026506233000911
Be L *Eigenwert and proper vector pair, and λ i≤ ... ≤ λ nL *Be block diagonal matrix, the element between the same class group has larger absolute value.L *Less partial feature value characteristic of correspondence vector is keeping same block structure, makes it form U k=[v 1, v 2..., v k], wherein k is χ *The quantity that middle class is rolled into a ball can be by L *The eigenwert of arranging from small to large in k the largest interval value occurs and determine with k+1.Then will with the K averaging method
Figure BDA000026506233000912
Gather into the k class, comprising the class that is formed by discrete noise node.If to F *The summation of every row, row corresponding to noise itself and less.
According to label matrix and the definition x of class group iLeading mark be
Figure BDA000026506233000913
Use simultaneously c ∈ 1 ..., the label of k} representation class group.Then can be according to inequality
Figure BDA000026506233000914
Judge noise class group, wherein
Figure BDA000026506233000915
Expression is averaged the data among the c of class group,
Figure BDA000026506233000916
Expression is averaged k class group, and β is the threshold value factor, can be preset value, and removal noise class namely obtains the denoising image collection after rolling into a ball corresponding noise image set.
Sending module 190 is used for returning the denoising image collection as the corresponding result for retrieval of search key.
In the present embodiment, the image collection after the denoising is returned to search engine, as the corresponding result for retrieval of search key, namely finish the retrieval of image.
Above-mentioned image collator, by obtaining search key, and screen from database according to search key and to obtain image collection, set up the first spectrogram model of image collection according to characteristics of image, obtain the in twos similarity relation between the image in the image collection, set up the semi-supervised learning model according to similarity relation, according to the semi-supervised learning model image collection is carried out denoising, obtain the denoising image collection, return the denoising image collection as the corresponding result for retrieval of search key, by the image collection that retrieves is carried out overall denoising, improved the degree of accuracy of image retrieval.
In one embodiment, above-mentioned sending module 190 also is used for the denoising image collection is set up the second spectrogram model, obtain the corresponding Section Point of denoising image collection set χ ' and based on the second normalization limit matrix S of χ ' ', set up the maximization function according to the spectrogram model y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein, M=(I-α S') -1, m Ii=0, γ is preset value, M P * pThe capable p row of front p of M, obtain positive sample by alternative manner solution maximization function, by positive sample training semi-supervised learning model, so that the denoising image collection is resequenced, obtain the image collection that reorders, return the result for retrieval of image collection as search key that reorder.
In the present embodiment, on the basis of denoising image collection, set up the spectrogram model, carry out the spectral clustering analysis, obtain leading class, to obtain positive sample, be used for training semi-supervised learning model and then the denoising image collection sorted, obtain final result for retrieval.With key word since with in terms of content with key word the relevant ratio of image in the image of standing out so set up spectrogram based on the image set of standing out, select to dominate class usually above the ratio in whole image set.Leading class, the i.e. many and high class group of density of node in the spectral clustering.
Concrete, make that χ ' is the denoising image collection, S' is the normalization limit matrix based on χ '.
Set up matrix M=(I-α S') -1And m Ii=0.Owing to more may occupy larger proportion in the image that positive sample is stood out to form leading class group in image collection, only consider the front p dimension of M here, p can preset value.
The query vector y of definition p * 1 pExpression comes the demarcation information of front p width of cloth image.In order to make y pDemarcation information accurate, set up the maximization function y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein γ is balance factor, M P * pThe capable p row of front p of M,
Figure BDA00002650623300112
Be the density item, weigh by y pThe density of the block structure that middle nominal data forms,
Figure BDA00002650623300113
Be the yardstick item, guarantee that leading class group has larger size, It is the demarcation query vector after the purification of requirement.
For above-mentioned maximization function, adopt alternative manner to find the solution, at first, at all dimension initialization y p=1, for each iteration, the value of certain one dimension is become 0 from 1, so that Amplification is maximum.When
Figure BDA00002650623300116
In the time of can't increasing by this mode, iteration stopping.Remaining 1 corresponding image corresponding to expression is positive sample.After demarcating good positive sample, the processing of resequencing of image that can the denoising image collection is arranged on the denoising image collection of positive sample by the semi-supervised learning model being applied in demarcate.The result that will resequence at last returns as the result for retrieval of search key and connects, and namely gets the result for retrieval that improves retrieval accuracy.
One of ordinary skill in the art will appreciate that all or part of flow process that realizes in above-described embodiment method, to come the relevant hardware of instruction to finish by computer program, described program can be stored in the computer read/write memory medium, this program can comprise the flow process such as the embodiment of above-mentioned each side method when carrying out.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.
The above embodiment has only expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (10)

1. an image search method comprises the steps:
Obtain search key, and screen from database according to described search key and to obtain image collection;
Set up the first spectrogram model of described image collection according to characteristics of image, obtain the in twos similarity relation between the image in the described image collection;
Set up the semi-supervised learning model according to described similarity relation;
According to described semi-supervised learning model described image collection is carried out denoising, obtain the denoising image collection;
Return described denoising image collection as the corresponding result for retrieval of described search key.
2. image search method according to claim 1 is characterized in that, described the first spectrogram model of setting up described image collection according to characteristics of image, and the in twos step of the similarity relation between the image that obtains in the described image collection comprises:
Obtain image feature value, set up the Characteristic of Image vector;
According to described proper vector described image collection is gathered χ={ x by first node 1..., x nExpression, wherein x nA multi-C vector, x nA dimension represent an eigenwert;
According to described first node set opening relationships matrix W, wherein, when i ≠ j, w Ij=exp (|| x i-x j|| 2/ σ 2); When i=j, w Ij=0;
Described similar matrix W is done normalized get first normalization limit matrix S=D -1/2WD -1/2, wherein D is that diagonal element is
Figure FDA00002650623200011
Diagonal matrix.
3. image search method according to claim 2 is characterized in that, the described step of setting up the semi-supervised learning model according to described similarity relation comprises:
P node before obtaining in the described first node set, it is positive sample that described p node demarcated;
Definition query vector y is wherein for demarcating node, y=y i=1 (i≤p), for not demarcating node, y=y u=0 (p+1≤u≤n);
Definition prediction label vector f, wherein f i(1≤i≤n) expression node xi predicts label;
Set up the energy function of described prediction label vector f E ( f ) = Σ i , j = 1 n w ij ( f i d ii - f i d jj ) 2 + μ Σ i = 1 n ( f i - y i ) 2 ;
According to described energy function f is differentiated and to get semi-supervised learning model f=(1-α) (I-α S) -1Y, wherein α=1/ (1+ μ), I is unit matrix.
4. image search method according to claim 3 is characterized in that, describedly according to described semi-supervised learning model described image collection is carried out denoising, and the step that obtains the denoising image collection is:
Obtain the label of single node prediction according to described semi-supervised learning model, obtain label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure FDA00002650623200021
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label;
Carry out the spectral clustering analysis according to described label matrix, obtain a plurality of class group;
Leading mark according to described label matrix and the described node of class group definition is
Figure FDA00002650623200022
According to inequality
Figure FDA00002650623200023
Judge noise class group, wherein
Figure FDA00002650623200024
Expression is averaged the data among the c of class group, Expression is averaged k class group, and β is preset value;
Remove described noise class and roll into a ball corresponding noise image set, obtain the denoising image collection.
5. according to claim 4 described image search methods, it is characterized in that, describedly return described denoising image collection and as the step of the corresponding result for retrieval of described search key be:
Described denoising image collection is set up the second spectrogram model, obtain the corresponding Section Point of described denoising image collection set χ ' and based on the second normalization limit matrix S of χ ' ';
Set up the maximization function according to described the second spectrogram model y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein, M=(I-α S') -1, m Ii=0, γ is preset value, M P * pThe capable p row of front p of M;
Obtain positive sample by the described maximization function of alternative manner solution;
By the described semi-supervised learning model of described positive sample training, so that described denoising image collection is resequenced, obtain the image collection that reorders;
Return the described result for retrieval of image collection as described search key that reorder.
6. image retrieving apparatus comprises:
Acquisition module is used for obtaining search key, and screens from database according to described search key and to obtain image collection;
MBM, the first spectrogram model for set up described image collection according to characteristics of image obtains the in twos similarity relation between the image in the described image collection;
Study module is used for setting up the semi-supervised learning model according to described similarity relation;
The denoising module is used for according to described semi-supervised learning model described image collection being carried out denoising, obtains the denoising image collection;
Sending module is used for returning described denoising image collection as the corresponding result for retrieval of described search key.
7. image retrieving apparatus according to claim 6 is characterized in that, described MBM also is used for obtaining image feature value, sets up the Characteristic of Image vector, according to described proper vector described image collection is gathered x={x by first node 1..., x nExpression, wherein x nA multi-C vector, x nA dimension represent an eigenwert, according to described first node set opening relationships matrix W, wherein, when i ≠ j, w Ij=exp (|| x i-x j|| 2/ σ 2); When i=j, w Ij=0, described similar matrix W is done normalized get first normalization limit matrix S=D -1/2WD -1/2, wherein D is that diagonal element is
Figure FDA00002650623200031
Diagonal matrix.
8. image retrieving apparatus according to claim 7 is characterized in that, describedly also is used for obtaining p node before the described first node set according to described study module, it is positive sample that described p node demarcated, definition query vector y is wherein for demarcating node, y=y i=1 (i≤p), for not demarcating node, y=y u=0 (p+1≤u≤n), definition prediction label vector f, wherein f i(the expression of 1≤i≤n) node x iPredict label, set up the energy function of described prediction label vector f E ( f ) = Σ i , j = 1 n w ij ( f i d ii - f i d jj ) 2 + μ Σ i = 1 n ( f i - y i ) 2 , According to described energy function f is differentiated and to get semi-supervised learning model f=(1-α) (I-α S) -1Y, wherein α=1/ (1+ μ), I is unit matrix.
9. image retrieval rotary device according to claim 8 is characterized in that, the described label that also is used for obtaining according to described semi-supervised learning model the single node prediction according to described denoising module obtains label matrix F *=(I-α S) -1[y 1. ..., y i..., y n]=(I-α S) -1Wherein,
Figure FDA00002650623200033
Be based on the query vector y that single node is demarcated jAnd to x iThe prediction label, carry out the spectral clustering analysis according to described label matrix, obtain a plurality of class group, according to the leading mark of described label matrix and the described node of class group definition be
Figure FDA00002650623200041
According to inequality
Figure FDA00002650623200042
Judge noise class group, wherein
Figure FDA00002650623200043
Expression is averaged the data among the c of class group,
Figure FDA00002650623200044
Expression is averaged k class group, and β is preset value, removes described noise class and rolls into a ball corresponding noise image set, obtains the denoising image collection.
10. according to claim 9 described image retrieving apparatus, it is characterized in that, described sending module also is used for described denoising image collection is set up the second spectrogram model, obtain the corresponding Section Point of described denoising image collection set χ ' and based on the second normalization limit matrix S of χ ' ', set up the maximization function according to described the second spectrogram model y p * = arg max ( y p T M p × p y p ( Σ i = 1 n ( y p ) i ) 2 - γ 1 ( Σ i = 1 n ( y p ) i ) 2 ) , Wherein, M=(I-α S') -1, m Ii=0, γ is preset value, M P * pThe capable p row of front p of M, obtain positive sample by the described maximization function of alternative manner solution, by the described semi-supervised learning model of described positive sample training, so that described denoising image collection is resequenced, obtain the image collection that reorders, return the described result for retrieval of image collection as described search key that reorder.
CN201210572371.8A 2012-12-25 2012-12-25 Image search method and device Active CN103064941B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210572371.8A CN103064941B (en) 2012-12-25 2012-12-25 Image search method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210572371.8A CN103064941B (en) 2012-12-25 2012-12-25 Image search method and device

Publications (2)

Publication Number Publication Date
CN103064941A true CN103064941A (en) 2013-04-24
CN103064941B CN103064941B (en) 2016-12-28

Family

ID=48107571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210572371.8A Active CN103064941B (en) 2012-12-25 2012-12-25 Image search method and device

Country Status (1)

Country Link
CN (1) CN103064941B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440332A (en) * 2013-09-05 2013-12-11 南京大学 Image searching method based on relation matrix regularization enhancement representation
CN103699612A (en) * 2013-12-13 2014-04-02 中国科学院深圳先进技术研究院 Image retrieval ranking method and device
CN104484438A (en) * 2014-12-23 2015-04-01 小米科技有限责任公司 Image processing method and device
CN107273547A (en) * 2017-07-18 2017-10-20 北京奇虎科技有限公司 Mobile terminal to search result page three-dimensional rendering method and device
CN109002442A (en) * 2017-06-06 2018-12-14 株式会社日立制作所 A kind of device and method based on doctor's association attributes retrieval diagnosed case
CN109146825A (en) * 2018-10-12 2019-01-04 深圳美图创新科技有限公司 Photography style conversion method, device and readable storage medium storing program for executing
CN109740062A (en) * 2019-01-04 2019-05-10 东北大学 A kind of search mission clustering method based on study output
CN109766470A (en) * 2019-01-15 2019-05-17 北京旷视科技有限公司 Image search method, device and processing equipment
CN110413848A (en) * 2019-07-19 2019-11-05 上海赜睿信息科技有限公司 A kind of data retrieval method, electronic equipment and computer readable storage medium
CN111507407A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Training method and device of image classification model
CN113127663A (en) * 2021-04-01 2021-07-16 深圳力维智联技术有限公司 Target image searching method, device, equipment and computer readable storage medium
CN113360698A (en) * 2021-06-30 2021-09-07 北京海纳数聚科技有限公司 Picture retrieval method based on image-text semantic transfer technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070098254A1 (en) * 2005-10-28 2007-05-03 Ming-Hsuan Yang Detecting humans via their pose
US7562060B2 (en) * 2006-03-31 2009-07-14 Yahoo! Inc. Large scale semi-supervised linear support vector machines
CN102096825A (en) * 2011-03-23 2011-06-15 西安电子科技大学 Graph-based semi-supervised high-spectral remote sensing image classification method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070098254A1 (en) * 2005-10-28 2007-05-03 Ming-Hsuan Yang Detecting humans via their pose
US7562060B2 (en) * 2006-03-31 2009-07-14 Yahoo! Inc. Large scale semi-supervised linear support vector machines
CN102096825A (en) * 2011-03-23 2011-06-15 西安电子科技大学 Graph-based semi-supervised high-spectral remote sensing image classification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈蓉,孙剑,徐宗本: "彩色图像分割中基于图上半监督学习算法研究", 《西安交通大学学报》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440332A (en) * 2013-09-05 2013-12-11 南京大学 Image searching method based on relation matrix regularization enhancement representation
CN103440332B (en) * 2013-09-05 2016-08-17 南京大学 A kind of image search method strengthening expression based on relational matrix regularization
CN103699612A (en) * 2013-12-13 2014-04-02 中国科学院深圳先进技术研究院 Image retrieval ranking method and device
CN103699612B (en) * 2013-12-13 2017-10-13 中国科学院深圳先进技术研究院 A kind of method and device of image retrieval sequence
CN104484438A (en) * 2014-12-23 2015-04-01 小米科技有限责任公司 Image processing method and device
CN109002442B (en) * 2017-06-06 2023-04-25 株式会社日立制作所 Device and method for searching diagnosis cases based on doctor related attributes
CN109002442A (en) * 2017-06-06 2018-12-14 株式会社日立制作所 A kind of device and method based on doctor's association attributes retrieval diagnosed case
CN107273547B (en) * 2017-07-18 2021-01-29 三六零科技集团有限公司 Three-dimensional presentation method and device for search result page of mobile terminal
CN107273547A (en) * 2017-07-18 2017-10-20 北京奇虎科技有限公司 Mobile terminal to search result page three-dimensional rendering method and device
CN109146825A (en) * 2018-10-12 2019-01-04 深圳美图创新科技有限公司 Photography style conversion method, device and readable storage medium storing program for executing
CN109740062A (en) * 2019-01-04 2019-05-10 东北大学 A kind of search mission clustering method based on study output
CN109766470A (en) * 2019-01-15 2019-05-17 北京旷视科技有限公司 Image search method, device and processing equipment
CN110413848A (en) * 2019-07-19 2019-11-05 上海赜睿信息科技有限公司 A kind of data retrieval method, electronic equipment and computer readable storage medium
CN111507407A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Training method and device of image classification model
CN111507407B (en) * 2020-04-17 2024-01-12 腾讯科技(深圳)有限公司 Training method and device for image classification model
CN113127663A (en) * 2021-04-01 2021-07-16 深圳力维智联技术有限公司 Target image searching method, device, equipment and computer readable storage medium
CN113127663B (en) * 2021-04-01 2024-02-27 深圳力维智联技术有限公司 Target image searching method, device, equipment and computer readable storage medium
CN113360698A (en) * 2021-06-30 2021-09-07 北京海纳数聚科技有限公司 Picture retrieval method based on image-text semantic transfer technology

Also Published As

Publication number Publication date
CN103064941B (en) 2016-12-28

Similar Documents

Publication Publication Date Title
CN103064941A (en) Image retrieval method and device
CN110019843B (en) Knowledge graph processing method and device
WO2022126810A1 (en) Text clustering method
CN109002492B (en) Performance point prediction method based on LightGBM
Stanisz et al. Linguistic data mining with complex networks: A stylometric-oriented approach
CN107545038B (en) Text classification method and equipment
CN106294344A (en) Video retrieval method and device
CN109117480A (en) Word prediction technique, device, computer equipment and storage medium
CN112700325A (en) Method for predicting online credit return customers based on Stacking ensemble learning
CN104346379A (en) Method for identifying data elements on basis of logic and statistic technologies
De Roover et al. Mixture simultaneous factor analysis for capturing differences in latent variables between higher level units of multilevel data
EP3798891B1 (en) Systems and methods for detecting personally identifiable information
CN107563324B (en) Hyperspectral image classification method and device of ultralimit learning machine with composite nuclear structure
CN114443847A (en) Text classification method, text processing method, text classification device, text processing device, computer equipment and storage medium
CN107122395B (en) Data sampling method and device
CN108920451A (en) Text emotion analysis method based on dynamic threshold and multi-categorizer
CN109960730B (en) Short text classification method, device and equipment based on feature expansion
CN113343012B (en) News matching method, device, equipment and storage medium
Kumar et al. Fake news detection of Indian and United States election data using machine learning algorithm
CN112489689B (en) Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure
CN106844743B (en) Emotion classification method and device for Uygur language text
Zhao et al. Prediction of English Scores of College Students Based on Multi-source Data Fusion and Social Behavior Analysis.
CN112148735A (en) Construction method for structured form data knowledge graph
CN109144999B (en) Data positioning method, device, storage medium and program product
CN112131106B (en) Test data construction method and device based on small probability data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant