CN102902821B - The image high-level semantics mark of much-talked-about topic Network Based, search method and device - Google Patents

The image high-level semantics mark of much-talked-about topic Network Based, search method and device Download PDF

Info

Publication number
CN102902821B
CN102902821B CN201210431912.5A CN201210431912A CN102902821B CN 102902821 B CN102902821 B CN 102902821B CN 201210431912 A CN201210431912 A CN 201210431912A CN 102902821 B CN102902821 B CN 102902821B
Authority
CN
China
Prior art keywords
image
class
marked
word
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210431912.5A
Other languages
Chinese (zh)
Other versions
CN102902821A (en
Inventor
王晓茹
余志洪
杜军平
维旭光
孙朝阳
林晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201210431912.5A priority Critical patent/CN102902821B/en
Publication of CN102902821A publication Critical patent/CN102902821A/en
Application granted granted Critical
Publication of CN102902821B publication Critical patent/CN102902821B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of image high-level semantics mask method of much-talked-about topic Network Based, search method and device.Wherein mask method comprises: the Entity Semantics word utilizing image to be marked, and based on the search engine of text key word, retrieval is with the image of image entities semantic similitude to be marked and with text.Then theme is extracted from text, and set up theme and theme, image and image, incidence relation between image and theme, and based on this, to there is similar topic and the similar image of visual signature to gather be a class, being gathered by similar topic corresponding for the image with similar visual signature is a class.Therefrom select the image class the most similar to Image Visual Feature to be marked, using the theme of its correspondence as much-talked-about topic.The present invention, by said process, achieves and marks the high-level semantics of image, and enable the high-level semantics obtained describe image to be marked accurately by denoising.

Description

The image high-level semantics mark of much-talked-about topic Network Based, search method and device
Technical field
The present invention relates to image labeling and searching field, be specifically related to the image high-level semantics mark of much-talked-about topic Network Based, search method and device.
Background technology
Image is a kind of multi-medium data of complexity, contains abundant semantic content.The semanteme of image is divided into three levels, and ground floor is bottom semantic layer, the bottom visual signature such as color, texture namely utilizing original image data to extract to obtain; The second layer is Entity Semantics layer, namely utilizes the bottom visual signature of extraction, carries out certain reasoning from logic, identify the object type comprised in image, around the Entity Semantics of the object extraction of image.Third layer is abstract semantics layer and high-level semantics, and containing the semanteme that scene, behavior and emotion etc. are senior, is the more senior reasoning to Entity Semantics.
Along with the development of digital audio-effect processing and Internet technology, user can obtain a large amount of images easily.From a large amount of images, retrieve the image meeting demand for the convenience of the user, image labeling technology is arisen at the historic moment.Image labeling refers to as image adds the technology that can describe the keyword of its semanteme.Namely such user searches for keyword by text retrieval just can retrieve associated picture from network.Along with the development of technology, image labeling develops into automatic image annotation namely by finding the incidence relation between semantic and bottom visual signature by artificial mark, with this opening relationships model, realizes the mark to unknown semantics image.
At present, automatic image annotation technology mainly refers to mark that is semantic to image bottom and Entity Semantics, and based on this, user also cannot be retrieved image by the mode of input high-level semantics content.But along with the development of internet, user often needs to retrieve the image relevant to high-level semantics content.Such as, user often wants to retrieve the image relevant to network hot topic.Herein, network hot topic refers to sometime in section, (burst) event that network occurs or the topic widely discussed.Generally be presented as the clicking rate of webpage sharply rise or image inquiry, upload, download increases.
Therefore, a kind of method that image high-level semantics is marked is badly in need of at present, especially to the mask method of image-related network hot topic.
Summary of the invention
In view of this, the object of the present invention is to provide that a kind of image high-level semantics of much-talked-about topic Network Based marks, search method and device thereof, by high-level semantics, retrieval is carried out to image for realizing user condition is provided.
Embodiments provide a kind of image high-level semantics mask method of much-talked-about topic Network Based, described method comprises:
An image high-level semantics mask method for much-talked-about topic Network Based, it is characterized in that, described method comprises:
With at least one Entity Semantics word of image to be marked for query word, utilize the search engine based on text key word, retrieve from network and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image;
Extract described with the theme in text, and set up the image of described semantic similitude and the corresponding relation of described theme based on the described corresponding relation with text and described theme;
Visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, form the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set;
Set up the corresponding relation of the set of described image class and described theme class set;
According to the visual signature of described image to be marked, from the set of described image class, search the image class similar to the visual signature of described image to be marked, and extract the network hot topic of theme class corresponding to described similar image class as described image to be marked;
According to described network hot topic, semantic tagger is carried out to described image to be marked.
Preferably, described method also comprises the step of in advance described image to be marked being carried out to Entity Semantics mark, specifically comprises:
Extract the visual signature of described image to be marked;
According to described visual signature, from limited training set, search the candidate image with described image similarity to be marked;
Extract the Entity Semantics word of described candidate image, and utilize described Entity Semantics word to carry out Entity Semantics mark to described image to be marked.
Preferably, after the Entity Semantics word of the described candidate image of described extraction, utilize before described Entity Semantics word carries out Entity Semantics mark to described image to be marked, described method also comprises:
Candidate image similar for Entity Semantics being gathered according to described Entity Semantics word is a class, forms the set of candidate image class;
The candidate image class the most similar to described Image Visual Feature to be marked is searched as neighbor image class from the set of described candidate image class;
Described utilize described Entity Semantics word to described image to be marked carry out Entity Semantics mark comprise:
The Entity Semantics word of described neighbor image class is utilized to carry out Entity Semantics mark to described image to be marked.
Preferably, described candidate image similar for Entity Semantics being gathered according to described Entity Semantics word is a class, forms candidate image class and comprises
Set up hypergraph model G (Vs, Ts), and obtain the similarity matrix H of hypergraph model based on this, wherein, described hypergraph model for vertex set, is super limit collection with the set Ts of the Entity Semantics word of described candidate image with the set Vs of the candidate image with described image similarity to be marked; Element Hij in described matrix H represents the symbiosis of contacting of each image Vi and corresponding Entity Semantics word Tj and each Entity Semantics word and multiple candidate image;
According to described similarity matrix H, utilize spectral clustering, carry out cluster to described hypergraph model, it is a class that the candidate image shared some being surpassed limit gathers, and forms described candidate image class.
Preferably, described method also comprises:
Utilize formula calculate the degree of correlation of Entity Semantics word in described neighbor image class and described image to be marked; Wherein, ii is the neighbor image in neighbor image class S, and iq is image to be marked; P (ii/iq) equals the visual signature similarity of ii and described iq;
Described utilize the Entity Semantics word of described neighbor image to described image to be marked carry out Entity Semantics mark comprise:
According to described degree of correlation order from big to small, the Entity Semantics word choosing predetermined number from described neighbor image class carries out Entity Semantics mark to described image to be marked.
Preferably, described extraction is described with the theme in text, and based on the described corresponding relation with text and theme, the corresponding relation of the image and described theme of setting up described semantic similitude comprises:
Utilize and describedly set up LDA model with text, set up image-theme correlation matrix Rvt based on theme described in described LDA model extraction;
Described that visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, form the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set and comprise:
Set up the described theme correlation matrix Rt with text;
Utilize the visual similarity of image, calculate the visual similarity matrix Rv of the image of described semantic similitude;
Utilize Rt, Rvt, Rv, set up complicated graph model G (Rv, Rt, Rvt);
Cluster is carried out to described complexity figure G (Rv, Rt, Rvt), forms the set of described image class and described theme class set.
Preferably, describedly according to described network hot topic, semantic tagger is carried out to described image to be marked and comprises:
Utilize evolution method of inspection χ 2extract front K the word the highest with the described network hot topic degree of correlation and semantic tagger is carried out to described image to be marked.
Present invention also offers a kind of image high-level semantics annotation equipment of much-talked-about topic Network Based, described device comprises:
Text retrieving unit, for with at least one Entity Semantics word of image to be marked for query word, utilize the search engine based on text key word, retrieve from network and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image;
Subject distillation unit, described with the theme in text for extracting;
First associative cell, for the corresponding relation based on described adjoint text and theme, sets up the image of described semantic similitude and the corresponding relation of described theme;
Cluster cell, for visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, forms the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set;
Second associative cell, for setting up the corresponding relation of the set of described image class and described theme class set;
First content retrieval unit, for the visual signature according to described image to be marked, searches the image class similar to the visual signature of described image to be marked from the set of described image class;
Much-talked-about topic extraction unit, for extracting the network hot topic of theme class corresponding to described similar image class as described image to be marked;
Much-talked-about topic mark unit, for carrying out semantic tagger according to described network hot topic to described image to be marked.
Preferably, described device also comprises Entity Semantics mark unit, for carrying out Entity Semantics mark to described image to be marked; Described Entity Semantics mark unit specifically comprises:
Visual Feature Retrieval Process unit, for extracting the visual signature of described image to be marked;
Second content retrieval unit, for according to described visual signature, searches the candidate image with described image similarity to be marked from limited training set;
Entity Semantics word extraction unit, for extracting the Entity Semantics word of described candidate image;
Entity Semantics mark subelement, carries out Entity Semantics mark for utilizing described Entity Semantics word to described image to be marked.
Preferably, described device also comprises denoising unit, for carrying out denoising to described candidate image; Concrete, described denoising unit comprises:
Candidate image cluster cell is a class for candidate image similar for Entity Semantics being gathered according to described Entity Semantics word, forms the set of candidate image class;
3rd content retrieval unit, for searching the candidate image class the most similar to described Image Visual Feature to be marked as neighbor image class from the set of described candidate image class;
Described Entity Semantics mark subelement, carries out Entity Semantics mark specifically for utilizing the Entity Semantics word of described neighbor image class to described image to be marked.
Preferably, described candidate image cluster cell comprises:
Hypergraph model unit, for setting up hypergraph model G (Vs, Ts), and the similarity matrix H of hypergraph model is obtained based on this, wherein, described hypergraph model for vertex set, is super limit collection with the set Ts of the Entity Semantics word of described candidate image with the set Vs of the candidate image with described image similarity to be marked; Element Hij in described matrix H represents the symbiosis of contacting of each image Vi and corresponding Entity Semantics word Tj and each Entity Semantics word and multiple candidate image;
Spectral clustering unit, for according to described similarity matrix H, utilizes spectral clustering, carries out cluster to described hypergraph model, and it is a class that the candidate image shared some being surpassed limit gathers, and forms described candidate image class.
Preferably, described device also comprises:
Correlation calculating unit, for utilizing formula calculate the degree of correlation of Entity Semantics word in described neighbor image class and described image to be marked; Wherein, ii is the neighbor image in neighbor image class S, and iq is image to be marked; P (ii/iq) equals the visual signature similarity of ii and described iq;
Described Entity Semantics mark subelement, specifically for according to described degree of correlation order from big to small, the Entity Semantics word choosing predetermined number from described neighbor image class carries out Entity Semantics mark to described image to be marked.
Preferably, described subject distillation unit, sets up LDA model, based on theme described in described LDA model extraction specifically for utilizing described adjoint text;
Described first associative cell, specifically for setting up image-theme correlation matrix Rvt based on described LDA model;
Described cluster cell comprises:
Theme correlation matrix unit, for setting up the described theme correlation matrix Rt with text;
Visual similarity matrix unit, for utilizing the visual similarity of image, calculates the visual similarity matrix Rv of the image of described semantic similitude;
Complicated graph model unit, for utilizing Rt, Rvt, Rv, sets up complicated graph model G (Rv, Rt, Rvt);
Complicated figure cluster cell, for carrying out cluster to described complexity figure G (Rv, Rt, Rvt), forms the set of described image class and described theme class set.
Preferably, described much-talked-about topic mark unit, specifically for, utilize evolution method of inspection χ 2extract front K the word the highest with the described network hot topic degree of correlation and semantic tagger is carried out to described image to be marked.
Present invention also offers a kind of image search method of much-talked-about topic Network Based, wherein, described network hot topic utilizes above-mentioned image labeling method to obtain, and this search method comprises:
Receive the image retrieval text of user's input; Described image retrieval text at least comprises a network hot topic word;
Image on crawl internet and the markup information of described image thereof;
Judge whether the markup information of described image matches with described image retrieval text;
As coupling, then the described image that matches and markup information thereof are exported.
Preferably, when comprising network hot topic and Entity Semantics in described image retrieval text simultaneously, whether the described markup information judging described image matches with described image retrieval text and comprises:
Judge in the markup information of described image, whether to comprise described network hot topic word.
Preferably, described as coupling, then the described image that matches and markup information thereof are exported and comprise:
As coupling, then the content relevant to described image retrieval text in the described image that matches and markup information thereof is exported.
Present invention also offers a kind of image retrieving apparatus of much-talked-about topic Network Based, wherein, described network hot topic utilizes above-mentioned image labeling method to obtain, and described device comprises:
Received text unit, for receiving the image retrieval text of user's input; Described image retrieval text at least comprises a network hot topic word;
Placement unit, for capturing the markup information of image on internet and described image thereof;
Judging unit, for judging whether the markup information of described image matches with described image retrieval text;
Output unit, for when the markup information of described image and described image retrieval text match, the image matched described in output and markup information thereof export.
Preferably, described judging unit, specifically for when comprising network hot topic and Entity Semantics in described image retrieval text simultaneously, judges whether comprise described network hot topic word in the markup information of described image.
Preferably, described output unit, specifically for when the markup information of described image and described image retrieval text match, exports the content relevant to described image retrieval text in the described image that matches and markup information thereof.
Compared with the existing technology, the present invention has following beneficial effect:
The invention provides a kind of mask method of image high-level semantics of much-talked-about topic Network Based, first the Entity Semantics word of image to be marked is utilized, based on the search engine of text key word, retrieved from the mass data of internet with the image of image entities semantic similitude to be marked and with text.Because training set is based on internet, therefore, the semanteme of acquisition comprehensively and have real-time update.Then the present invention extracts theme from text, and set up theme and theme, image and image, incidence relation between image and theme, and based on this, to there is similar topic and the similar image of visual signature to gather be a class, being gathered by similar topic corresponding for the image with similar visual signature is a class.Therefrom select the image class the most similar to Image Visual Feature to be marked, using the theme of its correspondence as much-talked-about topic.By said process, carry out denoising by with in text with the incoherent theme of correspondence image and the image not high with Image Visual Feature similarity to be marked, made the high-level semantics obtained can describe image to be marked accurately.
Accompanying drawing explanation
Fig. 1 is the embodiment of the present invention 1 text based image search method process flow diagram;
Fig. 2 A-2C is image to be marked and the semantic similitude image that arrives based on text retrieval in the embodiment of the present invention;
Fig. 3 is the denoising process flow diagram of semantic similitude image in the embodiment of the present invention;
Fig. 4 is the high-level semantics mask method process flow diagram of much-talked-about topic Network Based in the embodiment of the present invention 2;
Fig. 5 is the process flow diagram in advance image to be marked being carried out to Entity Semantics mark in the embodiment of the present invention 3;
Fig. 6 is hypergraph model schematic diagram in the embodiment of the present invention;
Fig. 7 is the image high-level semantics annotation equipment structural drawing of the embodiment of the present invention 5 based on much-talked-about topic;
Fig. 8 is the image search method process flow diagram of the embodiment of the present invention 6 much-talked-about topic Network Based;
Fig. 9 is the image retrieving apparatus structural drawing of the embodiment of the present invention 7 much-talked-about topic Network Based.
Embodiment
In order to the scheme making those skilled in the art person understand the embodiment of the present invention better, below in conjunction with drawings and embodiments, the embodiment of the present invention is described in further detail.
For realizing the mark to image network much-talked-about topic, the network hot topic that image is corresponding must be determined.We know, on network, most of image all interts in text message.The text message that we claim these to occur together with image is for text, common as the Word message etc. in web page text.Image has relevance to a great extent with text, and therefore, we tentatively can think that these are exactly much-talked-about topic corresponding to image with the much-talked-about topic embodied in text.
We know, the semanteme of piece image can obtain from its similar image.Based on this, the present invention needs the adjoint text retrieved with the image of image similarity to be marked and these similar images, and then extract network hot topic from these with text, this network hot topic is exactly much-talked-about topic corresponding to image to be marked.
The adjoint text of retrieval similar image and similar image can carry out in several ways.The embodiment of the present invention 1 provides a kind of image retrieval based on text key word (TBIR, Text Based ImageRetrieval) method, and see Fig. 1, the method comprises:
S11, with at least one Entity Semantics word of image to be marked for query word, based on the search engine of text key word, retrieval and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image from network.
At least one Entity Semantics word in the present invention can be that user manually inputs according to image to be marked, also can be that system is extracted automatically according to the Entity Semantics word of the image labeling to be marked of input.In actual applications, image to be marked may have multiple Entity Semantics word, is the image that farthest search is similar to image, semantic to be marked, and user or system can only select one of them or the semantic word of part entity to retrieve.But the image retrieved like this may comprise a large amount of image not high with image similarity to be marked.For improving the similarity of image and the image to be marked retrieved, in a preferred embodiment of the invention, can with all Entity Semantics words of image to be marked for query word be retrieved.
In the present invention, searching system can be Baidu or google, can certainly be other searching system.
Based on the adjoint text retrieved, much-talked-about topic can be formed by extracting with the theme in text.But this much-talked-about topic is corresponding to text, if based on image and the incidence relation with text, much-talked-about topic and image direct correlation are got up, a lot of noise can be introduced, be mainly reflected in two aspects:
First, the web page text at image place normally taken from by the adjoint text of image, and image is a part for web page text, and therefore, a part of theme of web page text is incoherent with the semanteme of image.
Such as tell about in the web page text of handset configuration one, be interspersed with the image of hand on-board speaker, but in web page text, have content greatly all in the structure telling about other parts of mobile phone, to have nothing to do with loudspeaker.
Secondly, all adjoint texts are all utilize identical query word search to obtain, and due to the vision polysemy of word, these visually may exist very big-difference with the image corresponding to text and image to be marked.
Such as, suppose that Fig. 2 A is image to be marked, its Entity Semantics word is " apple ", if but with " apple " for query word, the image graph 2B similar to Fig. 2 can be searched, also can search the image of with Fig. 2 A diverse " i Phone ".
For these reasons, when determining network hot topic corresponding to image to be marked, need to utilize the visual similarity of image to carry out denoising to semantic similitude image, pick out the image set the most similar to image to be marked and adjoint text set thereof, thus improve the correctness of the much-talked-about topic chosen.Its concrete denoising process as shown in Figure 3, comprising:
S21, the theme extracted in adjoint text, set up with the corresponding relation between text and its theme, and set up the corresponding relation of image and theme based on the corresponding relation with text and theme.
S22, set up symbiosis between theme and theme, set up the visual similarity relation between image and image.
S23, the similar topic of image similar for visual signature and correspondence thereof is polymerized to a class, forms theme class set, will there is similar topic and the image with similar visual signature is polymerized to a class, form the set of image class.And set up the corresponding relation between above-mentioned image class and theme class based on said process.
S24, in the set of above-mentioned image class, search the image class similar on visual signature to image to be marked, theme class corresponding to the similar image class of described visual signature is the network hot topic of image to be marked.
In the present invention, the image class similar on visual signature to image to be marked of some can be searched in the set of above-mentioned image class.In a preferred embodiment of the invention, in the set of above-mentioned image class, the image class the most similar on visual signature to image to be marked is searched.
Can be found out by said process, the present invention needs consideration three kinds of incidence relations when denoising: the symbiosis between theme and theme, the similarity relationships between image and image, the corresponding relation between image and theme.Because image and its with text, there is certain semantic association, therefore, the corresponding relation of image and theme can be similar to and obtain with the corresponding relation with text and theme.
In the present invention, above-mentioned various incidence relation can obtain based on various ways.
Such as, can set up in the leaching process of theme with the corresponding relation between text and theme and the symbiosis between theme and theme.Also there is various ways with the extraction of theme in text, such as can utilize existing topic model vector space model, latent semantic analysis model (LSA) etc., in the present invention, preferred LDA (Latent Dirichlet Allocation potential Di Li Cray apportion model) carries out subject distillation.
LD A model is a kind of theme probability model carrying out modeling for discrete type text, a text-theme-word three layers Bayesian model, it is the probability mixed distribution of some themes by text representation, there is the text semantic descriptive power of more approaching to reality data, large-scale corpus can be processed efficiently.LDA model by theme modeling, by the characteristic vector space of text by the dimension of word change the dimension of theme into, the relative words of synonym and nearly justice are mapped to same subject, realize the modeling of semantic level.
This model has two parameters to need to estimate: one is " text-theme " distribution probability, and another is several " theme-word " distribution probabilities.By these two parameters, we can know the interested theme of text author, and the theme ratio etc. that each text is contained.Incidence relation i.e. " text-theme " distribution probability estimating " text-theme " is needed in the present invention.Existing method for parameter estimation mainly contains variation-EM (expectation maximization, expectation maximization) algorithm, also has Gibbs sampling.The preferred Gibbs sampling of the present invention carries out parameter estimation.Utilize GibbsLDA, the incidence relation of text and theme can be obtained wherein, α is the parameter of the corresponding Dirichlet distribution of theme, and T is the number of different themes, n j djthe number of times that in text dj, theme j occurs, n djit is the total degree that all themes occur in text dj.
About the symbiosis of theme and theme, can be represented by the matrix Rt setting up symbiosis between a reflection theme, wherein, Topic (Zi, Zj) is entry of a matrix element, is defined as follows:
Topic ( zi , zj ) = p ( zi | zj ) = C ( zi ∩ zj ) C ( zi ∩ zj ) + C ( z - i ∩ z - j ) * C ( zi ∩ zj ) C ( zj ) + C ( z - i ∩ z - j ) C ( zi ∩ zj ) + C ( z - i ∩ z - j ) * C ( z - i ∩ z - j ) C ( z - j )
Wherein, C (zi ∩ zj) and show respectively theme zi and theme zj with the frequency occurred common in text set and simultaneously absent number of times, illustrate the incidence relation between theme.
Visual similarity relation about image and image can be drawn by the similarity between computed image.In conjunction with prior art, the vision similarity of image can calculate according to the proper vector of each image.The concrete similarity incidence matrix H that can set up image and image v.Entry of a matrix element Sim v(Ii, Ij) is defined as follows:
Sim v ( Ii , Ij ) = ( Σ d = 1 n min ( id , jd ) / max ( id , jd ) ) / n
Wherein, the n dimensional feature vector that image Ii extracts is expressed as Ii [i1, i2, i3...in], and the n dimensional feature vector that image Ij extracts is expressed as Ij [j1, j2, j3...jn], id and jd represents the number of times that d kind feature occurs in the image of correspondence respectively.
Every width image and its have certain theme semantic association with text, therefore the associating to be similar to and obtain with the adjoint text of image and the incidence relation of theme of image and theme.Concrete, the corresponding relation of image and theme can represent by probability of use p (zj|Ii).The probability that image Ii marks keyword Zj is as follows:
Wherein, p (zj|Ii) ≈ p (zj|dj), is trained by LDA and obtains; P (Ij|Ii) ≈ Sim v(Ii, Ij)
Based on above-mentioned three kinds of incidence relations in the present invention, by various ways, image and theme are carried out cluster.Owing to there are image and theme two kinds of isomery summits in above-mentioned three kinds of incidence relations, in the embodiment of the present invention, preferred complicated graph model carries out cluster.
The complicated figure of definition (Complex graph) G={Rv, Rt, Rvt}, wherein, Rv, Rt represent theme vertex set and image vertex set respectively, and limit set Rvt comprises two subsets, is designated as wherein, S is the limit weight matrix that in Rv, isomorphism connects, and A represents the limit weight matrix that the isomery between Rv, Rt connects.N1, N2 represent the number of vertices in Rv, Rt respectively.
Based on above-mentioned complexity figure definition, the difference cluster that cluster can realize theme vertex set and image vertex set is carried out to complicated figure, and, the isomorphism between summit is utilized to be connected incidence relation with isomery, the corresponding relation between two vertex set classifications can be set up, the corresponding relation of the jth class in the i-th class namely in image vertex set Rt and theme vertex set Rv.
Based on above-mentioned analysis, the embodiment of the present invention 2 provides a kind of high-level semantics mask method of much-talked-about topic Network Based, and see Fig. 4, idiographic flow is as follows:
S31, input initial query word, utilize the search engine of text, obtains semantic similitude image collection and adjoint text collection.
S32, utilize the visual similarity of image, the visual similarity matrix Rv between computing semantic similar image.
S33, utilization set up LDA model, theme correlation matrix Rt and image-theme correlation matrix Rvt with text collection.
S34, input Rv, Rt, Rvt, set up complicated graph model G (Rv, Rt, Rvt).
S35, cluster is carried out to complexity figure G (Rv, Rt, Rvt), finds out the expansion neighbor image set similar to query image vision, extract the much-talked-about topic of its correspondence.
Based on prior art as Wu Fei, Han Yahong, the complicated figure clustering algorithm proposed in " Web graph of image-text relevant mining is as clustering method " that Zhuan Yueting, Shao Jian write, the cluster process of complicated figure G can change into the process of the optimum solution solving following optimization problem.The optimization problem of complicated figure cluster is defined as the optimum solution solving objective function L:
arg min C ( 1 ) , C ( 2 ) L L = min C ( 1 ) , C ( 2 ) | | S - C ( T ) D ( C ( 2 ) ) T | | 2 + | | A - C ( 1 ) B ( C ( 2 ) ) T | | 2 S , t , C ( 1 ) ∈ ( 0,1 ) N 1 * K 1 , C ( 2 ) ∈ ( 0,1 ) N 2 * K 2 (formula 1)
Wherein, Matrix C (1)represent the Clustering of Rv interior knot, that is represent the link strength (degree of association) between the i-th class and jth class in Rv.Matrix C (2)represent the Clustering of Rt interior knot, that is represent the link strength (degree of association) between the i-th class and jth class in Rt.Matrix B represents the Clustering of Rv interior knot and Rt interior knot, and that is B (i, j) represents the strength of association of the i class in Rv and the j class in Rt, and showing as is a kind of probability.Matrix D represents the Clustering of Rv inner vertex, and matrix element D (i, j) represents the strength of joint in Rv between the i-th class and jth class, shows as a kind of probability.
Provide the solution formula of matrix D, B below.If D, B are the optimum solutions of formula (1), then have:
D = ( ( C ( 1 ) ) T C ( 1 ) ) - 1 ( C ( 1 ) ) T SC ( 1 ) ( ( C ( 1 ) ) T C ( 1 ) ) - 1 B = ( ( C ( 1 ) ) T C ( 1 ) ) - 1 ( C ( 1 ) ) T AC ( 2 ) ( ( C ( 2 ) ) T C ( 2 ) ) - 1 S , t , C ( 1 ) ∈ ( 0,1 ) N 1 * K 1 , C ( 2 ) ∈ ( 0,1 ) N 2 * K 2 , D ∈ R + K 1 * K 2 , B ∈ R + K 1 * K 2 Formula (2)
Based on above-mentioned analysis, the present invention utilizes the method for complicated figure cluster as follows:
Input: complicated figure G=(Rv, Rt, Rvt).The cluster number K1 of setting set Rv, the cluster number K2 of set Rt;
Export: Subject Clustering result C (1); Image clustering result C (2); Matrix Pk1*k2, matrix element represents the incidence relation of each classification in each classification in theme vertex set after cluster and image collection after cluster.
Concrete calculation process is:
1, given C (1), C (2)initial value, according to formula (1) (2), calculate the initial value of D, B, L successively, make L min=L init;
2, fixing D, B and C (2), upgrade C line by line (1)in 1 position, make each time more new capital ensure that the L that this renewal obtains is minimum, upgrade L min.
3, fixing D, B and C (1), upgrade C line by line (2)in 1 position, make each time more new capital ensure that the L that this renewal obtains is minimum, upgrade L min.
4, D, B is calculated successively according to formula (2);
5, constantly 2-4 is repeated until convergence;
6, the correlation degree matrix of image class and theme class is tried to achieve according to formula P (I|T)=P (I|T ') * P (T ' | T).The wherein strength of joint matrix of P (I|T ') representative image and theme, i.e. B matrix, P (T ' | T) represents the strength of joint matrix of theme and theme, i.e. D matrix.D corresponding to the final L that the 5th step is tried to achieve, B matrix incidence matrix P (I|T).
Complicated figure clustering algorithm is utilized to carry out cluster to image vertex set and theme vertex set respectively, and the class mapping relations one by one in the classification set up in theme vertex set and image vertex set.In cluster process, three kinds of incidence relations interact, and for the cluster of image, vision content is similar and image that is total similar topic aggregates into a class; For the cluster of theme, the related subject of similar image aggregates into a class, defines much-talked-about topic.Such as, when use " prick gram Buick " image as initial query image time, obtain the initial query word pricking gram Buick, through TBIR retrieval, obtain multiple image and multiple adjoint text having occurred bundle gram Buick, after complicated figure modeling and cluster, define the similar image set with query image, simultaneously correspondingly to define that " facebook, founder; social networks film wait multiple relevant to " pricking a gram Buick " mark word.
The much-talked-about topic extracted based on said method is the theme corresponding to the image class the most similar to image to be marked, for improving the accuracy of much-talked-about topic further, in a preferred embodiment of the invention, can use evolution method of inspection χ 2extract the mark word of the individual word of the front K the highest with the topic degree of correlation as this topic, to mark image to be marked.Wherein, K is greater than 0 and is less than or equal to word number in network hot topic.
X2 is write as CHI sometimes, and its formula is: CHI ( t , c i ) = N * ( AD - BC ) 2 ( A + C ) * ( B + D ) * ( A + B ) * ( C + D )
Wherein, t represents a certain word in word list, and ci is a certain much-talked-about topic, and CHI (t, ci) represents the degree of association of t and ci.N represents all number of documents, A representative comprises word t and belongs to the number of documents of theme ci, but B representative comprises word t does not belong to the number of documents of theme ci, but C representative does not comprise word t belongs to the number of documents of theme ci, and D representative does not comprise word t nor belongs to the number of documents of theme ci.
It should be noted that, based on the descriptor likely comprising image entities semanteme in the much-talked-about topic that said process obtains.Visible, based on said process, utilize the adjoint text of similar image on internet to expand simultaneously and the Entity Semantics of image to be marked is marked.And the mode of relatively existing limited training set, the present invention for training set with internet mass data, makes the semanteme of mark comprehensively and has real-time update.
In embodiments of the invention 3, before the Entity Semantics word obtaining image to be marked, method of the present invention also comprises the process of in advance image to be marked being carried out to Entity Semantics mark.As shown in Figure 5, this process comprises the steps:
S41, extract the visual signature of described image to be marked.
S42, according to described visual signature, utilize CBIR (CBIR, Content BasedImage Retrieval), from limited training set, search the candidate image with described image similarity to be marked.
S43, extract the Entity Semantics word of described candidate image, and utilize described Entity Semantics word to carry out Entity Semantics mark to described image to be marked.
It should be noted that, due to the problem of semantic gap, although above-mentioned candidate image visually with image similarity to be marked, semantically may have very big difference.Such as, the image of an apple is inputted.So based on the retrieval of CBIR, may by the head of the spherical object or people that are similar to apple etc. all alternatively image export.And the Entity Semantics word obtained based on these candidate images such as " ball ", " head " etc. are not obviously the true semantemes of image to be marked.Therefore, in a preferred embodiment of the invention, need to carry out denoising to candidate image, to choose the candidate image more close with image, semantic to be marked.
We know, the image that candidate image is concentrated has marked the Entity Semantics word that multiple quantity does not wait, the adjoint text of correspondence image can be regarded as in these Entity Semantics words, and adjoint is all that semantic content is relevant to a certain extent between text to its corresponding image.Therefore, in order to investigate the otherness of the semantic content between the similar candidate image of these vision contents, the semantic content that can be considered as every width image is similar to and is equal to its semantic content with text, thus by investigating these with the semantic difference between text, the semantic content otherness between candidate image just indirectly can be obtained.Between these are with text, the adjoint text that semantic content is similar, the quantity of the semantic word of its similar entities owned together is more.
Based on this, the embodiment of the present invention 4 provides a kind of denoising method of candidate image, specifically comprises:
According to the similarity between Entity Semantics word, image corresponding for the adjoint text of semantic similitude is carried out cluster, form the set of candidate image class.
Then the Entity Semantics word choosing the image class the most similar on visual signature to image to be marked marks image to be marked.
In the present invention, utilize hypergraph to hint obliquely at the Entity Semantics word of image and the incidence relation with text.
Hypergraph is a kind of graph model, and its limit is referred to as super limit, can connect two or more summit.That is, every bar limit is exactly a vertex subset.In the present invention, the summit in hypergraph just represents image, and super limit is exactly Entity Semantics word (word).Obviously, an image may have multiple super frontier juncture to join.
Such as, when having image A, B, C, D, E, if the Entity Semantics word of the adjoint text of image A is " apple ", " fruit " and " price ", the Entity Semantics word of the adjoint text of image B is " apple ", " fruit ".The Entity Semantics word of the adjoint text of image C is " apple ", " fruit " and " illumination ", the Entity Semantics word of the adjoint text of image D is " Qiao Busi " and " price ", the Entity Semantics word of the adjoint text of image E is " football ", i.e. image A, B, Entity Semantics word " apple " shared by the adjoint text of C, " fruit ", image A, Entity Semantics word " price " shared by the adjoint text of D, when the adjoint text of image E does not share Entity Semantics word with the adjoint text of any image, above-mentioned relation is represented with hypergraph, it is concrete as shown in Figure 6, wherein image A, B, C shares two super limits, image A, D shares a super limit.Image E is an isolated summit.
Concentrate in candidate image, usual noise figure picture and query image are discrepant on semantic content, and noise image is also semantic discrepant each other, and between the candidate image similar to query image semantic content, semanteme is also similar.From the angle of cluster, the isolated point that above-mentioned situation then shows as multiple dispersion is often corresponding multiple noise image, then defines the larger class of quantity with the image of query image semantic similitude.Therefore, cluster analysis can be carried out on hypergraph, thus image candidate image concentrated carries out separation division according to text similarity.By cluster, achieve similar between summit (image) farthest share multiple super limit (Entity Semantics word), and widely different between inhomogeneity.
Through above-mentioned hypergraph modeling and cluster process, just search is obtained image and be polymerized to some classes, utilize the visual similarity of image, just can find out the image class the most similar to query image, thus in this image class, extract Entity Semantics word just can obtain initial Entity Semantics.
Each image class is to there being multiple Entity Semantics word, and each Entity Semantics word is different from the degree of correlation of image to be marked.For improving the accuracy of image entities semantic tagger to be marked further, in the preferred embodiment of the present invention, first calculate the degree of correlation of each Entity Semantics word and image to be marked in above-mentioned image class the most similar.Then the Entity Semantics word choosing a front K degree of correlation the highest marks image to be marked.
Wherein, in the most similar image class, the degree of correlation of each Entity Semantics word and image to be marked can utilize formula calculate.Wherein, S is image class the most similar, ti is the Entity Semantics word in image class the most similar, ii is the image in image class the most similar, and iq is image to be marked, p (ii/iq) ≈ similary (ii, iq), similary (ii, iq) represents the visual signature similarity of ii and iq, and this similarity can utilize method of the prior art to calculate.No longer elaborate herein. it represents when comprising ti in ii, and value is 1, otherwise is 0.
To sum up, the invention provides a kind of mask method of the image high-level semantics based on much-talked-about topic, first the Entity Semantics word of image to be marked is utilized, text based image retrieval, retrieved with the image of image entities semantic similitude to be marked and with text from the mass data of internet.Because training set is based on internet, therefore, the semanteme of acquisition comprehensively and have real-time update.Then the present invention extracts theme from text, and set up theme and theme, image and image, incidence relation between image and theme, and based on this, to there is similar topic and the similar image of visual signature to gather be a class, being gathered by similar topic corresponding for the image with similar visual signature is a class.Therefrom select the image the most similar to Image Visual Feature to be marked, using the theme of its correspondence as much-talked-about topic.By said process, carry out denoising by with in text with the incoherent theme of correspondence image and the image not high with Image Visual Feature similarity to be marked, made the high-level semantics obtained can describe image to be marked accurately.
Corresponding said method, the embodiment of the present invention 5 additionally provides a kind of image high-level semantics annotation equipment based on much-talked-about topic, and see Fig. 7, this device comprises:
Receiving element 11, for receiving at least one Entity Semantics word of the image to be marked of user's input.
Text retrieving unit 12, for at least one Entity Semantics word described for query word, text based picture search, retrieval and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image from network.
In actual applications, image to be marked may have multiple Entity Semantics word, and be the image that farthest search is similar to image, semantic to be marked, user can only select the input of one of them Entity Semantics word to retrieve.But the image retrieved like this may comprise a large amount of image not high with image similarity to be marked.For improving the similarity of image and the image to be marked retrieved, in a preferred embodiment of the invention, can with all Entity Semantics words of image to be marked for query word be retrieved.
In the present invention, the device carrying out retrieving can be Baidu or google, can certainly be other indexing unit.
Subject distillation unit 13, described with the theme in text for extracting.
According to the analysis of embodiment of the method part before, when determining network hot topic corresponding to image to be marked, the visual similarity utilizing image is needed to carry out denoising, pick out the image set the most similar to image to be marked and adjoint text set thereof, thus improve the correctness of the much-talked-about topic chosen.Based on this, described device also comprises:
First associative cell 14, for the corresponding relation based on described adjoint text and theme, sets up the image of described semantic similitude and the corresponding relation of described theme.
Cluster cell 15, for visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, forms the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set.
Second associative cell 16, for setting up the corresponding relation of the set of described image class and described theme class set.
First content retrieval unit 17, for the visual signature according to described image to be marked, searches the image class similar to the visual signature of described image to be marked from the set of described image class.
The image to the semantic similitude obtained from network and the denoising with text thereof is completed by said process.Then, network hot topic is extracted on this basis:
Much-talked-about topic extraction unit 18, for extracting the network hot topic of theme class corresponding to described similar image class as described image to be marked.
Much-talked-about topic mark unit 19, for carrying out semantic tagger according to described network hot topic to described image to be marked.
In the present invention, the cluster of subject distillation and image and theme can be realized by multiple device.The extraction of such as theme can utilize the existing device based on topic model vector space model, device etc. based on latent semantic analysis model (LSA).
Embodiments provide a kind of device extracting network hot topic and cluster:
Device based on LDA (Latent Dirichlet Allocation potential Di Li Cray apportion model) carries out subject distillation, namely subject distillation unit sets up LDA model, based on theme described in described LDA model extraction specifically for utilizing described adjoint text.
Described first associative cell, specifically for setting up image-theme correlation matrix Rvt based on described LDA model.
Described cluster cell comprises:
Theme correlation matrix unit, for setting up the described theme correlation matrix Rt with text.
Visual similarity matrix unit, for utilizing the visual similarity of image, calculates the visual similarity matrix Rv of the image of described semantic similitude.
Complicated graph model unit, for utilizing Rt, Rvt, Rv, sets up complicated graph model G (Rv, Rt, Rvt).
Complicated figure cluster cell, for carrying out cluster to described complexity figure G (Rv, Rt, Rvt), forms the set of described image class and described theme class set.
The much-talked-about topic extracted based on said process is the theme corresponding to the image class the most similar to image to be marked, for improving the accuracy of much-talked-about topic further, in a preferred embodiment of the invention, described much-talked-about topic mark unit, specifically for, utilize evolution method of inspection χ 2method is extracted and is marked image to be marked as marking word with the highest front K the word of the network hot topic degree of correlation, and K is greater than 0 and is less than or equal to word number in network hot topic.Wherein, the concrete formula about the evolution method of inspection can see embodiment of the method part.
It should be noted that, based on the descriptor likely comprising image entities semanteme in the much-talked-about topic that said process obtains.Visible, based on said process, utilize the adjoint text of similar image on internet to expand simultaneously and the Entity Semantics of image to be marked is marked.And the mode of relatively existing limited training set, the present invention for training set with internet mass data, makes the semanteme of mark comprehensively and has real-time update.
The Entity Semantics word of image to be marked is realized by Entity Semantics mark, and in a particular embodiment of the present invention, described device also comprises the Entity Semantics mark unit in advance image to be marked being carried out to Entity Semantics mark.For carrying out Entity Semantics mark to described image to be marked; Described Entity Semantics mark unit specifically comprises:
Visual Feature Retrieval Process unit, for extracting the visual signature of described image to be marked;
Second content retrieval unit, for according to described visual signature, searches the candidate image with described image similarity to be marked from limited training set;
Entity Semantics word extraction unit, for extracting the Entity Semantics word of described candidate image;
Entity Semantics mark subelement, carries out Entity Semantics mark for utilizing described Entity Semantics word to described image to be marked.
It should be noted that, due to the problem of semantic gap, although above-mentioned candidate image visually with image similarity to be marked, semantically may have very big difference.Such as, the image of an apple is inputted.So based on the retrieval of CBIR, may by the head of the spherical object or people that are similar to apple etc. all alternatively image export.And the mark word obtained based on these candidate images such as " ball ", " head " etc. are not obviously the true semantemes of image to be marked.Therefore, in a preferred embodiment of the invention, need to carry out denoising to candidate image, to choose the candidate image more close with image, semantic to be marked.
We know, the image that candidate image is concentrated has marked the mark word that multiple quantity does not wait, and the adjoint text of correspondence image can be regarded as in these mark words, and adjoint is all that semantic content is relevant to a certain extent between text to its corresponding image.Therefore, in order to investigate the otherness of the semantic content between the similar candidate image of these vision contents, the semantic content that can be considered as every width image is similar to and is equal to its semantic content with text, thus by investigating these with the semantic difference between text, the semantic content otherness between candidate image just indirectly can be obtained.Between these are with text, the adjoint text that semantic content is similar, the quantity of its similar mark word owned together is more.
Based on this, device of the present invention also comprises denoising unit, for carrying out denoising to described candidate image; Concrete, described denoising unit comprises:
Candidate image cluster cell is a class for candidate image similar for Entity Semantics being gathered according to described Entity Semantics word, forms the set of candidate image class;
3rd content retrieval unit, for searching the candidate image class the most similar to described Image Visual Feature to be marked as neighbor image from the set of described candidate image class;
Institute's Entity Semantics mark subelement, carries out Entity Semantics mark specifically for utilizing the Entity Semantics word of described neighbor picture image set to described image to be marked.
Candidate image cluster realizes by multiple device, and embodiments provide wherein a kind of, described candidate image cluster cell comprises:
Hypergraph model unit, for setting up hypergraph model G (Vs, Ts), and the similarity matrix H of hypergraph model is obtained based on this, wherein, described hypergraph model for summit, is super limit with the set Ts of the Entity Semantics word of described candidate image with the set Vs of the candidate image with described image similarity to be marked; Element Hij in described matrix H represents the symbiosis of contacting of each image Vi and corresponding Entity Semantics word Tj and each Entity Semantics word and multiple candidate image;
Spectral clustering unit, for according to described similarity matrix H, utilizes spectral clustering, carries out cluster to described hypergraph model, and it is a class that the candidate image shared some being surpassed limit gathers, and forms described candidate image class.
Each image class is to there being multiple Entity Semantics word, and each Entity Semantics word is different from the degree of correlation of image to be marked.For improving the accuracy of image entities semantic tagger to be marked further, in the preferred embodiment of the present invention, described device also comprises:
Correlation calculating unit, for utilizing formula calculate the degree of correlation of Entity Semantics word in described neighbor image and described image to be marked; Wherein, ii is the neighbor image in neighbor image class S, and iq is image to be marked; p (ii/iq) equals the visual signature similarity of ii and described iq; The visual signature similarity that p (ii/ip) is image ii and described image to be marked.
Described Entity Semantics mark subelement, specifically for choose the degree of correlation maximum before K Entity Semantics word Entity Semantics mark is carried out to described image to be marked.
Corresponding above-mentioned mask method and device, the embodiment of the present invention 6 additionally provides a kind of image search method of much-talked-about topic Network Based, and wherein, the network hot topic of image obtains based on the method mark in above-described embodiment.See Fig. 8, described search method comprises:
The image retrieval text of S51, reception user input; Described image retrieval text at least comprises a network hot topic word.
In retrieval, user only can input the text relevant to network hot topic, also can input Entity Semantics and the text relevant to network hot topic simultaneously.The present invention does not limit this.
Image on S52, crawl internet and the markup information of described image thereof.
Existing method for capturing network information can being utilized in the present invention, as utilized spider traverses network resource, obtaining the image on internet and markup information thereof.
S53, judge whether the markup information of described image matches with described image retrieval text.
Concrete, can judge in markup information, whether to contain inputted all images retrieval text.Or, also can judge in markup information, whether to contain the image retrieval text word being greater than pre-set threshold numbers.
Such as, when user's input " Beijing ", " apple ", " price overgrowing " several word, can arrange system retrieve in markup information comprise 2 identical terms time, namely think image and text retrieval information match.
In one particular embodiment of the present invention, when text retrieval information comprises network hot topic word and Entity Semantics word simultaneously, can be arranged in the markup information retrieved when comprising network hot topic word, think image and text retrieval information match.
S54, as coupling, then the described image that matches and markup information thereof to be exported.
The content that the text retrieval information that most of user only pays close attention to oneself inputs is relevant, therefore, in a preferred embodiment of the invention, only can export the image and the markup information relevant to text retrieval information that match.
Corresponding above-mentioned search method, the embodiment of the present invention 7 additionally provides a kind of image retrieving apparatus of much-talked-about topic Network Based, and wherein said network hot topic utilizes above-mentioned image labeling method to obtain.See Fig. 9, described device comprises:
Received text unit 21, for receiving the image retrieval text of user's input; Described image retrieval text at least comprises a network hot topic word.
In retrieval, user only can input the text relevant to network hot topic, also can input Entity Semantics and the text relevant to network hot topic simultaneously.The present invention does not limit this.
Placement unit 22, for capturing the markup information of image on internet and described image thereof.
Existing method for capturing network information can being utilized in the present invention, as utilized spider traverses network resource, obtaining the image on internet and markup information thereof.
Judging unit 23, for judging whether the markup information of described image matches with described image retrieval text.
In one particular embodiment of the present invention, when text retrieval information comprises network hot topic word and Entity Semantics word simultaneously, described judging unit, can be used for judging whether comprise described network hot topic word in the markup information of described image.
Output unit 24, for when the markup information of described image and described image retrieval text match, the image matched described in output and markup information thereof export.
The content that the text retrieval information that most of user only pays close attention to oneself inputs is relevant, therefore, in a preferred embodiment of the invention, described output unit, specifically for when the markup information of described image and described image retrieval text match, the content relevant to described image retrieval text in the described image that matches and markup information thereof is exported.
It should be noted that, said apparatus embodiment is corresponding with embodiment of the method, therefore no longer describes in detail device section, and relevant portion is see embodiment of the method.
Being described in detail the embodiment of the present invention above, applying embodiment herein to invention has been elaboration, the explanation of above embodiment just understands method and apparatus of the present invention for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (18)

1. an image high-level semantics mask method for much-talked-about topic Network Based, it is characterized in that, described method comprises:
With at least one Entity Semantics word of image to be marked for query word, utilize the search engine based on text key word, retrieve from network and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image;
Extract described with the theme in text, and set up the image of described semantic similitude and the corresponding relation of described theme based on the described corresponding relation with text and described theme;
Visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, form the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set;
Set up the corresponding relation of the set of described image class and described theme class set;
According to the visual signature of described image to be marked, from the set of described image class, search the image class similar to the visual signature of described image to be marked, and extract the network hot topic of theme class corresponding to described similar image class as described image to be marked;
According to described network hot topic, semantic tagger is carried out to described image to be marked.
2. method according to claim 1, is characterized in that, described method also comprises the step of in advance described image to be marked being carried out to Entity Semantics mark, specifically comprises:
Extract the visual signature of described image to be marked;
According to described visual signature, from limited training set, search the candidate image with described image similarity to be marked;
Extract the Entity Semantics word of described candidate image, and utilize described Entity Semantics word to carry out Entity Semantics mark to described image to be marked.
3. method according to claim 2, is characterized in that, after the Entity Semantics word of the described candidate image of described extraction, utilize before described Entity Semantics word carries out Entity Semantics mark to described image to be marked, described method also comprises:
Candidate image similar for Entity Semantics being gathered according to described Entity Semantics word is a class, forms the set of candidate image class;
The candidate image class the most similar to described Image Visual Feature to be marked is searched as neighbor image class from the set of described candidate image class;
Described utilize described Entity Semantics word to described image to be marked carry out Entity Semantics mark comprise:
The Entity Semantics word of described neighbor image class is utilized to carry out Entity Semantics mark to described image to be marked.
4. method according to claim 3, is characterized in that, described candidate image similar for Entity Semantics being gathered according to described Entity Semantics word is a class, forms candidate image class and comprises
Set up hypergraph model G (Vs, Ts), and obtain the similarity matrix H of hypergraph model based on this, wherein, described hypergraph model for vertex set, is super limit collection with the set Ts of the Entity Semantics word of described candidate image with the set Vs of the candidate image with described image similarity to be marked; Element Hij in described matrix H represents the symbiosis of contacting of each image Vi and corresponding Entity Semantics word Tj and each Entity Semantics word and multiple candidate image;
According to described similarity matrix H, utilize spectral clustering, carry out cluster to described hypergraph model, it is a class that the candidate image shared some being surpassed limit gathers, and forms described candidate image class.
5. the method according to claim 3 or 4, is characterized in that, described method also comprises:
Utilize formula calculate the degree of correlation of Entity Semantics word in described neighbor image class and described image to be marked; Wherein, ii is the neighbor image in neighbor image class S, and iq is image to be marked; P (ii/iq) equals the visual signature similarity of ii and described iq;
Described utilize the Entity Semantics word of described neighbor image to described image to be marked carry out Entity Semantics mark comprise:
According to described degree of correlation order from big to small, the Entity Semantics word choosing predetermined number from described neighbor image class carries out Entity Semantics mark to described image to be marked.
6. method according to claim 1, is characterized in that, described extraction is described with the theme in text, and based on the described corresponding relation with text and theme, the corresponding relation of the image and described theme of setting up described semantic similitude comprises:
Utilize and describedly set up LDA model with text, set up image-theme correlation matrix Rvt based on theme described in described LDA model extraction;
Described that visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, form the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set and comprise:
Set up the described theme correlation matrix Rt with text;
Utilize the visual similarity of image, calculate the visual similarity matrix Rv of the image of described semantic similitude;
Utilize Rt, Rvt, Rv, set up complicated graph model G (Rv, Rt, Rvt);
Cluster is carried out to described complexity figure G (Rv, Rt, Rvt), forms the set of described image class and described theme class set.
7. method according to claim 1, is characterized in that, describedly carries out semantic tagger according to described network hot topic to described image to be marked and comprises:
Utilize evolution method of inspection χ 2 to extract front K the word the highest with the described network hot topic degree of correlation and semantic tagger is carried out to described image to be marked.
8. an image high-level semantics annotation equipment for much-talked-about topic Network Based, it is characterized in that, described device comprises:
Text retrieving unit, for with at least one Entity Semantics word of image to be marked for query word, utilize the search engine based on text key word, retrieve from network and the image of semantic similitude of described image to be marked and the adjoint text of described semantic similitude image;
Subject distillation unit, described with the theme in text for extracting;
First associative cell, for the corresponding relation based on described adjoint text and theme, sets up the image of described semantic similitude and the corresponding relation of described theme;
Cluster cell, for visual signature is similar and the image with the described semantic similitude of similar topic is polymerized to a class, forms the set of image class; Similar topic corresponding for the image of described semantic similitude similar for visual signature is polymerized to a class, forms theme class set;
Second associative cell, for setting up the corresponding relation of the set of described image class and described theme class set;
First content retrieval unit, for the visual signature according to described image to be marked, searches the image class similar to the visual signature of described image to be marked from the set of described image class;
Much-talked-about topic extraction unit, for extracting the network hot topic of theme class corresponding to described similar image class as described image to be marked;
Much-talked-about topic mark unit, for carrying out semantic tagger according to described network hot topic to described image to be marked.
9. device according to claim 8, is characterized in that, described device also comprises Entity Semantics mark unit, for carrying out Entity Semantics mark to described image to be marked; Described Entity Semantics mark unit specifically comprises:
Visual Feature Retrieval Process unit, for extracting the visual signature of described image to be marked;
Second content retrieval unit, for according to described visual signature, searches the candidate image with described image similarity to be marked from limited training set;
Entity Semantics word extraction unit, for extracting the Entity Semantics word of described candidate image;
Entity Semantics mark subelement, carries out Entity Semantics mark for utilizing described Entity Semantics word to described image to be marked.
10. device according to claim 9, is characterized in that, described device also comprises denoising unit, for carrying out denoising to described candidate image; Concrete, described denoising unit comprises:
Candidate image cluster cell is a class for candidate image similar for Entity Semantics being gathered according to described Entity Semantics word, forms the set of candidate image class;
3rd content retrieval unit, for searching the candidate image class the most similar to described Image Visual Feature to be marked as neighbor image class from the set of described candidate image class;
Described Entity Semantics mark subelement, carries out Entity Semantics mark specifically for utilizing the Entity Semantics word of described neighbor image class to described image to be marked.
11. devices according to claim 10, is characterized in that, described candidate image cluster cell comprises:
Hypergraph model unit, for setting up hypergraph model G (Vs, Ts), and the similarity matrix H of hypergraph model is obtained based on this, wherein, described hypergraph model for vertex set, is super limit collection with the set Ts of the Entity Semantics word of described candidate image with the set Vs of the candidate image with described image similarity to be marked; Element Hij in described matrix H represents the symbiosis of contacting of each image Vi and corresponding Entity Semantics word Tj and each Entity Semantics word and multiple candidate image;
Spectral clustering unit, for according to described similarity matrix H, utilizes spectral clustering, carries out cluster to described hypergraph model, and it is a class that the candidate image shared some being surpassed limit gathers, and forms described candidate image class.
12. devices according to claim 10 or 11, it is characterized in that, described device also comprises:
Correlation calculating unit, for utilizing formula calculate the degree of correlation of Entity Semantics word in described neighbor image class and described image to be marked; Wherein, ii is the neighbor image in neighbor image class S, and iq is image to be marked; P (ii/iq) equals the visual signature similarity of ii and described iq;
Described Entity Semantics mark subelement, specifically for according to described degree of correlation order from big to small, the Entity Semantics word choosing predetermined number from described neighbor image class carries out Entity Semantics mark to described image to be marked.
13. devices according to claim 8, is characterized in that, described subject distillation unit, set up LDA model, based on theme described in described LDA model extraction specifically for utilizing described adjoint text;
Described first associative cell, specifically for setting up image-theme correlation matrix Rvt based on described LDA model;
Described cluster cell comprises:
Theme correlation matrix unit, for setting up the described theme correlation matrix Rt with text;
Visual similarity matrix unit, for utilizing the visual similarity of image, calculates the visual similarity matrix Rv of the image of described semantic similitude;
Complicated graph model unit, for utilizing Rt, Rvt, Rv, sets up complicated graph model G (Rv, Rt, Rvt);
Complicated figure cluster cell, for carrying out cluster to described complexity figure G (Rv, Rt, Rvt), forms the set of described image class and described theme class set.
14. devices according to claim 8, is characterized in that, described much-talked-about topic mark unit, specifically for, utilize evolution method of inspection χ 2 to extract front K the word the highest with the described network hot topic degree of correlation and semantic tagger is carried out to described image to be marked.
The image search method of 15. 1 kinds of much-talked-about topics Network Based, is characterized in that, the markup information of described network hot topic and image utilizes method any one of claim 1-7 to obtain, and described method comprises:
Receive the image retrieval text of user's input; Described image retrieval text at least comprises a network hot topic word;
Image on crawl internet and the markup information of described image thereof;
Judge whether the markup information of described image matches with described image retrieval text;
As coupling, then the described image that matches and markup information thereof are exported;
Wherein, when comprising network hot topic and Entity Semantics in described image retrieval text simultaneously, whether the described markup information judging described image matches with described image retrieval text and comprises:
Judge in the markup information of described image, whether to comprise described network hot topic word.
16. methods according to claim 15, is characterized in that, described as coupling, then the described image that matches and markup information thereof are exported and comprise:
As coupling, then the content relevant to described image retrieval text in the described image that matches and markup information thereof is exported.
The image retrieving apparatus of 17. 1 kinds of much-talked-about topics Network Based, is characterized in that, the markup information of described network hot topic and image utilizes method any one of claim 1-7 to obtain, and described device comprises:
Received text unit, for receiving the image retrieval text of user's input; Described image retrieval text at least comprises a network hot topic word;
Placement unit, for capturing the markup information of image on internet and described image thereof;
Judging unit, for judging whether the markup information of described image matches with described image retrieval text;
Output unit, for when the markup information of described image and described image retrieval text match, the image matched described in output and markup information thereof export;
Wherein, described judging unit, specifically for when comprising network hot topic and Entity Semantics in described image retrieval text simultaneously, judges whether comprise described network hot topic word in the markup information of described image.
18. devices according to claim 17, it is characterized in that, described output unit, specifically for when the markup information of described image and described image retrieval text match, exports the content relevant to described image retrieval text in the described image that matches and markup information thereof.
CN201210431912.5A 2012-11-01 2012-11-01 The image high-level semantics mark of much-talked-about topic Network Based, search method and device Expired - Fee Related CN102902821B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210431912.5A CN102902821B (en) 2012-11-01 2012-11-01 The image high-level semantics mark of much-talked-about topic Network Based, search method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210431912.5A CN102902821B (en) 2012-11-01 2012-11-01 The image high-level semantics mark of much-talked-about topic Network Based, search method and device

Publications (2)

Publication Number Publication Date
CN102902821A CN102902821A (en) 2013-01-30
CN102902821B true CN102902821B (en) 2015-08-12

Family

ID=47575053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210431912.5A Expired - Fee Related CN102902821B (en) 2012-11-01 2012-11-01 The image high-level semantics mark of much-talked-about topic Network Based, search method and device

Country Status (1)

Country Link
CN (1) CN102902821B (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226600A (en) * 2013-04-25 2013-07-31 广东欧珀移动通信有限公司 Method and system for mobile terminal word retrieve
CN103631890B (en) * 2013-11-15 2017-05-17 北京奇虎科技有限公司 Method and device for mining image principal information
CN104657375B (en) * 2013-11-20 2018-01-26 中国科学院深圳先进技术研究院 A kind of picture and text subject description method, apparatus and system
WO2015078022A1 (en) * 2013-11-30 2015-06-04 Xiaoou Tang Visual semantic complex network and method for forming network
CN104750751B (en) * 2013-12-31 2018-02-23 华为技术有限公司 Track data mask method and device
CN103984738B (en) * 2014-05-22 2017-05-24 中国科学院自动化研究所 Role labelling method based on search matching
CN105446973B (en) * 2014-06-20 2019-02-26 华为技术有限公司 The foundation of user's recommended models and application method and device in social networks
CN104317867B (en) * 2014-10-17 2018-02-09 上海交通大学 The system that entity cluster is carried out to the Web page picture that search engine returns
CN104765796A (en) * 2015-03-25 2015-07-08 无锡天脉聚源传媒科技有限公司 Image recognizing searching method and device
CN106294344B (en) * 2015-05-13 2019-06-18 北京智谷睿拓技术服务有限公司 Video retrieval method and device
CN104951545B (en) * 2015-06-23 2018-07-10 百度在线网络技术(北京)有限公司 Export the data processing method and device of object
US11514244B2 (en) * 2015-11-11 2022-11-29 Adobe Inc. Structured knowledge modeling and extraction from images
CN105653701B (en) 2015-12-31 2019-01-15 百度在线网络技术(北京)有限公司 Model generating method and device, word assign power method and device
CN107562742B (en) * 2016-06-30 2021-02-05 江苏苏宁云计算有限公司 Image data processing method and device
CN108305296B (en) * 2017-08-30 2021-02-26 深圳市腾讯计算机系统有限公司 Image description generation method, model training method, device and storage medium
CN108228720B (en) * 2017-12-07 2019-11-08 北京字节跳动网络技术有限公司 Identify method, system, device, terminal and the storage medium of target text content and original image correlation
CN110427542A (en) * 2018-04-26 2019-11-08 北京市商汤科技开发有限公司 Sorter network training and data mask method and device, equipment, medium
US11244013B2 (en) * 2018-06-01 2022-02-08 International Business Machines Corporation Tracking the evolution of topic rankings from contextual data
CN109800818A (en) * 2019-01-25 2019-05-24 宝鸡文理学院 A kind of image meaning automatic marking and search method and system
WO2020191706A1 (en) * 2019-03-28 2020-10-01 香港纺织及成衣研发中心有限公司 Active learning automatic image annotation system and method
CN110298386B (en) * 2019-06-10 2023-07-28 成都积微物联集团股份有限公司 Label automatic definition method based on image content
CN110598739B (en) * 2019-08-07 2023-06-23 广州视源电子科技股份有限公司 Image-text conversion method, image-text conversion equipment, intelligent interaction method, intelligent interaction system, intelligent interaction equipment, intelligent interaction client, intelligent interaction server, intelligent interaction machine and intelligent interaction medium
CN110489593B (en) * 2019-08-20 2023-04-28 腾讯科技(深圳)有限公司 Topic processing method and device for video, electronic equipment and storage medium
CN111177444A (en) * 2020-01-02 2020-05-19 杭州创匠信息科技有限公司 Image marking method and electronic equipment
CN111460206B (en) * 2020-04-03 2023-06-23 百度在线网络技术(北京)有限公司 Image processing method, apparatus, electronic device, and computer-readable storage medium
CN111666752B (en) * 2020-04-20 2023-05-09 中山大学 Circuit teaching material entity relation extraction method based on keyword attention mechanism
CN112215837B (en) * 2020-10-26 2023-01-06 北京邮电大学 Multi-attribute image semantic analysis method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271476A (en) * 2008-04-25 2008-09-24 清华大学 Relevant feedback retrieval method based on clustering in network image search
CN101419606A (en) * 2008-11-13 2009-04-29 浙江大学 Semi-automatic image labeling method based on semantic and content
CN101685464A (en) * 2009-06-18 2010-03-31 浙江大学 Method for automatically labeling images based on community potential subject excavation
CN102270234A (en) * 2011-08-01 2011-12-07 北京航空航天大学 Image search method and search engine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271476A (en) * 2008-04-25 2008-09-24 清华大学 Relevant feedback retrieval method based on clustering in network image search
CN101419606A (en) * 2008-11-13 2009-04-29 浙江大学 Semi-automatic image labeling method based on semantic and content
CN101685464A (en) * 2009-06-18 2010-03-31 浙江大学 Method for automatically labeling images based on community potential subject excavation
CN102270234A (en) * 2011-08-01 2011-12-07 北京航空航天大学 Image search method and search engine

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种自适应的Web图像语义自动标注方法;许红涛等;《软件学报》;20100930;第21卷(第9期);2183-2195 *
图像自动语义标注技术综述;孙君顶等;《计算机系统应用》;20121231;第21卷(第7期);257-261 *
融合语义主题的图像自动标注;李志欣等;《软件学报》;20110430;第22卷(第4期);801-812 *

Also Published As

Publication number Publication date
CN102902821A (en) 2013-01-30

Similar Documents

Publication Publication Date Title
CN102902821B (en) The image high-level semantics mark of much-talked-about topic Network Based, search method and device
CN110162593B (en) Search result processing and similarity model training method and device
US10438091B2 (en) Method and apparatus for recognizing image content
GB2544379B (en) Structured knowledge modeling, extraction and localization from images
CN104899253B (en) Towards the society image across modality images-label degree of correlation learning method
CN103744981B (en) System for automatic classification analysis for website based on website content
Nagarajan et al. Fuzzy ontology based multi-modal semantic information retrieval
CN110362660A (en) A kind of Quality of electronic products automatic testing method of knowledge based map
CN104008203B (en) A kind of Users' Interests Mining method for incorporating body situation
CN110674407A (en) Hybrid recommendation method based on graph convolution neural network
CN109918560A (en) A kind of answering method and device based on search engine
CN103064903B (en) Picture retrieval method and device
CN103116588A (en) Method and system for personalized recommendation
CN106796600A (en) The computer implemented mark of relevant item
GB2544853A (en) Structured knowledge modeling and extraction from images
CN110414581B (en) Picture detection method and device, storage medium and electronic device
CN104462327A (en) Computing method, search processing method, computing device and search processing device for sentence similarity
CN116955730A (en) Training method of feature extraction model, content recommendation method and device
CN113569118B (en) Self-media pushing method, device, computer equipment and storage medium
Wajid et al. Neutrosophic-CNN-based image and text fusion for multimodal classification
Wu et al. Reducing noisy labels in weakly labeled data for visual sentiment analysis
KR101955920B1 (en) Search method and apparatus using property language
Rabello Lopes et al. Two approaches to the dataset interlinking recommendation problem
Huang et al. Tag refinement of micro-videos by learning from multiple data sources
CN105608183A (en) Method and apparatus for providing answer of aggregation type

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150812

Termination date: 20211101