Background technology
Current, the development of global network and universally reached unprecedented scale, people have got used to searching various information on the internet, and search engine has become the center of internet.Domestic each internet giant is just sparing no effort to improve oneself search engine, the important development direction supported during " New-generation search engines and browser " is also classified as " 12 " by country's " core Gao Ji " scientific and technological key special subjects.But the information on internet is exponentially level growth, and type is various, complicated association is there is between the information of various media format, these cross correlations make internet data present across media characteristic, and thisly have higher requirement to internet information analysis and retrieval across media characteristic.After knowledge mapping is introduced cross-media retrieval system, contribute to the context data obtaining various dimension, support that user is with natural language, multimedia sample or dissimilar sets of media data incompatible expression retrieval intention better, the feature under different situation can also be found by further reasoning, realize user's query semantics more accurately and analyze and retrieval.Therefore, the present invention gives the implementation of cross-media retrieval system from the angle of knowledge mapping.
Knowledge mapping has developed after Open database connectivity company Metaweb purchased in 2010 in Google.Metaweb was mainly absorbed at that time and different literals statement was coupled together with same entity, and explored these entity attributes (age of such as star) and contact each other, finally provided a kind of search form newly.Although can not substitute keyword search completely, index, the searching method of Metaweb are more efficient when processing the inquiry of natural language.Equally, in cross-media retrieval, by knowledge mapping, also can understand the inquiry request of user better and sum up relevant content semantic to query demand, for user finds out the relevant information more accurately and more having the degree of depth.In addition, knowledge mapping also can help user to understand relation between things.When the inquiry request that user expresses with the combination of natural language, multimedia sample or dissimilar media data, such a inquiry request may represent multiple implication, knowledge mapping can understand difference wherein, and that implication Search Results range shorter can wanted to user.Moreover, because knowledge mapping constructs a complete knowledge hierarchy relevant to Search Results, merge a lot of subject, the knowledge hierarchy relevant to user's query semantics is showed user systematically, so user may recognize certain new fact or new contact when retrieving, impel it to carry out a series of brand-new search inquiry, allow search more have depth & wideth.Therefore, knowledge mapping is introduced in cross-media retrieval, for improvement retrieval performance, there is vital role.
Therefore, the present invention is with the cross-media retrieval gordian technique towards knowledge mapping for research object, and the sensor model proposed across medium property is expressed, across the consistance expression of media knowledge and the implementation of user's query semantics analytical approach of knowledge based collection of illustrative plates and the cross-media retrieval system towards knowledge mapping with the multiple unified quantization that associates.In information retrieval field, development from Present Domestic, becomes inexorable trend towards knowledge mapping with across media, and therefore the present invention has very large actual application value and wide application prospect.
Summary of the invention
The object of the present invention is to provide one in cross-media retrieval, to introduce knowledge mapping across media information retrieval instrument, what knowledge based collection of illustrative plates was contained carries out semantic analysis and reasoning across media semantic association and knowledge, realizes cross-media retrieval.Specifically, content of the present invention comprise following some.
(1) for complicated across media data on internet, set up across medium property sensor model and analyze the incidence relation wherein contained, what propose a kind of unification describes mechanism across media data association.Pass through text resolution, entity extracts, metadata analysis, the technology such as semantic tagger and user behavior analysis obtains across the natural quality of media data and social property, then association modeling is carried out to across the complex relationship in media data between natural quality and social property, consider across the relevance existed between media data (same mode) in modeling process, semantic association (different modalities), sequential correlation, the multiple association such as structure connection, according to the link between the webpage of multimedia object place, based on probability graph model to the modeling analysis carrying out randomization across media content and link, thus unified quantization expression is carried out to dissimilar incidence relation.
(2) in order to meet the needs across media semantic description and knowledge acquisition, proposing the method for the data-mapping of different shape to same semantic label space, realizing semantic consistency and expressing.Work as text, when the media modalities co expression one of the isomery complementations such as image is semantic, by learning certain mapping relations, these isomery modal informations are mapped to a semantic label space, thus directly similarity measurement is carried out to isomeric data under expressing framework at one, and according to semantic similarity, semantic coverage and semantic space calibration set up evaluation function, the alternative of semantic label is evaluated, semantic label information is utilized to be respectively each shape up exercise sorter, and using the result of classification as sharing feature, make the data of different shape also can be mapped to same semantic label space, thus realize semantic consistency expression.
(3) propose when user to carry out the method for semantic analysis and reasoning to it with the association contained in conjunction with knowledge mapping during natural language, multimedia sample or dissimilar media data combination expression inquiry request.For the query contents of user's input, respectively the content of text and multimedia inquiry is carried out separately and Conjoint Analysis, carry out analyzing user queries intention from semantic level.Therefore first gather from internet and enough set up semantic model respectively across media information and for the data of different media types, realize across the feature interpretation of media data on same semantic space.Then the semantic distributional analysis of composite image data and text data and the semanteme of identification user inquiry, and carry out the semantic excavation of further association in conjunction with knowledge mapping.Knowledge based collection of illustrative plates contain data semantic association, sequential correlation and structure connection etc., obtain the context data of the various dimensions relevant to user's query contents, and find the feature under different situation by reasoning, thus obtain more perfect query semantics.
(4) cross-media retrieval system architecture and the implementation method of introducing knowledge mapping are proposed.System, except possessing the elements such as user's query analysis, index, retrieval and sequence, also will create knowledge mapping knowledge base of certain scale and be integrated in system.In user's query analysis part, support that user is with natural language, query contents across the input of the form such as media sample, different media types data.When carrying out query semantics and analyzing, except semantic analysis will be carried out respectively to the various media type data of user's input, also to carry out combination semantic analysis and further reasoning, to understand user's query intention better according to context knowledge such as the time on knowledge mapping, place, entity and social relationships thereof in conjunction with knowledge mapping to it.More existing algorithms are mainly being called across media hash index and sort sections.
Embodiment
For making object of the present invention, technical scheme and advantage clearly understand, below in conjunction with Figure of description, the present invention is described in further detail.
1. across medium property perception and association analysis
The mode that current knowledge is propagated more and more has the characteristic across media, the relevant knowledge of same entity and information are often from multiple support channels, with media form coordinate expression, and contain multiple natural quality and social property, in order to utilize across the association knowledge contained in media data and use it in cross-media retrieval, in the process building knowledge mapping, except the semantic relation of inter-entity will be considered, also to consider the perception across medium property to entity, set up across medium property sensor model and association analysis is carried out to it.In order to carry out unified quantization expression to dissimilar incidence relation, and effectively predicting potential association, making mutually utilize between different incidence relations, that sets up a kind of unification describes mechanism across media data association.
(microblogging is comprised for from multiple support channels, micro-letter, forum, news website, professional website etc.), with media form (text, sound, image, video) coordinate expression, and contain the multiple natural quality (time, place, personage, apparent information etc.) and social property (as temperature, evaluate and preference etc.) entity relevant information, based on and text accompanying information between complementary information extract the high-level semantic of other media type data, then text resolution is passed through, entity extracts, metadata analysis, the technology such as semantic tagger and user behavior analysis obtains across the natural quality of media data and social property, by one group of support vector machine classifier, new data is classified again, thus from noisy network image concentrate automatically extract and identify generic target, or by analyzing the network user, modeling is carried out to the attention rate etc. of forwarding behavior to real-world user across media data, behavior is forwarded by the data contents such as analysis microblogging, micro-letter, social networks and user, build forwarding tree model, and utilize frequent subtree to find repeatability and the tendentiousness rule of user behavior, thus community interest degree is followed the tracks of more accurately and predicts.Next association modeling is carried out to across the complex relationship in media data between natural quality and social property, consider across the relevance existed between media data (same mode) in modeling process, semantic association (different modalities), sequential correlation, the multiple association such as structure connection, according to the link between the webpage of multimedia object place, based on probability graph model to the modeling analysis carrying out randomization across media content and link, thus unified quantization expression is carried out to dissimilar incidence relation, and the interaction prediction realized further across media data, as shown in Figure 1.
2. the consistance across media knowledge is expressed
Because existing knowledge representation mode and knowledge base sources are also confined to the state of single mode substantially, the needs across media semantic description and knowledge acquisition cannot be met, therefore contain after across medium property knowledge in the knowledge mapping built, use it in cross-media retrieval, on the basis analyzing single modal data semantic knowledge expression rule, propose the method for the data-mapping of different shape to same semantic label space, thus realize semantic consistency expression.In order to from single Modal Expansion to across media representation of knowledge aspect, propose the method that the content of medium type various in knowledge mapping is calculated and measured, be mapped to certain space to carry out structure analysis, fusion and reasoning etc. by unified for the structural information of media data theoretically.
Obtain enough across medium property knowledge and incidence relation after, in order to use it in cross-media retrieval, different data granularities, different knowledge hierarchy be set up and represent mechanism across media knowledge consistance.When the media modalities co expression one of the isomery such as text, image complementation is semantic, by learning certain mapping relations, these isomery modal informations are mapped to a shared subspace, under just can expressing framework at one, directly similarity measurement are carried out to isomeric data.For content and semantically have correlativity across media data, adopt probability generation model that the data of different media types are transformed into unified latent variables space to be described, using across the distribution of media data in each hidden variable as its semantic label, and set up evaluation function according to semantic similarity, semantic coverage and semantic space calibration, the alternative of semantic label is evaluated, and sets up set of semantics.Utilize the semantic label information of set of semantics, same modal data in different multimedia document is extracted respectively, the semantic label of utilization group is respectively each shape up exercise sorter, and using the result of classification as sharing feature, make the data of different shape also can be mapped to same semantic label space, thus realize semantic consistency expression.
The key that semantic label is selected calculates it and the semantic dependency across media content, namely the coupling between semantic label and semantic model, in order to can directly semantic label and semantic model be compared, the mode that semantic label distributes with semanteme is represented, uses the Semantic Similarity between KL distance computing semantic label and semantic model.In order to obtain semanteme distribution { p (w|l) } of semantic label l, by carrying out approximate evaluation { p (w|l, D) } across media data collection D.So just can use the Semantic Similarity between KL distance computing semantic label { p (w|l) } and semantic model { p (w| θ) }:
In order to ensure that semantic label has higher coverage to the semantic content across media data, the neology word selected can cover other semantic component, instead of the content that existing semantic word has been contained, adopting maximal margin correlation technique, obtaining maximum correlation and the semantic word of otherness by maximizing maximal margin correlativity:
Wherein, S is the semantic word selected.
In addition, when marking multiple semantic content, in order to ensure that a semantic word can not have the higher degree of correlation with multiple semantic content, also to consider the differentiation between different semantic content, i.e. semantic space calibration, in this case, need to adopt the Semantic Similarity computing method considering discrimination:
S’(l,θ
i)=S(l,θ
i)-αS(l,θ
-i)(4)
S (l, θ
-i)=-d (θ
-i‖ l) (5) wherein, θ
-1represent except semantic feature θ
1outside other k-1 semantic feature, i.e. θ
1 ... i-1i+1 ... k, k is semantic feature number.Pass through S ' (l, θ
i) calculate and to go forward side by side line ordering across the semantic similarity of semantic feature, thus can be correlated with for multiple semantic content generative semantics and there is the semantic word of certain coverage and discrimination.
3. user's query semantics of knowledge based collection of illustrative plates is analyzed
For the query contents of user's input, need to carry out separately and Conjoint Analysis the content of text and multimedia inquiry respectively, carry out analyzing user queries intention from semantic level.Therefore, first gather from internet and enough set up semantic model respectively across media information and for the data of different media types, as shown in Figure 2: the text semantic model described with text word and the vision semantic model described with visual word; Then utilize these two models that the text data in document to be analyzed and view data are all transformed into identical semantic space, and be described in the mode of semantic probability distribution.The Semantic mapping of different media types data is realized afterwards by semantic study.In order to be associated between the data of different media types, excavate the shared subspace existed between relevance dissimilar medium data, for have semantic dependency across media data, vision data as relevant to text semantic in image, video etc., text data is adopted to carry out the study of vision semanteme, describe text semantic with the form of visual word, set up the mapping relations between text semantic and vision semanteme, thus realize across the feature interpretation of media data on same semantic space.
After obtaining the semantic characteristics description across media data, the semantic distributional analysis of composite image data and text data and the semanteme of identification user inquiry, and carry out the semantic excavation of further association in conjunction with knowledge mapping.Knowledge based collection of illustrative plates contain data semantic association, sequential correlation and structure connection etc., obtain the context data of the various dimensions relevant to user's query contents, as time, place, entity and social relationships thereof etc., and find the feature under different situation by reasoning, thus obtain more perfect query semantics.What relate to due to reasoning is across media data, so first to realize carrying out formalization representation across media to the conversion of Text Mode based on technology such as moving object action recognition in image labeling, video before reasoning, then text based inference technology realizes reasoning.Need in semantic layer process across media data in transfer process, can realize across media semantic model based on set up.
4. towards the cross-media retrieval system of knowledge mapping
In order to realize a cross-media retrieval system towards knowledge mapping, first the cross-media retrieval system architecture introducing knowledge mapping is proposed, as shown in Figure 3.System, except possessing the elements such as user's query analysis, index, retrieval, sequence, adds and expresses several part across medium property perception and association analysis and consistance.First enough multi-medium datas are gathered from internet, based on the natural quality obtained respectively across medium property sensor model across media data and social property, then association analysis and description are carried out to the semantic association, sequential correlation, structure connection etc. of the entity object association of wherein containing, various media type data.Build on this basis to be formed afterwards and reach the knowledge mapping of certain scale, in order to utilize contain in knowledge mapping across media knowledge, express framework based on proposed consistance and it represented.
In user's query analysis part, support that user is with natural language, query contents across the input of the form such as media sample, different media types data.When carrying out query semantics and analyzing, except semantic analysis will be carried out respectively to the various media type data of user's input, also to carry out combination semantic analysis and further reasoning, to understand user's query intention better according to context knowledge such as the time on knowledge mapping, place, entity and social relationships thereof in conjunction with knowledge mapping to it.More existing algorithms are mainly being called across media hash index and sort sections.