Background technique
Currently, the development of global network and it is universal have reached unprecedented scale, people have got used on the internet
Various information are searched, search engine has become the center of internet.Domestic each internet giant just spares no effort to improve certainly
" New-generation search engines and browser " is also classified as " 12th Five-Year Plan " by oneself search engine, national " core Gao Ji " scientific and technological key special subjects
The important development direction that period is supported.But the information on internet exponentially increases, and type multiplicity, various matchmakers
There are complicated association between the information of body form, these cross correlations make internet data present across media spies
Property, and it is this across media characteristic to internet information analyze with retrieve more stringent requirements are proposed.Since knowledge mapping being introduced
After cross-media retrieval system, facilitate the context data for obtaining various dimensions, preferably support user is with natural language, multimedia
Sample or the combination of different type media data are intended to express retrieval, and different situations can also be found by further reasoning
Under feature, realize more accurate user query semantic analysis and retrieval.Therefore, the present invention is from the angle of knowledge mapping
Give the implementation of cross-media retrieval system.
Knowledge mapping is developed after Google has purchased Open database connectivity company Metaweb in 2010.
Metaweb was principally dedicated to state different literals at that time and connect with the same entity, and explored these entity attributes
(such as age of star) and connection each other finally provides a kind of new search form.Although cannot substitute completely
Keyword search, but the index of Metaweb, searching method are more efficient when handling the inquiry of natural language.Equally, across media
In retrieval, by knowledge mapping, the inquiry request of user also may be better understood and sum up semantic related to query demand
Content, find out relevant information that is more accurate and more having depth for user.In addition, knowledge mapping can also help user to be experienced and understanding
Relationship between object.When user is asked with the inquiry of natural language, multimedia sample or different type media data combination expression
When asking, such a inquiry request may represent multiple meaning, and knowledge mapping, and can will it will be appreciated that difference therein
Search result range shorter that meaning most desired to user.Furthermore since knowledge mapping constructs one and search result
Relevant complete knowledge hierarchy, has merged many subjects, and knowledge hierarchy relevant to user query semanteme is systematically opened up
Show to user, so user may will appreciate that some new fact or new connection in retrieval, promotes its progress a series of
Completely new search inquiry allows search more to have depth and range.Therefore, knowledge mapping is introduced into cross-media retrieval and is examined for improving
Can play a significant role without hesitation.
Therefore, the present invention is proposed using the cross-media retrieval key technology towards knowledge mapping as research object across media
The sensor model of attribute with it is a variety of be associated with unified quantization expression, across media knowledge consistency expression and knowledge based map use
The implementation of family query semantics analysis method and the cross-media retrieval system towards knowledge mapping.In information retrieval field,
From the point of view of current development situation, have become inexorable trend towards knowledge mapping and across media, therefore the present invention has
Very big practical application value and wide application prospect.
Summary of the invention
The purpose of the present invention is to provide across a media information retrieval tools, and knowledge graph is introduced in cross-media retrieval
It composes, across the media semantic associations and knowledge covered on knowledge based map carry out semantic analysis and reasoning, realize cross-media retrieval.
Specifically, the content of present invention includes the following.
(1) for complicated across media data on internet, across medium property sensor model is established and to wherein containing
The incidence relation of lid is analyzed, and proposes a kind of across the media data association description mechanism of unification.It is taken out by text resolution, entity
Take, the technologies such as metadata analysis, semantic tagger and user behavior analysis obtain natural quality and social property across media data,
Then it is associated modeling to across the complex relationship in media data between natural quality and social property, is examined in modeling process
Consider across content existing between media data association (same mode), semantic association (different modalities), sequential correlation, structure connection etc.
A variety of associations are tapped into based on probability graph model to across media content and chain according to the link between webpage where multimedia object
The modeling analysis of row randomization, to carry out unified quantization expression to different types of incidence relation.
(2) in order to meet the needs of across media semantic descriptions and knowledge acquisition, the data of different shape are mapped to by proposition
The method in the same semantic label space realizes semantic consistency expression.When the media modalities of the isomeries such as text, image complementation are total
When with expressing a kind of semanteme, by learning certain mapping relations, these isomery modal informations are mapped to a semantic label sky
Between, to directly carry out similarity measurement to isomeric data under an expression frame, and according to semantic similarity, semantic covering
Evaluation function is established in degree and semantic space indexing, is evaluated the alternative of semantic label, is distinguished using semantic label information
For each shape up exercise classifier, and using the result of classification as sharing feature, so that the data of different shape can also reflect
It is mapped to the same semantic label space, to realize that semantic consistency is expressed.
(3) it proposes to ask when user combines expression inquiry with natural language, multimedia sample or different type media data
The method that the association covered when asking in conjunction with knowledge mapping carries out semantic analysis and reasoning to it.For in the inquiry of user's input
Hold, respective and Conjoint Analysis is carried out to the content of text and multimedia inquiry respectively, carrys out analyzing user queries from semantic level
It is intended to.Therefore it is acquired from internet first and enough establishes semanteme respectively across media information and for the data of different media types
Model realizes the feature description across media data on same semantic space.Then the language of composite image data and text data
The semanteme of adopted distributional analysis and identification user query, and combine the further association of knowledge mapping progress is semantic to excavate.Based on knowing
Know data semantic association, sequential correlation and the structure connection etc. that map is covered, obtains various dimensions relevant to user query content
The context data of degree, and find by reasoning the feature under different situations, to obtain more perfect query semantics.
(4) the cross-media retrieval system architecture and implementation method of introducing knowledge mapping are proposed.System is looked into addition to having user
The elements such as analysis, index, retrieval and sequence are ask, knowledge mapping knowledge base union of certain scale is also created
At into system.In user query analysis part, support user with natural language, across media samples, different media types data
Etc. forms input inquiry content.When carrying out query semantics analysis, in addition to the various media type datas to be inputted to user
Semantic analysis is carried out respectively, also combination semantic analysis and further reasoning is carried out to it in conjunction with knowledge mapping, so as to root
More fully understand that user query are intended to according to context knowledges such as time, place, entity and its social relationships on knowledge mapping.Across
Media hash index and sort sections mainly call existing some algorithms.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, below in conjunction with Figure of description to this hair
It is bright to be described in further detail.
1. across medium property perception and association analysis
The mode that current knowledge is propagated increasingly has the characteristic across media, and the relevant knowledge and information of same entity are often
From multiple support channels, the coordinate expression in the form of media, and contain a variety of natural qualities and social property, in order to utilize
Across the association knowledge contained in media data and it is used in cross-media retrieval, during constructing knowledge mapping, in addition to
The semantic relation between entity is considered, it is also contemplated that the perception across medium property to entity, establishes across medium property perception mould
Type is simultaneously associated analysis to it.In order to carry out unified quantization expression to different types of incidence relation, and to potential association
It is effectively predicted, makes mutually utilize between different incidence relations, establish a kind of across the media data association description of unification
Mechanism.
For multiple support channels (comprising microblogging, wechat, forum, news website, professional website etc.) is come from, with media shape
State (text, sound, image, video) coordinate expression, and contain a variety of natural qualities (time, place, personage, apparent letter
Breath etc.) and social property (such as temperature, evaluation and preference) entity relevant information, based on and text with mutual between information
Information is mended to extract the high-level semantics of other media type datas, then by text resolution, entity extraction, metadata analysis,
The technologies such as semantic tagger and user behavior analysis obtain natural quality and social property across media data, then pass through one group of support
Vector machine classifier classifies to new data, thus concentrated from noisy network image automatically extract and identify it is similar
Other target;Or by analysis the network user to the forwarding behavior across media data to attention rate of real-world user etc. into
Row modeling forwards behavior, building forwarding tree-model, and benefit by the data contents such as analysis microblogging, wechat, social networks and user
The repeatability and tendentiousness rule that user behavior is found with frequent subtree, to more accurately be tracked to community interest degree
And prediction.Next it is associated modeling to across the complex relationship in media data between natural quality and social property, built
In mold process consider across content existing between media data association (same mode), semantic association (different modalities), sequential correlation,
A variety of associations such as structure connection, according to the link between webpage where multimedia object, based on probability graph model to across in media
Hold and link carries out the modeling analysis of randomization and goes forward side by side one to carry out unified quantization expression to different types of incidence relation
Step realizes the interaction prediction across media data, as shown in Figure 1.
2. the consistency across media knowledge is expressed
Since existing knowledge representation mode and knowledge base sources are substantially also confined to the state of single mode, can not
Meet the needs of across media semantic descriptions and knowledge acquisition, therefore covers across medium property knowledge in the knowledge mapping of building
Afterwards, to be used in cross-media retrieval, on the basis of analyzing single modal data semantic knowledge expression rule, propose by
The method that the data of different shape are mapped to the same semantic label space, to realize that semantic consistency is expressed.In order to from list
One Modal Expansion to across media representation of knowledge levels, propose to the content of medium types various in knowledge mapping carry out calculate and
The structural information of media data is theoretically uniformly mapped to certain space to carry out structure point by the method for measurement
Analysis, fusion and reasoning etc..
After obtaining enough across medium property knowledge and incidence relation, in order to be used in cross-media retrieval,
Establishing across media knowledge consistency on different data granularity, different knowledge hierarchies indicates mechanism.When the isomeries such as text, image are mutual
When the media modalities of benefit co-express a kind of semanteme, by learning certain mapping relations, these isomery modal informations are mapped to
One shared subspace, so that it may similarity measurement directly be carried out to isomeric data under an expression frame.For in content
Semantically with correlation across media data, the data of different media types are transformed by unification using generative probabilistic model
Latent variables space be described, using across distribution of the media data in each hidden variable as its semantic label, and according to language
Evaluation function is established in adopted similarity, semantic coverage and semantic space indexing, is evaluated the alternative of semantic label, and build
Vertical set of semantics.Using the semantic label information of set of semantics, the same modal data in different multimedia document is extracted respectively,
Semantic label using group is respectively each shape up exercise classifier, and using the result of classification as sharing feature, so that not
Data with form also may map to the same semantic label space, to realize that semantic consistency is expressed.
The key of semantic label selection is to calculate it and the semantic dependency across media content, i.e. semantic label and semantic mould
Matching between type, in order to directly be compared semantic label with semantic model, by semantic label with semanteme distribution
Mode indicates, the Semantic Similarity between semantic label and semantic model is calculated using KL distance.In order to obtain semantic label l's
Semanteme distribution { p (w | l) }, by across media data collection D come approximate evaluation { p (w | l, D) }.KL distance meter thus can be used
Calculate the Semantic Similarity between semantic label { p (w | l) } and semantic model { p (w | θ) }:
In order to guarantee that semantic label has higher coverage, the neology word energy of selection to the semantic content across media data
Other semantic components are enough covered, rather than have the content that semantic word has been covered, using maximal margin correlation technique, by most
Bigization maximal margin correlation obtains maximum correlation and otherness semanteme word:
Wherein, S is the semantic word having been selected.
In addition, when being labeled to multiple semantic contents, in order to guarantee that a semantic word will not be with multiple semantic contents
The degree of correlation with higher, it is also contemplated that the differentiation between different semantic contents, i.e. semantic space index, in this case it is necessary to
Using the Semantic Similarity calculation method for considering discrimination:
S ' (l, θi)=S (l, θi)-α S (l, θ-i)(4)
S (l, θ-i)=- d (θ-i‖ l) (5) wherein, θ-1It indicates to remove semantic feature θ1Except other k-1 semantic feature,
That is θ1 ... i-1i+1 ... k, k is semantic feature number.Pass through S ' (l, θi) calculate the semantic similarity across semantic feature and be ranked up,
So as to for multiple semantic content generative semantics correlations and with the semantic word of certain coverage and discrimination.
3. the user query semantic analysis of knowledge based map
For the inquiry content of user's input, need that the content of text and multimedia inquiry is carried out respectively and joined respectively
Analysis is closed, carrys out analyzing user queries from semantic level and is intended to.Therefore, acquired from internet first it is enough across media information simultaneously
Establish semantic model respectively for the data of different media types, as shown in Figure 2: with text word description text semantic model and with
The vision semantic model of visual word description;Then the text data and image data being analysed to using the two models in document
It is all transformed into identical semantic space, and is described in a manner of semantic probability distribution.It is realized not by semantic study later
With the Semantic mapping of media type data.In order to establish association between the data of different media types, relevance isomery matchmaker is excavated
Existing shared subspace between volume data, for semantic dependency across media data, such as image, video and text
Semantic relevant vision data is carried out the study of vision semanteme using text data, text semantic is described in the form of visual word, is built
Mapping relations between vertical text semantic and vision semanteme, are retouched to realize across feature of the media data on same semantic space
It states.
After obtaining across the semantic characteristics description of media data, the semantic distribution point of composite image data and text data
The semanteme of analysis and identification user query, and combine the further association of knowledge mapping progress is semantic to excavate.Knowledge based map is contained
Data semantic association, sequential correlation and structure connection of lid etc., obtain the situation of various dimensions relevant to user query content
Data, such as time, place, entity and its social relationships, and find by reasoning the feature under different situations, to obtain
More perfect query semantics.Since reasoning is what is involved is across media data, so first based in image labeling, video before reasoning
The technologies such as moving object action recognition realize the conversion across media to Text Mode and carry out formalization representation, are then based on text
Inference technology realize reasoning.Needed in conversion process in semantic layer processing across media data, can based on established across
Media semantic models is realized.
4. the cross-media retrieval system towards knowledge mapping
In order to realize the cross-media retrieval system towards knowledge mapping, propose to introduce first knowledge mapping across media
Searching system framework, as shown in Figure 3.System is in addition to having the elements such as user query analysis, index, retrieval, sequence
Outside, across medium property perception and association analysis be joined and consistency expresses several parts.Foot is acquired first from internet
Enough multi-medium datas obtain natural quality and social property across media data based on across medium property sensor model respectively,
Then to wherein contain entity object association, the semantic association of various media type datas, sequential correlation, structure connection etc. into
Row association analysis and description.Building forms the knowledge mapping for reaching certain scale on this basis later, in order to utilize knowledge graph
Across the media knowledge covered in spectrum are indicated it based on the consistency expression frame proposed.
In user query analysis part, support user with shapes such as natural language, across media samples, different media types data
The inquiry content of formula input.When carrying out query semantics analysis, in addition to distinguish the various media type datas that user inputs
Semantic analysis is carried out, also combination semantic analysis and further reasoning are carried out to it in conjunction with knowledge mapping, so that basis is known
Know the context knowledges such as time, place, entity and its social relationships on map and more fully understands that user query are intended to.Across media
Hash index and sort sections mainly call existing some algorithms.