CN105550190A

CN105550190A - Knowledge graph-oriented cross-media retrieval system

Info

Publication number: CN105550190A
Application number: CN201510358374.5A
Authority: CN
Inventors: 杨月华; 张铃丽; 平源; 王亚
Original assignee: Xuchang University
Current assignee: Jiangsu Huchuan Technology Co ltd
Priority date: 2015-06-26
Filing date: 2015-06-26
Publication date: 2016-05-04
Anticipated expiration: 2035-06-26
Also published as: CN105550190B

Abstract

In order to meet the needs of cross-media semantic description and knowledge acquisition, and effectively utilize the cross-media attributes and various associations covered in the knowledge map, the present invention proposes to establish a cross-media attribute perception model, and to carry out natural and social attributes contained in cross-media data. Perception and association analysis, establish a unified cross-media data association description mechanism, and uniformly quantify and express different types of association relationships; propose a method of mapping different forms of data covered by knowledge graphs to the same semantic space to realize semantic knowledge Consistent expression; for the query requests expressed by users in natural language, multimedia samples, or combinations of different types of media data, it is proposed to use various associations covered in the knowledge map to perform semantic analysis on user queries to understand user retrieval intentions, so as to retrieve more A method of relevant results that meets user query requirements; a cross-media retrieval system architecture and implementation method that introduces knowledge graphs is proposed.

Description

Cross-media retrieval system for knowledge graph

技术领域technical field

本发明属于信息检索技术范畴，具体为面向知识图谱的跨媒体检索系统。在跨媒体检索中引入知识图谱有助于获得各种维度的情境数据，甚至通过进一步推理来发现不同情境下的特征，从而能够更好地理解用户查询内容的语义，返回更加满足用户需求的检索结果。The invention belongs to the category of information retrieval technology, in particular to a cross-media retrieval system oriented to knowledge graphs. The introduction of knowledge graphs in cross-media retrieval helps to obtain contextual data of various dimensions, and even further reasoning to discover features in different contexts, so as to better understand the semantics of user query content and return retrievals that better meet user needs. result.

背景技术Background technique

当前，全球网络的发展和普及已经达到空前的规模，人们已经习惯于在互联网上查找各种信息，搜索引擎已成为互联网的中心。国内各个互联网巨头正不遗余力地完善自己的搜索引擎，国家“核高基”科技重大专项也将“新一代搜索引擎与浏览器”列为“十二五”期间支持的重要发展方向。但是互联网上的信息正在呈指数级增长，而且类型多样，各种媒体形式的信息之间存在错综复杂的关联，这些交叉关联使得互联网数据呈现出了跨媒体特性，而这种跨媒体特性对互联网信息分析与检索提出了更高的要求。由于将知识图谱引入跨媒体检索系统后，有助于获得各种维度的情境数据，更好地支持用户以自然语言、多媒体样例或者不同类型媒体数据组合来表达检索意图，还可以通过进一步推理来发现不同情境下的特征，实现更加准确的用户查询语义分析和检索。因此，本发明从知识图谱的角度出发给出了跨媒体检索系统的实现方案。At present, the development and popularization of the global network has reached an unprecedented scale. People have become accustomed to searching for various information on the Internet, and search engines have become the center of the Internet. Various domestic Internet giants are sparing no effort to improve their search engines, and the national "nuclear high-tech" major science and technology project has also listed "new generation search engines and browsers" as an important development direction supported during the "Twelfth Five-Year Plan" period. However, the information on the Internet is growing exponentially, and there are various types of information. There are intricate correlations between information in various media forms. Analysis and retrieval put forward higher requirements. Since the knowledge map is introduced into the cross-media retrieval system, it helps to obtain contextual data of various dimensions, better supports users to express retrieval intentions in natural language, multimedia samples, or combinations of different types of media data, and can also be further reasoned To discover the characteristics of different situations, to achieve more accurate user query semantic analysis and retrieval. Therefore, the present invention provides an implementation scheme of a cross-media retrieval system from the perspective of a knowledge map.

知识图谱是谷歌在2010年收购了开放式数据库公司Metaweb后发展而来的。Metaweb当时主要专注于将不同文字表述与同一个实体连接起来，并探索这些实体的属性(例如明星的年龄)以及彼此之间的联系，最终提供一种新的搜索形式。虽然不能完全替代关键词搜索，但Metaweb的索引、搜索方法在处理自然语言的查询时更高效。同样，在跨媒体检索中，借助知识图谱，也可以更好地理解用户的查询请求并总结出与查询需求语义相关的内容，为用户找出更加准确和更有深度的相关信息。此外，知识图谱还会帮助用户了解事物之间的关系。当用户以自然语言、多媒体样例或者不同类型媒体数据组合表达的查询请求时，这样的一个查询请求可能会代表多重含义，知识图谱能够理解其中的差别，并可以将搜索结果范围缩小到用户最想要的那种含义。再者，由于知识图谱构建了一个与搜索结果相关的完整的知识体系，融合了很多学科，把与用户查询语义相关的知识体系系统化地展示给用户，所以在检索时用户可能会了解到某个新的事实或新的联系，促使其进行一系列全新的搜索查询，让搜索更有深度和广度。因此，将知识图谱引入跨媒体检索中对于改进检索性能具有重要作用。The knowledge graph was developed after Google acquired the open database company Metaweb in 2010. Metaweb was mainly focused on connecting different textual expressions with the same entity, and exploring the attributes of these entities (such as the age of stars) and the relationship between each other, and finally provided a new form of search. Although it cannot completely replace keyword search, Metaweb's index and search methods are more efficient in processing natural language queries. Similarly, in cross-media retrieval, with the help of knowledge graphs, users' query requests can be better understood and content related to the semantics of query requirements can be summarized, so as to find out more accurate and in-depth relevant information for users. In addition, the knowledge graph will also help users understand the relationship between things. When a user expresses a query request in natural language, multimedia samples, or a combination of different types of media data, such a query request may represent multiple meanings, and the knowledge graph can understand the differences and narrow the search results to the user's most the desired meaning. Furthermore, because the knowledge map builds a complete knowledge system related to search results, integrates many disciplines, and systematically displays the knowledge system related to user query semantics to users, so users may learn about certain knowledge during retrieval. A new fact or a new connection prompts it to conduct a whole new set of search queries, giving it greater depth and breadth. Therefore, introducing knowledge graphs into cross-media retrieval plays an important role in improving retrieval performance.

因此，本发明以面向知识图谱的跨媒体检索关键技术为研究对象，提出了跨媒体属性的感知模型和多种关联统一量化表达、跨媒体知识的一致性表达和基于知识图谱的用户查询语义分析方法以及面向知识图谱的跨媒体检索系统的实现方案。在信息检索领域，从当前国内外发展情况来看，面向知识图谱和跨媒体已经成为必然趋势，因此本发明具有非常大的实际应用价值以及广阔的应用前景。Therefore, the present invention takes the key technology of knowledge graph-oriented cross-media retrieval as the research object, and proposes a perceptual model of cross-media attributes, a unified quantitative expression of multiple associations, a consistent expression of cross-media knowledge, and a semantic analysis of user queries based on knowledge graphs. The method and the implementation scheme of the cross-media retrieval system oriented to knowledge graph. In the field of information retrieval, judging from the current development situation at home and abroad, facing knowledge graphs and cross-media has become an inevitable trend, so the present invention has very great practical application value and broad application prospects.

发明内容Contents of the invention

本发明的目的在于提供一个跨媒体信息检索工具，在跨媒体检索中引入知识图谱，基于知识图谱上涵盖的跨媒体语义关联和知识进行语义分析和推理，实现跨媒体检索。具体来说，本发明内容包括以下几点。The purpose of the present invention is to provide a cross-media information retrieval tool, which introduces a knowledge graph into cross-media retrieval, performs semantic analysis and reasoning based on cross-media semantic associations and knowledge covered in the knowledge graph, and realizes cross-media retrieval. Specifically, the content of the present invention includes the following points.

(1)针对互联网上错综复杂的跨媒体数据，建立跨媒体属性感知模型并对其中涵盖的关联关系进行分析，提出一种统一的跨媒体数据关联描述机制。通过文本解析、实体抽取、元数据分析、语义标注和用户行为分析等技术获得跨媒体数据的自然属性和社会属性，然后对跨媒体数据中自然属性和社会属性之间的复杂关系进行关联建模，在建模过程中考虑跨媒体数据间存在的内容关联(同一模态)、语义关联(不同模态)、时序关联、结构关联等多种关联，根据多媒体对象所在网页之间的链接，基于概率图模型对跨媒体内容和链接进行概率化的建模分析，从而对不同类型的关联关系进行统一量化表达。(1) Aiming at the intricate cross-media data on the Internet, establish a cross-media attribute perception model and analyze the association relationships covered in it, and propose a unified cross-media data association description mechanism. Obtain the natural and social attributes of cross-media data through text analysis, entity extraction, metadata analysis, semantic annotation and user behavior analysis, and then perform association modeling on the complex relationship between natural and social attributes in cross-media data In the modeling process, various associations such as content association (same modality), semantic association (different modality), timing association, and structural association among cross-media data are considered. According to the links between the web pages where multimedia objects are located, based on The probabilistic graphical model performs probabilistic modeling and analysis on cross-media content and links, so as to uniformly quantify and express different types of associations.

(2)为了满足跨媒体语义描述和知识获取的需要，提出将不同形态的数据映射到同一个语义标签空间的方法，实现语义一致性表达。当文本、图像等异构互补的媒体形态共同表达一种语义时，通过学习某种映射关系，将这些异构模态信息映射到一个语义标签空间，从而在一个表达框架下直接对异构数据进行相似性度量，并根据语义相似度、语义覆盖度和语义区分度建立评价函数，对语义标签的可选择性进行评价，利用语义标签信息分别为每一个形态训练分类器，并将分类的结果作为共享特征，使得不同形态的数据也可以映射到同一个语义标签空间，从而实现语义一致性表达。(2) In order to meet the needs of cross-media semantic description and knowledge acquisition, a method of mapping different forms of data to the same semantic label space is proposed to achieve semantic consistency expression. When heterogeneous and complementary media forms such as text and images jointly express a semantic, by learning a certain mapping relationship, these heterogeneous modal information can be mapped to a semantic label space, so that heterogeneous data can be directly analyzed under an expression framework. Carry out similarity measurement, and establish an evaluation function based on semantic similarity, semantic coverage and semantic differentiation, evaluate the selectivity of semantic labels, use semantic label information to train classifiers for each form, and classify the results As a shared feature, data of different forms can also be mapped to the same semantic label space, so as to achieve semantic consistency expression.

(3)提出当用户以自然语言、多媒体样例或者不同类型媒体数据组合表达查询请求时结合知识图谱涵盖的关联对其进行语义分析和推理的方法。对于用户输入的查询内容，分别对文本和多媒体查询的内容进行各自以及联合分析，从语义层面来解析用户查询意图。因此首先从互联网上采集足够的跨媒体信息并为不同媒体类型的数据分别建立语义模型，实现跨媒体数据在同一语义空间上的特征描述。然后综合图像数据和文本数据的语义分布分析和识别用户查询的语义，并结合知识图谱进行进一步的关联语义挖掘。基于知识图谱涵盖的数据语义关联、时序关联和结构关联等，获得与用户查询内容相关的各种维度的情境数据，并通过推理来发现不同情境下的特征，从而得到更加完善的查询语义。(3) Propose a method for semantic analysis and reasoning based on associations covered by knowledge graphs when users express query requests in natural language, multimedia samples, or combinations of different types of media data. For the query content entered by the user, the content of the text and multimedia query is separately and jointly analyzed, and the user query intention is analyzed from the semantic level. Therefore, firstly enough cross-media information is collected from the Internet and semantic models are established for data of different media types to realize the feature description of cross-media data in the same semantic space. Then, the semantic distribution analysis of image data and text data is integrated, and the semantics of user queries are identified, and further associated semantic mining is carried out in combination with knowledge graphs. Based on the data semantic association, temporal association, and structural association covered by the knowledge graph, contextual data in various dimensions related to user query content are obtained, and features in different contexts are discovered through reasoning, so as to obtain more complete query semantics.

(4)提出引入知识图谱的跨媒体检索系统架构和实现方法。系统除了具备用户查询分析、索引、检索和排序等基本组成部分，还要创建具有一定规模的知识图谱知识库并集成到系统中。在用户查询分析部分，支持用户以自然语言、跨媒体样例、不同媒体类型数据等形式输入的查询内容。在进行查询语义分析时，除了要对用户输入的各种媒体类型数据分别进行语义分析，还要结合知识图谱对其进行联合语义分析以及进一步的推理，以便根据知识图谱上的时间、地点、实体及其社会关系等情境知识更好地理解用户查询意图。在跨媒体哈希索引和排序部分主要是调用已有的一些算法。(4) Propose the architecture and implementation method of cross-media retrieval system that introduces knowledge graph. In addition to the basic components such as user query analysis, indexing, retrieval and sorting, the system also needs to create a knowledge map knowledge base with a certain scale and integrate it into the system. In the user query analysis section, it supports user-input query content in the form of natural language, cross-media samples, and data of different media types. When performing query semantic analysis, in addition to semantic analysis of various media types data input by users, joint semantic analysis and further reasoning should be carried out in combination with the knowledge graph, so that the time, place, and entity on the knowledge graph Situational knowledge such as social relations and social relations can better understand user query intentions. In the part of cross-media hash indexing and sorting, some existing algorithms are mainly called.

附图说明Description of drawings

图1为跨媒体属性感知和关联分析；Figure 1 shows cross-media attribute perception and association analysis;

图2为基于知识图谱的用户查询语义分析；Figure 2 is a semantic analysis of user queries based on knowledge graphs;

图3为面向知识图谱的跨媒体检索系统架构。Figure 3 shows the architecture of cross-media retrieval system for knowledge graph.

具体实施方式detailed description

为使本发明的目的、技术方案及优点更加清楚明白，以下结合说明书附图对本发明做进一步的详细说明。In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

1.跨媒体属性感知和关联分析1. Cross-media attribute perception and association analysis

当前知识传播的方式越来越具有跨媒体的特性，同一实体的相关知识和信息往往来自多个渠道，以多种媒体形态协同表达，并且蕴含着多种自然属性和社会属性，为了利用跨媒体数据中蕴含的关联知识并将其用于跨媒体检索中，在构建知识图谱的过程中，除了要考虑实体间的语义关系，还要考虑对实体的跨媒体属性的感知，建立跨媒体属性感知模型并对其进行关联分析。为了对不同类型的关联关系进行统一量化表达，并对潜在的关联进行有效预测，使不同的关联关系之间能相互利用，建立一种统一的跨媒体数据关联描述机制。The current way of knowledge dissemination is increasingly characterized by cross-media. Relevant knowledge and information of the same entity often come from multiple channels and are expressed collaboratively in various media forms, and contain various natural and social attributes. In order to make use of cross-media The associated knowledge contained in the data is used in cross-media retrieval. In the process of building a knowledge map, in addition to considering the semantic relationship between entities, the perception of cross-media attributes of entities should also be considered, and the establishment of cross-media attribute perception model and perform correlation analysis on it. In order to uniformly quantify and express different types of associations, and effectively predict potential associations, so that different associations can use each other, a unified description mechanism for cross-media data associations is established.

针对来自多个渠道(包含微博、微信、论坛、新闻网站、专业网站等)，以多种媒体形态(文本、声音、图像、视频)协同表达，并且蕴含着多种自然属性(时间、地点、人物、表观信息等)和社会属性(如热度、评价和偏好等)的实体相关信息，基于和文本伴随信息之间的互补信息来提取其他媒体类型数据的高层语义，然后通过文本解析、实体抽取、元数据分析、语义标注和用户行为分析等技术获得跨媒体数据的自然属性和社会属性，再通过一组支持向量机分类器对新数据进行分类，从而从有噪声的网络图像中集中自动地提取和识别同类别的目标；或者通过分析网络用户对跨媒体数据的转发行为对现实世界用户的关注度等进行建模，通过分析微博、微信、社交网络等数据内容及用户转发行为，构建转发树模型，并利用频繁子树来发现用户行为的重复性和倾向性规律，从而对群体关注度进行更准确的跟踪和预测。接下来对跨媒体数据中自然属性和社会属性之间的复杂关系进行关联建模，在建模过程中考虑跨媒体数据间存在的内容关联(同一模态)、语义关联(不同模态)、时序关联、结构关联等多种关联，根据多媒体对象所在网页之间的链接，基于概率图模型对跨媒体内容和链接进行概率化的建模分析，从而对不同类型的关联关系进行统一量化表达，并进一步实现跨媒体数据的关联预测，如图1所示。From multiple channels (including Weibo, WeChat, forums, news websites, professional websites, etc.), it is expressed in a variety of media forms (text, sound, image, video), and contains a variety of natural attributes (time, place, etc.) , characters, appearance information, etc.) and social attributes (such as popularity, evaluation and preference, etc.), based on the complementary information with the text accompanying information to extract the high-level semantics of other media type data, and then through text analysis, Entity extraction, metadata analysis, semantic annotation, and user behavior analysis and other technologies obtain the natural and social attributes of cross-media data, and then classify the new data through a set of support vector machine classifiers, thereby concentrating on noisy network images. Automatically extract and identify targets of the same category; or model the attention of users in the real world by analyzing the forwarding behavior of network users to cross-media data, and analyze the data content and user forwarding behavior of Weibo, WeChat, social networks, etc. , build a forwarding tree model, and use frequent subtrees to discover the repetitive and tendency rules of user behavior, so as to track and predict the attention of groups more accurately. Next, carry out association modeling on the complex relationship between natural attributes and social attributes in cross-media data, and consider the content association (same modality), semantic association (different modality), and Various associations such as temporal association and structural association, according to the links between the web pages where the multimedia objects are located, carry out probabilistic modeling and analysis of cross-media content and links based on the probabilistic graph model, so as to uniformly quantify and express different types of associations. And further realize the association prediction of cross-media data, as shown in Figure 1.

2.跨媒体知识的一致性表达2. Consistent expression of cross-media knowledge

由于已有的知识表示方式和知识库资源基本上还局限在单一模态的状态，已无法满足跨媒体语义描述和知识获取的需要，因此在构建的知识图谱中涵盖跨媒体属性知识后，要将其用于跨媒体检索中，在分析单一模态数据语义知识表达规律的基础上，提出了将不同形态的数据映射到同一个语义标签空间的方法，从而实现语义一致性表达。为了从单一模态扩展到跨媒体知识表示层面，提出了对知识图谱中各种媒体类型的内容进行计算和度量的方法，从理论上将多种媒体数据的结构信息统一映射到一定的空间以便进行结构分析、融合以及推理等。Since the existing knowledge representation and knowledge base resources are basically limited to a single mode, they can no longer meet the needs of cross-media semantic description and knowledge acquisition. Therefore, after covering cross-media attribute knowledge in the constructed knowledge map, it is necessary It is used in cross-media retrieval. On the basis of analyzing the semantic knowledge expression rules of single-modal data, a method of mapping different forms of data to the same semantic label space is proposed, so as to achieve semantic consistency expression. In order to expand from a single modality to the level of cross-media knowledge representation, a method for calculating and measuring the content of various media types in the knowledge graph is proposed, and the structural information of various media data is mapped to a certain space theoretically. Perform structural analysis, fusion, and inference.

在获取了足够的跨媒体属性知识及关联关系后，为了将其用于跨媒体检索中，在不同的数据粒度、不同知识层次上建立跨媒体知识一致性表示机制。当文本、图像等异构互补的媒体形态共同表达一种语义时，通过学习某种映射关系，将这些异构模态信息映射到一个共享子空间，就可以在一个表达框架下直接对异构数据进行相似性度量。对于在内容和语义上具有相关性的跨媒体数据，采用概率生成模型将不同媒体类型的数据转换到统一的隐变量空间进行描述，以跨媒体数据在各个隐变量上的分布作为其语义标签，并根据语义相似度、语义覆盖度和语义区分度建立评价函数，对语义标签的可选择性进行评价，并建立语义组。利用语义组的语义标签信息，将不同多媒体文档中的同模态数据分别提取出来，利用组的语义标签分别为每一个形态训练分类器，并将分类的结果作为共享特征，使得不同形态的数据也可以映射到同一个语义标签空间，从而实现语义一致性表达。After obtaining enough cross-media attribute knowledge and association relations, in order to use it in cross-media retrieval, a consistent representation mechanism for cross-media knowledge is established on different data granularities and different knowledge levels. When heterogeneous and complementary media forms such as text and images jointly express a semantic, by learning a certain mapping relationship and mapping these heterogeneous modal information to a shared subspace, the heterogeneous data similarity measure. For cross-media data that is relevant in content and semantics, a probabilistic generation model is used to transform data of different media types into a unified latent variable space for description, and the distribution of cross-media data on each latent variable is used as its semantic label. And according to the semantic similarity, semantic coverage and semantic differentiation, an evaluation function is established to evaluate the selectivity of semantic tags and establish semantic groups. Using the semantic label information of the semantic group, the same-modal data in different multimedia documents are extracted separately, and the semantic label of the group is used to train a classifier for each form, and the classification result is used as a shared feature, so that the data of different forms It can also be mapped to the same semantic label space to achieve semantic consistency expression.

语义标签选择的关键是计算它与跨媒体内容的语义相关性，即语义标签和语义模型之间的匹配，为了能够直接将语义标签与语义模型进行比较，将语义标签以语义分布的方式表示，使用KL距离计算语义标签和语义模型之间的语义相似性。为了获得语义标签l的语义分布{p(w|l)}，通过跨媒体数据集D来近似估计{p(w|l，D)}。这样就可以使用KL距离计算语义标签{p(w|l)}和语义模型{p(w|θ)}之间的语义相似性：The key to semantic tag selection is to calculate its semantic correlation with cross-media content, that is, the matching between semantic tags and semantic models. In order to be able to directly compare semantic tags with semantic models, semantic tags are expressed in a semantic distribution, Semantic similarity between semantic labels and semantic models is computed using KL distance. To obtain the semantic distribution {p(w|l)} of the semantic label l, {p(w|l, D)} is approximated by cross-media dataset D. In this way, the semantic similarity between the semantic label {p(w|l)} and the semantic model {p(w|θ)} can be calculated using the KL distance:

$S S ((l l,, θ θ)) = = - - d d ((θ θ | | | | l l)) = = - - \underset{w w}{Σ Σ} p p ((w w | | θ θ)) log log \frac{p p ((w w | | θ θ))}{p p ((w w | | l l))} - - - - - - ((11))$

为了保证语义标签对跨媒体数据的语义内容有较高的覆盖度，选择的新语义词能够覆盖其它语义部分，而不是已有语义词已经涵盖的内容，采用最大边缘相关方法，通过最大化最大边缘相关性取得最大相关性和差异性语义词：In order to ensure that the semantic tags have a high coverage of the semantic content of the cross-media data, the selected new semantic words can cover other semantic parts, rather than the content already covered by the existing semantic words. Marginal relevance achieves maximum relevance and difference semantic words:

$\overset{^^}{l l} = = \underset{l l &Element; &Element; L L - - S S}{arg arg max max} [[λS λS ((l l,, θ θ)) - - ((11 - - λ λ)) \underset{l l &Element; &Element; L L - - S S}{max max Sim Sim ((l l'',, l l))}]] - - - - - - ((22))$

$Sim Sim ((l l'',, l l)) = = - - d d ((l l'' | | | | l l)) = = - - \underset{w w}{Σ Σ} p p ((w w | | l l'')) log log \frac{p p ((w w | | l l''))}{p p ((w w | | l l))} - - - - - - ((33))$

其中，S是已经选择的语义词。Among them, S is the selected semantic word.

此外，当对多个语义内容进行标注时，为了保证一个语义词不会和多个语义内容具有较高的相关度，还要考虑不同语义内容间的区分，即语义区分度，在这种情况下，需要采用考虑区分度的语义相似性计算方法：In addition, when labeling multiple semantic contents, in order to ensure that a semantic word will not have a high degree of correlation with multiple semantic contents, the distinction between different semantic contents must also be considered, that is, the degree of semantic differentiation. In this case In this case, it is necessary to adopt a semantic similarity calculation method that considers the degree of discrimination:

S’(l，θ_i)＝S(l，θ_i)-αS(l，θ_-i)(4)S'(l, θ _i ) = S(l, θ _i )-αS(l, θ _-i ) (4)

S(l，θ_-i)＝-d(θ_-i‖l)(5)其中，θ_-1表示除语义特征θ₁之外的其他k-1个语义特征，即θ_{1，...i-1i+1，...k}，k为语义特征数。通过S’(l，θ_i)计算跨语义特征的语义相似度并进行排序，从而可以为多个语义内容生成语义相关且具有一定覆盖度和区分度的语义词。S(l, θ _-i )=-d(θ _-i ∥ l) (5) where θ _-1 represents k-1 semantic features other than the semantic feature θ ₁ , namely θ _{1,... i-1i+1,...k} , where k is the number of semantic features. The semantic similarity of cross-semantic features is calculated and sorted by S'(l, θ _i ), so that semantic words with certain coverage and differentiation can be generated for multiple semantic contents.

3.基于知识图谱的用户查询语义分析3. Semantic analysis of user queries based on knowledge graph

对于用户输入的查询内容，需要分别对文本和多媒体查询的内容进行各自以及联合分析，从语义层面来解析用户查询意图。因此，首先从互联网上采集足够的跨媒体信息并为不同媒体类型的数据分别建立语义模型，如图2所示：以文本词描述的文本语义模型和以视觉词描述的视觉语义模型；然后利用这两个模型将待分析文档中的文本数据和图像数据都转换到相同的语义空间，并以语义概率分布的方式进行描述。之后通过语义学习实现不同媒体类型数据的语义映射。为了在不同媒体类型的数据间建立关联，挖掘关联性异构媒体数据之间存在的共享子空间，对于具有语义相关性的跨媒体数据，如图像、视频等与文本语义相关的视觉数据，采用文本数据进行视觉语义学习，以视觉词的形式描述文本语义，建立文本语义和视觉语义之间的映射关系，从而实现跨媒体数据在同一语义空间上的特征描述。For the query content entered by the user, it is necessary to separately and jointly analyze the text and multimedia query content, and analyze the user query intention from the semantic level. Therefore, first, collect enough cross-media information from the Internet and establish semantic models for different media types of data, as shown in Figure 2: the text semantic model described by text words and the visual semantic model described by visual words; then use These two models transform both text data and image data in the document to be analyzed into the same semantic space, and describe it in the form of semantic probability distribution. After that, the semantic mapping of different media types of data is realized through semantic learning. In order to establish associations between data of different media types, and to mine the shared subspaces existing among related heterogeneous media data, for cross-media data with semantic correlation, such as image, video and other visual data related to text semantics, adopt Text data is used to learn visual semantics, describe text semantics in the form of visual words, and establish a mapping relationship between text semantics and visual semantics, so as to realize the feature description of cross-media data in the same semantic space.

在获得了跨媒体数据的语义特征描述后，综合图像数据和文本数据的语义分布分析和识别用户查询的语义，并结合知识图谱进行进一步的关联语义挖掘。基于知识图谱涵盖的数据语义关联、时序关联和结构关联等，获得与用户查询内容相关的各种维度的情境数据，如时间、地点、实体及其社会关系等，并通过推理来发现不同情境下的特征，从而得到更加完善的查询语义。由于推理涉及的是跨媒体数据，所以推理前先基于图像标注、视频中活动对象动作识别等技术实现跨媒体到文本模式的转换并进行形式化表示，然后基于文本的推理技术实现推理。在转换过程中需要在语义层处理跨媒体数据，可以基于所建立的跨媒体语义模型来实现。After the semantic feature description of cross-media data is obtained, the semantic distribution analysis of image data and text data is integrated to identify the semantics of user queries, and further associated semantic mining is carried out in combination with knowledge graphs. Based on the data semantic association, temporal association, and structural association covered by the knowledge map, obtain contextual data in various dimensions related to user query content, such as time, location, entity and its social relationship, etc., and use reasoning to discover different situations. features, so as to obtain a more complete query semantics. Since the reasoning involves cross-media data, before reasoning, the conversion from cross-media to text mode is realized based on technologies such as image annotation and moving object action recognition in video, and formalized representation is performed, and then reasoning is realized based on text reasoning technology. In the conversion process, it is necessary to process cross-media data at the semantic level, which can be realized based on the established cross-media semantic model.

4.面向知识图谱的跨媒体检索系统4. Cross-media retrieval system for knowledge graph

为了实现一个面向知识图谱的跨媒体检索系统，首先提出引入知识图谱的跨媒体检索系统架构，如图3所示。系统除了具备用户查询分析、索引、检索、排序等基本组成部分外，加入了跨媒体属性感知和关联分析以及一致性表达几个部分。首先从互联网上采集足够的多媒体数据，基于跨媒体属性感知模型分别获取跨媒体数据的自然属性和社会属性，然后对其中蕴含的实体对象关联、各种媒体类型数据的语义关联、时序关联、结构关联等进行关联分析和描述。之后在此基础上构建形成达到一定规模的知识图谱，为了利用知识图谱中涵盖的跨媒体知识，基于所提出的一致性表达框架对其进行表示。In order to realize a knowledge graph-oriented cross-media retrieval system, a cross-media retrieval system architecture that introduces knowledge graphs is firstly proposed, as shown in Figure 3. In addition to basic components such as user query analysis, indexing, retrieval, and sorting, the system also includes several parts such as cross-media attribute perception, correlation analysis, and consistency expression. First, collect enough multimedia data from the Internet, obtain the natural attributes and social attributes of the cross-media data based on the cross-media attribute perception model, and then analyze the entity object associations, semantic associations, time series associations, and structure of various media types. Correlation analysis and description. Afterwards, a knowledge graph of a certain scale is built on this basis. In order to utilize the cross-media knowledge covered in the knowledge graph, it is represented based on the proposed consistent expression framework.

在用户查询分析部分，支持用户以自然语言、跨媒体样例、不同媒体类型数据等形式输入的查询内容。在进行查询语义分析时，除了要对用户输入的各种媒体类型数据分别进行语义分析，还要结合知识图谱对其进行联合语义分析以及进一步的推理，以便根据知识图谱上的时间、地点、实体及其社会关系等情境知识更好地理解用户查询意图。在跨媒体哈希索引和排序部分主要是调用已有的一些算法。In the user query analysis section, it supports user-input query content in the form of natural language, cross-media samples, and data of different media types. When performing query semantic analysis, in addition to semantic analysis of various media types data input by users, joint semantic analysis and further reasoning should be carried out in combination with the knowledge graph, so that the time, place, and entity on the knowledge graph Situational knowledge such as social relations and social relations can better understand user query intentions. In the part of cross-media hash indexing and sorting, some existing algorithms are mainly called.

Claims

1. towards the cross-media retrieval system of knowledge mapping, it is characterized in that, this system covers the content of following aspect:

Across medium property perception and association analysis;

Consistance across media knowledge is expressed;

User's query semantics of knowledge based collection of illustrative plates is analyzed;

Towards cross-media retrieval system architecture and the realization of knowledge mapping.

2. system according to claim 1, is characterized in that, sets up across medium property sensor model and analyzes the incidence relation wherein contained, and what propose a kind of unification describes mechanism across media data association.Pass through text resolution, entity extracts, metadata analysis, the technology such as semantic tagger and user behavior analysis obtains across the natural quality of media data and social property, then association modeling is carried out to across the complex relationship in media data between natural quality and social property, consider across the relevance existed between media data (same mode) in modeling process, semantic association (different modalities), sequential correlation, the multiple association such as structure connection, according to the link between the webpage of multimedia object place, based on probability graph model to the modeling analysis carrying out randomization across media content and link, thus unified quantization expression is carried out to dissimilar incidence relation.

3. system according to claim 1, is characterized in that, in order to meet the needs across media semantic description and knowledge acquisition, proposes the method for the data-mapping of different shape to same semantic label space, realizes semantic consistency and expresses.Work as text, when the media modalities co expression one of the isomery complementations such as image is semantic, by learning certain mapping relations, these isomery modal informations are mapped to a semantic label space, thus directly similarity measurement is carried out to isomeric data under expressing framework at one, and according to semantic similarity, semantic coverage and semantic space calibration set up evaluation function, the alternative of semantic label is evaluated, semantic label information is utilized to be respectively each shape up exercise sorter, and using the result of classification as sharing feature, make the data of different shape also can be mapped to same semantic label space, thus realize semantic consistency expression.

4. system according to claim 1, is characterized in that, proposes when user to carry out the method for semantic analysis and reasoning to it with the association contained in conjunction with knowledge mapping during natural language, multimedia sample or dissimilar media data combination expression inquiry request.For the query contents of user's input, respectively the content of text and multimedia inquiry is carried out separately and Conjoint Analysis, carry out analyzing user queries intention from semantic level.Therefore first gather from internet and enough set up semantic model respectively across media information and for the data of different media types, realize across the feature interpretation of media data on same semantic space.Then the semantic distributional analysis of composite image data and text data and the semanteme of identification user inquiry, and carry out the semantic excavation of further association in conjunction with knowledge mapping.Knowledge based collection of illustrative plates contain data semantic association, sequential correlation and structure connection etc., obtain the context data of the various dimensions relevant to user's query contents, and find the feature under different situation by reasoning, thus obtain more perfect query semantics.

5. system according to claim 1, is characterized in that, system, except possessing the elements such as user's query analysis, index, retrieval and sequence, also will create knowledge mapping knowledge base of certain scale and be integrated in system.In user's query analysis part, support that user is with natural language, query contents across the input of the form such as media sample, different media types data.When carrying out query semantics and analyzing, except semantic analysis will be carried out respectively to the various media type data of user's input, also to carry out combination semantic analysis and further reasoning, to understand user's query intention better according to context knowledge such as the time on knowledge mapping, place, entity and social relationships thereof in conjunction with knowledge mapping to it.