CN112860916A - Movie-television-oriented multi-level knowledge map generation method - Google Patents
Movie-television-oriented multi-level knowledge map generation method Download PDFInfo
- Publication number
- CN112860916A CN112860916A CN202110254580.7A CN202110254580A CN112860916A CN 112860916 A CN112860916 A CN 112860916A CN 202110254580 A CN202110254580 A CN 202110254580A CN 112860916 A CN112860916 A CN 112860916A
- Authority
- CN
- China
- Prior art keywords
- data
- attribute
- embedding
- knowledge
- level knowledge
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000004458 analytical method Methods 0.000 claims description 11
- 230000004927 fusion Effects 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 230000010354 integration Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims 1
- 230000002776 aggregation Effects 0.000 claims 1
- 230000002457 bidirectional effect Effects 0.000 claims 1
- 238000005259 measurement Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 238000004590 computer program Methods 0.000 description 8
- 239000010410 layer Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 230000007847 structural defect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a movie-oriented multi-level knowledge map generation method, which is used for acquiring information data of different levels related to a movie to be processed; extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data; embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data; and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
Description
Technical Field
The disclosure relates to the technical field of data mining and intelligent information processing, in particular to a movie-television-oriented multi-level knowledge map generation method.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The film and television field has the problems of a large number of data sources, a large quantity, various data forms and complex data structures, different knowledge is stored among different film and television knowledge maps, and the knowledge has a plurality of repetitions and can be mutually supplemented, so that researchers propose that all the knowledge maps can be integrated to form a multi-level knowledge map. To form a multi-level knowledge graph, a basic problem is to align the knowledge of entities that exist in different video knowledge graphs but represent the same meaning.
The alignment method is mainly divided into two parts: conventional alignment methods and embedding-based alignment methods. The former mainly aligns entities by means of attribute similarity matching by using a supervised machine learning model. The latter is mainly based on methods of representation learning, and specifically, computation and reasoning are performed by mapping both entities and relationships of a movie knowledge graph to a low-dimensional vector space, and then calculating the similarity between the entities. Most of these methods focus on how to encode the relational triplets in a better way, ignoring those attribute triplets; especially for entities lacking relationships, if the entities are aligned by using only the relationship triples, the comprehensiveness and accuracy of the obtained knowledge graph are poor.
Disclosure of Invention
In order to overcome the defects of the prior art, the method for generating the film-television-oriented multilayer knowledge graph is provided, and meanwhile, the film-television-oriented multilayer knowledge graph is aligned from the perspective of structure and attribute according to the relation triples and the attribute triples existing in the single-layer film knowledge graph, so that the multilayer knowledge graph is formed finally.
In order to achieve the purpose, the following technical scheme is adopted in the disclosure:
the first aspect of the disclosure provides a movie-oriented multi-level knowledge graph generation method.
A movie-oriented multi-level knowledge map generation method comprises the following steps:
acquiring information data of different layers related to a movie to be processed;
extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
The second aspect of the present disclosure provides a film and television oriented multi-level knowledge graph generation system.
A multi-level knowledge map generation system facing to film and television comprises: the method comprises the following steps:
a data acquisition module configured to: acquiring information data of different layers related to a movie to be processed;
a relationship extraction module configured to: extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
an embedding module configured to: embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
a knowledge-graph integration module configured to: and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
The third aspect of the disclosure provides a movie and television query method based on a multi-level knowledge graph.
A movie and television query method based on a multi-level knowledge graph comprises the following steps:
acquiring a text to be queried;
analyzing the text to be queried to obtain an analysis result;
and inquiring the movie information data according to the analysis result and the multi-level knowledge graph constructed by using the generation method of the first aspect of the disclosure to obtain an inquiry result.
Further, the movie query result is displayed.
A fourth aspect of the present disclosure provides a computer-readable storage medium, on which a program is stored, which when executed by a processor, implements the steps in the movie-oriented multi-level knowledge-graph generation method according to the first aspect of the present disclosure.
A fifth aspect of the present disclosure provides an electronic device, which includes a memory, a processor, and a program stored in the memory and executable on the processor, where the processor executes the program to implement the steps in the method for generating a multi-level film-oriented knowledge map according to the first aspect of the present disclosure.
Compared with the prior art, the beneficial effect of this disclosure is:
1. according to the generation method, the generation system, the generation medium or the electronic equipment, the alignment is carried out from the angle of structure and attribute respectively according to the relation triple and the attribute triple existing in the single-level movie knowledge map, and the multi-level knowledge map is finally formed, so that the structural defect and the relation defect of the conventional multi-level knowledge map are overcome, and the comprehensiveness and the accuracy of the generated multi-level knowledge map are ensured.
2. The movie and television query method based on the multi-level knowledge graph combines the constructed comprehensive and accurate multi-level knowledge graph, can realize the quick query of the related movie and television information data, and improves the query accuracy.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
Fig. 1 is a schematic flow chart of a movie-oriented multi-level knowledge graph generation method provided in embodiment 1 of the present disclosure.
Fig. 2 is a schematic diagram of a multi-level knowledge-graph structure provided in example 1 of the present disclosure.
Fig. 3 is a schematic diagram of an attribute value and attribute type embedding process based on Pseudo-simple Neural Network provided in embodiment 1 of the present disclosure.
Detailed Description
The present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
Example 1:
as shown in fig. 1, embodiment 1 of the present disclosure provides a movie-oriented multilayer knowledge graph generation method, including the following steps:
acquiring information data of different layers related to a movie to be processed;
extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
Specifically, the method comprises the following steps:
s1: firstly, a single-level knowledge map is constructed
S1.1: firstly, different levels of movie knowledge are analyzed, for example, the types of movies include suspicion, horror, love, swordsman, literary and artistic films and the like; the movie streams are assigned with super-realistic meaning movie genres, extremely simple meaning genres and the like; the movies are technically divided into black and white movies, common color movies, 3D movies and the like, movie actors come from different national regions of the world respectively, and data are acquired according to different levels at the stage.
S1.2: performing relation extraction according to different characteristics of the acquired data, wherein the acquired data only has three types: text, data, tables. Text extracts triples by using dependency syntax analysis and HanLP, and data and tables find high-degree correlation relations among entities by a Pearson correlation coefficient method to extract triples. The triplet is of the form < h, r, t > where h is the head entity, t is the tail entity, and r is the relationship between the two entities.
S1.3: and respectively constructing a single-level knowledge graph by using the extracted triples.
S2.1: multi-source data fusion
Knowledge is continuously updated, in order to judge whether the newly added knowledge is real and credible, collected information needs to be fused by using a multi-source data fusion model, and the data can be added into a knowledge map only by judging the reality and credibility of the data.
The working process of the multi-source data fusion model is as follows:
(1) data blocking: in the relation extraction stage, knowledge extraction is performed with entity keywords as the center. Therefore, the data from different sources are partitioned and aggregated according to the entity keywords of each layer, and the partitioned and aggregated data are used as candidate matching knowledge. Therefore, when data are fused, the data can be fused with the data of each layer, the whole knowledge base is prevented from being traversed, and the calculation complexity is reduced.
(2) Multi-source data fusion coefficient K: and matching the candidate matching knowledge in the same block by using the multi-source data fusion coefficient K and the knowledge of the original knowledge base, and if the K is greater than a set threshold value, considering the candidate matching knowledge as correct knowledge and adding the correct knowledge into the knowledge base.
The multi-source data fusion coefficient K is defined as follows:
k is composed of two parts, wherein confidence is the confidence score, and the latter part is the average value of the entity similarity and the relationship similarity. The confidence is composed of two parts, namely M and cf, wherein M is the confidence of a data source, such as a relatively authoritative website or a relatively authoritative knowledge base like Baidu encyclopedia, a relatively authoritative Hopkins and the like, and the M value is relatively high. cf is a confidence calculated for each two entity combination based on the distance between the entity and the entity, and the entity and the relationship representation.
The term _ sim is the text similarity calculation between the entities, the Relationship _ sim is the Relationship similarity calculation, the average of the two is taken as the similarity of the knowledge, if the corresponding similarity is greater than a set threshold value of 0.5, the knowledge is considered to be more credible, L in the formula represents the Entity position, and R represents the Relationship position. L isi-LjRepresents the distance between entity 1 and entity 2; l isiR represents the distance between the entity 1 and the relation, the larger the distance is, the less the probability that the semantic relation exists between the entity and the relation is, and the lower the confidence coefficient is.
S3: constructing a multi-level knowledge graph
After the multi-source data fusion is completed, the knowledge graph is also a dispersed single-level knowledge graph. And integrating the single-level knowledge graph in an entity alignment mode to construct a multi-level knowledge graph. Knowledge exists in the form of triples, which are typically divided into relationship triples and attribute triples, such as relationship triples (Zhang Ziyi, Husband, Wang Feng), attribute triples (Zhang Ziyi, age, 41).
By embedding the structure and the attribute by using the relation triple and the attribute triple, the entity semantic information and the attribute information can be fully utilized, and the accuracy of the alignment process is further improved, and the specific method comprises the following steps:
s3.1: and merging the triples in the single-level knowledge graph by adopting a uniform naming method through predicate similarity, and setting 0.95 as a similarity threshold value, so that the entities and the relations can be embedded into the same vector space.
Because the same entity in different knowledge graphs has different representation modes, the same entity of different knowledge graphs is simply named uniformly, and after the name is uniform, the subsequent alignment is executed in a vector space.
S3.2: and respectively carrying out structure and attribute alignment by utilizing the relation triples and the attribute triples.
(A) Structure embedding: and carrying out structure embedding by using the relation triples and the training set, and learning vector representation of the entity and the relation. Given the relationship triplet Tt ═ (h, r, t), and the expectation that h + r ═ t, to measure reasonableness, the model optimizes the margin-based loss of ranking, leaving the score of the positive triplet lower than that of the negative triplet:
wherein,tr is a scoring function representing all sets of positive triples, Tr' represents the associated set of negative triples generated by replacing the head or tail of a random entity (but not both entities), and α is a weighted ratio hyperparameter for positive and negative triples, with values in the range of 0,1]. Therefore, we can learn the approximate vector representation of the entity on kgs (knowledge graphs), and the similarity measure of the entity after structure embedding is shown in the formula:
wherein, G1, G2 refer to knowledge graph 1 and knowledge graph 2, hi, hj refer to the entity in knowledge graph 1 and the entity in knowledge graph 2, cos function refers to cosine function, Sim function refers to similarity value after structure embedding, and its value is equal to cosine function value of two entities in vector space.
(B) Attribute value and attribute type embedding based on Pseudo-Simase Neural Network
Embedding attribute values: encoding attribute values from two directions into a single embedding using a Bi-directional gated recursive unit (Bi-GRU) network, detailed in the following equation:
Zi=σ(WZ[Ci,hi-1])
Ri=σ(Wr[Ci,hi-1])
wherein Z and R are the update gate and reset gate, respectively, of the GRU unit, WZ,Wh,WrIs weight matrix, Bi-GRU is composed of forward GRU and backward GRU, the forward GRU reads the embedding of input character from left to right, the backward GRU reads the embedding of character from backward, if reading attribute value v ═ c0,c1.....cn) Their outputs are respectively
The initial state of the Bi-GRU is set as a 0 vector, and after the read character is embedded, attribute value embedding is finally formed
Attribute type embedding: to learn the importance of different attributes to an entity, attribute types and attribute values are made to share an attention weight. Suppose thatIs m attribute values of the entity e, calculates the attention weightThe weight of an attribute value should be consistent with the weight of its attribute type, as follows:
and connecting attribute type embedding and attribute value embedding to obtain final entity attribute embedding:
the attribute embedding process is to utilize the attribute of the entity to be aligned to have high similarity, learn on the attribute information of two KGs through Pseudo-simple Neural Network, then use some trained metrics to evaluate the similarity between the entity embedding, and the entity embedding is finally the entity attribute type embedding and attribute value embedding connection.
S3: integrating a multi-level knowledge map:
and finally, integrating the aligned single-level knowledge maps into a unified multi-level knowledge map.
Obtaining a similar matrix of each entity pair through the attribute embedding and structure embedding processes, then using an optimal bipartite graph matching algorithm as a new entity pair for next iteration, and finally finding an optimally matched entity, namely realizing entity alignment.
Example 2:
the embodiment 2 of the present disclosure provides a film and television oriented multi-level knowledge graph generation system, including:
a data acquisition module configured to: acquiring information data of different layers related to a movie to be processed;
a relationship extraction module configured to: extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
an embedding module configured to: embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
a knowledge-graph integration module configured to: and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
The working method of the system is the same as the method for generating the multi-level knowledge graph facing the movie and television provided in the embodiment 1, and the details are not repeated here.
Example 3:
at present, people mainly search movie and television related knowledge through a search engine, collect webpage information from the internet according to keywords input by a user, organize and process the information, and then display the retrieved related webpages to the user. The search engine is suitable for the information retrieval of the whole scope, is not limited to a specific field to inquire, but has the following disadvantages: firstly, the webpage information is often mechanically matched through keywords, and the semantics of sentences input by users cannot be understood, so that certain requirements are required on user input, and the more accurate the keywords input by the users are, the closer the retrieved contents are to the contents desired by the users; secondly, the retrieval returns related web pages, not accurate answers, and the user needs to obtain required information and must browse numerous web pages
In view of this, embodiment 3 of the present disclosure provides a method for querying a movie based on a multi-level knowledge graph, including the following steps:
acquiring a text to be queried;
analyzing the text to be queried to obtain an analysis result;
according to the analysis result and the multi-level knowledge graph constructed by the generation method in the embodiment 1, the movie information data is queried to obtain a query result;
and displaying the movie inquiry result.
The data source of the multi-level knowledge graph is obtained by crawling the movie text data on a webpage.
The analysis method comprises the following steps:
analyzing the query text by adopting an intention recognition model based on an enhanced convolutional neural network to obtain an analysis result, which specifically comprises the following steps:
constructing entities, attributes and collected film and television professional words in the knowledge graph into a dictionary tree;
matching the characters in the query text with the professional movie words in the dictionary tree through an intention recognition model, and taking the matched professional movie words as word information sequences;
and fusing all matched film and television professional words to form an analysis result.
Example 4:
the embodiment 4 of the present disclosure provides a computer-readable storage medium, on which a program is stored, where the program, when executed by a processor, implements the steps in the method for generating a multi-level knowledge map for movies and televisions according to embodiment 1 of the present disclosure.
Example 5:
the embodiment 5 of the present disclosure provides an electronic device, which includes a memory, a processor, and a program stored in the memory and executable on the processor, where the processor implements the steps in the method for generating a multi-level knowledge graph for movies and televisions according to embodiment 1 of the present disclosure when executing the program.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (10)
1. A movie-oriented multi-level knowledge map generation method is characterized by comprising the following steps: the method comprises the following steps:
acquiring information data of different layers related to a movie to be processed;
extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
2. The method for generating a multi-level knowledge-graph for movies according to claim 1, characterized in that:
the information data includes text, data and tables, the text uses dependency syntax analysis and HanLP to extract triples, and the data and tables extract triples by Pearson's correlation coefficient method.
3. The method for generating a multi-level knowledge-graph for movies according to claim 1, characterized in that:
the method for carrying out multivariate data fusion on the acquired information data and judging the authenticity of the data comprises the following steps:
carrying out block aggregation on data from different sources by taking the entity keywords of each layer as a basis to serve as candidate matching knowledge;
and matching the candidate matching knowledge in the same block by using the multisource data fusion coefficient and the knowledge of the original knowledge base, and if the multisource data fusion coefficient is greater than a set threshold value, considering that the candidate matching knowledge is correct knowledge and can be added into the knowledge base, otherwise, the candidate matching knowledge cannot be added.
4. The method for generating a multi-level knowledge-graph for movies according to claim 1, characterized in that:
merging the triples in the single-level knowledge graph by adopting a uniform naming method through predicate similarity, and embedding the entities and the relations into the same vector space;
in the vector space, structure and attribute alignment is carried out by utilizing the relation triples and the attribute triples respectively;
and integrating the aligned single-level knowledge maps into a uniform multi-level knowledge map.
5. The method for generating a multi-level knowledge-graph for movies according to claim 1, characterized in that:
carrying out structure embedding based on the TransE relation triple, comprising the following steps: and combining the predicate-aligned triples, performing structure embedding by using the relation triples and the training set, learning vector representation of the entities and the relation, and making the scores of the positive triples lower than those of the negative triples to obtain an entity similarity measurement expression after structure embedding.
6. The method for generating a multi-level knowledge-graph for movies according to claim 1, characterized in that:
embedding attribute values and attribute types based on a pseudo-twin neural network, comprising:
and encoding the attribute values into single embedding from two directions by using a bidirectional gating recursive unit network, enabling the attribute types and the attribute values to share an attention weight, and connecting the attribute type embedding and the attribute value embedding in series to obtain a final attribute embedding result.
7. A multi-level knowledge map generation system facing to film and television is characterized in that: the method comprises the following steps:
a data acquisition module configured to: acquiring information data of different layers related to a movie to be processed;
a relationship extraction module configured to: extracting the relation of the acquired information data to obtain triple data, and constructing a plurality of single-level knowledge maps according to the triple data;
an embedding module configured to: embedding the structure and the attribute by utilizing the relation ternary group data and the attribute ternary group data in the ternary group data;
a knowledge-graph integration module configured to: and combining the results of structure embedding and attribute embedding to carry out entity alignment, and integrating a plurality of single-level knowledge maps after entity alignment to obtain a multi-level knowledge map.
8. A movie and television query method based on a multi-level knowledge graph is characterized by comprising the following steps: the method comprises the following steps:
acquiring a text to be queried;
analyzing the text to be queried to obtain an analysis result;
and inquiring the movie information data according to the analysis result and the multi-level knowledge graph constructed by the generation method of any one of claims 1 to 6 to obtain an inquiry result.
9. A computer-readable storage medium, on which a program is stored, wherein the program, when executed by a processor, implements the steps in the movie-oriented multi-level knowledge-graph generation method according to any one of claims 1 to 6.
10. An electronic device comprising a memory, a processor and a program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method for generating a multi-level knowledge map for movie & TV according to any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110254580.7A CN112860916B (en) | 2021-03-09 | 2021-03-09 | Movie-television-oriented multi-level knowledge map generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110254580.7A CN112860916B (en) | 2021-03-09 | 2021-03-09 | Movie-television-oriented multi-level knowledge map generation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112860916A true CN112860916A (en) | 2021-05-28 |
CN112860916B CN112860916B (en) | 2022-09-16 |
Family
ID=75993475
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110254580.7A Active CN112860916B (en) | 2021-03-09 | 2021-03-09 | Movie-television-oriented multi-level knowledge map generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112860916B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282764A (en) * | 2021-06-29 | 2021-08-20 | 南方电网科学研究院有限责任公司 | Network security data knowledge graph construction method and device |
CN115391569A (en) * | 2022-10-27 | 2022-11-25 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Method for automatically constructing industry chain map from research report and related equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111694966A (en) * | 2020-06-10 | 2020-09-22 | 齐鲁工业大学 | Multilevel knowledge map construction method and system for chemical industry field |
CN112131404A (en) * | 2020-09-19 | 2020-12-25 | 哈尔滨工程大学 | Entity alignment method in four-risk one-gold domain knowledge graph |
CN112200317A (en) * | 2020-09-28 | 2021-01-08 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Multi-modal knowledge graph construction method |
-
2021
- 2021-03-09 CN CN202110254580.7A patent/CN112860916B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111694966A (en) * | 2020-06-10 | 2020-09-22 | 齐鲁工业大学 | Multilevel knowledge map construction method and system for chemical industry field |
CN112131404A (en) * | 2020-09-19 | 2020-12-25 | 哈尔滨工程大学 | Entity alignment method in four-risk one-gold domain knowledge graph |
CN112200317A (en) * | 2020-09-28 | 2021-01-08 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Multi-modal knowledge graph construction method |
Non-Patent Citations (4)
Title |
---|
TAO SUN; JIAOJIAO ZHAI; QI WANG: "NovEA: A Novel Model of Entity Alignment Using Attribute Triples and Relation Triples", 《 KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT. 13TH INTERNATIONAL CONFERENCE,KSEM 2020》 * |
吕燕等: "一种精细表示多值属性的知识图谱嵌入模型", 《计算机与数字工程》 * |
杜文倩等: "融合实体描述及类型的知识图谱表示学习方法", 《中文信息学报》 * |
赵晓娟等: "多源知识融合技术研究综述", 《云南大学学报(自然科学版)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282764A (en) * | 2021-06-29 | 2021-08-20 | 南方电网科学研究院有限责任公司 | Network security data knowledge graph construction method and device |
CN115391569A (en) * | 2022-10-27 | 2022-11-25 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Method for automatically constructing industry chain map from research report and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN112860916B (en) | 2022-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021223567A1 (en) | Content processing method and apparatus, computer device, and storage medium | |
CN106682059B (en) | Modeling and extraction from structured knowledge of images | |
CN106682060B (en) | Modeling, extracting, and localizing from structured knowledge of images | |
CN108268600B (en) | AI-based unstructured data management method and device | |
Li et al. | Residual attention-based LSTM for video captioning | |
CN114565104A (en) | Language model pre-training method, result recommendation method and related device | |
CN112860916B (en) | Movie-television-oriented multi-level knowledge map generation method | |
CN115438674B (en) | Entity data processing method, entity linking method, entity data processing device, entity linking device and computer equipment | |
CN113239159B (en) | Cross-modal retrieval method for video and text based on relational inference network | |
CN109145083B (en) | Candidate answer selecting method based on deep learning | |
CN114332519A (en) | Image description generation method based on external triple and abstract relation | |
Cheng et al. | Stack-VS: Stacked visual-semantic attention for image caption generation | |
CN118035945B (en) | Label recognition model processing method and related device | |
CN111090765B (en) | Social image retrieval method and system based on missing multi-modal hash | |
Yao et al. | Hypergraph-enhanced textual-visual matching network for cross-modal remote sensing image retrieval via dynamic hypergraph learning | |
CN114239730B (en) | Cross-modal retrieval method based on neighbor ordering relation | |
Dai et al. | Multi-granularity association learning for on-the-fly fine-grained sketch-based image retrieval | |
Senior et al. | Graph neural networks in vision-language image understanding: A survey | |
US20240005094A1 (en) | Hierarchical ontology matching with self-supervision | |
Liu et al. | Entity representation learning with multimodal neighbors for link prediction in knowledge graph | |
Sah et al. | Aligned attention for common multimodal embeddings | |
Liu et al. | Combined application of video semantic understanding technology for music video information learning | |
Liu et al. | Knowledge graph embedding by fusing multimodal content via cross-modal learning | |
CN118035565B (en) | Active service recommendation method, system and equipment based on multi-modal emotion perception | |
Wei et al. | Construction and application of the knowledge graph in endangered plants |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501 Patentee after: Qilu University of Technology (Shandong Academy of Sciences) Country or region after: China Address before: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501 Patentee before: Qilu University of Technology Country or region before: China |