CN111782769A - Intelligent knowledge graph question-answering method based on relation prediction - Google Patents
Intelligent knowledge graph question-answering method based on relation prediction Download PDFInfo
- Publication number
- CN111782769A CN111782769A CN202010628423.3A CN202010628423A CN111782769A CN 111782769 A CN111782769 A CN 111782769A CN 202010628423 A CN202010628423 A CN 202010628423A CN 111782769 A CN111782769 A CN 111782769A
- Authority
- CN
- China
- Prior art keywords
- entity
- question
- kgs
- relation
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 239000013598 vector Substances 0.000 claims abstract description 57
- 238000013507 mapping Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 4
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 2
- 230000015654 memory Effects 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 238000010276 construction Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012458 free base Substances 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a knowledge graph intelligent question-answering method based on relation prediction, and belongs to the field of natural language processing. The method comprises the following steps: s1: inputting a problem Q, and preprocessing the problem; s2: identifying an entity e in a problem using entity identification techniquesquestionAnd an entity equestionMapping to corresponding entity e in KGsKGs(ii) a S3: query KGs for entity eKGsClass c, replacing entity e in question Q with class cquestionLogoIs recorded as Qc(ii) a S4: from QcMapping a relation r; s5: at KGs, if entity eKGsAnd the relation r; s6: learning center entity eKGsA new vector representation of (2); s7: inferring KGs relationships hidden in the object based on existing related triples; s8: and obtaining an answer A based on knowledge graph reasoning of the entity and the relation. The invention can find the corresponding relation between the question entity and the knowledge graph entity and the corresponding relation between the question natural language description and the knowledge graph semantic relation.
Description
Technical Field
The invention belongs to the field of natural language processing, and relates to a knowledge graph intelligent question-answering method based on relation prediction.
Background
The traditional search engine search mode based on keywords lacks semantic analysis and semantic understanding of natural language, and is increasingly difficult to meet the requirements of people. For the user, the interaction mode conforming to the expression of the human natural language is the best, and when the question-answering system shows enough intelligence, the requirement of the user on the interaction mode can be met. Google proposed the concept of Knowledge Graph (KGs) in 2012, further pushing the question-answering system towards intellectualization. With the development of knowledge graph technology, the intelligent question-answering system shows a new development prospect. The rise of social network sites provides a large number of real question and answer linguistic data in different fields for the research of an intelligent question and answer system, and great convenience is provided for a machine to understand natural language question sentences from a data level. The rapid development of large-scale knowledge maps such as Freebase, DBpedia and the like provides a high-quality structured knowledge source for an intelligent question-answering system. The intelligent question-answering field based on the knowledge graph is an important direction in the current question-answering field research. The method has a great application prospect in the artificial intelligence era, and provides technical and theoretical support for mobile internet and human social information entrance.
Disclosure of Invention
In view of this, the present invention aims to provide a Knowledge graph intelligent Question-Answering method based on relationship prediction, which infers KGs hidden relationships by constructing an attention-based graph embedding method, thereby completing KGs missing relationships and improving accuracy of a Knowledge graph-based Question-Answering system (KGs-QA).
In order to achieve the purpose, the invention provides the following technical scheme:
a knowledge graph intelligent question-answering method based on relation prediction comprises the following steps:
s1: inputting a problem Q, and preprocessing the problem;
s2: identifying an entity e in a problem using entity identification techniquesquestionAnd an entity equestionMapping to corresponding entity e in KGsKGs;
S3: query KGs for entity eKGsClass c, replacing entity e in question Q with class cquestionIs marked as Qc;
S4: from QcMapping a relation r;
s5: at KGs, if entity eKGsAnd the relation r;
s6: learning center entity eKGsA new vector representation of (2);
s7: inferring KGs relationships hidden in the object based on existing related triples;
s8: and obtaining an answer A based on knowledge graph reasoning of the entity and the relation.
Optionally, step S1 specifically includes: text is divided into words or phrases through a CRF parser and a maximum entropy dependency parser in HanLP and Stanford parser, and part of speech, word sequence, keywords and dependency quantitative description are obtained.
Optionally, step S2 specifically includes: predicting whether each word in the question is an entity or not by utilizing a Bi-LSTM model of the bidirectional long-short term memory network;
processing the input sequence by adopting a forward LSTM unit and a backward LSTM unit, and finally splicing the output vectors of the two LSTM output vectors;
the output vector of the model is y ═ y1,y2,...,yn) Where n is the length of the input sequence, the length of the model output vector is consistent with the input sequence, yiCorresponding to the marking information of the ith word in the input question sentence, ifA "1" is an entity that is sought, whereas the opposite is not.
Optionally, step S3 specifically includes: conceptualizing entities in the question by utilizing a potential Dirichlet topic model so as to facilitate understanding of the entities and increase interpretability of the entities;
a corpus-based context-dependent conceptualization framework is developed by capturing semantic relationships between words in combination with topic model latent Dirichlet allocation and a large-scale probability KGs.
Optionally, step S4 specifically includes: introducing a Convolutional Neural Network (CNNs) model into a relational link task; and extracting semantic information about the relationship in the question through a deep neural network model, simultaneously processing all the relationships of the candidate entities by using the same model, and performing similarity matching on the obtained question attribute vector and the obtained knowledge map attribute vector to obtain the final linked correct relationship.
Optionally, step S5 specifically includes: based on the entity identified in the step S2 and the relation linked in the step S4, reasoning and simplifying a knowledge graph based on the entity and the relation into a sub-graph matching problem;
in the knowledge-graph, if no match is found, entity eKGsAnd if the relation r is lack of connection, performing the next relation prediction task.
Optionally, step S6 specifically includes: in order to solve the problem that information hidden in the field around a triplet cannot be acquired, a vector learning model is optimized, and a feature embedding method based on attention is provided, wherein the method captures entity features and relationship features in the neighborhood of any given entity;
encapsulating relationship clusters and multi-hop relationships in a model; to obtain a new vector representation of a central entity, feature vectors for each set of related facts existing in the neighborhood of the central entity are learned by linear transformation.
Optionally, step S7 specifically includes: inferring an initially hidden relationship by identifying existing related fact triplets (h, r, t), wherein h represents a head semantic entity, r represents a semantic relationship, and t represents a tail semantic entity; namely, a multi-hop entity and a multi-hop relation in the neighborhood of a central entity are learned, and an auxiliary edge is introduced between n-hop fields to realize a relation prediction task.
Optionally, step S8 specifically includes: KGs-QA supports binary factual question answering by learning 2782 intents of 2700 ten thousand templates.
The invention has the beneficial effects that: the invention adopts a knowledge graph method to reason the entities and the relations in the natural language question, thereby finding the corresponding relation of the question entities and the knowledge graph entities and the corresponding relation of the question natural language description and the knowledge graph semantic relation, and obtaining the corresponding answers based on the knowledge graph reasoning of the template, so that the natural language understanding function of the invention not only has the capability of understanding the literal meaning, but also has the capability of logical reasoning and deep meaning understanding.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of the construction of an intelligent knowledge-graph-based question-answering system according to the present invention;
FIG. 2 is a schematic diagram of the Bi-LSTM-based entity recognition model of the present invention;
FIG. 3 is a schematic diagram of a CNNs-based relational link model according to the present invention;
FIG. 4 is a schematic diagram of knowledge-graph based entity identification and relationship linking according to the present invention;
FIG. 5 is a schematic diagram of a sub-graph of a knowledge graph according to the present invention;
FIG. 6 is a schematic diagram of knowledge graph inference based on entities and relationships according to the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
FIG. 1 is a flow chart of the construction of the intellectual question answering system based on knowledge graph of the present invention. In Yahoo! The natural language question-Answer corpus of Answer is a semantic knowledge resource, and the knowledge graph is a semantic representation method. Through a data representation mode of a knowledge graph, the knowledge graph is used for describing entities of natural language to complete a question and answer task, so that the natural language understanding function of the model not only has the capacity of understanding literal meanings, but also has the capacity of logical reasoning and understanding deep meanings. An embodiment of the intelligent question-answering system using knowledge-graph is provided in the following to further illustrate the invention.
As shown in FIG. 1, the details of the present invention are as follows:
1. inputting a question Q, and preprocessing the question. Text is divided into words or phrases through a CRF parser and a maximum entropy dependency parser in HanLP and Stanford parser, and part of speech, word sequence, keywords and dependency quantitative description are obtained.
2. Identifying an entity e in a problem using entity identification techniquesquestionAnd an entity equestionMapping to corresponding entity e in KGsKGs. Whether each word in the question is an entity is predicted by using a Bi-directional Long Short-Term Memory (Bi-LSTM) model, as shown in fig. 2. And processing the input sequence by adopting a front LSTM unit and a rear LSTM unit, and finally splicing the output vectors of the two LSTM output vectors. Compared with the LSTM model, the Bi-LSTM model keeps the advantages thereof, and simultaneously gives consideration to context information through respectively training the forward and backward sequences, so that deep semantic information can be better extracted. The output vector of the model is y ═ y1,y2,...,yn) Where n is the length of the input sequence, it can be seen that the length of the model output vector is consistent with the input sequence, yiIf the marking information corresponding to the ith word in the input question sentence is 1, the marking information represents a searched entity, otherwise, the marking information does not represent the searched entity.
3. Query entity eKGsClass c, replacing entity e in question Q with class cquestionIs marked as Qc. Entities in the question are conceptualized by using a latent Dirichlet (late Dirichlet) topic model, so that the understanding of the entities is facilitated, and the interpretability of the entities is increased. A corpus-based context-dependent summary is developed by capturing semantic relationships between words in conjunction with topic model latent Dirichlet allocation (late Dirichlet allocation) and a large scale KGsA canonicalization model. The model will automatically disambiguate the input information (e.g., the term "apple" in the question "where is the headquarters of apple company. The conceptualization mechanism itself is based on a large semantic network of millions of concepts, so we have enough granularity to represent a wide variety of problems.
4. From QcThe relation r is mapped out. A Convolutional Neural Networks (CNNs) model is introduced in the relational linking task, as shown in fig. 3. And extracting semantic information about the relationship in the question through a deep neural network model, simultaneously processing all the relationships of the candidate entities by using the same model, and performing similarity matching on the obtained question attribute vector and the obtained knowledge map attribute vector to obtain the final linked correct relationship.
5. At KGs, if entity eKGsAnd the relation r. Simplifying knowledge reasoning based on entities and relationships to a sub-graph matching problem, in a knowledge graph, if no match is found, that is, entity eKGsAnd the relation r, as shown in fig. 4, and then the next relation prediction task is performed.
6. Learning center entity eKGsIs represented by the new vector. In order to solve the problem that the traditional method cannot acquire information hidden in the field around the triples, a vector learning model is optimized, and a feature embedding method based on attention is provided. Furthermore, we encapsulate relationship clusters and multi-hop relationships in the model. To obtain a new vector representation of a central entity, feature vectors for each set of related facts existing in the neighborhood of the central entity are learned by linear transformation.
7. The relationships hidden in KGs are inferred based on existing related triples. The initially hidden relationship is inferred by finding already stored related fact triplets (h, r, t), where h represents the head semantic entity, r represents the semantic relationship, and t represents the tail semantic entity. More specifically, a multi-hop entity and a multi-hop relationship in the neighborhood of the central entity are learned, and an auxiliary edge is introduced between n-hop domains to implement a relationship prediction task, as shown in fig. 5.
8. An answer a is obtained based on knowledge graph reasoning of entities and relationships, as shown in fig. 6. Based on large scale KGs and a large amount of question and answer corpora, a new question representation method is designed: and (5) template. 2782 attempts to learn 2700 ten thousand templates. Based on these templates, KGs-QA effectively supports binary fact question-answering.
FIG. 2 is a schematic diagram of the Bi-LSTM-based entity recognition model of the present invention. The method regards the entity identification process as a sequence labeling problem, and predicts whether each word in the question is an entity or not by using a Bi-LSTM model. For example, the question "where the capital of china is" the word segmentation result is "china/where/capital/is/where", and the entity is "china", the tag sequence is (1,0,0, 0). In the sequence, "1" represents that the entity relationship between "China" and the question sentence is the largest. The process is actually to perform data processing such as word segmentation and dictionary construction on the natural language question, and put all possible entities as candidates into the candidate entity set.
Front and back LSTM respectively to input vector (x)1,x2,...,xt-1,xt) Processing to obtain an output vector htI.e. byWherein,is the output of the forward sequence and,is the output of the backward sequence. The output from the Bi-LSTM is also fed to a sigmoid layer for processing, i.e.The output vector of the model is y ═ y1,y2,...,yn) Where n is the length of the input sequence, it can be seen that the length of the model output vector is consistent with the input sequence, yiCorresponding input questionThe label information of the ith word represents the entity to be found if it is "1", otherwise it is not. In the present invention, the mean square error is used as a loss function of the model, i.e.Where ω is a weight, b is a deviation, yiAs a predictor of the model, ziFor the target value, λ is the hyper-parameter controlling the normalization,normalized for L2.
Fig. 3 is a schematic diagram of a CNNs-based relational link model according to the present invention. The relation linking process is essentially to measure the relevance of question-related relations and candidate relations. Based on this idea, a CNNs model as shown in fig. 3 is employed herein. It should be noted that the input question vector part and the candidate entity part are replaced by the concept, so that the influence of the named entity on the mapping result can be avoided. As can be seen from the figure, CNNs are used to perform convolution processing on question vectors and candidate relationship vectors respectively to obtain semantic vectors corresponding to the question and the relationship. And finally, carrying out relevancy calculation on the obtained question and relation semantic vector to obtain a link result. After being respectively processed by the CNNs model, the semantic vector of the question and the semantic vector of the candidate relation can be obtained, and the semantic similarity of the question and the semantic vector of the candidate relation is calculated by calculating the cosine similarity, namely,where θ is the angle between vector Q and vector R, QiIs the ith element, R, of the problem semantic vectorjIs the jth element of the candidate relationship. The cosine distance calculates the cosine angle between the two vectors. In the same vector space, the smaller the angle, the closer the cosine distance between the two.
FIG. 5 is a diagram of a sub-graph of a knowledge graph according to the present invention. In the figure, the solid lines represent existing relations and the dashed lines represent introduced auxiliary relations (hidden relations). For example, where Bob is born? In this example, the analysis process includes an entity identification process and a relationship linking process, as shown in FIG. 5. In the above example, after finding the entities and relationships in KGs, we note that there is a case KGs where there is incomplete (i.e., the relationships between the entities are lost), as shown in fig. 5. The relationship prediction task is realized by assigning different attentions to nearby entities, the attentions are propagated through layers in an iterative manner, and the contribution of the entities is smaller and smaller considering the increase of the iteration times. One promising solution to solve the above problem is the combination of relations, achieved by introducing a secondary edge (dashed line) between n-hop neighbors, in this case n-2, as shown in fig. 5. According to learning, we note that the importance of < Bob, brother, Andy > + < Andy, born in washington > is greater. In our model, a secondary relationship of 2-hop neighbors (dashed lines) between two entities is introduced. The vector representation of the auxiliary relations is the sum of the vector representations of all the relevant relations (solid lines). In this example, the secondary relationship (dashed line) may be understood as < Bob, brother, Andy > plus < Andy, born in washington state >.
FIG. 6 is a schematic diagram of knowledge graph inference based on entities and relationships according to the present invention. The word embedding concept is introduced to convert the acquired knowledge graph training sample into a low-dimensional space vector, so that the knowledge reasoning is converted into the problem of processing a natural language question through a construction template, and the corresponding relation of a question entity-a knowledge graph entity and the corresponding relation of question natural language description-a knowledge graph semantic relation are found. The entity recognition technology based on deep learning is researched, and the Bi-LSTM is utilized to fully utilize the context information to locate the candidate entity position of the question sentence. The relation link technology based on deep learning is researched, the CNNs are used for extracting deep semantic features, the attention mechanism is integrated, so that semantic vectors of relevant relations in the question are obtained, and the parameter sharing mechanism is used for inputting the candidate relations in the triple into the same model to obtain the semantic vectors of the candidate relations. And finally, calculating and selecting the most correct candidate relation through cosine similarity. At KGs, if entity eKGsAnd the relation r, as shown in fig. 4, and then the next relation prediction task is performed. First, a learning center entity eKGsNew vector table ofShown in the figure. In order to solve the problem that the traditional method cannot acquire information hidden in the field around the triples, a vector learning model is optimized, and a feature embedding method based on attention is provided. Furthermore, we encapsulate relationship clusters and multi-hop relationships in the model. To obtain a new vector representation of a central entity, feature vectors for each set of related facts existing in the neighborhood of the central entity are learned by linear transformation. The relationships hidden in KGs are then inferred based on the existing relevant triples. The initially hidden relationship is inferred by finding already stored related fact triplets (h, r, t), where h represents the head semantic entity, r represents the semantic relationship, and t represents the tail semantic entity. More specifically, a multi-hop entity and a multi-hop relationship in the neighborhood of the central entity are learned, and an auxiliary edge is introduced between n-hop domains to implement a relationship prediction task, as shown in fig. 5. And finally, realizing knowledge graph reasoning based on the entities and the relations, and acquiring an answer A by inquiring the knowledge graph.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.
Claims (9)
1. The relation prediction-based knowledge graph intelligent question-answering method is characterized by comprising the following steps: the method comprises the following steps:
s1: inputting a problem Q, and preprocessing the problem;
s2: identifying an entity e in a problem using entity identification techniquesquestionAnd an entity equestionMapping to corresponding entity e in KGsKGs;
S3: query KGs for entity eKGsClass c, replacing entity e in question Q with class cquestionIs marked as Qc;
S4: from QcMapping a relation r;
s5: at KGs, if entity eKGsAnd the relation r;
s6: learning center entity eKGsA new vector representation of (2);
s7: inferring KGs relationships hidden in the object based on existing related triples;
s8: and obtaining an answer A based on knowledge graph reasoning of the entity and the relation.
2. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S1 specifically includes: text is divided into words or phrases through a CRF syntactic analyzer and a maximum entropy dependency syntactic analyzer in HanLP and Stanfordparser, and part of speech, word sequence, keywords and dependency relationship quantitative description are obtained.
3. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S2 specifically includes: predicting whether each word in the question is an entity or not by utilizing a Bi-LSTM model of the bidirectional long-short term memory network;
processing the input sequence by adopting a forward LSTM unit and a backward LSTM unit, and finally splicing the output vectors of the two LSTM output vectors;
the output vector of the model is y ═ y1,y2,...,yn) Where n is the length of the input sequence, the length of the model output vector is consistent with the input sequence, yiIf the marking information corresponding to the ith word in the input question sentence is 1, the marking information represents a searched entity, otherwise, the marking information does not represent the searched entity.
4. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S3 specifically includes: conceptualizing entities in the question by utilizing a potential Dirichlet topic model so as to facilitate understanding of the entities and increase interpretability of the entities;
a corpus-based context-dependent conceptualization framework is developed by capturing semantic relationships between words in combination with topic model latent Dirichlet allocation and a large-scale probability KGs.
5. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S4 specifically includes: introducing a Convolutional Neural Network (CNNs) model into a relational link task; and extracting semantic information about the relationship in the question through a deep neural network model, simultaneously processing all the relationships of the candidate entities by using the same model, and performing similarity matching on the obtained question attribute vector and the obtained knowledge map attribute vector to obtain the final linked correct relationship.
6. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S5 specifically includes: based on the entity identified in the step S2 and the relation linked in the step S4, reasoning and simplifying a knowledge graph based on the entity and the relation into a sub-graph matching problem;
in the knowledge-graph, if no match is found, entity eKGsAnd if the relation r is lack of connection, performing the next relation prediction task.
7. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S6 specifically includes: in order to solve the problem that information hidden in the field around a triplet cannot be acquired, a vector learning model is optimized, and a feature embedding method based on attention is provided, wherein the method captures entity features and relationship features in the neighborhood of any given entity;
encapsulating relationship clusters and multi-hop relationships in a model; to obtain a new vector representation of a central entity, feature vectors for each set of related facts existing in the neighborhood of the central entity are learned by linear transformation.
8. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S7 specifically includes: inferring an initially hidden relationship by identifying existing related fact triplets (h, r, t), wherein h represents a head semantic entity, r represents a semantic relationship, and t represents a tail semantic entity; namely, a multi-hop entity and a multi-hop relation in the neighborhood of a central entity are learned, and an auxiliary edge is introduced between n-hop fields to realize a relation prediction task.
9. The relation prediction-based intellectual graph intelligent question answering method according to claim 1, characterized in that: the step S8 specifically includes: KGs-QA supports binary factual question answering by learning 2782 intents of 2700 ten thousand templates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010628423.3A CN111782769B (en) | 2020-07-01 | 2020-07-01 | Intelligent knowledge graph question-answering method based on relation prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010628423.3A CN111782769B (en) | 2020-07-01 | 2020-07-01 | Intelligent knowledge graph question-answering method based on relation prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111782769A true CN111782769A (en) | 2020-10-16 |
CN111782769B CN111782769B (en) | 2022-07-08 |
Family
ID=72758091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010628423.3A Active CN111782769B (en) | 2020-07-01 | 2020-07-01 | Intelligent knowledge graph question-answering method based on relation prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111782769B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112487827A (en) * | 2020-12-28 | 2021-03-12 | 科大讯飞华南人工智能研究院(广州)有限公司 | Question answering method, electronic equipment and storage device |
CN112579795A (en) * | 2020-12-28 | 2021-03-30 | 重庆邮电大学 | Intelligent question-answering method based on knowledge graph embedded representation |
CN112765312A (en) * | 2020-12-31 | 2021-05-07 | 湖南大学 | Knowledge graph question-answering method and system based on graph neural network embedding matching |
CN113590782A (en) * | 2021-07-28 | 2021-11-02 | 北京百度网讯科技有限公司 | Training method, reasoning method and device of reasoning model |
CN113792132A (en) * | 2021-09-24 | 2021-12-14 | 泰康保险集团股份有限公司 | Target answer determination method, device, equipment and medium |
CN114186068A (en) * | 2021-11-04 | 2022-03-15 | 国网天津市电力公司 | Audit system basis question-answering method based on multi-level attention network |
CN114860877A (en) * | 2022-04-29 | 2022-08-05 | 华侨大学 | Problem chain generation method and system based on knowledge graph relation prediction |
CN118520957A (en) * | 2024-07-23 | 2024-08-20 | 成都信通信息技术有限公司 | Intelligent question-answering method based on deep learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160019280A1 (en) * | 2013-03-15 | 2016-01-21 | Google Inc. | Identifying question answerers in a question asking system |
CN109492077A (en) * | 2018-09-29 | 2019-03-19 | 北明智通(北京)科技有限公司 | The petrochemical field answering method and system of knowledge based map |
CN110322959A (en) * | 2019-05-24 | 2019-10-11 | 山东大学 | A kind of Knowledge based engineering depth medical care problem method for routing and system |
CN110888946A (en) * | 2019-12-05 | 2020-03-17 | 电子科技大学广东电子信息工程研究院 | Entity linking method based on knowledge-driven query |
CN111291156A (en) * | 2020-01-21 | 2020-06-16 | 同方知网(北京)技术有限公司 | Question-answer intention identification method based on knowledge graph |
-
2020
- 2020-07-01 CN CN202010628423.3A patent/CN111782769B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160019280A1 (en) * | 2013-03-15 | 2016-01-21 | Google Inc. | Identifying question answerers in a question asking system |
CN109492077A (en) * | 2018-09-29 | 2019-03-19 | 北明智通(北京)科技有限公司 | The petrochemical field answering method and system of knowledge based map |
CN110322959A (en) * | 2019-05-24 | 2019-10-11 | 山东大学 | A kind of Knowledge based engineering depth medical care problem method for routing and system |
CN110888946A (en) * | 2019-12-05 | 2020-03-17 | 电子科技大学广东电子信息工程研究院 | Entity linking method based on knowledge-driven query |
CN111291156A (en) * | 2020-01-21 | 2020-06-16 | 同方知网(北京)技术有限公司 | Question-answer intention identification method based on knowledge graph |
Non-Patent Citations (3)
Title |
---|
FEN ZHAO等: "Improving question answering over incomplete knowledge graphs with relation prediction", 《NEURAL COMPUTING AND APPLICATIONS》 * |
张杰: "文献结构化的细粒度检索技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
程显毅等: "基于知识图的观点句识别算法研究", 《计算机科学》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112487827A (en) * | 2020-12-28 | 2021-03-12 | 科大讯飞华南人工智能研究院(广州)有限公司 | Question answering method, electronic equipment and storage device |
CN112579795A (en) * | 2020-12-28 | 2021-03-30 | 重庆邮电大学 | Intelligent question-answering method based on knowledge graph embedded representation |
CN112487827B (en) * | 2020-12-28 | 2024-07-02 | 科大讯飞华南人工智能研究院(广州)有限公司 | Question answering method, electronic equipment and storage device |
CN112765312A (en) * | 2020-12-31 | 2021-05-07 | 湖南大学 | Knowledge graph question-answering method and system based on graph neural network embedding matching |
CN113590782A (en) * | 2021-07-28 | 2021-11-02 | 北京百度网讯科技有限公司 | Training method, reasoning method and device of reasoning model |
CN113590782B (en) * | 2021-07-28 | 2024-02-09 | 北京百度网讯科技有限公司 | Training method of reasoning model, reasoning method and device |
CN113792132A (en) * | 2021-09-24 | 2021-12-14 | 泰康保险集团股份有限公司 | Target answer determination method, device, equipment and medium |
CN113792132B (en) * | 2021-09-24 | 2023-11-17 | 泰康保险集团股份有限公司 | Target answer determining method, device, equipment and medium |
CN114186068A (en) * | 2021-11-04 | 2022-03-15 | 国网天津市电力公司 | Audit system basis question-answering method based on multi-level attention network |
CN114860877A (en) * | 2022-04-29 | 2022-08-05 | 华侨大学 | Problem chain generation method and system based on knowledge graph relation prediction |
CN118520957A (en) * | 2024-07-23 | 2024-08-20 | 成都信通信息技术有限公司 | Intelligent question-answering method based on deep learning |
CN118520957B (en) * | 2024-07-23 | 2024-10-01 | 成都信通信息技术有限公司 | Intelligent question-answering method based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN111782769B (en) | 2022-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111782769B (en) | Intelligent knowledge graph question-answering method based on relation prediction | |
CN108733792B (en) | Entity relation extraction method | |
CN112015868B (en) | Question-answering method based on knowledge graph completion | |
CN113535917A (en) | Intelligent question-answering method and system based on travel knowledge map | |
CN112115238A (en) | Question-answering method and system based on BERT and knowledge base | |
US20240233877A1 (en) | Method for predicting reactant molecule, training method, apparatus, and electronic device | |
CN112766507B (en) | Complex problem knowledge base question-answering method based on embedded and candidate sub-graph pruning | |
CN111339407B (en) | Implementation method of information extraction cloud platform | |
CN112417170B (en) | Relationship linking method for incomplete knowledge graph | |
CN113707339A (en) | Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases | |
CN114004237A (en) | Intelligent question-answering system construction method based on bladder cancer knowledge graph | |
Sharath et al. | Question answering over knowledge base using language model embeddings | |
Huo et al. | Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research | |
CN117973519A (en) | Knowledge graph-based data processing method | |
Li et al. | Using context information to enhance simple question answering | |
Zulfiqar et al. | Logical layout analysis using deep learning | |
CN117609436A (en) | College scientific research management question-answering system combining knowledge graph and large language model | |
Yang et al. | Named entity recognition of power substation knowledge based on transformer-BiLSTM-CRF network | |
CN116680407A (en) | Knowledge graph construction method and device | |
Liao et al. | The sg-cim entity linking method based on bert and entity name embeddings | |
Deng et al. | Covidia: COVID-19 Interdisciplinary Academic Knowledge Graph | |
Wu et al. | A Text Emotion Analysis Method Using the Dual‐Channel Convolution Neural Network in Social Networks | |
Deroy et al. | Question Generation: Past, Present & Future | |
Japa et al. | Question answering over knowledge base using language model embeddings | |
CN113779211B (en) | Intelligent question-answering reasoning method and system based on natural language entity relationship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |