CN113342952A - Knowledge graph question-answering method based on problem graph iterative retrieval - Google Patents

Knowledge graph question-answering method based on problem graph iterative retrieval Download PDF

Info

Publication number
CN113342952A
CN113342952A CN202110663953.6A CN202110663953A CN113342952A CN 113342952 A CN113342952 A CN 113342952A CN 202110663953 A CN202110663953 A CN 202110663953A CN 113342952 A CN113342952 A CN 113342952A
Authority
CN
China
Prior art keywords
question
graph
map
knowledge
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110663953.6A
Other languages
Chinese (zh)
Inventor
张帆
陶思雨
赵前
李倩倩
戚瑶瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Electric Group Corp
Original Assignee
Shanghai Electric Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Electric Group Corp filed Critical Shanghai Electric Group Corp
Priority to CN202110663953.6A priority Critical patent/CN113342952A/en
Publication of CN113342952A publication Critical patent/CN113342952A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a knowledge graph question-answering method based on problem graph iterative retrieval, which is characterized in that a method for generating a problem graph by graph iterative retrieval is adopted to realize question-answering of a knowledge graph, and the method specifically comprises the following steps: inputting entities contained in the questions to position corresponding nodes in the knowledge graph and construct an initial problem graph, and selecting adjacent nodes with higher association degree with the nodes in the problem graph and the questions through a retrieval module and adding the adjacent nodes into the problem graph; after iteration is carried out for multiple times to obtain a final problem map, possible answer nodes are found out from the problem map through a node classification method, and the like. Compared with the prior art, the method has the advantages of narrowing the question and answer retrieval range, achieving high answer retrieval response speed and high question and answer accuracy, further improving the response time of knowledge map question and answer, avoiding a large amount of expenses caused by map retrieval, effectively improving the effect of the question and answer method, and achieving good use experience.

Description

Knowledge graph question-answering method based on problem graph iterative retrieval
Technical Field
The invention relates to the technical field of knowledge graph question answering, in particular to a knowledge graph question answering method based on problem graph iterative retrieval.
Background
Knowledge map question-answer is one of the research hotspots of the current computer natural language, has very wide application in both academic and industrial fields, and the research and application of knowledge map question-answer will have great influence on social life. How to realize intelligent question answering on a large-scale knowledge graph is a problem which needs to be solved urgently at present, and the application of the knowledge graph question answering technology in the actual life is hindered. The adoption of deep learning technology to realize highly intelligent knowledge map question-answering is the mainstream at present.
The prior knowledge-graph question-answering method is generally realized by keyword retrieval of a knowledge-graph, however, since the graph data of the knowledge-graph is generally huge and a lot of useless information exists, a lot of noise-related problems can occur in the graph question-answering process. Meanwhile, due to the fact that the data amount stored in the knowledge graph is too large, huge performance overhead can be generated by using a traditional retrieval method, and the response speed of answer retrieval is greatly reduced. With the increasing scale of the knowledge graph, the difficulty of answer retrieval and the large amount of existing noise problems can have a large negative effect on the effect of the question and answer, and the use experience in practice is influenced. How to solve the noise problem existing in the map, how to improve the effect of the question-answering method, how to narrow the range of question-answering retrieval, and how to improve the response time of the knowledge map question-answering are two major difficulties in realizing the large-scale knowledge map question-answering at present.
Disclosure of Invention
The invention aims to design a knowledge graph question-answering method based on problem graph iterative retrieval aiming at the defects of the prior art, which adopts a method for generating a problem graph by graph iterative retrieval to realize the question-answering of the knowledge graph, well eliminates a large amount of noise in the graph, limits the answer searching range in the finally generated problem graph, reduces the question-answering retrieving range and has high answer retrieving response speed, thereby avoiding a large amount of expenses caused by graph retrieval, effectively improving the effect of the question-answering method, further improving the response time of the knowledge graph question-answering and having good use experience.
The specific technical scheme for realizing the purpose of the invention is as follows: a knowledge map question-answering method based on question map iterative retrieval is characterized in that a method for generating question maps by adopting the question map iterative retrieval is adopted to realize question-answering of knowledge maps, and the method comprises the following specific steps:
step 1: the entity that appears is identified from the input question, which may be a single word or a continuous segment of text.
Step 2: and finding out the corresponding node of the entity in the map by fuzzy query of the node name in the knowledge map, and taking the found node as the initial node of the problem map.
And step 3: searching adjacent nodes of the nodes in the problem graph, calculating the association degree of the adjacent nodes and the problem through a pre-training model, adding the nodes with the association degree reaching a threshold value into the problem graph, and iterating the steps for multiple times.
And 4, step 4: and carrying out answer prediction on the nodes in the finally formed problem graph.
The step 1 completes the identification of emerging entities from the input question using the open source NER tool LAC.
The algorithm for calculating the association degree between the adjacent nodes and the problem in the step 3 is realized by adopting a pre-training language model Bert, and the association degree between the adjacent nodes and the problem is calculated by splicing the text information of the problem and the adjacent nodes, such as the node names, and inputting the information into the Bert language model.
Compared with the prior art, the method has the advantages of narrowing the question and answer retrieval range, achieving high answer retrieval response speed and high question and answer accuracy, further improving the response time of knowledge map question and answer, avoiding a large amount of expenses caused by map retrieval, effectively improving the effect of the question and answer method, and achieving good use experience.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flowchart of the operation of example 1.
Detailed Description
Referring to fig. 1, the method for generating a problem map by using a map iterative search according to the present invention to implement a question-answer of a knowledge map specifically includes the following steps:
step 1: inputting a question and identifying emerging entities from the question;
step 2: finding out a corresponding node of an entity in the map by fuzzy query of a node name in the knowledge map, and taking the node as an initial node of the problem map;
and step 3: searching adjacent nodes of the nodes in the problem graph, calculating the association degree of the adjacent nodes and the problem through a pre-training model, and adding the nodes with the association degree reaching a threshold value into the problem graph;
and 4, step 4: and (4) carrying out iteration processing on the step (3) for n times, carrying out answer prediction on the nodes in the finally formed problem graph, and returning the predicted nodes with the highest confidence coefficient, wherein n is more than or equal to 3.
The step 1 identifies the entity appearing from the input question, and is done using the open source NER tool LAC.
And 2, finding out the corresponding node in the entity re-map by fuzzy query of the node name in the knowledge map, and realizing the fuzzy query process by adopting a regular expression.
And 3, the pre-training language model Bert is adopted to realize the step, and the association degree between the adjacent nodes and the problem is calculated by splicing the problem text and the name text name of the adjacent node and inputting the spliced problem text and the name text name into the Bert language model.
And 4, carrying out answer prediction on nodes in the question graph, splicing the semantic features of the question text and the semantic features of the name text of the nodes in the current question graph, inputting the semantic features into a three-layer neural network, calculating the confidence coefficient that the nodes in the current question graph are answers, and obtaining the semantic features of the question text and the semantic features of the node name by adopting a pre-training model Bert.
The problem graph is generated by adopting a graph iteration retrieval technology, noise problems possibly generated in the answer retrieval process are reduced to the maximum extent, the retrieval range is limited in the subgraph range of the knowledge graph, the data volume needing to be retrieved is greatly reduced, and higher question answering efficiency and higher question answering accuracy are achieved compared with other methods.
The present invention will be described in further detail with reference to the following specific examples and the accompanying drawings.
Example 1
Referring to fig. 2, the following questions that describe the knowledge graph as to what the "unconnected" state of the SA interface in the MSR family of routers is:
1) building an initial problem graph
1-1: inputting problem entity identification and aligning with knowledge spectrogram nodes;
1-2: using an NER tool LAC to realize the identification and extraction of the entity in the problem, and screening out the entity with the identification confidence coefficient higher than 0.5;
1-3: and searching a map node related to the identified problem entity from the knowledge map by adopting a fuzzy matching method for constructing an initial problem map.
2) Iterative search
2-1: problem graph after first iteration retrieval
After the initial problem map is obtained, calculating the association degree between the adjacent nodes of all the nodes in the problem map and the problem, wherein the association degree is obtained by splicing the problem text, the current node name and the adjacent node name to form a new text and inputting the new text into a Bert pre-training model for calculation, and adding the adjacent nodes with the association confidence coefficient larger than 0.5 into the problem map to form the new problem map.
2-2: second iteration retrieval
And repeating the iterative retrieval of the step 2-1, and adding the adjacent nodes with the associated confidence degrees larger than 0.5 into the updated problem map in the problem map.
2-3: third iteration retrieval
And repeating the iterative retrieval of the step 2-1, and adding the adjacent nodes with the associated confidence degrees larger than 0.5 into the updated problem map in the problem map.
The above search process will be iterated many times, and the nodes in the finally formed problem graph will be used for answer node prediction.
3) Answer node prediction
And performing answer node prediction on all nodes in the finally formed problem graph, wherein in the prediction process, node name information and a problem text are firstly spliced and input into a pre-training model Bert to obtain the association degree between one node and the problem, and then the association degree confidence coefficient, the node text semantic features and the problem text semantic features are spliced to form a feature array and input into a three-layer neural network to obtain the final answer prediction confidence coefficient through network calculation.
The invention can realize the question and answer based on the knowledge graph only by using the question and answer training set to train the graph iteration retrieval module and the node classification module on the basis of the knowledge graph.
The above embodiments are merely illustrative of the technical solutions and advantages of the present invention, and are not intended to limit the present invention, and any modifications, additions and equivalents made within the scope of the principles of the present invention are intended to be included in the scope of the claims of the present invention.

Claims (5)

1. A knowledge graph question-answering method based on question graph iterative search is characterized in that a question graph is generated by adopting a graph iterative search method, and question-answering of a knowledge graph is realized, and the method specifically comprises the following steps:
step 1: inputting a question and identifying emerging entities from the question;
step 2: finding out a corresponding node of an entity in the map by fuzzy query of a node name in the knowledge map, and taking the node as an initial node of the problem map;
and step 3: searching adjacent nodes of the nodes in the problem graph, calculating the association degree of the adjacent nodes and the problem through a pre-training model, and adding the nodes with the association degree reaching a threshold value into the problem graph;
and 4, step 4: and (4) carrying out iteration processing on the step (3) for n times, carrying out answer prediction on the nodes in the finally formed problem graph, and returning the predicted nodes with the highest confidence coefficient, wherein n is more than or equal to 3.
2. The method of claim 1 wherein step 1 identifies emerging entities from the input questions and is performed using the open source NER tool LAC.
3. The knowledge-graph question-answering method based on problem graph iterative retrieval as claimed in claim 1, wherein the step 2 finds the corresponding node in the entity re-graph by fuzzy query of node names in the knowledge-graph, and the fuzzy query process is realized by adopting a regular expression.
4. The knowledge-graph question-answering method based on question-graph iterative retrieval as claimed in claim 1, wherein said step 3 is implemented by using a pre-trained language model Bert, and the degree of association between the adjacent nodes and the question is calculated by splicing the text of the question and the name text names of the adjacent nodes and inputting the spliced text names into the Bert language model.
5. The knowledge-graph question-answering method based on question map iterative retrieval as claimed in claim 1, wherein the step 4 performs answer prediction on nodes in the question map, calculates the confidence degree that the current question map node is an answer by splicing the question text semantic features and the current question map node name text semantic features and inputting the result into a three-layer neural network, and the question text semantic features and the node name semantic features are obtained by adopting a pre-training model Bert.
CN202110663953.6A 2021-06-16 2021-06-16 Knowledge graph question-answering method based on problem graph iterative retrieval Pending CN113342952A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110663953.6A CN113342952A (en) 2021-06-16 2021-06-16 Knowledge graph question-answering method based on problem graph iterative retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110663953.6A CN113342952A (en) 2021-06-16 2021-06-16 Knowledge graph question-answering method based on problem graph iterative retrieval

Publications (1)

Publication Number Publication Date
CN113342952A true CN113342952A (en) 2021-09-03

Family

ID=77477262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110663953.6A Pending CN113342952A (en) 2021-06-16 2021-06-16 Knowledge graph question-answering method based on problem graph iterative retrieval

Country Status (1)

Country Link
CN (1) CN113342952A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729395A (en) * 2012-10-12 2014-04-16 国际商业机器公司 Method and system for inferring inquiry answer
CN111666399A (en) * 2020-06-23 2020-09-15 中国平安人寿保险股份有限公司 Intelligent question and answer method and device based on knowledge graph and computer equipment
CN111831880A (en) * 2020-02-21 2020-10-27 桂林电子科技大学 Intelligent question and answer method based on micro hotel platform
CN112487168A (en) * 2020-12-11 2021-03-12 润联软件系统(深圳)有限公司 Semantic questioning and answering method and device for knowledge graph, computer equipment and storage medium
CN112632250A (en) * 2020-12-23 2021-04-09 南京航空航天大学 Question and answer method and system under multi-document scene

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729395A (en) * 2012-10-12 2014-04-16 国际商业机器公司 Method and system for inferring inquiry answer
CN111831880A (en) * 2020-02-21 2020-10-27 桂林电子科技大学 Intelligent question and answer method based on micro hotel platform
CN111666399A (en) * 2020-06-23 2020-09-15 中国平安人寿保险股份有限公司 Intelligent question and answer method and device based on knowledge graph and computer equipment
CN112487168A (en) * 2020-12-11 2021-03-12 润联软件系统(深圳)有限公司 Semantic questioning and answering method and device for knowledge graph, computer equipment and storage medium
CN112632250A (en) * 2020-12-23 2021-04-09 南京航空航天大学 Question and answer method and system under multi-document scene

Similar Documents

Publication Publication Date Title
CN106776534B (en) Incremental learning method of word vector model
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN111522839B (en) Deep learning-based natural language query method
CN112069826B (en) Vertical domain entity disambiguation method fusing topic model and convolutional neural network
CN110633366B (en) Short text classification method, device and storage medium
CN108388559A (en) Name entity recognition method and system, computer program of the geographical space under
CN111897944B (en) Knowledge graph question-answering system based on semantic space sharing
CN103646112A (en) Dependency parsing field self-adaption method based on web search
CN116127095A (en) Question-answering method combining sequence model and knowledge graph
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN116991869A (en) Method for automatically generating database query statement based on NLP language model
CN117290489B (en) Method and system for quickly constructing industry question-answer knowledge base
CN113742446A (en) Knowledge graph question-answering method and system based on path sorting
CN117312499A (en) Big data analysis system and method based on semantics
CN115658846A (en) Intelligent search method and device suitable for open-source software supply chain
CN117010373A (en) Recommendation method for category and group to which asset management data of power equipment belong
CN111382333A (en) Case element extraction method in news text sentence based on case correlation joint learning and graph convolution
CN116561264A (en) Knowledge graph-based intelligent question-answering system construction method
CN113807102B (en) Method, device, equipment and computer storage medium for establishing semantic representation model
CN113111136B (en) Entity disambiguation method and device based on UCL knowledge space
CN113342952A (en) Knowledge graph question-answering method based on problem graph iterative retrieval
CN111125308A (en) Lightweight text fuzzy search method supporting semantic association
CN116501895B (en) Typhoon time sequence knowledge graph construction method and terminal
CN113343670B (en) Address text element extraction method based on coupling of hidden Markov and classification algorithm
Hamdulla et al. A hierarchical clustering based relation extraction method for domain ontology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination