CN117076688A - Knowledge question-answering method and device based on domain knowledge graph and electronic equipment - Google Patents
Knowledge question-answering method and device based on domain knowledge graph and electronic equipment Download PDFInfo
- Publication number
- CN117076688A CN117076688A CN202311049695.8A CN202311049695A CN117076688A CN 117076688 A CN117076688 A CN 117076688A CN 202311049695 A CN202311049695 A CN 202311049695A CN 117076688 A CN117076688 A CN 117076688A
- Authority
- CN
- China
- Prior art keywords
- target
- entity
- knowledge
- information
- question
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012545 processing Methods 0.000 claims abstract description 24
- 230000011218 segmentation Effects 0.000 claims description 26
- 238000012549 training Methods 0.000 claims description 13
- 238000010276 construction Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 230000001502 supplementing effect Effects 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 12
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000004458 analytical method Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 235000008694 Humulus lupulus Nutrition 0.000 description 7
- 238000001228 spectrum Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Animal Behavior & Ethology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a knowledge question and answer method based on a domain knowledge graph, a device thereof and electronic equipment, and relates to the field of artificial intelligence, the field of financial science and technology or other related fields, wherein the knowledge question and answer method comprises the following steps: receiving a target question, processing the target question to obtain question prompt information, extracting a target entity set or a target relation set based on the target question, retrieving triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set to obtain a triplet information set, constructing input knowledge information based on the triplet information set and the question prompt information, inputting the input knowledge information into a preset reasoning model, and outputting a target answer of the target question. The invention solves the technical problem of lower accuracy of knowledge question-answering reasoning on the problems in the related technology.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to a knowledge question-answering method based on a domain knowledge graph, a knowledge question-answering device based on the domain knowledge graph and electronic equipment.
Background
Currently, in a financial service scenario (e.g., a financial market transaction scenario), manual transaction advance investigation, data analysis and transaction decision are required, and the problems that effective information cannot be quickly obtained from massive information, information is integrated and analyzed in time, so that decision difficulty, inaccurate decision and the like occur, and further business efficiency and growth are affected are often faced.
With the release of the generative model (for example, chatGPT, full scale Chat Generative Pre-trained Transformer, namely, a chat robot program), the understanding and generating capability of the natural language text are improved significantly, and the intelligent knowledge question-answering reasoning service level is improved greatly. However, the current generative model still has the problems of unreliable spectrum of logical reasoning and low facts of the generated result, and cannot provide professional and accurate answers to the problems in the field class (such as the financial field).
In the related art, the following scheme is often adopted to conduct intelligent knowledge question answering: (1) intelligent question-answering reasoning based on knowledge graph (2) question-answering pair matching (3) intelligent question-answering based on generative model, wherein,
(1) Knowledge graph-based intelligent question-answering reasoning can construct a knowledge graph in the general or professional field based on knowledge graph technology, and intelligent question-answering application of knowledge reasoning is realized.
FIG. 1 is a schematic diagram of an alternative knowledge-based intelligent question-answering reasoning, as shown in FIG. 1, according to the related art, including: the system comprises a question analysis module, a question answer module and an answer generation module, wherein the question analysis module comprises: the problem classification and NLP (Natural Language Processing ) technology can firstly classify the input problems, and then process the problems such as keyword extraction and semantic analysis by using the NLP technology; the question answering module comprises: pattern matching and knowledge question answering, wherein the answers are obtained by carrying out semantic understanding and analysis on the data transmitted by the question analysis module and utilizing a knowledge base for inquiry and reasoning; answer generation module: the candidate answers can be scored according to the data transmitted by the question analysis module, and the best answer can be selected.
(2) Question-answer pairs are matched, the answers are matched by calculating semantic similarity depending on a question-answer library.
(3) And carrying out intelligent question-answering based on the generated model, carrying out intention recognition and semantic analysis based on the pre-trained model according to information such as a context scene, a user question and the like, and generating a question-answering answer.
FIG. 2 is a schematic diagram of an alternative intelligent question-answering based on a generative model according to the related art, as shown in FIG. 2, the model includes: and the intention analysis, semantic analysis, answer generation and other modules are used for inputting the questions into the model, carrying out intention analysis and semantic analysis on the questions, and obtaining question and answer through the answer generation module.
However, the intelligent knowledge question-answering scheme in the related art has the following problems: (1) For the intelligent question-answering reasoning scheme based on the model, the problems of unreliable reasoning results, low result controllability and the like exist. On one hand, the model is pre-trained mainly through self-collected and self-labeled data, and if sample imbalance exists in training sample data, the problem of prejudice and fairness of the model can be caused; on the other hand, aiming at the field scenes, the model has insufficient professional sample data (or few field sample data) in the pre-trained data, and the information collected by the model from the network also faces the non-factual problem, so that the content reliability generated by model reasoning is low, professional and spectral answers cannot be truly provided for the field problems, and if the new labeling field professional sample data is adopted and the model is injected for pre-training, the problems of high manpower and calculation cost and incapability of ensuring that the model achieves the expected effect in each field subdivision scene are faced; (2) For the intelligent question-answering reasoning scheme based on the knowledge graph, the problems of difficult framework adjustment, difficult modification and adjustment according to new data or scenes, weak reasoning capability, high graph construction cost and the like exist.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a knowledge question-answering method based on a domain knowledge graph, a knowledge question-answering device based on the domain knowledge graph and electronic equipment, and aims to at least solve the technical problem that the accuracy of knowledge question-answering reasoning on the problems in the related technology is low.
According to an aspect of the embodiment of the invention, a knowledge question-answering method based on a domain knowledge graph is provided, which comprises the following steps: receiving a target problem, and processing the target problem to obtain problem prompt information; extracting a target entity set or a target relation set based on the target problem, wherein the target entity set comprises: a plurality of target entities, the set of target relationships comprising: a plurality of target relationships; retrieving triple information matched with the target entity from a preset domain knowledge graph based on the target entity set under the condition of extracting the target entity set, or retrieving triple information matched with the target relationship from the preset domain knowledge graph based on the target relationship set under the condition of extracting the target relationship set, so as to obtain a triple information set; and constructing input knowledge information based on the triplet information set and the question prompt information, inputting the input knowledge information into a preset reasoning model, and outputting a target answer of the target question.
Optionally, the step of processing the target problem to obtain problem prompt information includes: constructing a problem prompting template, wherein the problem prompting template comprises: a question instruction; and adding the problem instruction into the target problem based on the problem prompt template to generate the problem prompt information.
Optionally, based on the target problem, the step of extracting a target entity set or a target relationship set includes: performing word segmentation processing on the target problem to obtain a plurality of segmented words; analyzing the word segmentation to determine the word type of the word segmentation; determining the word segmentation indicated by the word type as the target entity under the condition that the word type is a first preset type, or determining the word segmentation indicated by the word type as the target relationship under the condition that the word type is a second preset type, so as to obtain the target relationship set; determining the context information of the target problem under the condition that all word types are not the first preset type and the second preset type, and supplementing the target entity corresponding to the target problem based on the context information; and generating the target entity set based on all the target entities.
Optionally, the triplet information includes: under the condition that the entity relationship among the main body entity, the object entity and the entity relationship is extracted to the target entity set, based on the target entity set, the step of retrieving the triplet information matched with the target entity from the preset domain knowledge graph to obtain the triplet information set comprises the following steps: determining a search hop count threshold and an initial search hop count; retrieving a knowledge-graph entity matched with the target entity from the preset domain knowledge-graph, wherein the knowledge-graph entity is the main entity or the object entity; under the condition that a first knowledge-graph entity matched with the target entity is retrieved, updating the initial retrieval hop count to obtain the current retrieval hop count; determining a second knowledge-graph entity associated with the first knowledge-graph entity based on the entity relationship if the current search hop count is less than the search hop count threshold; updating the current retrieval hop count, and continuously determining a third knowledge-graph entity associated with the second knowledge-graph entity based on the entity relationship until the current retrieval hop count is greater than or equal to the retrieval hop count threshold value to obtain a knowledge-graph entity set; and determining the triplet information of each knowledge-graph entity based on the knowledge-graph entity set to obtain the triplet information set.
Optionally, under the condition of extracting the target relation set, retrieving, based on the target relation set, triplet information matched with the target relation from a preset domain knowledge graph to obtain a triplet information set, where the step includes: searching entity relations matched with the target relations from the preset domain knowledge graph until searching is successful or searching times reach a preset searching threshold value, and obtaining a searching result; under the condition that the retrieval is successful, determining a target entity relationship matched with the target relationship based on the retrieval result to obtain a target entity relationship set; and determining the triplet information of each target entity relation based on the target entity relation set to obtain the triplet information set.
Optionally, after retrieving the triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set to obtain the triplet information set, the method further includes: connecting a main entity, an object entity and an entity relationship in the triplet information to obtain an answer text, wherein the triplet information corresponds to an association value associated with the target entity; and sorting all the answer texts based on the association value to obtain an answer text set.
Optionally, the step of constructing the input knowledge information based on the triplet information set and the problem prompt information includes: constructing an answer prompt template, wherein the answer prompt template comprises: answering the instruction; adding the answer instruction into each answer text in the answer text set based on the answer prompt template to generate an answer prompt information set; and splicing the question prompt information and the answer prompt information set to obtain the input knowledge information.
Optionally, the step of inputting the input knowledge information into a preset inference model and outputting a target answer of the target question includes: analyzing the question prompt information by adopting the preset reasoning model to obtain an answer set, wherein the preset reasoning model is a reasoning model which is trained in advance by adopting a training data set, and the training data set comprises: a set of historical questions, a historical answer corresponding to each historical question in the set of historical questions; characterizing the input knowledge information as a preset condition, and determining a conditional probability value of each answer in the answer set based on the preset condition; and determining the answer indicated by the maximum conditional probability value as the target answer.
According to another aspect of the embodiment of the present invention, there is also provided a knowledge question-answering device based on a domain knowledge graph, including: the receiving unit is used for receiving the target problem and processing the target problem to obtain problem prompt information; the extraction unit is configured to extract a target entity set or a target relationship set based on the target problem, where the target entity set includes: a plurality of target entities, the set of target relationships comprising: a plurality of target relationships; the retrieval unit is used for retrieving the triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set under the condition of extracting the target entity set, or retrieving the triplet information matched with the target relationship from the preset domain knowledge graph based on the target relationship set under the condition of extracting the target relationship set, so as to obtain a triplet information set; the construction unit is used for constructing input knowledge information based on the triplet information set and the question prompt information, inputting the input knowledge information into a preset reasoning model and outputting a target answer of the target question.
Optionally, the receiving unit includes: the first construction module is used for constructing a problem prompt template, wherein the problem prompt template comprises: a question instruction; the first generation module is used for adding the problem instruction into the target problem based on the problem prompt template to generate the problem prompt information.
Optionally, the extracting unit includes: the first processing module is used for performing word segmentation processing on the target problem to obtain a plurality of segmented words; the first analysis module is used for analyzing the word segmentation and determining the word type of the word segmentation; the first determining module is used for determining the word segmentation indicated by the word type as the target entity under the condition that the word type is of a first preset type, or determining the word segmentation indicated by the word type as the target relationship under the condition that the word type is of a second preset type, so as to obtain the target relationship set; the second determining module is used for determining the context information of the target problem and supplementing the target entity corresponding to the target problem based on the context information under the condition that all word types are not the first preset type and the second preset type; and the second generation module is used for generating the target entity set based on all the target entities.
Optionally, the triplet information includes: a subject entity, an object entity, and an entity relationship, the retrieval unit comprising: the third determining module is used for determining a retrieval hop count threshold value and an initial retrieval hop count; the first retrieval module is used for retrieving a knowledge-graph entity matched with the target entity from the preset domain knowledge-graph, wherein the knowledge-graph entity is the main entity or the object entity; the first updating module is used for updating the initial search hop count to obtain the current search hop count under the condition that a first knowledge-graph entity matched with the target entity is searched; a fourth determining module, configured to determine, based on the entity relationship, a second knowledge-graph entity associated with the first knowledge-graph entity, in a case where the current search hop count is less than the search hop count threshold; the second updating module is used for updating the current retrieval hop count, and continuously determining a third knowledge-graph entity associated with the second knowledge-graph entity based on the entity relationship until the current retrieval hop count is greater than or equal to the retrieval hop count threshold value to obtain a knowledge-graph entity set; and a fifth determining module, configured to determine, based on the set of knowledge-graph entities, the triplet information to which each knowledge-graph entity belongs, to obtain the triplet information set.
Optionally, the retrieving unit further includes: the second retrieval module is used for retrieving the entity relationship matched with the target relationship from the preset domain knowledge graph until the retrieval is successful or the retrieval times reach a preset retrieval threshold value, so as to obtain a retrieval result; a sixth determining module, configured to determine, based on the search result, a target entity relationship that matches the target relationship, to obtain a target entity relationship set, where the search is successful; and a seventh determining module, configured to determine, based on the target entity relationship set, the triplet information to which each target entity relationship belongs, to obtain the triplet information set.
Optionally, the knowledge question-answering device further includes: the first connection module is used for retrieving the triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set to obtain the triplet information set, and then connecting a main entity, an object entity and an entity relation in the triplet information to obtain an answer text, wherein the triplet information corresponds to an association value associated with the target entity; and the first ordering module is used for ordering all the answer texts based on the association value to obtain an answer text set.
Optionally, the building unit comprises: the second construction module is used for constructing an answer prompt template, wherein the answer prompt template comprises: answering the instruction; the third generation module is used for adding the answer instruction into each answer text in the answer text set based on the answer prompt template to generate an answer prompt information set; and the first splicing module is used for splicing the question prompt information and the answer prompt information set to obtain the input knowledge information.
Optionally, the building unit further comprises: the second analysis module is configured to analyze the question prompt information by using the preset inference model to obtain an answer set, where the preset inference model is an inference model that is trained in advance by using a training data set, and the training data set includes: a set of historical questions, a historical answer corresponding to each historical question in the set of historical questions; an eighth determining module, configured to characterize the input knowledge information as a preset condition, and determine a conditional probability value of each answer in the answer set based on the preset condition; and a ninth determining module, configured to determine the answer indicated by the maximum conditional probability value as the target answer.
According to another aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored computer program, and when the computer program runs, the device where the computer readable storage medium is controlled to execute any one of the knowledge question-answering methods based on the domain knowledge graph.
According to another aspect of the embodiment of the present invention, there is also provided an electronic device, including one or more processors and a memory, where the memory is configured to store one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement any one of the knowledge question and answer methods based on a domain knowledge graph.
In the disclosure, a target problem is received and processed to obtain problem prompt information, a target entity set or a target relation set is extracted based on the target problem, triple information matched with the target entity is retrieved from a preset domain knowledge graph based on the target entity set under the condition that the target entity set is extracted, or triple information matched with the target relation is retrieved from the preset domain knowledge graph based on the target relation set under the condition that the target relation set is extracted, so as to obtain the triple information set, input knowledge information is built based on the triple information set and the problem prompt information, the input knowledge information is input into a preset reasoning model, and a target answer of the target problem is output. In the disclosure, a received target problem may be first processed to obtain a problem prompt message, a target entity set or a target relation set in the target problem may be extracted, so as to retrieve triplet information matched with the target entity from a preset domain knowledge graph according to the target entity set or retrieve triplet information matched with the target relation from the preset domain knowledge graph according to the target relation set, then input knowledge information is constructed according to the obtained triplet information set and the problem prompt message, and then input knowledge information is input into a preset inference model to obtain a target answer of the output target problem, by combining the preset domain knowledge graph, the problem can be constructed as the input knowledge information, and then the input knowledge information is processed through the preset inference model, so that the prejudice in the model process can be reduced, the accuracy of performing the knowledge question-answering inference on the problem is improved, and further the technical problem that the accuracy of performing the knowledge-question-answering inference in the related technology is lower is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic diagram of an alternative knowledge-based intelligent question-answering reasoning in accordance with the related art;
FIG. 2 is a schematic diagram of an alternative intelligent question-answering based on a generative model in accordance with the related art;
FIG. 3 is a flow chart of an alternative knowledge-based method of knowledge-based on a domain knowledge graph, in accordance with an embodiment of the application;
FIG. 4 is a schematic diagram of an alternative knowledge-based question-and-answer reasoning process based on domain knowledge graph, in accordance with an embodiment of the application;
FIG. 5 is a schematic diagram of an alternative domain knowledge-graph-based knowledge-questioning-and-answering apparatus, in accordance with an embodiment of the present application;
fig. 6 is a block diagram of a hardware structure of an electronic device (or mobile device) for a knowledge-based on domain knowledge graph method, in accordance with an embodiment of the application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
To facilitate an understanding of the invention by those skilled in the art, some terms or nouns involved in the various embodiments of the invention are explained below:
the knowledge graph is a method for describing the association relationship between knowledge and modeling entities by using a graph model, and consists of nodes and edges.
The language model is used for generating probability distribution of word sequences, namely determining a probability distribution for a text, and representing the possibility of the text.
It should be noted that, the knowledge question and answer method and the device based on the domain knowledge graph in the disclosure may be used in the case that the knowledge question and answer is performed based on the domain knowledge graph in the artificial intelligence domain, and may also be used in any domain except the artificial intelligence domain, where the knowledge question and answer is performed based on the domain knowledge graph, and the application field of the knowledge question and answer method and the device based on the domain knowledge graph in the disclosure is not limited.
It should be noted that, related information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use and processing of related data need to comply with related laws and regulations and standards of related countries and regions, and be provided with corresponding operation entries for the user to select authorization or rejection. For example, an interface is provided between the system and the relevant user or institution, before acquiring the relevant information, the system needs to send an acquisition request to the user or institution through the interface, and acquire the relevant information after receiving the consent information fed back by the user or institution.
The following embodiments of the present invention are applicable to various systems/applications/devices for performing knowledge questions and answers based on domain knowledge graphs. The invention provides a knowledge question-answering reasoning method based on a domain knowledge graph, which can solve the problems of low reliability degree, prejudice and fairness of results of intelligent reasoning question-answering results in the related technology.
The invention utilizes the field knowledge graph data asset, can enhance the controllability and reliability of the model reasoning result, reduce the non-factual error of the model, promote the intelligent question-answer reasoning capability level of the model, promote the applicability of the model in the scenes of intelligent customer service, virtual assistant and the like, and can flexibly adapt to the knowledge question-answer reasoning in various scenes without retraining the model according to the scene types.
The present invention will be described in detail with reference to the following examples.
Example 1
According to an embodiment of the present invention, there is provided an embodiment of a knowledge-questioning-and-answering method based on a domain knowledge graph, it is to be noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions, and that, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that herein.
Fig. 3 is a flowchart of an alternative knowledge question-answering method based on domain knowledge graph according to an embodiment of the present invention, as shown in fig. 3, the method includes the steps of:
step S301, receiving a target problem, and processing the target problem to obtain problem prompt information.
Step S302, extracting a target entity set or a target relationship set based on the target problem, where the target entity set includes: a plurality of target entities, a set of target relationships comprising: a plurality of target relationships.
Step S303, retrieving the triplet information matched with the target entity from the preset domain knowledge graph based on the target entity set in the case of extracting the target entity set, or retrieving the triplet information matched with the target relationship from the preset domain knowledge graph based on the target relationship set in the case of extracting the target relationship set, so as to obtain the triplet information set.
Step S304, based on the triplet information set and the question prompt information, constructing input knowledge information, inputting the input knowledge information into a preset reasoning model, and outputting a target answer of the target question.
Through the steps, the target problem can be received, the target problem is processed to obtain problem prompt information, a target entity set or a target relation set is extracted based on the target problem, the triplet information matched with the target entity is searched from a preset domain knowledge graph based on the target entity set under the condition that the target entity set is extracted, or the triplet information matched with the target relation is searched from the preset domain knowledge graph based on the target relation set under the condition that the target relation set is extracted, the triplet information set is obtained, the input knowledge information is constructed based on the triplet information set and the problem prompt information, the input knowledge information is input into a preset inference model, and a target answer of the target problem is output. In the embodiment of the invention, the received target problem can be processed firstly to obtain the problem prompt information, a target entity set or a target relation set in the target problem can be extracted to retrieve the triplet information matched with the target entity from the preset domain knowledge graph according to the target entity set or retrieve the triplet information matched with the target relation from the preset domain knowledge graph according to the target relation set, and then the obtained triplet information set and the problem prompt information are used for prompting the target problem, the method comprises the steps of constructing input knowledge information, inputting the input knowledge information into a preset reasoning model to obtain a target answer of an output target problem, constructing the problem into the input knowledge information by combining a preset domain knowledge graph, and processing the input knowledge information through the preset reasoning model, so that prejudice in a model reasoning process can be reduced, accuracy of knowledge question-answering reasoning on the problem is improved, and further, the technical problem of lower accuracy of knowledge question-answering reasoning on the problem in related technologies is solved.
Embodiments of the present invention will be described in detail with reference to the following steps.
Step S301, receiving a target problem, and processing the target problem to obtain problem prompt information.
Optionally, the step of processing the target problem to obtain the problem prompt information includes: constructing a problem prompting template, wherein the problem prompting template comprises: a question instruction; based on the problem prompt template, adding a problem instruction into the target problem to generate problem prompt information.
In the embodiment of the invention, the target problem which needs to be subjected to knowledge question answering can be received first, and then the target problem is processed to obtain the problem prompt information, which is specifically as follows: a question prompting template may be constructed first, the question prompting template comprising: a question instruction (e.g., please answer the following questions), and then adding a question instruction to the target question according to the question prompting template to generate a question prompting message, e.g., the target question x is "who is the author of a book", and the generated question prompting message x' is "please answer the following questions" according to the question prompting template: who the author of a book is).
Step S302, extracting a target entity set or a target relationship set based on the target problem, where the target entity set includes: a plurality of target entities, a set of target relationships comprising: a plurality of target relationships.
Optionally, the step of extracting the target entity set or the target relation set based on the target problem includes: performing word segmentation processing on the target problem to obtain a plurality of segmented words; analyzing the word segmentation to determine the word type of the word segmentation; determining the word segmentation indicated by the word type as a target entity under the condition that the word type is a first preset type, or determining the word segmentation indicated by the word type as a target relationship under the condition that the word type is a second preset type, so as to obtain a target relationship set; determining context information of a target problem under the condition that all word types are not the first preset type and the second preset type, and supplementing a target entity corresponding to the target problem based on the context information; a set of target entities is generated based on all target entities.
In the embodiment of the invention, the content (namely, the entity or the relation, namely, the noun at the subject position, the noun at the object position, the relation (such as couple relation, master-slave relation and the like)) in the problem or the sentence (namely, the target problem) can be extracted through entity links or a natural language model, the content (namely, the entity or the relation, namely, the noun at the subject position, the noun at the object position, the relation (such as couple relation, master-slave relation and the like)) can be the object or the relation, if the entity or the relation does not exist, the entity in the problem can be supplemented by a knowledge complement method (namely, the target entity set or the target relation set is extracted based on the target problem, the target entity set comprises a plurality of target entities, and the target relation set comprises a plurality of target relations), and the method is as follows: the target question may be subjected to word segmentation processing to obtain a plurality of words, and then each word segment may be analyzed to determine a word type (e.g., noun, verb, adjective, etc.) of each word segment, and if the word type is a first preset type (e.g., noun representing an object name, etc.), the word segment indicated by the word type may be determined as a target entity, and if the word type is a second preset type (e.g., noun representing a relationship, etc.), the word segment indicated by the word type may be determined as a target entity. If all word types are not the first preset type and the second preset type, the context information of the target problem can be determined first, then the target entity corresponding to the target problem is supplemented according to the context information, and then a target entity set is generated according to all the target entities.
Step S303, retrieving the triplet information matched with the target entity from the preset domain knowledge graph based on the target entity set in the case of extracting the target entity set, or retrieving the triplet information matched with the target relationship from the preset domain knowledge graph based on the target relationship set in the case of extracting the target relationship set, so as to obtain the triplet information set.
Optionally, the triplet information comprises: under the condition that the main entity, the object entity and the entity relationship are extracted to the target entity set, based on the target entity set, the triplet information matched with the target entity is retrieved from the preset domain knowledge graph, and the triplet information set is obtained, which comprises the following steps: determining a search hop count threshold and an initial search hop count; retrieving a knowledge-graph entity matched with a target entity from a preset domain knowledge-graph, wherein the knowledge-graph entity is a main entity or an object entity; under the condition that a first knowledge-graph entity matched with a target entity is retrieved, updating the initial retrieval hop count to obtain the current retrieval hop count; determining a second knowledge-graph entity associated with the first knowledge-graph entity based on the entity relationship if the current search hop count is less than the search hop count threshold; updating the current retrieval hop count, and continuously determining a third knowledge-graph entity associated with the second knowledge-graph entity based on the entity relationship until the current retrieval hop count is greater than or equal to the retrieval hop count threshold value to obtain a knowledge-graph entity set; and determining the triplet information of each knowledge-graph entity based on the knowledge-graph entity set to obtain the triplet information set.
In the embodiment of the invention, based on the extracted entity, triples (body entity, object entity and relation) related to the entity extracted from the problem (namely, based on the target entity set, triples information matched with the target entity is retrieved from the preset domain knowledge graph, so as to obtain a triples information set), and the triples retrieved from the knowledge graph can be used as related facts of the input problem, wherein multiple pairs of triples may exist, and the triples information comprises: subject entities, object entities, and entity relationships.
In the embodiment of the invention, when the triples are searched, the size of the search space influences the number of the triples, so that the hop number searched from the problem can be set according to the task complexity of the question-answering scene, and the problem that the searched triples have no relation with the target problem or have a large number is considered, and the symmetrical knowledge retriever or the asymmetrical retriever can be adopted for searching.
In the embodiment of the present invention, a threshold value of the number of search hops (may be set according to the actual situation, for example, 2) and an initial number of search hops (for example, the initial number of search hops is set to 0) may be determined first, then a knowledge-graph entity (the knowledge-graph entity is a main body entity or an object entity) matching with the target entity may be searched from a preset domain knowledge-graph, if the first knowledge-graph entity matching with the target entity is searched, an update operation may be performed on the initial number of search hops (i.e., the initial number of search hops is increased by 1) to obtain a current number of search hops. s i Representing a subject entity, r i Representing object entities, o i Representing entity relationships, N represents the number of triples information.
Optionally, under the condition of extracting the target relation set, retrieving the triplet information matched with the target relation from the preset domain knowledge graph based on the target relation set to obtain the triplet information set, which comprises the following steps: searching entity relations matched with the target relations from a preset domain knowledge graph until searching is successful or searching times reach a preset searching threshold value, and obtaining a searching result; under the condition that the retrieval is successful, determining a target entity relation matched with the target relation based on the retrieval result to obtain a target entity relation set; and determining the triple information of each target entity relation based on the target entity relation set to obtain a triple information set.
In the embodiment of the invention, if the target relation set is extracted from the target problem, the entity relation matched with the target relation can be searched from the preset domain knowledge graph, if the searching is successful, the searching is stopped, the triplet information of the relation in the searched domain knowledge graph is acquired (under the condition that the searching is successful, the target entity relation matched with the target relation is determined based on the searching result, the target entity relation set is obtained, and the triplet information of each target entity relation is determined based on the target entity relation set, so as to obtain the triplet information set); if the search times reach the preset search threshold (which may be set according to the actual situation, for example, 3 times) and the search is not successful yet, the search may be stopped.
Optionally, after retrieving the triplet information matched with the target entity from the preset domain knowledge graph based on the target entity set to obtain the triplet information set, the method further includes: connecting a main entity, an object entity and an entity relation in the triplet information to obtain an answer text, wherein the triplet information corresponds to an association value associated with a target entity; and sorting all the answer texts based on the association value to obtain an answer text set.
In the embodiment of the invention, the input of the inference model is in a text form, so that the triples related to the problems retrieved from the domain knowledge graph are required to be converted into a variable-length text series, specifically: the body entities, entity relationships, and object entities of the triples may be connected in a linear fashion to generate knowledge text (i.e., the body entities, object entities, and entity relationships in the triples information are connected to obtain answer text). In this embodiment, the association value of the triplet information and the target entity may be determined according to the number of search hops undergone when the triplet information is searched, and then all the answer texts are ranked according to the association value, so as to obtain a ranked answer text set.
Step S304, based on the triplet information set and the question prompt information, constructing input knowledge information, inputting the input knowledge information into a preset reasoning model, and outputting a target answer of the target question.
In the embodiment of the invention, the answer text set k is required to be converted into the answer prompt information set k', then the answer prompt information set is preset to the question prompt information to obtain the input knowledge information, the input knowledge information is input to a preset reasoning model, and an answer is generated by the preset reasoning model and a final question-answer result (namely a target answer) is returned.
Optionally, the step of constructing the input knowledge information based on the triplet information set and the problem prompt information includes: constructing an answer prompt template, wherein the answer prompt template comprises: answering the instruction; adding an answer instruction into each answer text in the answer text set based on the answer prompt template to generate an answer prompt information set; and splicing the question prompt information and the answer prompt information set to obtain the input knowledge information.
In the embodiment of the invention, an answer prompt template can be constructed first, and the answer prompt template comprises: an answer instruction (for example, the answer to the question is referred to as follows), and then an answer instruction is added to each answer text in the answer text set according to the answer prompt template to generate an answer prompt information set, for example, "the answer text set k is" the author of a book is a, the author of a book is a+b, and the translation author of a book is C ", and then the answer prompt information set k' generated according to the answer prompt template is referred to as" the answer to the question is referred to as follows: the author of a book is A, the author of a book is A+B, and the translation author of a book is C). Then, the question prompt information and the answer prompt information set are spliced to obtain input knowledge information [ x ', k' ], wherein [ (C ] represents connection ].
Optionally, the step of inputting the input knowledge information into a preset inference model and outputting a target answer of the target question includes: analyzing the question prompt information by adopting a preset reasoning model to obtain an answer set, wherein the preset reasoning model is a pre-trained reasoning model by adopting a training data set, and the training data set comprises: a set of historical questions, a historical answer corresponding to each of the set of historical questions; characterizing the input knowledge information as preset conditions, and determining a conditional probability value of each answer in the answer set based on the preset conditions; and determining the answer indicated by the maximum conditional probability value as a target answer.
In the embodiment of the present invention, the preset inference model is an inference model pre-trained using a training data set (including a historical question set and a historical answer corresponding to each historical question in the historical question set), and the inference model may be various algorithm models, for example, a model for learning a complex relationship between a question and an answer using a deep neural network, a model for learning a statistical relationship between a question and an answer by analyzing a large amount of data, etc., without limitation.
In the embodiment of the invention, after the input knowledge information [ x ', k ' ] is obtained, the input knowledge information [ x ', k ' ] can be injected into a preset reasoning model, then the question prompt information x ' is analyzed by adopting the preset reasoning model to obtain an answer set, the preset reasoning model can take the input knowledge information as a preset condition, a conditional probability value (namely P (y| [ x ', k ' ]) of each answer in the answer set is determined according to the preset condition, wherein y represents a certain answer in the answer set), and finally the answer indicated by the maximum conditional probability value is determined as a target answer to be output.
The following detailed description is directed to alternative embodiments.
Fig. 4 is a schematic diagram of an alternative knowledge-based question-and-answer reasoning process based on domain knowledge graph according to an embodiment of the invention, as shown in fig. 4, including the following processes:
(1) Acquiring a problem, and adding an instruction into the problem according to a template mode to obtain a problem prompt;
(2) Extracting the entity from the problem to obtain entity and relation elements, and then carrying out knowledge retrieval in the domain knowledge graph according to the entity and relation elements to obtain triples (related problems);
(3) Then, performing triple spoken language on the triple (related to the problem) to generate a knowledge text;
(4) Adding instructions into the knowledge text according to a template mode to obtain knowledge prompts (facts), and presetting the knowledge prompts (facts) into the problem prompts to be fused into the knowledge prompts;
(5) And (3) injecting the prompt into an inference model, and analyzing through the inference model to generate a question answer.
In the embodiment of the invention, a method for generating a knowledge prompt based on a domain knowledge graph to enhance the professional knowledge reasoning capability of a model is provided, the problems of poor reliability, possible prejudice, fairness and the like of the model generation result in the related technology can be solved, and the model reasoning capability can be enhanced from iteration by fully utilizing the actual advantages of the constructed domain knowledge graph data assets, so that the method has the advantages of stronger scene adaptability, higher flexibility and lowest cost. In addition, the embodiment generates the fact answer based on the inference model under the condition of the fact knowledge, can effectively avoid the model from generating a logic confusing answer, can keep the parameters of the model unchanged, does not need fine adjustment when the knowledge is updated, and is more flexible to apply and lower in cost for the scene of quicker and changeable knowledge updating iteration in the application field.
The following describes in detail another embodiment.
Example two
The knowledge question-answering device based on the domain knowledge graph provided in the present embodiment includes a plurality of implementation units, each of which corresponds to each implementation step in the first embodiment.
Fig. 5 is a schematic diagram of an alternative knowledge-based trivia device based on domain knowledge graph according to an embodiment of the invention, as shown in fig. 5, the trivia device may include: a receiving unit 50, an extracting unit 51, a retrieving unit 52, a constructing unit 53, wherein,
the receiving unit 50 is configured to receive a target problem, and process the target problem to obtain problem prompt information;
the extracting unit 51 is configured to extract a target entity set or a target relationship set based on the target problem, where the target entity set includes: a plurality of target entities, a set of target relationships comprising: a plurality of target relationships;
a retrieving unit 52, configured to retrieve, based on the target entity set, triplet information matched with the target entity from a preset domain knowledge graph, or retrieve, based on the target relationship set, triplet information matched with the target relationship from the preset domain knowledge graph, to obtain a triplet information set;
The construction unit 53 is configured to construct input knowledge information based on the triplet information set and the question prompt information, input the input knowledge information to a preset inference model, and output a target answer of the target question.
The knowledge question-answering device can receive the target problem through the receiving unit 50 and process the target problem to obtain the problem prompt information, extract the target entity set or the target relation set through the extracting unit 51 based on the target problem, retrieve the triplet information matched with the target entity from the preset domain knowledge graph based on the target entity set through the retrieving unit 52 when extracting the target entity set, or retrieve the triplet information matched with the target relation from the preset domain knowledge graph based on the target relation set when extracting the target relation set, obtain the triplet information set, construct the input knowledge information through the constructing unit 53 based on the triplet information set and the problem prompt information, input the input knowledge information into the preset inference model, and output the target answer of the target problem. In the embodiment of the invention, the received target problem can be processed firstly to obtain the problem prompt information, a target entity set or a target relation set in the target problem can be extracted to retrieve the triplet information matched with the target entity from the preset domain knowledge graph according to the target entity set or retrieve the triplet information matched with the target relation from the preset domain knowledge graph according to the target relation set, and then the obtained triplet information set and the problem prompt information are used for prompting the target problem, the method comprises the steps of constructing input knowledge information, inputting the input knowledge information into a preset reasoning model to obtain a target answer of an output target problem, constructing the problem into the input knowledge information by combining a preset domain knowledge graph, and processing the input knowledge information through the preset reasoning model, so that prejudice in a model reasoning process can be reduced, accuracy of knowledge question-answering reasoning on the problem is improved, and further, the technical problem of lower accuracy of knowledge question-answering reasoning on the problem in related technologies is solved.
Optionally, the receiving unit includes: the first construction module is used for constructing a problem prompt template, wherein the problem prompt template comprises: a question instruction; the first generation module is used for adding a problem instruction into the target problem based on the problem prompt template to generate problem prompt information.
Optionally, the extracting unit includes: the first processing module is used for performing word segmentation processing on the target problem to obtain a plurality of segmented words; the first analysis module is used for analyzing the word segmentation and determining the word type of the word segmentation; the first determining module is used for determining the word segmentation indicated by the word type as a target entity under the condition that the word type is a first preset type, or determining the word segmentation indicated by the word type as a target relationship under the condition that the word type is a second preset type, so as to obtain a target relationship set; the second determining module is used for determining the context information of the target problem and supplementing the target entity corresponding to the target problem based on the context information under the condition that all word types are not the first preset type and the second preset type; and the second generation module is used for generating a target entity set based on all the target entities.
Optionally, the triplet information comprises: the main entity, the object entity and the entity relationship, and the retrieval unit comprises: the third determining module is used for determining a retrieval hop count threshold value and an initial retrieval hop count; the first retrieval module is used for retrieving a knowledge spectrum entity matched with the target entity from a preset domain knowledge spectrum, wherein the knowledge spectrum entity is a main entity or an object entity; the first updating module is used for updating the initial retrieval hop count to obtain the current retrieval hop count under the condition that a first knowledge-graph entity matched with the target entity is retrieved; a fourth determining module, configured to determine, based on the entity relationship, a second knowledge-graph entity associated with the first knowledge-graph entity if the current search hop count is less than the search hop count threshold; the second updating module is used for updating the current retrieval hop count, and continuously determining a third knowledge-graph entity associated with the second knowledge-graph entity based on the entity relationship until the current retrieval hop count is greater than or equal to the retrieval hop count threshold value to obtain a knowledge-graph entity set; and a fifth determining module, configured to determine, based on the set of knowledge-graph entities, triplet information to which each knowledge-graph entity belongs, to obtain a triplet information set.
Optionally, the retrieving unit further comprises: the second retrieval module is used for retrieving the entity relationship matched with the target relationship from the preset domain knowledge graph until the retrieval is successful or the retrieval times reach a preset retrieval threshold value, so as to obtain a retrieval result; a sixth determining module, configured to determine, based on a search result, a target entity relationship that matches the target relationship, to obtain a target entity relationship set, where the search is successful; and a seventh determining module, configured to determine, based on the target entity relationship set, triplet information to which each target entity relationship belongs, to obtain a triplet information set.
Optionally, the knowledge question-answering device further includes: the first connection module is used for retrieving the triplet information matched with the target entity from the preset domain knowledge graph based on the target entity set to obtain the triplet information set, and then connecting the main entity, the object entity and the entity relationship in the triplet information to obtain an answer text, wherein the triplet information corresponds to an association value associated with the target entity; and the first ordering module is used for ordering all the answer texts based on the association value to obtain an answer text set.
Optionally, the building unit comprises: the second construction module is used for constructing an answer prompt template, wherein the answer prompt template comprises: answering the instruction; the third generation module is used for adding an answer instruction into each answer text in the answer text set based on the answer prompt template to generate an answer prompt information set; and the first splicing module is used for splicing the question prompt information and the answer prompt information set to obtain the input knowledge information.
Optionally, the building unit further comprises: the second analysis module is used for analyzing the question prompt information by adopting a preset reasoning model to obtain an answer set, wherein the preset reasoning model is a reasoning model which is trained in advance by adopting a training data set, and the training data set comprises: a set of historical questions, a historical answer corresponding to each of the set of historical questions; an eighth determining module, configured to characterize the input knowledge information as a preset condition, and determine a conditional probability value of each answer in the answer set based on the preset condition; and a ninth determining module, configured to determine an answer indicated by the maximum conditional probability value as a target answer.
The knowledge question-answering apparatus may further include a processor and a memory, wherein the receiving unit 50, the extracting unit 51, the retrieving unit 52, the constructing unit 53, and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize the corresponding functions.
The processor includes a kernel, and the kernel fetches a corresponding program unit from the memory. The kernel can set one or more than one, construct input knowledge information based on the triplet information set and the question prompt information by adjusting kernel parameters, input the input knowledge information into a preset reasoning model and output a target answer of a target question.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), which includes at least one memory chip.
The application also provides a computer program product adapted to perform, when executed on a data processing device, a program initialized with the method steps of: receiving a target problem, processing the target problem to obtain problem prompt information, extracting a target entity set or a target relation set based on the target problem, retrieving triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set under the condition of extracting the target entity set, or retrieving triplet information matched with the target relation from the preset domain knowledge graph based on the target relation set under the condition of extracting the target relation set, obtaining the triplet information set, constructing input knowledge information based on the triplet information set and the problem prompt information, inputting the input knowledge information into a preset inference model, and outputting a target answer of the target problem.
According to another aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored computer program, and when the computer program runs, a device where the computer readable storage medium is located is controlled to execute the knowledge question-answering method based on the domain knowledge graph.
According to another aspect of the embodiment of the present invention, there is also provided an electronic device, including one or more processors and a memory, where the memory is configured to store one or more programs, and the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the knowledge question-answering method based on the domain knowledge graph.
Fig. 6 is a block diagram of a hardware structure of an electronic device (or mobile device) for a knowledge-based on domain knowledge graph method, in accordance with an embodiment of the invention. As shown in fig. 6, the electronic device may include one or more processors 602 (shown in fig. 6 as 602a, 602b, … …,602 n) (the processor 602 may include, but is not limited to, a microprocessor MCU, a programmable logic device FPGA, etc.) and a memory 604 for storing data. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a keyboard, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 6 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the electronic device may also include more or fewer components than shown in FIG. 6, or have a different configuration than shown in FIG. 6.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.
Claims (11)
1. The knowledge question-answering method based on the domain knowledge graph is characterized by comprising the following steps of:
receiving a target problem, and processing the target problem to obtain problem prompt information;
extracting a target entity set or a target relation set based on the target problem, wherein the target entity set comprises: a plurality of target entities, the set of target relationships comprising: a plurality of target relationships;
retrieving triple information matched with the target entity from a preset domain knowledge graph based on the target entity set under the condition of extracting the target entity set, or retrieving triple information matched with the target relationship from the preset domain knowledge graph based on the target relationship set under the condition of extracting the target relationship set, so as to obtain a triple information set;
and constructing input knowledge information based on the triplet information set and the question prompt information, inputting the input knowledge information into a preset reasoning model, and outputting a target answer of the target question.
2. The knowledge question answering method according to claim 1, wherein the step of processing the target question to obtain question prompting information comprises:
constructing a problem prompting template, wherein the problem prompting template comprises: a question instruction;
and adding the problem instruction into the target problem based on the problem prompt template to generate the problem prompt information.
3. The knowledge question answering method according to claim 1, wherein the step of extracting a set of target entities or a set of target relationships based on the target questions comprises:
performing word segmentation processing on the target problem to obtain a plurality of segmented words;
analyzing the word segmentation to determine the word type of the word segmentation;
determining the word segmentation indicated by the word type as the target entity under the condition that the word type is a first preset type, or determining the word segmentation indicated by the word type as the target relationship under the condition that the word type is a second preset type, so as to obtain the target relationship set;
determining the context information of the target problem under the condition that all word types are not the first preset type and the second preset type, and supplementing the target entity corresponding to the target problem based on the context information;
And generating the target entity set based on all the target entities.
4. The knowledge question-answering method according to claim 1, wherein the triplet information includes: under the condition that the entity relationship among the main body entity, the object entity and the entity relationship is extracted to the target entity set, based on the target entity set, the step of retrieving the triplet information matched with the target entity from the preset domain knowledge graph to obtain the triplet information set comprises the following steps:
determining a search hop count threshold and an initial search hop count;
retrieving a knowledge-graph entity matched with the target entity from the preset domain knowledge-graph, wherein the knowledge-graph entity is the main entity or the object entity;
under the condition that a first knowledge-graph entity matched with the target entity is retrieved, updating the initial retrieval hop count to obtain the current retrieval hop count;
determining a second knowledge-graph entity associated with the first knowledge-graph entity based on the entity relationship if the current search hop count is less than the search hop count threshold;
updating the current retrieval hop count, and continuously determining a third knowledge-graph entity associated with the second knowledge-graph entity based on the entity relationship until the current retrieval hop count is greater than or equal to the retrieval hop count threshold value to obtain a knowledge-graph entity set;
And determining the triplet information of each knowledge-graph entity based on the knowledge-graph entity set to obtain the triplet information set.
5. The knowledge question-answering method according to claim 1, wherein in the case of extracting the target relationship set, based on the target relationship set, the step of retrieving triplet information matched with the target relationship from a preset domain knowledge graph to obtain a triplet information set includes:
searching entity relations matched with the target relations from the preset domain knowledge graph until searching is successful or searching times reach a preset searching threshold value, and obtaining a searching result;
under the condition that the retrieval is successful, determining a target entity relationship matched with the target relationship based on the retrieval result to obtain a target entity relationship set;
and determining the triplet information of each target entity relation based on the target entity relation set to obtain the triplet information set.
6. The knowledge question-answering method according to claim 1, further comprising, after obtaining the triplet information set:
connecting a main entity, an object entity and an entity relationship in the triplet information to obtain an answer text, wherein the triplet information corresponds to an association value associated with the target entity;
And sorting all the answer texts based on the association value to obtain an answer text set.
7. The knowledge question answering method according to claim 6, wherein the step of constructing input knowledge information based on the triplet information set and the question prompting information comprises:
constructing an answer prompt template, wherein the answer prompt template comprises: answering the instruction;
adding the answer instruction into each answer text in the answer text set based on the answer prompt template to generate an answer prompt information set;
and splicing the question prompt information and the answer prompt information set to obtain the input knowledge information.
8. The knowledge question answering method according to claim 7, wherein the step of inputting the input knowledge information to a preset inference model and outputting a target answer to the target question comprises:
analyzing the question prompt information by adopting the preset reasoning model to obtain an answer set, wherein the preset reasoning model is a reasoning model which is trained in advance by adopting a training data set, and the training data set comprises: a set of historical questions, a historical answer corresponding to each historical question in the set of historical questions;
Characterizing the input knowledge information as a preset condition, and determining a conditional probability value of each answer in the answer set based on the preset condition;
and determining the answer indicated by the maximum conditional probability value as the target answer.
9. The knowledge question-answering device based on the domain knowledge graph is characterized by comprising:
the receiving unit is used for receiving the target problem and processing the target problem to obtain problem prompt information;
the extraction unit is configured to extract a target entity set or a target relationship set based on the target problem, where the target entity set includes: a plurality of target entities, the set of target relationships comprising: a plurality of target relationships;
the retrieval unit is used for retrieving the triplet information matched with the target entity from a preset domain knowledge graph based on the target entity set under the condition of extracting the target entity set, or retrieving the triplet information matched with the target relationship from the preset domain knowledge graph based on the target relationship set under the condition of extracting the target relationship set, so as to obtain a triplet information set;
the construction unit is used for constructing input knowledge information based on the triplet information set and the question prompt information, inputting the input knowledge information into a preset reasoning model and outputting a target answer of the target question.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to execute the knowledge-based knowledge-graph answering method according to any one of claims 1 to 8.
11. An electronic device comprising one or more processors and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the domain knowledge-graph based knowledge-question-answering method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311049695.8A CN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question-answering method and device based on domain knowledge graph and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311049695.8A CN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question-answering method and device based on domain knowledge graph and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117076688A true CN117076688A (en) | 2023-11-17 |
Family
ID=88703770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311049695.8A Pending CN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question-answering method and device based on domain knowledge graph and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117076688A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117436531A (en) * | 2023-12-21 | 2024-01-23 | 安徽大学 | Question answering system and method based on rice pest knowledge graph |
CN117493582A (en) * | 2023-12-29 | 2024-02-02 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
CN117634617A (en) * | 2024-01-25 | 2024-03-01 | 清华大学 | Knowledge-intensive reasoning question-answering method, device, electronic equipment and storage medium |
CN118069817A (en) * | 2024-04-18 | 2024-05-24 | 国家超级计算天津中心 | Knowledge graph-based generation type question-answering method, device and storage medium |
-
2023
- 2023-08-18 CN CN202311049695.8A patent/CN117076688A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117436531A (en) * | 2023-12-21 | 2024-01-23 | 安徽大学 | Question answering system and method based on rice pest knowledge graph |
CN117493582A (en) * | 2023-12-29 | 2024-02-02 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
CN117493582B (en) * | 2023-12-29 | 2024-04-05 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
CN117634617A (en) * | 2024-01-25 | 2024-03-01 | 清华大学 | Knowledge-intensive reasoning question-answering method, device, electronic equipment and storage medium |
CN117634617B (en) * | 2024-01-25 | 2024-05-17 | 清华大学 | Knowledge-intensive reasoning question-answering method, device, electronic equipment and storage medium |
CN118069817A (en) * | 2024-04-18 | 2024-05-24 | 国家超级计算天津中心 | Knowledge graph-based generation type question-answering method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111209384B (en) | Question-answer data processing method and device based on artificial intelligence and electronic equipment | |
CN112270196B (en) | Entity relationship identification method and device and electronic equipment | |
CN117076688A (en) | Knowledge question-answering method and device based on domain knowledge graph and electronic equipment | |
CN114565104A (en) | Language model pre-training method, result recommendation method and related device | |
CN110675023B (en) | Litigation request rationality prediction model training method based on neural network, and litigation request rationality prediction method and device based on neural network | |
CN112016313B (en) | Spoken language element recognition method and device and warning analysis system | |
CN116561538A (en) | Question-answer scoring method, question-answer scoring device, electronic equipment and storage medium | |
CN117520523B (en) | Data processing method, device, equipment and storage medium | |
CN109857865B (en) | Text classification method and system | |
CN113537206B (en) | Push data detection method, push data detection device, computer equipment and storage medium | |
Wu et al. | Chinese text classification based on character-level CNN and SVM | |
CN111368096A (en) | Knowledge graph-based information analysis method, device, equipment and storage medium | |
CN114647713A (en) | Knowledge graph question-answering method, device and storage medium based on virtual confrontation | |
CN112632248A (en) | Question answering method, device, computer equipment and storage medium | |
CN112199958A (en) | Concept word sequence generation method and device, computer equipment and storage medium | |
CN117290488A (en) | Man-machine interaction method and device based on large model, electronic equipment and storage medium | |
CN114416929A (en) | Sample generation method, device, equipment and storage medium of entity recall model | |
EP4030355A1 (en) | Neural reasoning path retrieval for multi-hop text comprehension | |
CN116049376B (en) | Method, device and system for retrieving and replying information and creating knowledge | |
CN113705207A (en) | Grammar error recognition method and device | |
Tannert et al. | FlowchartQA: the first large-scale benchmark for reasoning over flowcharts | |
CN115859973A (en) | Text feature extraction method and device, nonvolatile storage medium and electronic equipment | |
CN113656548A (en) | Text classification model interpretation method and system based on data envelope analysis | |
CN111191448A (en) | Word processing method, device, storage medium and processor | |
CN117009532B (en) | Semantic type recognition method and device, computer readable medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |