CN117033608A

CN117033608A - Knowledge graph generation type question-answering method and system based on large language model

Info

Publication number: CN117033608A
Application number: CN202311266604.6A
Authority: CN
Inventors: 陈莹; 崔莹; 谢达; 代翔; 雋兆波; 何健军; 陈伟晴; 王侃; 戴礼灿
Original assignee: CETC 10 Research Institute
Current assignee: CETC 10 Research Institute
Priority date: 2023-09-28
Filing date: 2023-09-28
Publication date: 2023-11-10
Anticipated expiration: 2043-09-28
Also published as: CN117033608B

Abstract

The invention discloses a knowledge graph generation type question-answering method and system based on a large language model, wherein the method comprises the following steps: building a large language model fine tuning training data, wherein the training data comprises a prompt sentence, a question set and an answer set; the prompt statement comprises a prompt template and instance data; fine tuning a large language model based on the LoRA; providing a question-answer knowledge base for the large language model subjected to LoRA fine adjustment through a subgraph searching strategy; and taking the large language model subjected to LoRA fine tuning as a question-answer reasoning model, inputting a question text into the question-answer reasoning model, and generating a question answer based on the provided question-answer knowledge base. According to the invention, the large model generates answers only according to the question text, but the map information and the questions are constructed together to form model prompt sentences to generate the answers to the questions, so that the answers are more accurate and traceable.

Description

Knowledge graph generation type question-answering method and system based on large language model

Technical Field

The invention relates to the technical field of knowledge graphs, in particular to a knowledge graph generation type question-answering method and system based on a large language model.

Background

Intelligent question answering has become an important way for people to solve problems and quickly acquire related information. The question-answering system is a program which can answer questions presented by a user by using natural language as fast and accurately as possible, the most common intelligent question-answering technology at present is a pre-training model constructed based on deep learning, and the core of the intelligent question-answering technology is to recognize the intention of the questions through semantic understanding and then search related answers based on a knowledge base. Compared with the traditional question-answering database, the knowledge graph more intuitively shows the data relationship and the data characteristics, so that the knowledge graph of each field is used as a main data source base of intelligent question-answering by constructing the knowledge graph of the specific field.

The knowledge graph can be regarded as the structural representation of knowledge, is formed by triples (head entity, relation, tail entity), represent the relation between two entities, the knowledge graph question-answering is mainly realized through the sub-graph inquiry, semantically similar matching method, but still has the following deficiency at present:

firstly, the problem generalization capability is poor, a problem entity and a problem relation need to be identified in the knowledge graph question answering process, and related graph nodes are matched with related triples, so that answers are retrieved; if the questions are more generalized, there are cases where the question entities and relationships cannot be successfully matched with the subgraphs, and answers cannot be retrieved.

Secondly, the problem limitation is large, the knowledge graph question-answering is based on the entity name and the entity relation of the triplet to return the answer, so that the association relation and the attribute related content question of the graph node are only supported, and other contents cannot be answered.

Thirdly, the graph reasoning ability is weak, the current knowledge graph question-answering does not have the reasoning ability, and multi-hop reasoning or statistics cannot be carried out through graph sub-graph contents.

Therefore, the invention provides a knowledge graph question-answering method, device and system based on a large model, which can accurately understand the intention of a question in the question-answering process and generate a question answer based on knowledge graph data.

Disclosure of Invention

In view of the above, the invention provides a knowledge graph generation type question-answering method and system based on a large language model, which can analyze a question and extract key elements through a user input, search subgraphs based on the knowledge graph as a question-answering knowledge base, and then construct model prompt sentences by utilizing subgraph information and question sentences, so that the large language model finally generates a question answer.

The invention discloses a knowledge graph generation type question-answering method based on a large language model, which comprises the following steps:

step 1: building a large language model fine tuning training data, wherein the training data comprises a prompt sentence, a question set and an answer set; the prompt statement comprises a prompt template and instance data;

step 2: fine tuning a large language model based on the LoRA;

step 3: providing a question-answer knowledge base for the large language model subjected to LoRA fine adjustment through a subgraph searching strategy;

step 4: and (3) taking the large language model subjected to LoRA fine tuning as a question-answering reasoning model, inputting a question text into the question-answering reasoning model, and generating a question answer based on the question-answering knowledge base provided in the step (3).

Further, the step 1 includes:

step 11: constructing a question set and an answer set of training data:

generating a spectrum single-hop, multi-hop class question set and Answer set by acquiring triple structure data of a domain knowledge spectrum and calling a BERT-based pre-training model, so that a spectrum triple triple= [ node1, rel, node2] is realized, wherein node1 is a head entity, rel is a relation edge, node2 is a tail entity, the pre-training model is input as [ node1, rel ], and the model outputs a question according to [ node1, rel ], and meanwhile, node2 is a question Answer, and a batch of question and Answer data are automatically constructed;

step 12: constructing a prompt statement of training data:

obtaining structured example data through a domain knowledge graph: labeling the entity of the graph spectrum data and the associated triplet information to enable the knowledge graph data X= { name: name, k ₁ :v ₁ ,k ₂ :v ₂ ,...,k _n :v _n Setting X as the relation data of one entity node in the knowledge graph, name as node name, k _n For the attribute or relationship edge name of a node, v _n Connecting node names for attribute values or relation edges corresponding to the nodes; converting map data X into list data X' = [ [ name, k ] ₁ ,v ₁ ],[name,k ₂ ,v ₂ ],...,[name,k _n ,v _n ]]X' is the example data of the model training data;

then constructing a prompt template P facing the knowledge graph question-answering task; and adding the instance data to the Prompt template to generate a Prompt statement Prompt of model training.

Further, the step 2 includes:

step 21: splitting LoRA model parameters into two parts: pre-training weights w E R ^d*d And a finetune delta weight Deltaw ε R ^d*d W is the frozen pre-training weight, and Deltaw is the weight updating amount generated in the fine tuning process; let x be the input and h be the output, then h=wx+Δwx; in the training process, the pre-training weight w is fixed, and two low-rank matrixes A and B are adopted to approximately represent Deltaw, namely h=wx+ BAx, A epsilon R ^r*d ，B∈R ^r*d Gaussian initialization is adopted for A, zero initialization is adopted for B, low-rank matrixes A and B are trained in the training process, and only a low-rank matrix part is saved as model weight;

step 22: performing token processing on the training data in the step 1, setting training parameters, loading a ChatGLM-6B model, starting fine tuning training, and storing the model weight with the lowest Loss value Loss in the training process;

after the fine tuning of the model is finished, verifying the model effect by utilizing part of training data, adding the trained matrix product BA and the original fixed weight matrix w to be used as a matrix in a new model area, namely h= (w+BA) x, taking the weight matrix h as a new weight parameter to replace the original pre-trained model language model parameter, comparing res1 of the model with a model test before training to obtain a comparison result res0, and adopting a question-answer average accuracy rate Acc as a comparison index;

if res1> res0, the fine tuning effect of the model is improved, and the model can be used as an inference model for subsequent map question-answering; if res1< res0, the training parameters are readjusted, and steps 21 and 22 are repeated until the post-fine tuning model test result is better than the pre-training result.

Further, the comparison index adopts a question-answer average accuracy rate Acc, and the calculation mode is as follows:

where n is the total number of test samples, x _i To test for centralization problem q _i If the predicted answer is the same as the standard answer, the value is 1, otherwise, the value is 0; t is t _i To test for centralization problem q _i And returning the time required by the answer.

Further, the step 3 includes:

step 31: constructing a vertical domain knowledge graph database and an entity name vector library; importing structured entity-relation-entity and entity-attribute value triplet data into a atlas database in batches, searching entity nodes according to labels after the knowledge atlas is constructed, converting entity names into text vectors by using a TransE method, and respectively storing the text vectors and node IDs in a vector database;

step 32: and completing an entity identification task by using the information extraction model UIE: when the UIE framework is used, a target query entity list is required to be established, the nodes in the knowledge graph are screened, the target entities are extracted, and a corresponding AC automaton model is constructed; after obtaining text data and identifying named entities, judging whether the extracted entity is a target entity or not;

step 33: converting the target entity identified in step 32 into a text vector x ₁ Text vector x using cosine similarity ₁ Respectively carrying out similar comparison with all vectors in the name vector library constructed in the step 31, and finally returning the text vector x from the vector library ₁ Vector and node id with maximum cosine similarity;

step 34: entity links link entity names to corresponding entities in the knowledge base to knowledge graph nodes, and corresponding node descriptions are found in the candidate knowledge graph based on the node id information returned in the step 33; and taking the map node as a center, searching all map data connected with the edges with the relation as a subgraph, and returning the map data as a subgraph to provide data support for subsequent questions and answers.

Further, in said step 31:

the TransE method defines the relationship in the knowledge graph as a head entity e ₁ To tail entity e ₂ Mapping conversion of (c): when a header entity is given ₁ When the knowledge graph is used, the corresponding tail entity is predicted according to the relation expression in the knowledge graph ₂ The method comprises the steps of carrying out a first treatment on the surface of the Likewise, predicting a head entity from the relationship representation and the tail entity; the relationship between entities is expressed as: r=e ₁ -e ₂ At the same time, the feature expression vector s of the entity co-occurrence sentence is considered _i The higher the similarity with the target relation vector r, the higher the probability that the co-occurrence sentence can correctly express the target relation, and the higher the attention weight i.

Further, in the step 32, determining whether the extracted entity is a target entity includes:

directly judging an entity to be searched in the text through an AC automaton; or alternatively, the method can be used for processing,

firstly, converting a target entity list into a word vector list, converting the extracted entity into a word vector, comparing the word vector of the extracted entity with word vectors in the list, and judging whether the entity with the same semantic as the extracted entity exists in the target entity list; if yes, adding the extraction entity into the subsequent knowledge extraction work, and realizing fuzzy extraction of the entity based on the extraction entity;

the input of the information extraction model UIE comprises two parameters of schema and text; the schema is an extraction entity type, and the text is a text to be extracted.

Further, in said step 34:

given a set of identification references m= { M ₁ ,m ₂ ,…,m _n Regular text S, the goal of the entity linking system is to find a mapping, each referring to m _i Linking to target entity e _i Target entity e _i Refers to an explicit page in the knowledge base, or predicts that there is no currently mentioned entity in the corresponding graph, for each reference m before the entity disambiguates _i Potential candidate entity O _i ∈{e _i1 ,e _i2 ,…,e _ik First, selecting from the designated atlas by candidate entities, each candidateIn the knowledge graph there is a corresponding description D _ij As a support description; where K is a predefined parameter for pruning the candidate set.

Further, the step 4 includes:

step 41: after inputting a question text Quest, setting a schema list required by a UIE model according to a question scene, returning to a question Entity, and returning to question related sub-graph information KG based on the sub-graph retrieval method in the step 3;

step 42: combining the problem information, the problem entity and the map sub-graph information in the step 41 to construct a question-answer Prompt sentence promt;

step 43: model reasoning, namely inputting the Prompt and the History in the step 42, wherein the History is empty when the model is input for the first time and is used for storing a user question-answer dialogue record; and finally, generating a large language model in a pure character string format, and returning a question answer after regular matching word segmentation and structural assembly.

The invention also discloses a system suitable for the knowledge graph generation type question-answering method based on the large language model, which comprises:

the building module is used for building the large language model fine tuning training data, wherein the training data comprises a prompt sentence, a question set and an answer set; the prompt statement comprises a prompt template and instance data;

the fine tuning module is used for fine tuning the large language model based on the LoRA;

the providing module is used for providing a question-answer knowledge base for the large language model subjected to LoRA fine tuning through the subgraph searching strategy;

the generation module is used for taking the large language model subjected to LoRA fine tuning as a question-answer reasoning model, inputting the question text into the question-answer reasoning model, and generating a question answer based on the provided question-answer knowledge base.

Due to the adoption of the technical scheme, the invention has the following advantages:

1. according to the large model generation type question-answering method based on the knowledge graph data, the large model does not only generate answers according to the question text, but also constructs graph information and questions together into model prompt sentences to generate the answers of the questions, so that the answers are more accurate and traceable.

2. Compared with the traditional knowledge graph question-answering method, the method has more question generalization capability, and the generalization capability is mainly embodied in two aspects of question understanding and question types which can be answered. In terms of question understanding, compared with a traditional map question-answering model, the large language model has stronger semantic understanding capability, and meets the question asking of more spoken language and more naturalization of a user; in the aspect of answering the question types, the traditional map question answering method only supports answering the map relation and attribute related contents, the specific contents contained in the nodes cannot be answered, and the large model supports asking all the contents contained in the map nodes, the relation and the attribute, so that the map structure limit is broken through.

3. Compared with the traditional knowledge graph question-answering method, the method has more reasoning and statistical analysis capability, the traditional knowledge graph question-answering method has single-hop and multi-hop question-answering capability, does not have reasoning and statistical functions, only establishes a series of question-answering strategies and rules for specific scenes to realize reasoning or statistics, and is not applicable once the business scene change method is not used any more, and does not have generalization capability; the large model has strong semantic understanding capability, can understand complex problems and clearly ask questions, has thinking chain reasoning capability, can perform reasoning analysis based on provided knowledge graph data, and has incremental capabilities such as multi-hop reasoning, statistical analysis and the like.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments described in the embodiments of the present invention, and other drawings may be obtained according to these drawings for those skilled in the art.

FIG. 1 is a schematic diagram of a knowledge graph generation type question-answering method based on a large language model according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a LoRA-based fine tuning large model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a graph correlation sub-graph retrieval process according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an answer generation flow based on the ChatGLM-6B model according to an embodiment of the present invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings and examples, wherein it is apparent that the examples described are only some, but not all, of the examples of the present invention. All other embodiments obtained by those skilled in the art are intended to fall within the scope of the embodiments of the present invention.

Referring to fig. 1, the present invention provides an embodiment of a knowledge graph generation type question-answering method based on a large language model, which includes the steps of:

step 1: building large language model fine tuning training data, wherein the training data comprises prompt sentences, a question set and an answer set, and the prompt sentences comprise a prompt template and instance data;

step 1.1: constructing a question set and an answer set of training data;

in the embodiment, by acquiring the triple structure data of the domain knowledge graph, invoking a BERT-based pre-training model to generate a graph single-hop, multi-hop class question set and an Answer set, and enabling a graph triple to be = [ node1, rel, node2], wherein node1 is a head entity, rel is a relation edge, node2 is a tail entity, the pre-training model is input as [ node1, rel ], and the model outputs a question, while node2 is a question Answer; for example: triple= [ mid-autumn festival, lunar calendar, eighty-five ], model output Quest: is mid-autumn the lunar calendar the month and day? Answer: automatically constructing a batch of question and answer data in the manner, wherein 60% of the data are used for model training, 20% of the data are used for model verification, and 20% of the data are used for effect testing after model training;

step 1.2: constructing a prompt sentence;

in the embodiment, structured example data can be obtained through a domain knowledge graph, data preprocessing is performed, and entities and associated triplet information of the graph data are marked, so that the knowledge graph data X= { name: name, k ₁ :v ₁ ,k ₂ :v ₂ ,...,k _n :v _n Setting X as the relation data of one entity node in the knowledge graph, name as node name, k _n For the attribute or relationship edge name of a node, v _n The node name is connected for the attribute value or the relation edge corresponding to the node, and after data preprocessing, the map data X is converted into X' = [ [ name, k) ₁ ,v ₁ ],[name,k ₂ ,v ₂ ],...,[name,k _n ,v _n ]]List data, x' is the example data of model training data; then constructing a prompt template P= [ please answer questions according to the following triplet information towards the knowledge graph question answering task: xxx, triplet information: xxx, requirement: if no related answer exists, a temporary no answer can be output, if the answer exists, only the answer is output, and the reasoning analysis process is not needed; adding instance data to the alert template generates an alert statement Prompt for model training, such as: prompt=Please answer questions based on the following triplet information: what is the mood for opening up to the whole society? Triplet information: [ [ Wenyin, time, service on line, [ Wenyin, research and development company, hundred degrees ]][ Wenyan, english name, ERNIE Bot ]]]The requirements are: if no related answer exists, a temporary no answer can be output, if the answer exists, only the answer is output, and the reasoning analysis process is not needed;

step 2: fine tuning a large language model based on the LoRA;

referring to fig. 2, the LoRA is an efficient and flexible fine tuning manner, and compared with a full-scale fine tuning large model, the trainable parameters of the LoRA fine tuning are greatly reduced, and the effect after fine tuning is basically completely close to that of the full-scale fine tuning, so that the method has extremely strong general generation capability. First, loRA model parameters are split into two parts: pre-training weights w E R ^d*d And a finetune delta weight Deltaw ε R ^d*d W is the frozen pre-training weight, Δw is the weight update amount generated in the fine tuning process, the input is x, the output is h, and h=wx+Δwx is given; in the training process, the pre-training weight w is fixed, and two low-rank matrixes A and B are adopted to approximately represent Deltaw, namely h=wx+ BAx, A epsilon R ^r*d ，B∈R ^r*d The method comprises the steps of initializing A by Gaussian and initializing B by zero, so that the value of B is 0 at the beginning of training, extra noise is not brought to a model, training is carried out on low-rank matrixes A and B in the training process, and only a low-rank matrix part is saved as model weight;

performing token processing on the training data in the step 1, setting training parameters, wherein train_step=5, train_epoch=20, training_rate=0.0001, loading a ChatGLM-6B model, starting fine tuning training, and storing the model weight with the lowest Loss value Loss in the training process;

after the fine tuning of the model is finished, the model effect is verified by using 20% of the constructed data, as the pre-training weight w of the LoRA model is not changed, only the matrix product BA after training and the original fixed weight matrix w are added to be used as a new matrix in a model area, namely h= (w+BA) x, the weight matrix h is used as a new weight parameter to replace the original pre-training model language model parameter, the test result res1 of the model is used for comparing res0 with the model test before training, the comparison index adopts the question-answer average accuracy Acc, and the calculation mode is as follows:

where n is the total number of test samples, x _i To test for centralization problem q _i If the predicted answer is the same as the standard answer, the value is 1, otherwise, the value is 0; t is t _i To test for centralization problem q _i And returning the time required by the answer. If res1>res0, the fine tuning effect of the model is improved, and the fine tuning effect can be used as an inference model for subsequent map question-answering; if res1<re 0, readjusting training parameters, and repeating the step 2 until the test result of the fine-tuned model is better than the result before training;

step 3: searching a map-related subgraph;

the large language model has text generation capability, and in the map question-answering task, if the questions have no relevant answers, the model can generate irrelevant answers based on learned knowledge, so that accuracy and user experience of the question-answering system are affected. Therefore, the embodiment provides a related sub-graph retrieval strategy based on the knowledge graph, the sub-graph information is used as a designated knowledge base, and the large model generates answers based on the designated knowledge base and supports answer tracing. The specific implementation steps are as follows, see fig. 3;

step 3.1: constructing a vertical domain knowledge graph database and an entity name vector library; importing structured entity-relation-entity and entity-attribute value triplet data into a NEO4J atlas library in batches, searching entity nodes according to label labels after knowledge atlas construction is completed, converting entity names into text vectors by using a TransE method, and respectively storing the vectors and node IDs in a vector library;

the TransE method defines the relationship in the knowledge graph as a head entity e ₁ To tail entity e ₂ Mapping transformations of (a). Based on such idea, when given aIndividual header entity ₁ When the method is used, the corresponding tail entity can be predicted according to the relation expression in the knowledge graph ₂ The method comprises the steps of carrying out a first treatment on the surface of the Likewise, the head entity may also be predicted from the relationship representation and the tail entity. The relationship between entities is represented by referring to the idea of transfer: r=e ₁ -e ₂ At the same time, the feature expression vector s of the entity co-occurrence sentence is considered _i When the similarity with the target relation vector r is higher, the likelihood that the sentence can correctly express the target relation is higher, and the attention weight i is higher;

step 3.2: the method comprises the steps of identifying a problem entity, wherein the traditional entity identification model only can identify entity type data appointed by a training process because of different entity types of different problem types, so that the invention proposes to complete an entity identification task by utilizing an information extraction model UIE, in open domain information extraction, the extracted type is not limited, a user can define the information by himself, the key information extraction of an undefined domain and an extraction target is supported, zero sample rapid cold start is realized, and meanwhile, the invention has small sample fine adjustment capability;

when the UIE framework is used, a target query entity list is required to be established, the nodes in the knowledge graph are screened, the target entities are extracted, and a corresponding AC automaton model is constructed. After the text data is taken and the named entity is identified, there are two methods to determine whether the extracted entity is the target entity: one is to directly judge which entities to be searched exist in the text through an AC automaton; the other is that firstly, a target entity list is converted into a word vector list, the extracted entity is also converted into a word vector, the word vector of the extracted entity is compared with the word vector in the list, whether the entity with the same semantic as the extracted entity exists in the target entity list is judged, if so, the extracted entity is added into the subsequent knowledge extraction work, and fuzzy extraction of the entity is realized based on the fact;

the UIE model input contains two parameters, schema and text, schema is a type of extraction entity, user can customize, such as "time", "player" and "event name", schema constructs, such as [ 'time', 'player', 'event name' ], text is text to be extracted, such as "gold medal is obtained in high score by a player in a game of a certain morning on a certain month of a certain year-! The model output result is { 'time': 'month and day morning', 'event name': 'event game item, player' };

step 3.3: converting the recognition result of the step 3.2 into a text vector x ₁ X is determined by cosine similarity ₁ Performing similar comparison with all vectors in the name vector library constructed in the step 3.1, and setting the problem entity vector as x ₁ ＝(x ₁₁ ,x ₁₂ ,...,x _1k ) Any vector x in name vector library ₂ ＝(x ₂₁ ,x ₂₂ ,...,x _2k ) Cosine similarityThe more similar the two vectors are, the greater the absolute value of the cosine similarity s, and therefore, the sum x is finally returned from the vector library ₁ Vector and node id with maximum cosine similarity;

step 3.4: entity links link entity designation links to corresponding entities in the knowledge base to knowledge graph nodes, given one contains a set of identification references m= { M ₁ ,m ₂ ,…,m _n Regular text S, the goal of the entity linking system is to find a mapping, each referring to m _i Linking to target entity e _i Target entity e _i Refers to an explicit page in the knowledge base, or predicts that there is no currently mentioned entity in the corresponding graph, for each reference m before the entity disambiguates _i Potential candidate entity O _i ∈{e _i1 ,e _i2 ,…,e _ik First selected from a particular atlas by the candidate entity, where K is a predefined parameter for pruning the candidate set. Notably, each candidateIn the knowledge graph there is a corresponding description D _ij As a support description. Therefore, based on the node id information returned in the step 3.3, the corresponding node description can be found in the candidate knowledge graph. Taking the map node as the center, searching all map data connected with the side with the relation as a sub-graph to return as a follow-up questionThe answer provides data support;

step 4: returning a generated answer based on the ChatGLM-6B model;

taking the ChatGLM-6B big model subjected to LoRA fine tuning in the step 2 as a question-answer reasoning model, and providing a question-answer knowledge base by using the sub-graph retrieval strategy in the step 3, so that the generated question-answer based on knowledge embedding of an external map base is realized, and the specific flow is shown in FIG. 4;

step 4.1: after inputting a question text Quest, setting a schema list required by a UIE model according to a question scene, returning to a question Entity, and returning to question related sub-graph information KG based on the sub-graph retrieval method in the step 3;

step 4.2: combining the question information, the question entity and the map sub-graph information in the step 4.1 to construct a question-answer Prompt sentence, wherein Prompt= [ please ] answer the question according to the following triplet information, return an answer, and if no related answer exists, return a 'temporary no answer'; if yes, the answers are output in a list form, and an reasoning process is not needed. Problem of n: quest, n triplet information: KG';

step 4.3: model reasoning, namely inputting results promt and History of the step 4.2, wherein the History is empty when the model is input for the first time and is used for storing a user question-answer dialogue record; the ChatGLM-6B model has multi-round interactive dialogue capability, supports a user to conduct multiple inquiry on a certain subject event or a certain target, generates a pure character string format as a final model, and returns answers to questions after regular matching word segmentation and structured assembly.

The invention provides an embodiment of a knowledge graph generation type question-answering system based on a large language model, which is applicable to the embodiment of the method, and comprises the following steps:

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims

1. A knowledge graph generation type question-answering method based on a large language model is characterized by comprising the following steps:

step 2: fine tuning a large language model based on the LoRA;

2. The method according to claim 1, wherein the step 1 comprises:

step 11: constructing a question set and an answer set of training data:

step 12: constructing a prompt statement of training data:

obtaining structured example data through a domain knowledge graph: labeling the entity of the graph spectrum data and the associated triplet information to enable the knowledge graph spectrum dataLet X be the relationship data of an entity node in the knowledge graph, name be the node name,/->For the attribute or relationship edge name of the node, +.>Connecting node names for attribute values or relation edges corresponding to the nodes; converting the map data X into list data +.>,.../>，/>Namely, example data of model training data;

3. The method according to claim 1, wherein the step 2 comprises:

step 21: splitting LoRA model parameters into two parts: pre-training weightsAnd finetune delta weights，/>Pre-training weights for freezing, +.>Updating the weight generated in the fine tuning process; let the input be x and the output be +.>There is->x+/>x; in the training process, the pre-training weight is fixed>Approximating +.A using two low rank matrices of A and B>I.e. +.>x+BAx，/>，/>Gaussian initialization is adopted for A, zero initialization is adopted for B, low-rank matrixes A and B are trained in the training process, and only a low-rank matrix part is saved as model weight;

after the fine tuning of the model is finished, the model effect is verified by utilizing part of training data, and the matrix product BA after training is added with the weight matrix w which is originally fixed to be used as a matrix in a new model area, namelyThe weight matrix h is used as a new weight parameter to replace the original pre-training model language model parameter, the test result res1 of the model is utilized to compare with the model test before training res0, and the question-answer average accuracy rate Acc is adopted as a comparison index;

if res1> res0, the fine tuning effect of the model is improved, and the fine tuning effect is used as an inference model for subsequent map question-answering; if res1< res0, the training parameters are readjusted, and steps 21 and 22 are repeated until the post-fine tuning model test result is better than the pre-training result.

4. The method of claim 3, wherein the comparison index is calculated by using a question-answer average accuracy Acc, which is calculated by:

where n is the total number of test samples,question of test set>If the predicted answer is the same as the standard answer, the value is 1, otherwise, the value is 0; />Question of test set>And returning the time required by the answer.

5. The method according to claim 1, wherein the step 3 comprises:

step 33: converting the target entity identified in step 32 into a text vectorText vector +.>Respectively carrying out similar comparison with all vectors in the name vector library constructed in the step 31, and finally returning the text vector from the vector library>Vector and node id with maximum cosine similarity;

6. The method according to claim 5, characterized in that in said step 31:

the TransE method defines the relationship in the knowledge graph as a head entityTo the tail entity->Mapping conversion of (c): when given a head entity->When the corresponding tail entity is predicted according to the relation expression in the knowledge graph>The method comprises the steps of carrying out a first treatment on the surface of the Likewise, predicting a head entity from the relationship representation and the tail entity; the relationship between entities is expressed as: />At the same time, it is considered that the feature expression vector of the entity co-occurrence sentence +.>When the similarity with the target relation vector r is higher, the probability that the co-occurrence sentence can correctly express the target relation is higher, and the attention weight is +.>The higher.

7. The method according to claim 5, wherein in the step 32, determining whether the extracted entity is a target entity includes:

8. The method according to claim 5, characterized in that in said step 34:

given a set of identification referencesThe object of the entity-linked system is to find a mapping, each of which is mentioned +.>Link to target entity->Target entity->Refers to explicit pages in the knowledge base or predicts that there is no currently mentioned entity in the corresponding map, before the entity disambiguates, for each reference +.>Potential candidate entity->First, selecting from a specified atlas by candidate entities, eachCandidate->The knowledge graph has a corresponding description +.>As a support description; where K is a predefined parameter for pruning the candidate set.

9. The method according to claim 1, wherein the step 4 comprises:

step 43: model reasoning, wherein the model inputs two parameters, namely a History and a Prompt in the step 42, and when the model is input for the first time, the History is empty and is used for storing a user question-answer dialogue record; and finally, generating a large language model in a pure character string format, and returning a question answer after regular matching word segmentation and structural assembly.

10. A system adapted for the large language model based knowledge graph generation type question-answering method according to any one of claims 1 to 9, the system comprising: