CN114357195A - Knowledge graph-based question-answer pair generation method, device, equipment and medium - Google Patents

Knowledge graph-based question-answer pair generation method, device, equipment and medium Download PDF

Info

Publication number
CN114357195A
CN114357195A CN202210029916.4A CN202210029916A CN114357195A CN 114357195 A CN114357195 A CN 114357195A CN 202210029916 A CN202210029916 A CN 202210029916A CN 114357195 A CN114357195 A CN 114357195A
Authority
CN
China
Prior art keywords
question
template
entity
templates
answer pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210029916.4A
Other languages
Chinese (zh)
Inventor
叶泳坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202210029916.4A priority Critical patent/CN114357195A/en
Publication of CN114357195A publication Critical patent/CN114357195A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a question-answer pair generation method, a device, equipment and a medium based on a knowledge graph, which belong to the technical field of artificial intelligence, wherein the question-answer pair generation method based on the knowledge graph comprises the following steps: generating a plurality of problem templates based on the original problem; based on the set knowledge graph, conducting regularization processing on the plurality of problem templates respectively; acquiring question-answer pairs respectively corresponding to the question templates from the knowledge graph according to the processed question templates; and storing the question-answer pairs into a question-answer pair library so as to obtain answers corresponding to the user questions from the question-answer pair library when receiving the user questions. The utility model provides the practicality of intelligence question answering can be improved to this application.

Description

Knowledge graph-based question-answer pair generation method, device, equipment and medium
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a question-answer pair generation method based on a knowledge graph, a question-answer pair generation device based on the knowledge graph, a computer readable medium and electronic equipment.
Background
The intelligent question answering is an important scene of the existing artificial intelligence landing. With the increasing complexity of internet services and the increasing cost of manual customer service, intelligent question answering becomes more and more important. The intelligent question answering method can not only save the time for a user to wait for manual customer service, but also greatly reduce the operation cost. The intelligent question answering can solve most of user problems, and precious artificial customer service resources can be used for solving a small part of troublesome problems, so that the queuing waiting time of the users can be reduced, and the overall operation efficiency is improved.
However, the existing intelligent question-answering based on the knowledge graph usually needs a large amount of labeled data for training, the effect of the model depends on the accuracy of the labeled data, the calculation amount is large, the cost is high, and the applicable scene is limited, so the practicability is not high.
Therefore, a solution is needed to improve the utility of intelligent question answering.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present application and therefore may include information that does not constitute prior art known to a person of ordinary skill in the art.
Disclosure of Invention
The application aims to provide a question-answer pair generation method, a question-answer pair generation device and a question-answer pair generation medium based on a knowledge graph, and at least solves the technical problems that the practicability of intelligent question-answers in the related technology is not high to a certain extent.
Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.
According to an aspect of an embodiment of the present application, there is provided a method for generating a knowledge-graph-based question-answer pair, including: generating a plurality of problem templates based on the original problem; based on the set knowledge graph, conducting regularization processing on the plurality of problem templates respectively; acquiring question-answer pairs corresponding to the question templates from the knowledge graph according to the processed question templates; and storing the question-answer pairs into a question-answer pair library so as to obtain answers corresponding to the user questions from the question-answer pair library when receiving the user questions.
In some embodiments, the problem template includes a first entity, a second entity, and a linking relationship between the first entity and the second entity; based on the set knowledge graph, the problem templates are respectively subjected to regularization treatment, and the regularization treatment comprises the following steps: according to the problem template, acquiring a first entity, a second entity and a link relation corresponding to the problem template; and respectively carrying out normalization processing on the first entity, the second entity and the link relation so as to enable the expression of the first entity, the second entity and the link relation to be matched with the expression in the set knowledge graph.
In some embodiments, obtaining, from the set knowledge graph, a plurality of question-answer pairs corresponding to a plurality of question templates, respectively, according to the processed question templates includes: extracting a triple template set from a set knowledge graph; matching the processed problem templates with the triple templates respectively according to preset matching conditions; and extracting the target question-answer pairs in the triple templates which accord with the preset matching conditions, and using the target question-answer pairs as question-answer pairs of the question templates matched with the triple templates.
In some embodiments, before extracting the target question-answer pair in the triple template meeting the preset matching condition and using the target question-answer pair as the question-answer pair of the question template matched with the triple template, the method further includes: vectorizing the problem template and each triple template respectively; calculating the vector dot products corresponding to the problem template and each triple template respectively; and taking the triple template with the vector dot product larger than the set threshold value as the template meeting the preset matching condition.
In some embodiments, before extracting the target question-answer pair in the triple template meeting the preset matching condition and using the target question-answer pair as the question-answer pair of the question template matched with the triple template, the method further includes: classifying each triple template in the triple template set; calculating the similarity of the corresponding link relation of the problem template and the corresponding link relation of the triple template of each category; and taking the triple template corresponding to the category with the highest similarity as the template meeting the preset matching condition.
In some embodiments, based on the original question, several question templates are extracted, including: carrying out entity identification and entity link identification on the original problem so as to obtain an entity corresponding to the original problem and a link relation between the entities; and determining a plurality of problem templates corresponding to the original problems according to the entities and the link relation.
In some embodiments, prior to the entity identifying and entity link identifying the original question, the method further comprises: extracting the original problem of which the use frequency is greater than the preset frequency based on the use frequency of the original problem; and generating a problem template according to the original problem of which the use frequency is greater than the preset frequency.
According to an aspect of an embodiment of the present application, there is provided a knowledge-graph-based question-answer pair generating apparatus including:
the generating unit is used for generating a plurality of problem templates based on the original problem;
the processing unit is used for respectively carrying out regularization processing on the plurality of problem templates based on the set knowledge graph;
the first acquisition unit is used for acquiring question-answer pairs respectively corresponding to the question templates from the knowledge graph according to the processed question templates;
and the second acquisition unit is used for storing the question-answer pairs into the question-answer pair library so as to acquire answers corresponding to the user questions from the question-answer pair library when the user questions are received.
According to an aspect of the embodiments of the present application, there is provided a computer-readable medium, on which a computer program is stored, which when executed by a processor implements a method for generating a knowledge-graph-based question-answer pair as in the above technical solutions.
According to an aspect of an embodiment of the present application, there is provided an electronic apparatus including: a processor; and a memory for storing executable instructions for the processor; wherein the processor is configured to execute the knowledge-graph based question-answer pair generating method as in the above technical solution via executing the executable instructions.
According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the method for generating the knowledge-graph-based question-answer pair according to the above technical solution.
According to the technical scheme provided by the embodiment of the application, the problem template is selectively generated according to the original problem, the generated problem template is subjected to standardization processing, and then a plurality of question-answer pairs corresponding to the problem template are obtained from the set knowledge graph; on the other hand, the obtained question-answer pairs are stored in the question-answer pair library, so that the user can inquire in the accurately generated question-answer pair library, access in an online mode and an offline mode is supported, the diversity of scenes using intelligent question-answers can be improved, and the practicability of the intelligent question-answers is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 schematically shows a block diagram of an exemplary system architecture to which the solution of the present application applies.
Fig. 2 is a flowchart of a method for generating a knowledge-graph-based question-answer pair according to an embodiment of the present application.
Fig. 3 is a flowchart of obtaining question-answer pairs from a knowledge-graph according to an embodiment of the present application.
Fig. 4 is a flowchart of a method for determining whether a preset matching condition is met according to an embodiment of the present application.
Fig. 4a is a schematic structural diagram of a problem template provided in an embodiment of the present application.
Fig. 4b is a schematic structural diagram of a triplet template in a knowledge-graph according to an embodiment of the present application.
Fig. 5 is a flowchart of a method for determining whether a preset matching condition is met according to another embodiment of the present application.
Fig. 6 schematically shows a block diagram of a knowledge-graph-based question-answer pair generating apparatus 600 provided in an embodiment of the present application.
FIG. 7 schematically illustrates a block diagram of a computer system suitable for use in implementing an electronic device of an embodiment of the present application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Fig. 1 schematically shows a block diagram of an exemplary system architecture to which the solution of the present application applies.
As shown in fig. 1, system architecture 100 may include a terminal device 110, a network 120, and a server 130. The terminal device 110 may include various electronic devices such as a smart phone, a tablet computer, a notebook computer, and a desktop computer. The server 130 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. Network 120 may be a communication medium of various connection types capable of providing a communication link between terminal device 110 and server 130, such as a wired communication link or a wireless communication link.
The system architecture in the embodiments of the present application may have any number of terminal devices, networks, and servers, according to implementation needs. For example, the server 130 may be a server group composed of a plurality of server devices. In addition, the technical solution provided in the embodiment of the present application may be applied to the terminal device 110, or may be applied to the server 130, or may be implemented by both the terminal device 110 and the server 130, which is not particularly limited in this application.
For example, the scheme of the application is suitable for any computer equipment which needs knowledge-graph-based question-answer pair generation or a platform consisting of a plurality of computer equipment. The computer device or platform is provided with a main control program used for generating the knowledge graph-based question-answer pair, and the main control program can be a plug-in program or an independent program. In addition, the master control program can also run in a certain or at least partial computer device of the test platform. The storage device may store data generated by the operation of the main control program, the storage device may store result information generated by the knowledge-graph-based question-answer pair, and the like, and may also store intermediate data generated by the knowledge-graph-based question-answer pair, and the like. The storage device may be further configured to store software version files for respective software versions involved in the generation of the knowledgegraph-based question-answer pairs, the software version files including at least one code file, each code file including at least one (or at least one line of) code. The specific manner in which the storage device stores the various data in the present application may be varied. For example, in one possible implementation, the at least one storage device may store devices in the form of a blockchain. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
It should be further noted that the technical scheme of the application can selectively and accurately expand the question-answer pairs of the question-answer pair library and support the intelligent questions in online and offline states, so that the applicable scenes of intelligent question-answer can be greatly expanded, and the practicability of intelligent question-answer is improved.
The knowledge-graph-based question-answer generation method provided by the application is described in detail below with reference to specific embodiments.
Fig. 2 is a flowchart of a method for generating a knowledge-graph-based question-answer pair according to an embodiment of the present application.
Step S210, generating a plurality of problem templates based on the original problem.
Here, the original question may be a question preset according to a usage scenario, in other words, different original questions may be generated based on different scenarios to promote practicality used for the scenario. Illustratively, if the product to which the knowledge-graph-based question-answer pair generation method is applied is a robot used in a shopping mall, the question-answer pair preset according to the scene may be: where is restaurant a? The extracted question template may then be "< where is the address of the restaurant? "and" < what is the address? "question templates. In another embodiment, the original question may also be a question that was asked by the user in a certain scenario, illustratively, such as "who is the father of sushi? "based on the original question above, the extracted question template may be" < who is the father of the person > "? "," < person > is the son? "question templates.
In one embodiment, there may be many similar structural questions in the existing question-answer pair library, such as: who is the father of sushi? Who is the father of the canadian? Wherein can be extracted "< who is the father of the person >? "this template, and hence a collection of similar templates from this template, e.g.," who is the son of < person > "? "," < daughter of figure >? "< mother of person > is who? "and the like.
In one embodiment, generating a plurality of question templates based on the original question may specifically include the following steps: carrying out entity identification and entity link identification on an original question to obtain entities respectively corresponding to the original question-answer pairs and link relations among the entities; and determining a plurality of question templates corresponding to the original question-answer pairs according to the entities and the link relation.
Specifically, the entity can be identified by means of entity tagging, and illustratively, a BMES four-digit sequence tagging method can be adopted, that is, B represents a word initial position, M represents a word middle position, E represents a word tail position, and S represents a single word. Then for the sentence "you are a chinese? ", then" are you (S) who is (M) in (S) country (M)? "entity tagging, and thus entity identification of the question template based on the entity tagging" < where is the country of the person? "
In one embodiment, before performing the entity identification and the entity link identification on the original question-answer pair, the method further comprises: based on the use frequency of the original question-answer pairs, extracting the original question-answer pairs with the use frequency larger than the preset frequency so as to generate the question template according to the original question-answer pairs with the use frequency larger than the preset frequency. Therefore, the problem template with high use frequency can be preferentially expanded, and the practicability of the question-answer pair library is improved.
And step S220, based on the set knowledge graph, conducting regularization processing on the plurality of problem templates respectively.
Since in natural language there may be many different expressions for the same relationship. Therefore, if the problem template extracted from the original problem is normalized to be matched with the expression of the link relation in the set knowledge graph, the efficiency of acquiring information from the set knowledge graph by the problem template can be improved. Also taking the above-mentioned problem as an example, from where is the address of < restaurant >? The question template extracted in (1) may be "entity (restaurant)" - "at" - "? "wherein,"? "characterize the answer that the user wants to know, the normalized question template may be" entity a "-" location "-"? "and thus use the more normalized expression" location "to make the question-answer pairs extracted from the knowledge-graph more accurate.
In one embodiment, the question template includes a first entity, a second entity, and a linking relationship between the first entity and the second entity; the regularization processing is respectively carried out on a plurality of the problem templates based on the set knowledge graph, and the regularization processing comprises the following steps: acquiring a first entity and a second entity corresponding to the question template and a link relation between the first entity and the second entity according to the question template; and respectively carrying out normalization processing on the first entity, the second entity and the link relation so as to enable the first entity, the second entity and the link relation to be matched with expressions in the knowledge graph.
Specifically, since the knowledge graph can be a knowledge graph crawled from a specific field, after the first entity and the second entity corresponding to the problem template and the link relation between the first entity and the second entity are respectively normalized according to the special vocabulary in the specific field, the accuracy of obtaining information in the set knowledge graph can be improved, the calculated amount can be reduced, and the obtaining efficiency can be improved.
And step S230, acquiring question-answer pairs respectively corresponding to the question templates from the knowledge graph according to the processed question templates.
The knowledge graph can be constructed by adopting the following method: crawling data in the corresponding domain through a crawler based on the determined domain; carrying out entity and relation identification on the acquired data by adopting a semantic identification mode and the like; and establishing a triple knowledge graph of the entity, the relation and the entity according to the identified entity and relation. Therefore, the knowledge graph in a specific field can be generated for selection. Furthermore, a plurality of question-answer pairs respectively corresponding to a plurality of question templates can be obtained from the knowledge graph obtained according to the question templates, so that the question-answer pairs in the specific field in the question-answer pair library can be expanded.
In one embodiment, a plurality of question-answer pairs respectively corresponding to a plurality of the question templates can be obtained from the set knowledge graph based on a vector dot product mode. Specifically, the triples of the question template and the triples of the knowledge graph may be vectorized, and the triples with the highest similarity may be selected as the triples matched with the question template by calculating the vector dot product, so as to obtain the question-answer pairs corresponding to the triples. For example, such as: who is the parent of "< person >"? "it is < people, relation, people > for the triple in the knowledge-graph, thus can obtain the triple in the knowledge-graph for a plurality of question-answer pairs that < people, relation, people > correspond.
In one embodiment, a model may be trained to obtain a plurality of question-answer pairs corresponding to a plurality of question templates from the set knowledge graph. In particular, the fields of the knowledge graph, namely the link relation between the entities, are provided with explanatory texts such as aliases. Therefore, the triples can be classified, the similarity between the problem template and each type of triples can be obtained by using the training model, and the entities corresponding to the type with the highest similarity and the relationship between the entities are converted into question-answer pairs to be stored.
Step S240, storing the question-answer pairs into a question-answer pair library, so as to obtain answers corresponding to the user questions from the question-answer pair library when receiving the user questions.
First, when a user question is received, the user question may be pre-processed first. The pretreatment mode comprises the following steps: the traditional Chinese characters are converted into simplified Chinese characters, redundant punctuations and symbols are removed, wrongly written characters are corrected, the upper case of the letters is converted into the lower case, and the like. And then, analyzing the preprocessed sentence by using a natural language processing tool, acquiring an entity in the sentence according to the grammar and a relation between the entity and the required answer, and further respectively matching with a plurality of question-answer pairs to acquire an answer with the highest matching degree to answer the question of the user.
Therefore, in the method and the device, the question template is selectively generated according to the original question, the generated question template is subjected to normalization processing, and then a plurality of question-answer pairs corresponding to the question template are obtained from the set knowledge graph. On one hand, the accuracy of the generated question template can be ensured by carrying out standardized processing on the question template, and further the accuracy of a plurality of corresponding question-answer pairs acquired in the set knowledge graph can be ensured; on the other hand, the obtained question-answer pairs are stored in the question-answer pair library, so that the user can inquire in the accurately generated question-answer pair library, and access in an online mode and an offline mode is supported, so that the diversity of scenes using intelligent question-answers can be improved, and the practicability of the intelligent question-answers is further improved.
Fig. 3 is a flowchart of obtaining question-answer pairs from a knowledge-graph according to an embodiment of the present application. As shown in fig. 3, the step of obtaining the question-answer pairs from the knowledge-graph may include the following steps.
Step S310, extracting a triple template set from the set knowledge graph;
step S320, matching the processed problem templates with the triple templates respectively according to preset matching conditions;
and S330, extracting the target question-answer pairs in the triple templates which accord with the preset matching conditions, and taking the target question-answer pairs as question-answer pairs of the question templates matched with the triple templates.
Specifically, the triple template is a template conforming to "entity-link relationship-entity". A number of triple template sets may be extracted from the knowledge-graph based on entity extraction, relationship extraction, attribute extraction, and the like.
The problem template includes a given relationship, at least one known entity of the two entities. Relevant knowledge of the given relationship, or at least one known entity, may first be mined from the set of triple templates. Specifically, all triples associated with the given relationship may be extracted from the triplet template set, and then all triples related to the known entity may be extracted from the triplet template set according to the known entity, and the question-answer pair corresponding to the extracted triples is taken as a target question-answer pair, and then the target question-answer pair is taken as a question-answer pair corresponding to the question template. Therefore, selective expansion of question-answer pairs based on the specific question template can be realized, and the practicability in an offline state is improved.
And the preset matching conditions are used for matching the processed problem templates with the triple templates respectively. The preset matching condition will be described below by two specific examples.
Fig. 4 is a flowchart of a method for determining whether a preset matching condition is met according to an embodiment of the present application. As shown in fig. 4, in this embodiment, before extracting a target question-answer pair in the triple template that meets the preset matching condition and using the target question-answer pair as a question-answer pair of the question template that matches the triple template, the following steps may be further specifically included:
step S410, vectorizing the problem template and each triple template in the triple template set;
step S420, calculating the vector dot products corresponding to the problem template and each triple template respectively;
and step S430, taking the triple template with the vector dot product larger than the set threshold value as the template meeting the preset matching condition.
Fig. 4a is a schematic structural diagram of a problem template provided in an embodiment of the present application, and fig. 4b is a schematic structural diagram of a triplet template in a knowledge graph provided in an embodiment of the present application.
As shown in fig. 4a and 4b, the problem template 420 is composed of a first entity 401, a first linking relationship 402 and a second entity 403, wherein the first linking relationship 402 is used for characterizing the relationship between the first entity 401 and the second entity 403. The triple template 430 is composed of the third entity 404, a second link relationship 405, and a fourth entity 406, the second link relationship 405 being used to characterize the relationship between the third entity 404 and the fourth entity 406. In one embodiment, the question templates 420 may be generated from high frequency of use question pairs extracted from the question-pair library 410. The triplet template 430 is extracted from the set knowledge-graph 440.
Specifically, a first vector is generated by the first entity 401, the first link relation 402, and the second entity 403, a second vector is generated by the third entity 404, the second link relation 405, and the fourth entity 406, and a vector dot product between the problem template 420 and the triplet template 430 is calculated. The vector dot product is used to indicate the degree of similarity between two vectors, and generally, when two vectors are the same, the corresponding vector dot product is 1, and when two vectors are completely opposite, the corresponding vector dot product is-1. That is, the larger the value of the vector dot product, the more similar the two vectors are.
Therefore, the similarity between the problem template and each triple template corresponding to each other can be obtained by calculating the vector dot product, and the triple template with the similarity larger than the set threshold value is taken as the template meeting the preset matching condition.
Fig. 5 is a flowchart of a method for determining whether a preset matching condition is met according to another embodiment of the present application. As shown in fig. 5, in this embodiment, before extracting a target question-answer pair in a triple template that meets a preset matching condition and using the target question-answer pair as a question-answer pair of a question template that matches the triple template, the method for measuring similarity between the question template and the triple template includes:
step S510, classifying all the triple templates in the triple template set;
step S520, calculating the similarity of the problem template and each category of triple templates respectively;
step S530, the triple template corresponding to the category with the highest similarity is used as the template meeting the preset matching condition.
In particular, in this embodiment, triples may be classified because the fields of the triples in the knowledge graph may carry explanatory text such as aliases, colloquials, and the like. Illustratively, "father," "dad," "son," and "mother" each represent a relationship of relativity and can be classified into categories representing relationships of relativity. When the link relation field of the problem template is the relationship, the triple template of the relationship type can be used as the template of the preset matching condition with the highest similarity. In an embodiment, the similarity between the problem template and each type of triplet template may also be calculated by means of vector dot product, and the calculation of the vector dot product is consistent as described above, and is not repeated here.
It should be noted that although the various steps of the methods in this application are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the shown steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
The following describes embodiments of an apparatus of the present application, which may be used to implement the method for generating a question-answer pair based on a knowledge graph in the above embodiments of the present application. Fig. 6 schematically shows a block diagram of a knowledge-graph-based question-answer pair generating apparatus 600 provided in an embodiment of the present application. As shown in fig. 6, the knowledge-graph-based question-answer pair generating apparatus includes the following parts.
A generating unit 610, configured to generate a plurality of question templates based on the original question;
the processing unit 620 is configured to perform regularization processing on the plurality of problem templates based on the set knowledge graph;
a first obtaining unit 630, configured to obtain question-answer pairs corresponding to the question templates from the knowledge graph according to the processed question templates;
the second obtaining unit 640 is configured to store the question-answer pairs in the question-answer pair library, so as to obtain answers corresponding to the user questions from the question-answer pair library when receiving the user questions.
The specific details of the knowledge-graph-based question-answering generation device provided in the embodiments of the present application have been described in detail in the corresponding method embodiments, and are not described herein again.
Fig. 7 schematically shows a block diagram of a computer system of an electronic device for implementing an embodiment of the present application.
It should be noted that the computer system 700 of the electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701 that can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the random access memory 703, various programs and data necessary for system operation are also stored. The cpu 701, the rom 702, and the ram 703 are connected to each other via a bus 704. An Input/Output interface 705(Input/Output interface, i.e., I/O interface) is also connected to the bus 704.
The following components are connected to the input/output interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a local area network card, a modem, and the like. The communication section 709 performs communication processing via a network such as the internet. A driver 710 is also connected to the input/output interface 705 as necessary. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to embodiments of the present application, the processes described in the various method flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the central processor 701, performs various functions defined in the system of the present application.
It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A question-answer pair generation method based on a knowledge graph is characterized by comprising the following steps:
generating a plurality of problem templates based on the original problem;
based on a set knowledge graph, conducting regularization processing on the problem templates respectively;
according to the processed question template, acquiring question-answer pairs respectively corresponding to the question template from the knowledge graph;
and storing the question-answer pairs into a question-answer pair library so as to obtain answers corresponding to the user questions from the question-answer pair library when receiving the user questions.
2. The method of claim 1, wherein the question template comprises a first entity, a second entity, and a linking relationship between the first entity and the second entity; the regularization processing is respectively carried out on a plurality of the problem templates based on the set knowledge graph, and the regularization processing comprises the following steps:
acquiring the first entity, the second entity and the link relation corresponding to the question template according to the question template;
and respectively carrying out normalization processing on the first entity, the second entity and the link relation so as to enable the expression of the first entity, the second entity and the link relation to be matched with the expression in the set knowledge graph.
3. The method of claim 2, wherein obtaining, from the set knowledge-graph, a plurality of question-answer pairs corresponding to a plurality of the question templates, respectively, according to the processed question templates, comprises:
extracting a triple template set from the set knowledge graph;
matching the processed problem templates with the triple templates respectively according to preset matching conditions;
and extracting the target question-answer pair in the triple template which meets the preset matching condition, and taking the target question-answer pair as the question-answer pair of the question template matched with the triple template.
4. The method according to claim 3, wherein before the extracting the target question-answer pair in the triple template meeting the preset matching condition and using the target question-answer pair as a question-answer pair of a question template matching with the triple template, the method further comprises:
vectorizing the problem template and each triple template respectively;
calculating the vector dot products respectively corresponding to the problem template and each triplet template;
and taking the triple template with the vector dot product larger than a set threshold value as the template meeting the preset matching condition.
5. The method according to claim 3, wherein before extracting the target question-answer pair in the triple template meeting the preset matching condition and using the target question-answer pair as a question-answer pair of the question template matching with the triple template, the method further comprises:
classifying each of the triplet templates in the triplet template set;
calculating the similarity corresponding to the link relation corresponding to the problem template and the link relation corresponding to the triple template of each category;
and taking the triple template corresponding to the category with the highest similarity as the template meeting the preset matching condition.
6. The method of claim 1, wherein generating a number of question templates based on the original question comprises:
carrying out entity identification and entity link identification on an original problem so as to obtain an entity corresponding to the original problem and a link relation between the entities;
and determining a plurality of problem templates corresponding to the original problems according to the entities and the link relations.
7. The method of claim 6, wherein prior to the identifying the entity and the identifying the entity link for the original question, the method further comprises:
extracting the original problem of which the use frequency is greater than a preset frequency based on the use frequency of the original problem;
and generating the problem template according to the original problem of which the use frequency is greater than the preset frequency.
8. A knowledge-graph-based question-answer pair generating apparatus, comprising:
the generating unit is used for generating a plurality of problem templates based on the original problem;
the processing unit is used for carrying out regularization processing on the plurality of problem templates respectively based on a set knowledge graph;
the first acquisition unit is used for acquiring question-answer pairs respectively corresponding to the question templates from the knowledge graph according to the processed question templates;
and the second acquisition unit is used for storing the question-answer pairs into a question-answer pair library so as to acquire answers corresponding to the user questions from the question-answer pair library when receiving the user questions.
9. A computer-readable medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the method for generating a knowledgegraph-based question-answer pair according to any one of claims 1 to 7.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of knowledge-graph-based question-answer pair generation of any one of claims 1 to 7 via execution of the executable instructions.
CN202210029916.4A 2022-01-12 2022-01-12 Knowledge graph-based question-answer pair generation method, device, equipment and medium Pending CN114357195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210029916.4A CN114357195A (en) 2022-01-12 2022-01-12 Knowledge graph-based question-answer pair generation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210029916.4A CN114357195A (en) 2022-01-12 2022-01-12 Knowledge graph-based question-answer pair generation method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114357195A true CN114357195A (en) 2022-04-15

Family

ID=81110081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210029916.4A Pending CN114357195A (en) 2022-01-12 2022-01-12 Knowledge graph-based question-answer pair generation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114357195A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114996419A (en) * 2022-05-09 2022-09-02 成都数之联科技股份有限公司 Intelligent question answering method and device for weapon equipment, electronic equipment and storage medium
CN115292457A (en) * 2022-06-30 2022-11-04 腾讯科技(深圳)有限公司 Knowledge question answering method and device, computer readable medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114996419A (en) * 2022-05-09 2022-09-02 成都数之联科技股份有限公司 Intelligent question answering method and device for weapon equipment, electronic equipment and storage medium
CN115292457A (en) * 2022-06-30 2022-11-04 腾讯科技(深圳)有限公司 Knowledge question answering method and device, computer readable medium and electronic equipment

Similar Documents

Publication Publication Date Title
US11501182B2 (en) Method and apparatus for generating model
CN111859960B (en) Semantic matching method, device, computer equipment and medium based on knowledge distillation
CN110705301B (en) Entity relationship extraction method and device, storage medium and electronic equipment
CN111159220B (en) Method and apparatus for outputting structured query statement
CN116629275B (en) Intelligent decision support system and method based on big data
CN110555451A (en) information identification method and device
CN110543633B (en) Sentence intention identification method and device
CN114357195A (en) Knowledge graph-based question-answer pair generation method, device, equipment and medium
CN111428010A (en) Man-machine intelligent question and answer method and device
CN110807311A (en) Method and apparatus for generating information
CN113761190A (en) Text recognition method and device, computer readable medium and electronic equipment
CN110377733A (en) A kind of text based Emotion identification method, terminal device and medium
CN116303537A (en) Data query method and device, electronic equipment and storage medium
CN112836019B (en) Public medical health named entity identification and entity linking method and device, electronic equipment and storage medium
CN113705207A (en) Grammar error recognition method and device
CN112926341A (en) Text data processing method and device
CN113704393A (en) Keyword extraction method, device, equipment and medium
CN111461757B (en) Information processing method and device, computer storage medium and electronic equipment
CN110705308A (en) Method and device for recognizing field of voice information, storage medium and electronic equipment
CN116502147A (en) Training method of anomaly detection model and related equipment
CN116127066A (en) Text clustering method, text clustering device, electronic equipment and storage medium
CN113268575B (en) Entity relationship identification method and device and readable medium
CN115186096A (en) Recognition method, device, medium and electronic equipment for specific type word segmentation
CN113254612A (en) Knowledge question-answering processing method, device, equipment and storage medium
CN113569567A (en) Text recognition method and device, computer readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination