CN114647713A - Knowledge graph question-answering method, device and storage medium based on virtual confrontation - Google Patents

Knowledge graph question-answering method, device and storage medium based on virtual confrontation Download PDF

Info

Publication number
CN114647713A
CN114647713A CN202210317565.7A CN202210317565A CN114647713A CN 114647713 A CN114647713 A CN 114647713A CN 202210317565 A CN202210317565 A CN 202210317565A CN 114647713 A CN114647713 A CN 114647713A
Authority
CN
China
Prior art keywords
entity
information
target
intention
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210317565.7A
Other languages
Chinese (zh)
Inventor
刘攀
李金龙
刘弘一
季江舟
杨一枭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Merchants Bank Co Ltd
Original Assignee
China Merchants Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Merchants Bank Co Ltd filed Critical China Merchants Bank Co Ltd
Priority to CN202210317565.7A priority Critical patent/CN114647713A/en
Publication of CN114647713A publication Critical patent/CN114647713A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application discloses a knowledge-graph question-answering method, a knowledge-graph question-answering system, knowledge-graph question-answering equipment and a storage medium based on virtual confrontation, wherein the knowledge-graph question-answering method comprises the following steps: the method comprises the steps of obtaining text information to be queried of a target user, carrying out entity recognition on the text information to be queried based on an entity extraction model to obtain target entity information, and carrying out intention recognition on the text information to be queried based on an intention recognition model and a preset recognition rule if the target entity information has a preset target type entity to obtain a target intention recognition result, wherein the entity extraction model and the intention recognition model are obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm. And performing data query on the graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information. The method and the device solve the technical problem of low accuracy of model identification.

Description

Knowledge graph question-answering method, device and storage medium based on virtual confrontation
Technical Field
The application relates to the technical field of machine learning, in particular to a knowledge graph question-answering method, a knowledge graph question-answering system, knowledge graph question-answering equipment and a storage medium based on virtual confrontation.
Background
With the development of the internet, a question-answering system is developed rapidly, an intelligent question-answering system finds character information which can best meet the intention of a user from a large amount of data, a classification and extraction model used in the field of the question-answering system at present is generally realized based on a convolutional neural network or a cyclic neural network, and a large amount of existing real data is used for training the model. When the quantity of training data is small, the model can only learn features through a small quantity of current samples, and a large amount of priori knowledge is lacked, so that the accuracy of model identification is low.
Disclosure of Invention
The application mainly aims to provide a knowledge-graph question-answering method, a knowledge-graph question-answering system, knowledge-graph question-answering equipment and a storage medium based on virtual confrontation, and aims to solve the technical problem that in the prior art, the accuracy of model identification is low.
In order to achieve the above object, the present application provides a knowledge-graph question-answering method based on virtual confrontation, including:
acquiring text information to be inquired of a target user;
performing entity recognition on the text information to be queried based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
if the target entity information has a preset target type entity, respectively performing intention recognition on the text information to be queried based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result, wherein the intention recognition model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
and performing data query on a preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information.
The application also provides a knowledge-graph question-answering system based on virtual confrontation, which is a virtual system, and comprises:
the acquisition module is used for acquiring text information to be inquired of a target user;
the entity identification module is used for carrying out entity identification on the text information to be inquired based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained and combined with a virtual confrontation training algorithm;
the intention identification module is used for respectively carrying out intention identification on the text information to be inquired based on a trained intention identification model and a preset identification rule to obtain a target intention identification result if the target entity information has a preset target type entity, wherein the intention identification model is obtained by training based on pre-collected corpus information to be trained and combined with a virtual confrontation training algorithm;
and the query module is used for performing data query on a preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information.
The application also provides a knowledge-graph question-answering device based on virtual confrontation, which is an entity device, and the knowledge-graph question-answering device based on virtual confrontation comprises: a memory, a processor, and a virtual confrontation-based knowledge-graph question-answering program stored on the memory, the virtual confrontation-based knowledge-graph question-answering program being executed by the processor to implement the steps of the virtual confrontation-based knowledge-graph question-answering method as described above.
The present application further provides a storage medium which is a computer-readable storage medium, on which a virtual confrontation-based knowledge-graph question-and-answer program is stored, and the virtual confrontation-based knowledge-graph question-and-answer program is executed by a processor to implement the steps of the virtual confrontation-based knowledge-graph question-and-answer method as described above.
The application provides a knowledge-graph question-answering method, a knowledge-graph question-answering system, knowledge-graph question-answering equipment and a storage medium based on virtual confrontation, the method comprises the steps of firstly obtaining text information to be inquired of a target user, further carrying out entity recognition on the text information to be inquired based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained and a virtual confrontation training algorithm, further, if the target entity information has a preset target type entity, carrying out intention recognition on the text information to be inquired respectively based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result, wherein the intention recognition model is obtained by training based on pre-collected corpus information to be trained and the virtual confrontation training algorithm, and then based on the target entity information, the target intention recognition result and the text information to be queried, data query is carried out on a preset constructed graph database to obtain target query information, model training based on a virtual confrontation training algorithm is realized, a supervised learning model is expanded to a semi-supervised learning model, the generalization of the whole model under the condition of few samples is improved, and therefore the accuracy of model recognition is improved, namely, the accuracy of entity recognition by an entity extraction model and the accuracy of intention recognition of an intention recognition model are improved, and an answer with higher accuracy is matched through the model based on the text information to be queried.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a schematic flow chart of a first embodiment of a knowledge-graph question-answering method based on virtual confrontation according to the present application;
FIG. 2 is a schematic flow chart of a second embodiment of the virtual confrontation-based knowledge-graph question-answering method according to the present application;
FIG. 3 is a schematic flow chart of a third embodiment of the virtual confrontation-based knowledge-graph question-answering method according to the present application;
FIG. 4 is a schematic flow chart of a fourth embodiment of the virtual confrontation-based knowledge-graph question-answering method according to the present application;
FIG. 5 is a schematic flow chart of a fifth embodiment of the virtual confrontation-based knowledge-map question-answering method according to the present application;
FIG. 6 is a schematic diagram of a virtual confrontation-based knowledge-graph question-answering apparatus for a hardware operating environment according to an embodiment of the present application;
fig. 7 is a schematic diagram of functional modules of the knowledge-graph question-answering device based on virtual confrontation.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In a first embodiment of the virtual confrontation-based knowledge-graph question-answering method, referring to fig. 1, the virtual confrontation-based knowledge-graph question-answering method includes:
step S10, acquiring text information to be inquired of a target user;
in this embodiment, it should be noted that the text information to be queried is a text in a natural language. The user inputs a natural language in a text form or a voice form, and if the natural voice input by the user is in the text form, the query information is directly obtained. And if the natural voice in the voice form is input by the user, converting the natural language in the voice form into the query information in the text form.
The method comprises the steps of obtaining text information to be inquired of a target user, specifically obtaining statement information input by the user on a system or a platform, and further preprocessing the statement information to obtain the text information to be inquired, wherein the preprocessing comprises the operations of performing full half-angle conversion, removing stop words, rewriting question sentences and the like on the statement information.
Step S20, performing entity recognition on the text information to be queried based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
in this embodiment, it should be noted that, a dialog template and entity attribute information that are marked to be required to extract entity information are obtained, where the entity attribute information is an entity attribute corresponding to a target field, for example: in the insurance field, entities such as insurance, insurance company and insurance category are included, the linguistic template and the entity attribute information are further composed to generate corpus information to be trained, the corpus information to be trained is labeled with corresponding entity type, entity position and intention label, and then a virtual confrontation training algorithm is combined, the corpus information to be trained is input into a trained pre-training model, the pre-training model comprises a BERT deep pre-training language model, an LSTM language model, an ELMo language model and other models, a text vector corresponding to the corpus information to be trained is output, and then the text vector is used as the input of a custom neural network, in the embodiment, the custom neural network extracts a model for the entities to obtain a target output result, and based on the target output result, the entity type and the entity position, and adjusting the model parameters of the entity extraction model so as to obtain the final entity extraction model.
Specifically, standard entity information and similar entity information of a target field are collected, an entity search prefix tree is constructed based on the standard entity information and the similar entity information, the text information to be queried is further input into the entity extraction model, so that the entity information of the text information to be queried is extracted through the entity extraction model, further, in order to improve the accuracy of entity identification, the similarity between the entity information obtained by the entity extraction model and the standard entity information and the similar entity information in the entity search prefix tree is calculated, and therefore the target entity information with high similarity is obtained.
After step S20, the method further includes:
step a1, if the target entity information does not have a preset target type entity, determining that the intention of the text information to be queried is a recommended query intention associated with the target entity information, and taking the recommended query intention as the target intention identification result.
In this embodiment, it should be noted that the preset target type entity is manually set entity information, for example, in the insurance field, the entities include insurance, insurance company, and insurance category, and when the target entity information only includes the insurance category and the insurance company category entity, it is directly determined that the intention for the current target is the recommendation query intention associated with the target entity information, and the recommendation query intention is used as the target intention identification result.
Step S30, if the target entity information has a preset target type entity, respectively performing intention recognition on the text information to be queried based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result, wherein the intention recognition model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
in this embodiment, it should be noted that the training process of the intention recognition model is substantially the same as the training process of the entity extraction model in step S20, and details are not repeated here, and the preset recognition rule includes at least one text similarity calculation algorithm, for example: and LCS, GST, the shortest editing distance and the like.
Specifically, if the target entity information has a preset target type entity, performing intent recognition on the text information to be queried based on the intent recognition model, calculating target similarity between the text information to be queried and pre-collected standard question information and intent keywords according to the preset recognition rules to obtain respective intent recognition results, and determining the target intent recognition result based on each of the intent recognition results, where for example, the preset recognition rules include three recognition rules, the intent recognition result obtained by the intent recognition model recognition is a, intentions corresponding to the three recognition rules are A, B, C respectively, and then a is taken as the target intent recognition result.
And step S40, performing data query on a preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information.
In this embodiment, it should be noted that the graph database is obtained by obtaining entity relationship data by performing entity mapping on the collected product information based on a preset graph data model, and the preset graph data model is constructed based on a data structure of the product information. Further, when a preset interface inquires product information of all target fields on line, the product information is mapped through the preset graph data model, and the entity relationship data corresponding to the product information can be determined by taking the number as a main key, so that the entity relationship data is imported into the graph database.
Specifically, the target entity information, the target intention recognition result and the text information to be queried are spliced, data is queried from the graph database based on the spliced information to obtain the target query information, and additionally, after the target query information is queried, the target query information is combined to obtain a target language and the target language is returned to the target user.
The embodiment of the application provides a knowledge graph question-answering method based on virtual confrontation, firstly, text information to be inquired of a target user is obtained, then entity identification is carried out on the text information to be inquired based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training the text information to be inquired based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm, further, if the target entity information has a preset target type entity, intention identification is respectively carried out on the text information to be inquired based on a trained intention identification model and a preset identification rule to obtain a target intention identification result, wherein the intention identification model is obtained by training the corpus information to be trained in combination with a virtual confrontation training algorithm in advance based on the target entity information, The target intention recognition result and the text information to be queried are subjected to data query in a preset constructed graph database to obtain target query information, model training based on a virtual confrontation training algorithm is achieved, a supervised learning model is expanded to a semi-supervised learning model, the generalization of the whole model under the condition of few samples is improved, and therefore the accuracy of model recognition is improved, namely, the accuracy of entity recognition by an entity extraction model and the accuracy of intention recognition of an intention recognition model are improved, and answers with higher accuracy are matched through the model based on the text information to be queried.
Further, referring to fig. 2, based on the first embodiment in the present application, in another embodiment of the present application, step S20: based on the trained entity extraction model, performing entity identification on the text information to be queried to obtain target entity information, which specifically comprises the following steps:
step A10, collecting standard entity information and similar entity information of a target field, and constructing an entity search prefix tree based on the standard entity information and the similar entity information;
step A20, retrieving and matching the text information to be queried with the standard entity information and the similar entity information in the entity search prefix tree;
step A30, if the matching fails, performing entity extraction on the text information to be inquired based on an entity extraction model to obtain entity information;
step A40, calculating the similarity between the entity information and the standard entity information and the similar entity information in the entity search prefix tree;
step A50, based on each similarity, determining the target entity information corresponding to the similarity exceeding a preset similarity threshold.
In this embodiment, it should be noted that the entity search prefix tree, also called word search tree or key tree, is a tree structure, is a variation of a hash tree, and is often used for text word frequency statistics by a search engine system.
Specifically, after standard entity information and similar entity information in a target field are collected, and an entity search prefix tree is constructed based on the standard entity information and the similar entity information, further, in order to improve the efficiency and accuracy of entity detection, before entity identification is performed through an entity extraction model, retrieval is directly performed on the entity search prefix tree based on the text information to be queried so as to be accurately matched with the standard entity information and the similar entity information, if matching is successful, it is proved that the text information to be queried contains the standard entity information or the similar entity information, the contained standard entity information or the similar entity information is directly used as the target entity information, and entity identification is directly finished, if matching is failed, the text information to be queried is input into the entity extraction model for entity identification, and outputting an entity identification result, further calculating the similarity between the entity information and the standard entity information and the similar entity information in the entity search prefix tree, and further selecting the entity information corresponding to the similarity exceeding a preset similarity threshold as the target entity information, so as to improve the accuracy of entity identification, wherein the preset similarity threshold is a preset similarity selection critical point, which is not specifically limited herein.
Through the scheme, namely, the standard entity information and the similar entity information in the target field are collected, the entity search prefix tree is constructed based on the standard entity information and the similar entity information, the text information to be queried and the standard entity information and the similar entity information in the entity search prefix tree are retrieved and matched, if the matching fails, the text information to be queried is subjected to entity extraction based on an entity extraction model to obtain the entity information, the similarity between the entity information and the standard entity information and the similar entity information in the entity search prefix tree is calculated, the target entity information corresponding to the similarity exceeding the preset similarity threshold is determined based on each similarity, the construction of the entity search prefix tree based on the collected standard entity information and the similar entity information is realized, therefore, retrieval matching can be directly carried out through the entity search prefix tree, entity identification efficiency is improved, further entity identification is carried out through the entity extraction model when retrieval matching fails, the similarity between the identified entity information and the standard entity information and the similar entity information in the entity search prefix tree is calculated, the final target entity information is obtained, and therefore the accuracy of entity identification is improved.
Further, referring to fig. 3, based on the first embodiment in the present application, in another embodiment of the present application, step S30: if the target entity information has a preset target type entity, respectively performing intention recognition on the text information to be queried based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result, specifically comprising:
step B10, matching the text information to be inquired with the standard question information collected in advance;
step B20, if the question matching fails, the text information to be inquired and the pre-collected intention keywords are matched with each other;
step B30, if the keyword matching fails, respectively identifying the text information to be inquired based on the intention identification model and the preset identification rule to obtain each intention identification result;
step B40, determining the target intention recognition result based on each intention recognition result.
In this embodiment, it should be noted that, before the system is online, a large amount of standard question information and intention keywords are collected in advance, for example, the standard question: what the insurance application age is, intention keywords: the insuring age, the insured person's identity, etc.
It can be understood that, in order to improve the efficiency of intent recognition, the text information to be queried and the standard question information collected in advance are subjected to question matching accurately, if the question matching succeeds, the intent associated with the matched standard question information is directly used as the target intent recognition result, further intent recognition is not needed, if the question matching fails, the text information to be queried and the intent keywords collected in advance are subjected to keyword matching, if the keyword matching succeeds, further intent recognition is not needed, if the keyword matching fails, the similarity between the text information to be queried and the standard question information and the similarity between the text information to be queried and the intent keywords are respectively calculated based on the intent recognition model and the preset recognition rule, so as to obtain each intent recognition result, based on each intent recognition result, a voting mechanism is adopted, and determining the target intention recognition result.
According to the scheme, if the target entity information has the preset target type entity, the text information to be inquired and the pre-collected standard question information are subjected to question matching, if the question matching fails, the text information to be inquired and the pre-collected intention keywords are subjected to keyword matching, if the keyword matching fails, the text information to be inquired is respectively identified based on the intention identification model and the preset identification rule, each intention identification result is obtained, then the target intention identification result is determined based on each intention identification result, the purpose of accurately matching the pre-collected standard question information and the intention keywords is realized, so that the intention identification efficiency is improved, and if the standard question information and the intention keywords are both subjected to matching failure, the purpose of matching the target intention identification result is further based on the intention identification model and the preset identification rule, and respectively carrying out intention identification on the text information to be inquired, thereby improving the accuracy of intention identification.
Further, referring to fig. 4, based on the first embodiment in the present application, in another embodiment of the present application, before step S20, the method further includes:
step C10, carrying out blacklist matching on the text information to be inquired and preset blacklist field information;
step C20, if the matching of the blacklist fails, carrying out white list matching on the text information to be inquired and preset white list field information;
step C30, if the white list is successfully matched, recognizing the text information to be inquired through a preset rejection recognition model;
step C40, if the identification is successful, returning to execute the steps of: and carrying out entity recognition on the text information to be inquired based on the trained entity extraction model to obtain target entity information.
In this embodiment, it can be understood that the preset blacklist field information is a list that does not include field information related to the target field, and the preset white list field information is a list that includes field information related to the target field, for example, in the insurance field, the preset blacklist field information includes vocabularies such as fund and financing, and the preset white list field information includes vocabularies such as accident risk and continuous risk. Specifically, the text information to be queried is firstly blacklisted and matched with preset blacklist field information, if the blacklist matching is successful, it is proved that the text information to be queried does not have vocabularies related to a target field, and identification rejection information is directly returned to allow the target user to input the text information to be queried again, if the blacklist matching is unsuccessful, the text information to be queried is whitelist matched with the preset whitelist field information, if the whitelist matching is successful, it is proved that the text information to be queried contains the vocabularies related to the target field, and then the text information to be queried is subjected to identification rejection recognition through a preset identification rejection recognition model, and if the text information to be queried is successfully recognized through the preset identification rejection recognition model, the execution step is returned: and performing entity recognition on the text information to be inquired based on the trained entity extraction model to obtain target entity information so as to execute the subsequent flow steps.
Through the steps, the text information to be inquired, the blacklist field information preset, the white list field information preset and the rejection recognition model preset are sequentially rejected and recognized before entity recognition and intention recognition are carried out, so that whether the text information to be inquired belongs to the vocabulary corresponding to the target field is judged, direct rejection is carried out if the text information does not belong to the vocabulary corresponding to the target field, follow-up entity recognition, intention recognition and other operations are not needed, and the effect of recognizing the target field is improved.
Further, referring to fig. 5, based on the first embodiment in the present application, in another embodiment of the present application, before step S210, the method further includes:
step D10, collecting corpus information to be trained, wherein the corpus information to be trained is corpus information labeled with an entity type label, an entity position label and an intention category label;
in this embodiment, specifically, a linguistic template pre-labeled with entity information to be extracted and entity attributes of a known target field are obtained, a large number of question corpuses are generated through a combination algorithm, and each question corpus is labeled with a corresponding entity type, entity position and label of an intention category, where the entity position is a position of an entity in a text.
And D20, respectively carrying out iterative training on the entity recognition model to be trained and the intention recognition model to be trained based on the corpus information to be trained by combining a transfer learning algorithm and a virtual confrontation training algorithm to obtain the entity recognition model and the intention recognition model.
In this embodiment, it should be noted that the migration learning algorithm: applying the knowledge or pattern learned in a certain field or task to different but related fields or problems, it can be understood that the trained model (pre-training model) parameters are transferred to a new model to help the training of the new model, specifically, constructing a pre-training model trained based on massive training data, analyzing common parameters or prior distribution between source data of the pre-training model and a space model of target data in an entity recognition model and an intention classification language model, establishing related knowledge mapping between a source domain and a target domain, and further establishing a high-precision and reliable learning model in the target field.
Virtual confrontation training algorithm: virtual confrontation training is an effective technique for locally assigning smoothness. In particular, pairs of data points are employed that are very close in the input space, but very close in the model output space. The models are then trained to have their outputs close to each other. To this end, the model gives very different outputs, with a given input and finding the disturbance. Then, the model penalizes the sensitivity due to disturbance, so that the supervised learning model is expanded to a semi-supervised learning model, and meanwhile, the generalization performance of the whole model under the condition of few samples is improved.
Specifically, model parameters of a trained pre-training model are migrated to an entity recognition model to be trained and an intention recognition model to be trained through a migration learning algorithm, iterative training is respectively carried out on the entity recognition model to be trained and the intention recognition model to be trained based on the corpus information to be trained and a virtual confrontation training algorithm, corresponding recognition results are output, model loss is calculated based on the body type, the entity position and the intention type of the entity recognition model to be trained and the intention recognition model to be trained, model parameters are adjusted based on the model loss of the entity recognition model to be trained and the intention recognition model to be trained, the final entity recognition model and the intention recognition model are obtained, and the generalization performance and the accuracy of the whole model under the condition of few samples are improved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a virtual confrontation-based knowledge-graph question-answering apparatus of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 6, the virtual confrontation-based knowledge-graph question-answering apparatus may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-vo l at i l e memory, such as a magnetic disk memory. The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.
Optionally, the virtual confrontation-based knowledge-graph question-answering device may further include a rectangular user interface, a network interface, a camera, RF (Rad i o Frequency) circuitry, sensors, audio circuitry, a Wi Fi module, and so forth. The rectangular user interface may include a display screen (Di sp ay), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-fi interface).
Those skilled in the art will appreciate that the virtual confrontation-based knowledge-graph question-answering machine architecture shown in FIG. 6 does not constitute a limitation of the virtual confrontation-based knowledge-graph question-answering machine, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
As shown in fig. 6, a memory 1005, which is a type of computer storage medium, may include an operating system, a network communication module, and a virtual confrontation-based knowledge-graph quiz program. The operating system is a program that manages and controls the hardware and software resources of the virtual confrontation-based knowledge-graph question-answering device, supporting the operation of the virtual confrontation-based knowledge-graph question-answering program, as well as other software and/or programs. The network communication module is used to enable communication between the various components within the memory 1005, as well as with other hardware and software in the virtual confrontation-based knowledge-graph question-answering system.
In the apparatus of fig. 6, the processor 1001 is configured to execute a program for learning knowledge-graph quiz based on virtual oppositions stored in the memory 1005, and implement any one of the steps of the method for learning knowledge-graph quiz based on virtual oppositions.
The specific implementation of the knowledge-graph question-answering device based on the virtual confrontation is basically the same as that of the knowledge-graph question-answering method based on the virtual confrontation, and the detailed description is omitted here.
In addition, referring to fig. 7, fig. 7 is a schematic diagram of functional modules of the virtual confrontation-based knowledge-graph question-answering device according to the present application, and the present application further provides a virtual confrontation-based knowledge-graph question-answering system, which includes:
the acquisition module is used for acquiring text information to be inquired of a target user;
the entity identification module is used for carrying out entity identification on the text information to be inquired based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained and combined with a virtual confrontation training algorithm;
the intention recognition module is used for respectively performing intention recognition on the text information to be queried based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result if the target entity information has a preset target type entity, wherein the intention recognition model is obtained by training based on pre-collected corpus information to be trained and combined with a virtual confrontation training algorithm;
and the query module is used for performing data query on a preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information.
Optionally, the entity identification is further for:
acquiring standard entity information and similar entity information of a target field, and constructing an entity search prefix tree based on the standard entity information and the similar entity information;
retrieving and matching the text information to be queried with standard entity information and similar entity information in the entity search prefix tree;
if the matching fails, performing entity extraction on the text information to be inquired based on an entity extraction model to obtain entity information;
calculating the similarity between the entity information and the standard entity information and the similar entity information in the entity search prefix tree;
and determining the target entity information corresponding to the similarity exceeding a preset similarity threshold based on the similarities.
Optionally, the intent recognition is further to:
if the target entity information has a preset target type entity, performing question matching on the text information to be inquired and standard question information collected in advance;
if the question matching fails, performing keyword matching on the text information to be inquired and pre-collected intention keywords;
if the keyword matching fails, respectively identifying the text information to be inquired based on the intention identification model and the preset identification rule to obtain each intention identification result;
and determining the target intention recognition result based on each intention recognition result.
Optionally, the virtual confrontation-based knowledge-graph question-answering system is further configured to:
and if the target entity information does not have a preset target type entity, judging that the intention of the text information to be inquired is a recommended inquiry intention associated with the target entity information, and taking the recommended inquiry intention as a target intention identification result.
Optionally, the query module is further configured to:
splicing the target entity information, the target intention recognition result and the text information to be inquired to obtain a target splicing result;
and performing data traversal query on the graph database based on a target splicing result to obtain the target query information.
Optionally, the virtual confrontation-based knowledge-graph question-answering system is further configured to:
carrying out blacklist matching on the text information to be inquired and preset blacklist field information;
if the blacklist matching fails, carrying out white list matching on the text information to be inquired and preset white list field information;
if the white list is successfully matched, identifying the text information to be inquired through a preset rejection identification model;
if the identification is successful, returning to the execution step: and carrying out entity recognition on the text information to be inquired based on the trained entity extraction model to obtain target entity information.
Optionally, the virtual confrontation-based knowledge-graph question-answering system is further configured to:
collecting corpus information to be trained, wherein the corpus information to be trained is corpus information marked with an entity type label, an entity position label and an intention category label;
and performing iterative training on the entity recognition model to be trained and the intention recognition model to be trained respectively based on the corpus information to be trained and by combining a transfer learning algorithm and a virtual confrontation training algorithm to obtain the entity recognition model and the intention recognition model.
Optionally, the virtual confrontation-based knowledge-graph question-answering system is further configured to:
acquiring product information of a target field;
and mapping the product information based on the preset graph data model to obtain the entity relationship data, and importing the entity relationship data into the graph database, wherein the preset graph data model is constructed based on a data structure of the product information.
The specific implementation of the knowledge-graph question-answering system based on the virtual confrontation is basically the same as that of each embodiment of the knowledge-graph question-answering method based on the virtual confrontation, and is not described in detail herein.
The present application provides a storage medium, which is a computer-readable storage medium, and the computer-readable storage medium stores one or more programs, which are further executable by one or more processors for implementing the steps of the virtual confrontation-based knowledge-graph question-answering method described in any one of the above.
The specific implementation of the computer-readable storage medium of the present application is substantially the same as the above-mentioned embodiments of the knowledge-graph question-answering method based on virtual confrontation, and is not described herein again.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all equivalent structures or equivalent processes, which are directly or indirectly applied to other related technical fields, and which are not limited by the present application, are also included in the scope of the present application.

Claims (10)

1. A knowledge-graph question-answering method based on virtual confrontation is characterized by comprising the following steps:
acquiring text information to be inquired of a target user;
performing entity recognition on the text information to be queried based on a trained entity extraction model to obtain target entity information, wherein the entity extraction model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
if the target entity information has a preset target type entity, respectively performing intention recognition on the text information to be queried based on a trained intention recognition model and a preset recognition rule to obtain a target intention recognition result, wherein the intention recognition model is obtained by training based on pre-collected corpus information to be trained in combination with a virtual confrontation training algorithm;
and performing data query on a preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information.
2. The knowledge-graph question-answering method based on virtual confrontation according to claim 1, wherein the step of performing entity recognition on the text information to be queried based on the trained entity extraction model to obtain target entity information further comprises:
acquiring standard entity information and similar entity information of a target field, and constructing an entity search prefix tree based on the standard entity information and the similar entity information;
retrieving and matching the text information to be queried with standard entity information and similar entity information in the entity search prefix tree;
if the matching fails, performing entity extraction on the text information to be inquired based on an entity extraction model to obtain entity information;
calculating the similarity between the entity information and the standard entity information and the similar entity information in the entity search prefix tree;
and determining the target entity information corresponding to the similarity exceeding a preset similarity threshold value based on each similarity.
3. The method as claimed in claim 1, wherein if there is a preset target type entity in the target entity information, the step of performing intent recognition on the target entity information based on a trained intent recognition model and a preset recognition rule to obtain a target intent recognition result comprises:
if the target entity information has a preset target type entity, matching question sentences of the text information to be inquired and standard question information collected in advance;
if the question matching fails, performing keyword matching on the text information to be inquired and the pre-collected intention keywords;
if the keyword matching fails, respectively identifying the text information to be queried based on the intention identification model and the preset identification rule to obtain each intention identification result;
and determining the target intention recognition result based on each intention recognition result.
4. The knowledge-graph question-answering method based on virtual confrontation according to claim 1, wherein after the step of performing entity recognition on the text information to be queried based on the trained entity extraction model to obtain the target entity information, the method further comprises:
and if the target entity information does not have a preset target type entity, judging that the intention of the text information to be inquired is a recommended inquiry intention associated with the target entity information, and taking the recommended inquiry intention as a target intention identification result.
5. The knowledge-graph question-answering method based on virtual confrontation according to claim 1, wherein the step of performing data query in the preset constructed graph database based on the target entity information, the target intention recognition result and the text information to be queried to obtain target query information comprises:
splicing the target entity information, the target intention recognition result and the text information to be inquired to obtain a target splicing result;
and performing data traversal query on the graph database based on a target splicing result to obtain the target query information.
6. The knowledge-graph question-answering method based on virtual confrontation according to claim 1, wherein before the step of performing entity recognition on the text information to be queried based on the entity extraction model to obtain the target entity information, the method further comprises:
carrying out blacklist matching on the text information to be inquired and preset blacklist field information;
if the blacklist matching fails, carrying out white list matching on the text information to be inquired and preset white list field information;
if the white list is successfully matched, identifying the text information to be inquired through a preset rejection identification model;
if the identification is successful, returning to the execution step: and performing entity identification on the text information to be inquired based on the trained entity extraction model to obtain target entity information.
7. The virtual confrontation-based knowledge-graph question-answering method according to claim 1, characterized in that before the step of obtaining the text information to be queried of the target user, the virtual confrontation-based knowledge-graph question-answering method further comprises:
collecting corpus information to be trained, wherein the corpus information to be trained is corpus information marked with an entity type label, an entity position label and an intention category label;
and respectively performing iterative training on an entity recognition model to be trained and an intention recognition model to be trained on the basis of the corpus information to be trained by combining a transfer learning algorithm and a virtual confrontation training algorithm to obtain the entity recognition model and the intention recognition model.
8. The virtual confrontation-based knowledge-graph question-answering method according to claim 1, characterized in that before the step of obtaining the text information to be queried of the target user, the virtual confrontation-based knowledge-graph question-answering method further comprises:
acquiring product information of a target field;
and mapping the product information based on the preset graph data model to obtain the entity relationship data, and importing the entity relationship data into the graph database, wherein the preset graph data model is constructed based on a data structure of the product information.
9. A virtual confrontation-based knowledge-graph question-answering apparatus, characterized in that the virtual confrontation-based knowledge-graph question-answering apparatus comprises: a memory, a processor, and a virtual confrontation-based knowledge-graph question-answering program stored on the memory,
the virtual confrontation-based knowledge-graph question-answering program is executed by the processor to implement the virtual confrontation-based knowledge-graph question-answering method according to any one of claims 1 to 8.
10. A storage medium being a computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a virtual confrontation-based knowledge-graph question-answering program, which is executed by a processor to implement the steps of the virtual confrontation-based knowledge-graph question-answering method according to any one of claims 1 to 8.
CN202210317565.7A 2022-03-29 2022-03-29 Knowledge graph question-answering method, device and storage medium based on virtual confrontation Pending CN114647713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210317565.7A CN114647713A (en) 2022-03-29 2022-03-29 Knowledge graph question-answering method, device and storage medium based on virtual confrontation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210317565.7A CN114647713A (en) 2022-03-29 2022-03-29 Knowledge graph question-answering method, device and storage medium based on virtual confrontation

Publications (1)

Publication Number Publication Date
CN114647713A true CN114647713A (en) 2022-06-21

Family

ID=81995876

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210317565.7A Pending CN114647713A (en) 2022-03-29 2022-03-29 Knowledge graph question-answering method, device and storage medium based on virtual confrontation

Country Status (1)

Country Link
CN (1) CN114647713A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818740A (en) * 2022-06-30 2022-07-29 江苏微皓智能科技有限公司 Man-machine cooperation method and system based on domain knowledge graph
CN115525804A (en) * 2022-09-23 2022-12-27 中电金信软件有限公司 Information query method and device, electronic equipment and storage medium
CN115982391A (en) * 2023-03-17 2023-04-18 恒生电子股份有限公司 Information processing method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818740A (en) * 2022-06-30 2022-07-29 江苏微皓智能科技有限公司 Man-machine cooperation method and system based on domain knowledge graph
CN115525804A (en) * 2022-09-23 2022-12-27 中电金信软件有限公司 Information query method and device, electronic equipment and storage medium
CN115982391A (en) * 2023-03-17 2023-04-18 恒生电子股份有限公司 Information processing method and device
CN115982391B (en) * 2023-03-17 2023-07-25 恒生电子股份有限公司 Information processing method and device

Similar Documents

Publication Publication Date Title
CN111026842B (en) Natural language processing method, natural language processing device and intelligent question-answering system
CN108427707B (en) Man-machine question and answer method, device, computer equipment and storage medium
KR102288249B1 (en) Information processing method, terminal, and computer storage medium
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
WO2017092380A1 (en) Method for human-computer dialogue, neural network system and user equipment
CN110727779A (en) Question-answering method and system based on multi-model fusion
CN110781276A (en) Text extraction method, device, equipment and storage medium
CN114647713A (en) Knowledge graph question-answering method, device and storage medium based on virtual confrontation
CN111444344B (en) Entity classification method, entity classification device, computer equipment and storage medium
CN109408821B (en) Corpus generation method and device, computing equipment and storage medium
CN112163424A (en) Data labeling method, device, equipment and medium
CN113377936B (en) Intelligent question and answer method, device and equipment
CN110377733A (en) A kind of text based Emotion identification method, terminal device and medium
CN112100377A (en) Text classification method and device, computer equipment and storage medium
CN114398881A (en) Transaction information identification method, system and medium based on graph neural network
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN114186076A (en) Knowledge graph construction method, device, equipment and computer readable storage medium
CN116304748A (en) Text similarity calculation method, system, equipment and medium
CN116992007A (en) Limiting question-answering system based on question intention understanding
CN111368066B (en) Method, apparatus and computer readable storage medium for obtaining dialogue abstract
CN111898528B (en) Data processing method, device, computer readable medium and electronic equipment
CN111783425B (en) Intention identification method based on syntactic analysis model and related device
CN113705207A (en) Grammar error recognition method and device
CN112989829A (en) Named entity identification method, device, equipment and storage medium
CN116881470A (en) Method and device for generating question-answer pairs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination