Disclosure of Invention
The embodiment of the invention aims to provide a question answering method, electronic equipment and a computer readable storage medium, which effectively improve the accuracy of question answers.
In order to solve the above technical problem, an embodiment of the present invention provides a question answering method, including the following steps: acquiring an entity in a question to be answered as a question entity; acquiring a plurality of map entities associated with the question entities and a plurality of map relations associated with the map entities from a knowledge map, and acquiring entity scores of the map entities and relation scores of the map relations; determining a preset weight according to training question sentences of a plurality of known answers; carrying out weighted calculation on the entity score and the relation score according to the preset weight value, and acquiring a target entity and target relation according to a calculation result; and obtaining the answer of the question to be answered according to the target entity and the target relation.
An embodiment of the present invention also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the question-answering method as described above.
The embodiment of the present invention also provides a computer-readable storage medium, which stores a computer program, wherein the computer program is characterized in that the computer program is executed by a processor to execute the aforementioned question answering method.
Compared with the prior art, the embodiment of the invention has the advantages that after the questions to be answered are obtained, the relation between the knowledge graph and the graph is obtained according to the knowledge graph and the question entities in the questions to be answered, the entity scores of all the graph entities and the relation scores of all the graph relations are obtained, the entity scores and the relation scores are weighted and calculated according to the preset weight, the preset weight is determined according to the training questions of a plurality of known answers, the accuracy of the target entities and the target relations obtained according to the result of the weighted calculation is higher, and therefore the accuracy of the answers of the questions to be answered is effectively improved.
In addition, the obtaining of the preset weight specifically includes: obtaining a plurality of first training question sentences and standard answers of the first training question sentences, and setting a plurality of initial weights; acquiring a training entity set and a training relation set corresponding to each first training question from the knowledge graph; according to the training entity set and the training relationship set, obtaining answers of the first training question as test answers under the initial weights; comparing the test answers with the standard answers to obtain the accuracy of each initial weight; and taking the initial weight with the highest accuracy as the preset weight. The initial weight with the highest accuracy is used as a preset weight, so that the accuracy of the question answer is effectively improved.
In addition, the acquiring a plurality of map entities associated with the question entity from the knowledge map specifically includes: acquiring the name of the question entity; and acquiring a plurality of entities in the knowledge graph, wherein the name similarity between the entity name and the question entity is greater than a preset threshold value, and the entities are used as graph entities.
In addition, the obtaining of the entity score of each map entity specifically includes: obtaining a question word vector of the question to be answered; obtaining word vectors of all the map entities to obtain a plurality of entity word vectors; obtaining word vector similarity of each entity word vector and the question word vector; and taking the word vector similarity as an entity score of each map entity.
In addition, before obtaining the relationship score of each map relationship, the method further comprises the following steps: constructing a relation prediction model; performing data training on the relationship prediction model according to a plurality of second training question sentences, a plurality of training entities corresponding to the second training question sentences, and a plurality of training relationships corresponding to the training entities; the relationship score of each map relationship specifically includes: after the relation prediction model is trained, inputting the question to be answered, each map entity and each map relation into the relation prediction model; and taking the probability of each map relation output by the relation prediction model as the score of each map relation.
In addition, before the data training of the relationship prediction model, the method further comprises: judging whether the number of the training relations is larger than or equal to a preset threshold value or not; if the number of the training relations is smaller than the preset threshold value, at least one relation is randomly added into the training relations, so that the number of the training relations is equal to the preset threshold value. When the number of the training relations is smaller than a preset threshold value, at least one relation is added into the training relations, so that the number of the training relations reaches the preset threshold value, a sufficient number of the training relations are guaranteed to carry out data training on the relation prediction model, the effectiveness of the training on the relation prediction model is guaranteed, the accuracy of the trained relation prediction model on relation prediction is improved, and the accuracy of answers of the question answering method is further improved.
In addition, before the question to be answered, each graph entity and each graph relation are input into the relation prediction model, the method further comprises the following steps: all the map entities are converted into preset entities; the inputting the question sentence to be answered, each map entity and each map relationship into the relationship prediction model specifically includes: and inputting the question sentence to be answered, the preset entity and the map relation into the relation prediction model.
In addition, before performing weighted calculation on the entity score and the relationship score according to the preset weight value, the method further includes: obtaining N map relations with larger scores in the map relations, wherein N is an integer larger than zero; the performing weighted calculation on the entity score and the relationship score according to the preset weight specifically includes: and carrying out weighted calculation on the entity score and the relation score of the N map relations according to the preset weight value.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
A first embodiment of the present invention relates to a question answering method, and a specific flow is shown in fig. 1, including the following steps:
step S101: and acquiring an entity in the question to be answered as a question entity.
Specifically, in this step, the question to be answered is the question input by the user. After a question to be answered input by a user is obtained, firstly, normalization processing is carried out on the question to be answered, redundant punctuations (for example, punctuations at the tail of the question) are deleted, redundant meaningless word strings are deleted, and the question is segmented.
Step S102: a plurality of map entities associated with the question entities are obtained.
Specifically, in the present embodiment, first, the entity name of a question entity is obtained, and an entity having the same entity name as the question entity is obtained from the knowledge-graph as a graph entity, for example, the question entity is "apple", and a plurality of entities having the entity name "apple" (including fruit apples, apple phones, and the like) are obtained from the knowledge-graph as graph entities.
It is to be understood that the obtaining of an entity with the same entity name as a question entity from a knowledge graph as a graph entity is merely a specific example in the present embodiment, and is not a limitation, and in other embodiments of the present invention, an entity with a similar entity name as a question entity may be obtained from a knowledge graph as a graph entity. Specifically, in another embodiment of the present invention, an entity with a similarity greater than a preset value to the noun vector of the question entity may be obtained from the knowledge graph as the graph entity. For example, the question entity is a worker ant, and entities such as an ant are obtained from the knowledge graph as graph entities.
In particular, in this embodiment, the knowledge-graph is essentially a database that includes triples of entities and relationships between the entities.
Further, in this embodiment, after the knowledge graph is constructed, word vector representations of all entities and relationships in the knowledge graph are obtained through algorithms such as a TransE algorithm, the TransE is a distributed word vector representation based on the entities and the relationships, the relationships in each triple instance (head entity, relationship, tail entity) are regarded as translations from the head entity to the tail entity, the head entity and the relationships are subjected to word vector addition to obtain the tail entity, that is, h + r = t, where h represents a word vector of the head entity, r represents a word vector of the relationship, and t represents a word vector of the tail entity, and h, r, and t are continuously adjusted to make (h + r) equal to t as much as possible. The entities and the relations in the knowledge graph are continuously adjusted through a TransE algorithm, so that the word vector representation of the entities and the relations in the knowledge graph is more accurate.
Step S103: and obtaining the score of each map entity.
Specifically, in this step, word vector representations of the question to be answered are first obtained, word vector representations of the respective map entities are obtained, and word vector similarities between the respective map entities and the question to be answered are calculated as scores of the respective map entities.
It should be understood that the above is only one specific application example of obtaining the score of each map entity in the present embodiment, and is not limited thereto, and in other embodiments of the present invention, the score of each map entity may be obtained by a method such as semantic analysis, which is not illustrated herein, and may be flexibly set according to actual needs.
Step S104: a plurality of graph relationships associated with graph entities are obtained.
Specifically, in the present embodiment, all relationships associated with each graph entity are obtained from the knowledge graph as graph relationships.
Step S105: and obtaining the relation score of each map relation.
Specifically, in the present embodiment, the score of each map relationship is obtained by a trained relationship prediction model. Inputting a question to be answered, each map entity and each map relation into a trained relation prediction model; and taking the probability of each map relation output by the relation prediction model as the score of each map relation.
Further, in this embodiment, the relationship prediction model may be a neural network model, which may output probabilities that the relationships in the question are the respective relationships according to the input question, the entity and the relationship. Before data training is performed on the relationship prediction model, an initial relationship prediction model needs to be constructed, a plurality of second training question sentences, a plurality of training entities corresponding to the second training question sentences and a plurality of training relationships corresponding to the training entities are obtained, the second training question sentences, the training entities and the training relationships are input into the initial relationship prediction model, and data training is performed on the initial relationship prediction model.
Preferably, in this embodiment, before performing data training on the initial relationship prediction model, it is first determined whether the number of the plurality of training relationships is greater than or equal to a preset threshold, and if the number of the plurality of training relationships is less than the preset threshold, at least one relationship is immediately added to the training relationships, so that the number of the training relationships reaches the preset threshold. When the number of the training relations is smaller than a preset threshold value, at least one relation is added into the training relations, so that the number of the training relations is guaranteed to reach the preset threshold value, the training relations with enough number are guaranteed to carry out data training on the relation prediction model, the effectiveness of the training on the relation prediction model is guaranteed, the accuracy of the relation prediction model after the training is finished on relation prediction is improved, and the accuracy of answers of the question answering method is further improved.
Step S106: and determining a preset weight according to the training question of a plurality of known answers.
Specifically, in this step, first, a plurality of first training question sentences and standard answers of the plurality of first training question sentences are obtained, and a plurality of initial weights are set; acquiring a training entity set and a training relation set corresponding to each first training question from the knowledge graph; according to the training entity set and the training relationship set, obtaining answers of all first training question sentences under all initial weights as test answers; comparing the test answers with the standard answers to obtain the accuracy of each initial weight; and taking the initial weight with the highest accuracy as a preset weight.
Step S107: and performing weighted calculation on the entity score and the relation score according to a preset weight value, and acquiring a target entity and a target relation according to a calculation result.
Specifically, in this step, the entity score and the relationship score are subjected to weighted summation according to a preset weight value, and an entity-relationship pair of the map entity and the map relationship with the largest sum is obtained and used as a target entity and target relationship.
Step S108: and obtaining answers of the question to be answered according to the target entity and the target relation.
Specifically, in this step, after the target entity and the target relationship are obtained, an entity corresponding to the target entity and the target relationship is obtained from the knowledge graph as an answer to the question to be answered.
Compared with the prior art, the question answering method provided by the first embodiment of the invention acquires the question to be answered, acquires the map entity and the map relation according to the knowledge map and the question entity in the question to be answered, acquires the entity score of each map entity and the relation score of each map relation, and performs weighting calculation on the entity score and the relation score according to the preset weight value.
A second embodiment of the present invention relates to a question answering method, which includes the following steps, as shown in fig. 2:
step S201: and acquiring an entity in the question to be answered as a question entity.
Step S202: obtaining a plurality of map entities associated with the question entities.
Step S203: and obtaining the score of each map entity.
Step S204: a plurality of graph relationships associated with graph entities are obtained.
It is to be understood that steps S201 to S204 in the present embodiment are substantially the same as steps S101 to S104 in the first embodiment, and specific reference may be made to the specific description in the first embodiment, which is not repeated herein.
Step S205: and all the map entities are converted into preset entities.
Step S206: and obtaining the relationship score of each map relationship according to a preset entity.
Specifically, in this step, the question to be answered, each graph relationship, and the preset entity are input into the trained relationship prediction model, and the probability of each graph relationship output by the relationship prediction model is used as the score of each graph relationship.
Step S207: and determining a preset weight according to the training question with a plurality of known answers.
Step S208: and performing weighted calculation on the entity score and the relation score according to a preset weight value, and acquiring a target entity and a target relation according to a calculation result.
Step S209: and obtaining answers of the question to be answered according to the target entity and the target relation.
It is to be understood that steps S207 to S209 in the present embodiment are substantially the same as steps S106 to S108 in the first embodiment, and specific reference may be made to the specific description in the first embodiment, which is not repeated herein.
Compared with the prior art, the second embodiment of the invention maintains all the technical effects of the first embodiment, and converts the graph entity into the preset entity before the score of each graph relation is obtained through the relation prediction model, thereby avoiding the influence of a plurality of complicated graph entities on the prediction result of the relation prediction model and improving the accuracy of the prediction result.
A third embodiment of the present invention relates to a question answering method, which includes the following steps, as shown in fig. 3:
step S301: and acquiring an entity in the question to be answered as a question entity.
Step S302: obtaining a plurality of map entities associated with the question entities.
Step S303: and obtaining the score of each map entity.
Step S304: a plurality of graph relationships associated with graph entities are obtained.
Step S305: and obtaining the relationship score of each map relationship according to a preset entity.
It is to be understood that steps S301 to S305 in the present embodiment are substantially the same as steps S101 to S105 in the first embodiment, and specific reference may be made to the specific description in the first embodiment, which is not repeated herein.
Step S306: and obtaining N map relations with larger scores in each map relation.
Specifically, N is an integer greater than zero.
Further, in this step, the map relations are sorted from large to small according to the scores, and N map relations ranked at the top are obtained.
Step S307: and determining a preset weight according to the training question of a plurality of known answers.
It is to be understood that step S307 in this embodiment is substantially the same as step S106 in the first embodiment, and specific reference may be made to the specific description in the first embodiment, which is not repeated herein.
Step S308: and performing weighted calculation on the entity score and the relation scores of the N map relations according to a preset weight value, and acquiring a target entity and target relation according to a calculation result.
Step S309: and obtaining answers of the question to be answered according to the target entity and the target relation.
It is to be understood that step S309 in this embodiment is substantially the same as step S108 in the first embodiment, and specific reference may be made to the specific description in the first embodiment, which is not repeated herein.
Compared with the prior art, the third embodiment of the invention retains all technical effects of the first embodiment, and obtains the N atlas relations with larger scores in each atlas relation for weighted calculation, so that the calculation process of the atlas relation with too low score is eliminated, the calculation amount of weighted calculation is effectively simplified, and the response efficiency of the question-answering method provided by the third embodiment of the invention is improved.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the steps contain the same logical relationship, which is within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
A fourth embodiment of the present invention relates to an electronic apparatus, as shown in fig. 4, including: at least one processor 401; and a memory 402 communicatively coupled to the at least one processor 401; the memory 402 stores instructions executable by the at least one processor 401, and the instructions are executed by the at least one processor 401 to enable the at least one processor 401 to perform the above-described question-and-answer method.
Where the memory 402 and the processor 401 are coupled by a bus, which may include any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 401 and the memory 402 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, etc., which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 401 may be transmitted over a wireless medium via an antenna, which may receive the data and transmit the data to the processor 401.
The processor 401 is responsible for managing the bus and general processing and may provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 402 may be used to store data used by processor 401 in performing operations.
A fifth embodiment of the present invention relates to a computer program product for use with an electronic device, and since the principle is the same as the above-mentioned question-answering method, the implementation thereof can be referred to the above-mentioned method embodiment, and repeated details are omitted. The computer program product comprises a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for performing any of the foregoing method embodiments.
A sixth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.