CN112784066A

CN112784066A - Information feedback method, device, terminal and storage medium based on knowledge graph

Info

Publication number: CN112784066A
Application number: CN202110278630.5A
Authority: CN
Inventors: 毋杰; 周凯捷
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2021-05-11
Anticipated expiration: 2041-03-15
Also published as: CN112784066B

Abstract

The embodiment of the invention discloses a knowledge graph-based information feedback method, a knowledge graph-based information feedback device, a knowledge graph-based information feedback terminal and a storage medium, wherein the method comprises the steps of extracting a target entity in a target text, obtaining N incidence relation vectors corresponding to the target entity based on the knowledge graph, replacing the target entity in the target text based on target preset characters to obtain a target replacement text, determining the target replacement text vector corresponding to the target replacement text, determining the target incidence relation vector with the highest similarity between the N incidence relation vectors and the target replacement text vector, determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector, and outputting the feedback information. By implementing the method, the knowledge graph can be constructed based on the vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of information feedback based on the knowledge graph is improved.

Description

Information feedback method, device, terminal and storage medium based on knowledge graph

Technical Field

The invention relates to the technical field of computers, in particular to a knowledge graph-based information feedback method, a knowledge graph-based information feedback device, a knowledge graph-based information feedback terminal and a storage medium.

Background

The method comprises the steps of feeding back information based on a knowledge graph, namely giving a natural language problem, obtaining the intention of a user in a mode of analyzing the problem, and further utilizing graph knowledge to inquire and reason to obtain accurate feedback information. Due to the structuring of knowledge and the accuracy of answers, the method has been increasingly researched and paid attention.

Currently, for information feedback based on the knowledge graph in each industry, the amount of products in the industry is large, iteration change is fast, new products and associated new attributes are on-line almost every week, and when a user raises a problem aiming at the new products or the new attributes, corresponding feedback cannot be obtained from the knowledge graph because information in the knowledge graph is not updated in time, so that the information feedback based on the knowledge graph fails. Namely, the accuracy of the current knowledge-graph-based information feedback is low.

Disclosure of Invention

The embodiment of the invention provides a knowledge graph-based information feedback method, a knowledge graph-based information feedback device, a knowledge graph-based information feedback terminal and a storage medium, wherein the knowledge graph can be constructed based on vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of knowledge graph-based information feedback is improved.

In one aspect, an embodiment of the present invention provides an information feedback method based on a knowledge graph, where the method includes:

acquiring a target text and extracting a target entity in the target text;

acquiring N incidence relations corresponding to the target entity from a knowledge graph, and vectorizing the N incidence relations to obtain N incidence relation vectors, wherein N is a positive integer;

replacing a target entity in the target text based on a target preset character to obtain a target replacement text, and determining a target replacement text vector corresponding to the target replacement text;

calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining a target incidence relation vector with the highest similarity between the target replacement text vector and the target replacement text vector;

determining feedback information corresponding to the target text from the knowledge graph based on a target entity vector corresponding to the target entity and the target incidence relation vector;

and outputting feedback information corresponding to the target text.

In one aspect, an embodiment of the present invention provides an information feedback apparatus based on a knowledge graph, where the apparatus includes:

the acquisition module is used for acquiring a target text and extracting a target entity in the target text;

the acquisition module is further configured to acquire N association relationships corresponding to the target entity from a knowledge graph;

the processing module is used for vectorizing the N incidence relations to obtain N incidence relation vectors, wherein N is a positive integer;

the replacing module is used for replacing a target entity in the target text based on a target preset character to obtain a target replacing text;

the determining module is used for determining a target replacing text vector corresponding to the target replacing text;

the calculation module is used for calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors;

the determining module is further configured to determine a target association relationship vector with the highest similarity to the target replacement text vector;

the determining module is further configured to determine, based on a target entity vector corresponding to the target entity and the target association relation vector, feedback information corresponding to the target text from the knowledge graph;

and the output module is used for outputting the feedback information corresponding to the target text.

In one aspect, an embodiment of the present invention provides a terminal, including a processor and a memory, where the memory is configured to store a computer program, and the computer program includes program instructions, and is characterized in that the processor is configured to call the program instructions to execute the knowledge-graph-based information feedback method.

In one aspect, an embodiment of the present invention provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions, which, when executed by a processor, cause the processor to execute the method for knowledge-graph-based information feedback.

In the embodiment of the invention, a terminal acquires a target text, extracts a target entity in the target text, acquires N incidence relations corresponding to the target entity from a knowledge graph, vectorizes the N incidence relations to obtain N incidence relation vectors, replaces the target entity in the target text based on target preset characters to obtain a target replacement text, and determines a target replacement text vector corresponding to the target replacement text; calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining the target incidence relation vector with the highest similarity with the target replacement text vector; determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector; and outputting feedback information corresponding to the target text. By implementing the method, the knowledge graph can be constructed based on the vectors, and corresponding feedback information is inquired in the knowledge graph based on the correlation between the vectors, so that no matter what text is acquired, the most relevant feedback information of the text can be acquired through the knowledge graph, and the accuracy rate of information feedback based on the knowledge graph is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a knowledge-graph-based information feedback method according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating a structure of a dictionary tree according to an embodiment of the present invention;

FIG. 3 is a flow chart of another knowledge-graph based information feedback method according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of determining vector representations of triples in a knowledge-graph according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an apparatus for feeding back knowledge-based information according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The information feedback method based on the knowledge graph provided by the embodiment of the invention is realized on a terminal, and the terminal comprises electronic equipment such as a smart phone, a tablet computer, a digital audio and video player, an electronic reader, a handheld game machine or vehicle-mounted electronic equipment and the like.

Fig. 1 is a schematic flow chart of an information feedback method based on a knowledge graph in an embodiment of the present invention, and as shown in fig. 1, the flow chart of the information feedback method based on a knowledge graph in the embodiment may include:

s101, obtaining a target text and extracting a target entity in the target text.

In the embodiment of the present invention, the target text may be composed of a plurality of characters, and may specifically be a question input by a user, or a character converted based on a voice, and the like. After receiving the target text, the target entity in the target text may be extracted, where the specific way of extracting the target entity may be based on a rule and a dictionary method, such as manually written rules, and extracting features, such as keywords, indicator words, and position words, as entities, or a traditional machine learning method based on statistics, a method based on deep learning, and the like.

The method based on the rules and the dictionary specifically includes defining a plurality of word groups as entity word groups in advance, storing the entity word groups in a database, performing word segmentation processing on a target text after receiving the target text to obtain at least one word group, and taking the target word group as a target entity in the target text when detecting that the target word group matched with the entity word groups in the database exists in the at least one word group. Or, the user may input the target text according to a preset template, and the target entity may be directly extracted from a specified position in the target text. Alternatively, the initial recognition model may be trained based on a large number of samples, resulting in a recognition model for recognizing the target entity in the text. Specifically, N sample sets are obtained, each sample set comprises a sample text and a sample entity, and the sample sets are input into an initial recognition model for iterative training so as to update parameters in the initial recognition model; and when detecting that the initial recognition model after the parameter updating meets the preset condition, determining the initial recognition model after the parameter updating as a recognition model. The preset condition may be that the recognition accuracy of the initial recognition model to each sample text is higher than a preset accuracy, and when the predicted entity output by the initial recognition model matches the sample entity, it is determined that the recognition is accurate.

S102, acquiring N incidence relations corresponding to the target entity from the knowledge graph, and vectorizing the N incidence relations to obtain N incidence relation vectors.

In the embodiment of the invention, the knowledge graph comprises a plurality of entity vectors and a plurality of incidence relation vectors, and the specific construction mode of the knowledge graph can be that a sample text set is obtained, wherein the sample text set comprises at least one sample text; extracting a triple corresponding to each sample text in the sample text set to obtain a triple set, wherein each triple in the triple set comprises a head entity, a tail entity and an association relationship, and the association relationship is a relationship between the head entity and the tail entity; determining the vector representation of each triple in the triple set in the knowledge graph to obtain a triple vector set, wherein each triple vector in the triple vector set comprises a head entity vector, a tail entity vector and an incidence relation vector; and the operation vector obtained by performing preset operation based on the head entity vector and the incidence relation vector in the triple exists corresponding relation with the tail entity vector. The specific method for constructing the knowledge graph may be that each triplet vector is correspondingly stored at a designated position in the database to obtain the knowledge graph, and each head entity vector and the relation vector in the knowledge graph are subjected to preset operation to obtain a corresponding tail entity vector.

It should be noted that, for each triplet in the triplet set, the terminal determines the vector representation of the triplet in the knowledge graph in the same manner, and here, the manner of determining the vector representation of each triplet in the knowledge graph is described in the manner of determining the vector representation of any reference triplet in the triplet set by the terminal. Specifically, the method for the terminal to determine the vector representation of any reference triple in the triple set in the knowledge graph may be that, an identifier corresponding to each reference element in the reference triple is determined, each reference element and the identifier corresponding to each reference element are spliced to obtain three reference spliced texts, a language model is invoked to encode each reference spliced text to obtain a reference encoding feature corresponding to each reference spliced text, and a vectorization model is invoked to vectorize each reference encoding feature to obtain a reference head entity vector corresponding to a reference head entity in the reference triple, a reference tail entity vector corresponding to a reference tail entity, and a reference association relationship vector corresponding to a reference association relationship, where the reference element includes the reference head entity, the reference tail entity, or the reference association relationship; the language model is a model for coding based on the semantic meaning of the text; a reference operation vector obtained by performing preset operation based on the reference head entity vector and the reference incidence relation vector has a corresponding relation with the reference tail entity vector; and constructing vector representation of the reference triple in the knowledge graph according to the reference head entity vector, the reference tail entity vector and the reference incidence relation vector. The specific rule for splicing each reference element and the identifier corresponding to each reference element may be to splice the element and the identifier by using a preset splicer, where the identifier is specifically a category to which the element belongs, such as a head entity, an association relationship, or a tail entity, and when the element is a head entity "computer", and the preset splicer is [ CLS ] and [ SEP ], a spliced text obtained by splicing may be [ CLS ] < computer > [ SEP ] < head entity > [ SEP ]. In one embodiment, the language model may be a bert model and the vectorization model may be a TransE model.

In a specific implementation, a specific way for the terminal to obtain the N association relations corresponding to the target entity from the knowledge graph may be to obtain a dictionary tree constructed based on a triple set corresponding to the knowledge graph, find a target entity element matched with the target entity from the dictionary tree, and determine a target node where the target entity element is located; determining N associated nodes and associated entity elements corresponding to the associated nodes, wherein the distance between the associated nodes and a target node in the dictionary tree is smaller than a preset distance; and determining the incidence relation between the target entity element and each incidence entity element as N incidence relations corresponding to the target entity. The dictionary tree comprises at least one node, each node corresponds to each entity element of the triple set, the entity elements comprise head entity elements or tail entity elements, and connection is established among the nodes based on each incidence relation of the triple set.

In one embodiment, the knowledge graph is formed by a triplet vector set, triples corresponding to vectors in the triplet vector set are the triplet set corresponding to the knowledge graph, and a specific manner for constructing the dictionary tree based on the triplet set may be that all entity elements in the triplet set, including a head entity element and a tail entity element, are extracted, and the head entity element and the tail entity element belonging to the same triplet are connected to obtain the dictionary tree. If the head entity element or the tail entity element in the multiple triples may be the same, one head entity element or one tail entity element may exist in the multiple triples, or the head entity element in one triplet may correspond to the tail entity element in another triplet.

In an embodiment, the distance between the nodes in the dictionary tree is the number of edges included in the shortest path between the nodes, and the specific manner of determining the N associated nodes in the dictionary tree, the distance between which and the target node is less than the preset distance, and the associated entity element corresponding to each associated node may be to determine the number of edges included in the shortest path between each node and the target node in the dictionary tree, and use the number of edges as the distance between each node and the target node. The terminal determines N associated nodes, the distance between the associated nodes and the target node is smaller than the preset distance, the associated entity elements included in each associated node are obtained, the associated relation between the target entity element and each associated entity element is determined from the triple, and further, the terminal determines the associated relation between the target entity element and each associated entity element as N associated relations corresponding to the target entity.

For example, the corresponding triple set in the knowledge graph includes 3 triples, where the triple 1 includes a head entity element "computer" and a tail entity element "processor", the triple 2 includes a head entity element "computer" and a tail entity element "display", and the triple 3 includes a head entity element "display" and a tail entity element "liquid crystal screen". The constructed dictionary tree is shown in fig. 2, in which "computer" is used as a root node and is connected with "processor" and "display", respectively, and "display" is connected with "liquid crystal display". When the target entity is a computer, the target entity element of the dictionary tree can be determined to be a computer, when the preset distance is 2, the processor and the display are determined to be associated entity elements corresponding to the target entity, the terminal finds an associated relation component between the computer and the processor from the triple 1, finds an associated relation display device between the computer and the display from the triple 2, and determines the component and the display device to be associated relation corresponding to the target entity.

Further, after the terminal obtains N association relationships, vectorization processing may be performed on the N association relationships to obtain N association relationship vectors. The specific way of vectorization processing may be to splice each association relationship with the corresponding identifier to obtain N relationship spliced texts, call a language model to encode each relationship spliced text to obtain a relationship spliced text corresponding to each relationship spliced text, and call a vectorization model to vectorize each reference encoding feature to obtain N association relationship vectors.

S103, replacing the target entity in the target text based on the target preset characters to obtain a target replacement text, and determining a target replacement text vector corresponding to the target replacement text.

In the embodiment of the invention, a terminal can replace a target entity in a target text based on a target preset character to obtain a target replacement text, wherein different entities can correspond to different preset characters, the preset characters are used for replacing the entities in the text, the target preset character can be a null character, the terminal can replace the target entity with the null character to obtain the target replacement text, and if the target text is 'what the components of a computer are', and the target entity is 'the computer', the null character is used for replacing the entities in the target text to obtain 'what the components of the target replacement text are', and by replacing the target entity with the null character, information representing the incidence relation in the target text can be remained for subsequent matching with the relation in a map. Optionally, stop words may also be removed from the target replacement text, and the obtained target replacement text is "what the component is". In one embodiment, the target preset character may be a standard form character corresponding to the target entity, and if the target entity is a "computer" and the corresponding standard form character is a "computer", the standard form character is used to replace the entity in the target text, so as to obtain what the target replacement text is the "component of the computer".

In an implementation manner, a specific manner of obtaining a target replacement text by replacing a target entity in a target text by a terminal based on a target preset character may be to obtain a target character set where the target entity is located from a database, and determine a target preset character corresponding to the target character set; and replacing the target entity in the target text with the target preset character to obtain a target replacement text. The database stores at least one character set, each character set comprises at least one entity, and each character set corresponds to a preset character; in an embodiment, the database may be specifically constructed by clustering a plurality of characters based on a preset clustering rule to obtain at least one character set, determining a calling frequency of each character in the character set, and determining a character with the highest calling frequency as a preset character corresponding to the character set, where the character may be a phrase, the preset clustering rule may be a semantic cluster, a character number cluster, or the like, and may be specifically preset by a developer, and the calling frequency is a frequency with which the character is used by a user in actual application. Alternatively, the preset character may be preset by the developer, for example, as a null character. A plurality of character sets are stored in the database, each character set comprises at least one character with the same semantic meaning and a standard character corresponding to the character set, and the standard character is a standard form of each character in the set. Through the method, the target entity can be converted into the expression in the standard form, and the vectorization error caused by character difference is reduced.

Furthermore, the terminal can call the trained deep learning model to perform vectorization processing on the target replacement text to obtain a target replacement text vector, wherein the deep learning model can be specifically used for converting the input text into the vector, when the input text is similar, the converted vector is also similar, and the deep learning model can be specifically trained on the basis of a large number of sample texts, so that the deep learning model has a function of converting the text into the corresponding vector.

S104, calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining the target incidence relation vector with the highest similarity with the target replacement text vector.

In the embodiment of the invention, after the terminal obtains the target replacement text vector and the N incidence relation vectors, the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors can be calculated.

In a specific implementation, for any reference incidence relation vector in the N incidence relation vectors, the specific way of determining the similarity between the target replacement text vector and the reference incidence relation vector by the terminal may be to determine the distance between the target replacement text vector and the reference incidence relation vector, and obtain a first similarity based on distance calculation; determining the number of the same characters between the target replacement text vector and the reference incidence relation vector, and calculating to obtain a second similarity based on the ratio of the number of the same characters to the total number of characters in the target replacement text vector; calculating to obtain a third similarity based on the first variation range corresponding to the target replacement text vector and the second variation range corresponding to the reference incidence relation vector; and calling a weight model to process the first similarity, the second similarity and the third similarity to obtain the similarity between the target replacement text vector and the reference incidence relation vector. The first variation range is a variation range of a vector in the target text vector, and specifically may be an interval constructed by a maximum vector and a minimum vector in the target text vector, and the second variation range is an interval constructed by a maximum vector and a minimum vector of the reference incidence relation vector. The terminal may specifically determine a first similarity corresponding to the distance based on the first preset corresponding relationship, determine a second similarity corresponding to the ratio based on the second preset corresponding relationship, calculate a range difference between the first variation range and the second variation range, and determine a third similarity corresponding to the range difference based on the third preset corresponding relationship.

The weight model may be a model obtained based on machine learning, and is used to give a better weight to different input values, and then perform weighting and summing processing on the input values. If the weight model is an xgboost model, after the first similarity, the second similarity and the third similarity are input to the weight model, the weight model may obtain a first weight corresponding to the first similarity, a second weight corresponding to the second similarity and a third weight corresponding to the third similarity, perform weighting processing on the first similarity by using the first weight to obtain a first weighted similarity, perform weighting processing on the second similarity by using the second weight to obtain a second weighted similarity, perform weighting processing on the third similarity by using the third weight to obtain a third weighted similarity, and perform summation processing on the first weighted similarity, the second weighted similarity and the third weighted similarity to obtain the similarity between the target replacement text vector and the reference association relation vector. The weight model can give different weights to the similarity degrees under different application scenes. As for the sales scenario and the shopping scenario, their respective first, second, and third weights may be different.

Further, after the terminal calculates the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, the target incidence relation vector with the highest similarity to the target replacement text vector can be determined.

And S105, determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector.

In the embodiment of the invention, after the terminal determines the target relation vector, the feedback information corresponding to the target text is determined from the knowledge graph based on the target entity vector corresponding to the target entity and the target association relation vector.

In specific implementation, a terminal performs preset operation on a target entity vector and a target incidence relation vector to obtain a target operation vector; acquiring a target tail entity vector matched with the target operation vector from the knowledge graph; and determining information corresponding to the target tail entity vector as feedback information corresponding to the target text. The preset operation may be an addition operation, a subtraction operation, a dot product operation, and the like, and may be specifically determined by a construction method of the knowledge graph, and if the tail entity vector corresponding to each head entity vector and the relationship vector in the knowledge graph can be obtained after the addition operation is performed, the preset operation is the addition operation. The information corresponding to the target tail entity vector may specifically be a target tail entity, or include a text of the target tail entity, and the like.

And S106, outputting feedback information corresponding to the target text.

In the embodiment of the invention, after the terminal determines the feedback information corresponding to the target text, the feedback information corresponding to the target text can be output. Further, the terminal may store the target text and the feedback information correspondingly, so that when the same target text is received next time, the target text is directly output based on the stored feedback information. The storage position can be a database or a block chain, and when the target text and the feedback information need to be correspondingly stored in the block chain, the specific storage mode of the terminal is that the target text and the feedback information are broadcasted, so that the nodes in the block chain carry out consensus verification on the target text and the feedback information; if the node in the block chain passes the verification of the target text and the feedback information; packing the target text and the feedback information into blocks and linking the blocks into a block chain so as to realize the corresponding storage of the target text and the feedback information.

In one implementation mode, before outputting the feedback information corresponding to the target text, the terminal checks whether the similarity between the target entity vector corresponding to the target entity and the target association relation vector is greater than a preset similarity, and if so, outputs the feedback information corresponding to the target text; and if the similarity is smaller than the preset similarity, requesting manual service and outputting corresponding manual feedback based on the manual service. Further, the terminal outputs prompt information to prompt that the association relation existing in the target text may be a new relation, and updates the knowledge graph based on the new relation after the corresponding new relation is determined manually. So as to realize the discovery of new association relationship and further update the knowledge graph. By the mode, when a user inputs a problem, accurate feedback can be timely obtained, and the information feedback accuracy is improved.

In the embodiment of the invention, a terminal acquires a target text, extracts a target entity in the target text, acquires N incidence relations corresponding to the target entity from a knowledge graph, vectorizes the N incidence relations to obtain N incidence relation vectors, replaces the target entity in the target text based on target preset characters to obtain a target replacement text, and determines a target replacement text vector corresponding to the target replacement text; calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining the target incidence relation vector with the highest similarity with the target replacement text vector; determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector; and outputting feedback information corresponding to the target text. By implementing the method, the knowledge graph can be constructed based on the vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of information feedback based on the knowledge graph is improved.

Fig. 3 is a schematic flow chart of another knowledge-graph-based information feedback method in the embodiment of the present invention, and as shown in fig. 3, the flow chart of the knowledge-graph-based information feedback method in the embodiment may include:

s301, obtaining a sample text set.

In the embodiment of the present invention, the sample text set includes at least one sample text, the sample text may be a text obtained in history, each sample text may include a head entity, a tail entity, and a relationship between the head entity and the tail entity, and if the sample text is "a component of a computer includes a mouse", the head entity is "the computer", the tail entity is "the mouse", and the association relationship is "the component".

And S302, extracting a triple corresponding to each sample text in the sample text set.

In the embodiment of the invention, after the terminal acquires the sample text set, the terminal can extract the triples corresponding to each sample text in the sample text set to obtain the triple set, wherein each triplet in the triple set comprises a head entity, a tail entity and an association relation, and the association relation is the relation between the head entity and the tail entity. The specific way of extracting the triples in each sample text may be, for example, extracting features such as nouns, indicator words, and position words as entities and characters between the entities as relationships by using a rule and dictionary-based method, for example, using a manually written rule, or extracting corresponding triples from specified positions in the sample texts if the formats of each sample text are the same.

S303, determining the vector representation of each triple in the triple set in the knowledge graph to obtain a triple vector set.

In specific implementation, each triplet vector in the triplet vector set includes a head entity vector, a tail entity vector, and an association relation vector, and an operation vector obtained by performing a preset operation based on the head entity vector and the association relation vector in the triplet has a corresponding relationship with the tail entity vector. If the preset operation is an addition operation, the addition operation is performed based on the head entity vector and the incidence relation vector in the triple, and then the tail entity vector can be obtained.

Specifically, the method for the terminal to determine the vector representation of any reference triple in the triple set in the knowledge graph may be that, an identifier corresponding to each reference element in the reference triple is determined, each reference element and the identifier corresponding to each reference element are spliced to obtain three reference spliced texts, a language model is invoked to encode each reference spliced text to obtain a reference encoding feature corresponding to each reference spliced text, and a vectorization model is invoked to vectorize each reference encoding feature to obtain a reference head entity vector corresponding to a reference head entity in the reference triple, a reference tail entity vector corresponding to a reference tail entity, and a reference association relationship vector corresponding to a reference association relationship, where the reference element includes the reference head entity, the reference tail entity, or the reference association relationship; the language model is a model for coding based on the semantic meaning of the text; and a corresponding relation exists between a reference operation vector obtained by performing preset operation on the basis of the reference head entity vector and the reference incidence relation vector and the reference tail entity vector.

For example, the language model is a bert model, the vectorization model is a TransE model, the predetermined operation is an addition operation, and the reference tail entity vector can be obtained by performing the addition operation based on the reference head entity vector and the reference incidence relation vector. The process of determining vector representation of a reference triple in a triple set in a knowledge graph is shown in fig. 4, in 401, after a terminal determines an identifier corresponding to each reference element in the reference triple, a reference head entity is spliced with a head entity identifier to obtain a reference spliced text 1, a reference association relation is spliced with a relation identifier to obtain a reference spliced text 2, a reference tail entity is spliced with a tail entity identifier to obtain a reference spliced text 3, wherein the specific form of the reference spliced text is shown in 402 and includes [ CLS ] < entity or relation > [ SEP ] < entity identifier or relation identifier > [ SEP ], further, in 403, the first four layers of a bert model are called to encode each reference spliced text to obtain reference encoding features corresponding to each reference spliced text, which are respectively reference head entity features, and in 404, calling a TransE model to process each reference coding feature to obtain a reference head entity vector, a reference tail entity vector and a reference incidence relation vector corresponding to the reference head entity.

And S304, constructing a knowledge graph based on the triple vector set.

In the embodiment of the invention, the terminal correspondingly stores each triple vector at a specified position in the database to obtain the knowledge map, and the corresponding tail entity vector can be obtained after the preset operation is carried out on each head entity vector and the relation vector in the knowledge map.

S305, obtaining the target text and extracting the target entity in the target text.

In the embodiment of the present invention, the target text may be composed of a plurality of characters, and may specifically be a question input by a user, or a character converted based on a voice, and the like. The specific way of extracting the target entity may be based on a rule and dictionary method, such as manually written rules, extracting features, such as keywords, indicator words, and position words, as entities, or a traditional machine learning method based on statistics, a deep learning method based on deep learning, and the like.

S306, acquiring N incidence relations corresponding to the target entity from the knowledge graph, and vectorizing the N incidence relations to obtain N incidence relation vectors.

In the embodiment of the invention, a specific way for the terminal to obtain the N association relations corresponding to the target entity from the knowledge graph may be to obtain a dictionary tree constructed based on a triple set corresponding to the knowledge graph, find a target entity element matched with the target entity from the dictionary tree, and determine a target node where the target entity element is located; determining N associated nodes and associated entity elements corresponding to the associated nodes, wherein the distance between the associated nodes and a target node in the dictionary tree is smaller than a preset distance; and determining the incidence relation between the target entity element and each incidence entity element as N incidence relations corresponding to the target entity.

Further, after the terminal obtains N association relationships, vectorization processing may be performed on the N association relationships to obtain N association relationship vectors.

S307, replacing the target entity in the target text based on the target preset characters to obtain a target replacement text, and determining a target replacement text vector corresponding to the target replacement text.

In the embodiment of the invention, the terminal can replace the target entity in the target text based on the target preset character to obtain the target replacement text, wherein different entities can correspond to different preset characters, the preset character is used for replacing the entity in the text, and the target preset character can be a null character.

S308, calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining the target incidence relation vector with the highest similarity with the target replacement text vector.

S309, based on the target entity vector and the target incidence relation vector corresponding to the target entity, determining feedback information corresponding to the target text from the knowledge graph.

And S310, outputting feedback information corresponding to the target text.

In the embodiment of the invention, after the terminal determines the feedback information corresponding to the target text, the feedback information corresponding to the target text can be output.

In an implementation manner, after the feedback information corresponding to the target text is output, it is detected whether the target association relationship is stored in an intention set corresponding to a target application scene, and if not, it is determined that the target association relationship is a new intention, so as to implement discovery of the new intention, where the target application scene may be an insurance scene, a financial scene, or the like, and is not limited herein. Through the mode, discovery of new intentions can be achieved based on the knowledge graph.

In the embodiment of the invention, the terminal constructs the knowledge graph and feeds back information of the input text based on the constructed knowledge graph. By implementing the method, the knowledge graph can be constructed based on the vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of information feedback based on the knowledge graph is improved.

The knowledge-graph based information feedback device provided by the embodiment of the invention will be described in detail with reference to fig. 5. It should be noted that the knowledge-graph-based information feedback device shown in fig. 5 is used for executing the method of the embodiment of the present invention shown in fig. 1 and 3, for convenience of description, only the part related to the embodiment of the present invention is shown, and specific technical details are not disclosed, and reference is made to the embodiment of the present invention shown in fig. 1 and 3.

Referring to fig. 5, a schematic structural diagram of an apparatus for feeding back knowledge-graph based information according to the present invention is shown, where the apparatus 50 for feeding back knowledge-graph based information includes: an acquisition module 501, a processing module 502, a replacement module 503, a determination module 504, a calculation module 505, and an output module 506.

An obtaining module 501, configured to obtain a target text and extract a target entity in the target text;

the obtaining module 501 is further configured to obtain N association relationships corresponding to the target entity from a knowledge graph;

a processing module 502, configured to perform vectorization processing on the N association relations to obtain N association relation vectors, where N is a positive integer;

a replacing module 503, configured to replace a target entity in the target text based on a target preset character to obtain a target replacement text;

a determining module 504, configured to determine a target replacement text vector corresponding to the target replacement text;

a calculating module 505, configured to calculate a similarity between the target replacement text vector and each of the N association relationship vectors;

the determining module 504 is further configured to determine a target association relationship vector with the highest similarity to the target replacement text vector;

the determining module 504 is further configured to determine, based on a target entity vector corresponding to the target entity and the target association relationship vector, feedback information corresponding to the target text from the knowledge graph;

and an output module 506, configured to output feedback information corresponding to the target text.

In one implementation, the processing module 502 is further configured to:

obtaining a sample text set, wherein the sample text set comprises at least one sample text;

extracting a triple corresponding to each sample text in the sample text set to obtain a triple set, wherein each triple in the triple set comprises a head entity, a tail entity and an association relationship, and the association relationship is the relationship between the head entity and the tail entity;

determining the vector representation of each triplet in the triplet set in a knowledge graph to obtain a triplet vector set, wherein each triplet vector in the triplet vector set comprises a head entity vector, a tail entity vector and an incidence relation vector;

and constructing a knowledge graph based on the triad vector set.

In one implementation, the processing module 502 is further configured to:

determining an identifier corresponding to each reference element in a reference triple, and splicing each reference element and the identifier corresponding to each reference element to obtain three reference spliced texts, wherein the reference elements comprise a reference head entity, a reference tail entity or a reference association relation;

calling a language model to encode each reference mosaic text to obtain reference encoding characteristics corresponding to each reference mosaic text, wherein the language model is a model for encoding based on the semantics of the text;

invoking a vectorization model to carry out vectorization processing on each reference coding feature to obtain a reference head entity vector corresponding to a reference head entity in the reference triple, a reference tail entity vector corresponding to the reference tail entity and a reference incidence relation vector corresponding to the reference incidence relation, wherein a corresponding relation exists between the reference head entity vector and the reference tail entity vector, and the reference operation vector is obtained by carrying out preset operation on the basis of the reference head entity vector and the reference incidence relation vector;

and constructing vector representation of the reference triple in the knowledge graph according to the reference head entity vector, the reference tail entity vector and the reference incidence relation vector.

In an implementation manner, the obtaining module 501 is specifically configured to:

acquiring a dictionary tree constructed based on a triple set corresponding to a knowledge graph, wherein the dictionary tree comprises at least one node, each node corresponds to each entity element of the triple set, the entity elements comprise head entity elements or tail entity elements, and each node is connected based on each incidence relation of the triple set;

finding a target entity element matched with the target entity from the dictionary tree, and determining a target node where the target entity element is located;

determining N associated nodes and associated entity elements corresponding to the associated nodes, wherein the distance between the N associated nodes and the target node in the dictionary tree is smaller than a preset distance;

and determining the association relationship between the target entity element and each associated entity element as N association relationships corresponding to the target entity.

In one implementation, the calculating module 505 is specifically configured to:

acquiring a target character set where the target entity is located from a database, wherein at least one character set is stored in the database, each character set comprises at least one entity, and each character set corresponds to a preset character;

determining a target preset character corresponding to the target character set;

and replacing the target entity in the target text with the target preset character to obtain a target replacement text.

In one implementation, the replacing module 503 is specifically configured to:

determining the distance between the target replacement text vector and the reference incidence relation vector, and calculating to obtain a first similarity based on the distance;

determining the number of the same characters between the target replacement text vector and the reference incidence relation vector, and calculating to obtain a second similarity based on the ratio of the number of the same characters to the total number of characters in the target replacement text vector;

calculating to obtain a third similarity based on a first variation range corresponding to the target replacement text vector and a second variation range corresponding to the reference incidence relation vector;

and calling a weight model to process the first similarity, the second similarity and the third similarity to obtain the similarity between the target replacement text vector and the reference incidence relation vector.

In an implementation, the determining module 504 is specifically configured to:

performing preset operation on the target entity vector and the target incidence relation vector to obtain a target operation vector;

acquiring a target tail entity vector matched with the target operation vector from the knowledge graph;

and determining information corresponding to the target tail entity vector as feedback information corresponding to the target text.

In the embodiment of the present invention, an obtaining module 501 obtains a target text and extracts a target entity in the target text, a processing module 502 obtains N association relations corresponding to the target entity from a knowledge graph and performs vectorization processing on the N association relations to obtain N association relation vectors, a replacing module 503 replaces the target entity in the target text based on a target preset character to obtain a target replacement text, and a determining module 504 determines a target replacement text vector corresponding to the target replacement text; the calculating module 505 calculates the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and the determining module 504 determines the target incidence relation vector with the highest similarity between the target replacement text vector and the target replacement text vector; determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector; the output module 506 outputs feedback information corresponding to the target text. By implementing the method, the knowledge graph can be constructed based on the vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of information feedback based on the knowledge graph is improved.

Fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention. As shown in fig. 6, the terminal includes: at least one processor 601, input devices 603, output devices 604, memory 605, at least one communication bus 602. Wherein a communication bus 602 is used to enable the connection communication between these components. The input device 603 may be a control panel, a microphone, or the like, and the output device 604 may be a display screen, or the like. The memory 605 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 605 may optionally be at least one storage device located remotely from the processor 601. Wherein the processor 601 may be combined with the apparatus described in fig. 5, the memory 605 stores a set of program codes, and the processor 601, the input device 603, and the output device 604 call the program codes stored in the memory 605 to perform the following operations:

the processor 601 is configured to obtain a target text and extract a target entity in the target text;

the processor 601 is configured to obtain N association relationships corresponding to the target entity from a knowledge graph, and perform vectorization processing on the N association relationships to obtain N association relationship vectors, where N is a positive integer;

the processor 601 is configured to replace a target entity in the target text based on a target preset character to obtain a target replacement text, and determine a target replacement text vector corresponding to the target replacement text;

a processor 601, configured to calculate similarity between the target replacement text vector and each of the N association relationship vectors, and determine a target association relationship vector with the highest similarity to the target replacement text vector;

a processor 601, configured to determine, based on a target entity vector corresponding to the target entity and the target association relationship vector, feedback information corresponding to the target text from the knowledge graph;

and the processor 601 is configured to output feedback information corresponding to the target text.

In one implementation, the processor 601 is specifically configured to:

and constructing a knowledge graph based on the triad vector set.

In one implementation, the processor 601 is specifically configured to:

In the embodiment of the invention, a processor 601 obtains a target text, extracts a target entity in the target text, obtains N association relations corresponding to the target entity from a knowledge graph, performs vectorization processing on the N association relations to obtain N association relation vectors, replaces the target entity in the target text based on target preset characters to obtain a target replacement text, and determines a target replacement text vector corresponding to the target replacement text; calculating the similarity between the target replacement text vector and each incidence relation vector in the N incidence relation vectors, and determining the target incidence relation vector with the highest similarity with the target replacement text vector; determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target incidence relation vector; and outputting feedback information corresponding to the target text. By implementing the method, the knowledge graph can be constructed based on the vectors, corresponding feedback information is inquired in the knowledge graph based on the correlation among the vectors, and the accuracy rate of information feedback based on the knowledge graph is improved.

The module in the embodiment of the present invention may be implemented by a general-purpose Integrated Circuit, such as a CPU (central Processing Unit), or an ASIC (application Specific Integrated Circuit).

It should be understood that, in the embodiment of the present invention, the Processor 601 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The bus 602 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like, and the bus 602 may be divided into an address bus, a data bus, a control bus, or the like, and fig. 6 illustrates only one thick line for convenience of illustration, but does not illustrate only one bus or one type of bus.

It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer storage medium and may include the processes of the embodiments of the methods described above when executed. The computer storage medium may be a magnetic disk, an optical disk, a Read-only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A knowledge graph-based information feedback method is characterized by comprising the following steps:

acquiring a target text and extracting a target entity in the target text;

and outputting feedback information corresponding to the target text.

2. The method according to claim 1, wherein before obtaining the N associations corresponding to the target entity from the knowledge-graph, the method further comprises:

and constructing a knowledge graph based on the triad vector set.

3. The method of claim 2, wherein determining the manner in which any reference triplet in the set of triples is represented by a vector in the knowledge-graph comprises:

4. The method according to claim 2, wherein the obtaining N association relationships corresponding to the target entity from the knowledge-graph includes:

5. The method of claim 1, wherein the replacing the target entity in the target text based on the target preset character to obtain a target replacement text comprises:

6. The method of claim 1, wherein calculating the similarity between the target replacement text vector and any reference incidence relation vector of the N incidence relation vectors comprises:

7. The method according to claim 3, wherein the determining feedback information corresponding to the target text from the knowledge graph based on the target entity vector corresponding to the target entity and the target association relationship vector comprises:

8. An apparatus for knowledge-graph based information feedback, the apparatus comprising:

9. A terminal, comprising a processor and a memory, wherein the memory is configured to store a computer program comprising program instructions, wherein the processor is configured to invoke the program instructions to perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.