CN114925681A - Knowledge map question-answer entity linking method, device, equipment and medium - Google Patents
Knowledge map question-answer entity linking method, device, equipment and medium Download PDFInfo
- Publication number
- CN114925681A CN114925681A CN202210649326.1A CN202210649326A CN114925681A CN 114925681 A CN114925681 A CN 114925681A CN 202210649326 A CN202210649326 A CN 202210649326A CN 114925681 A CN114925681 A CN 114925681A
- Authority
- CN
- China
- Prior art keywords
- entity
- candidate
- entities
- description
- character string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 124
- 239000013598 vector Substances 0.000 claims description 63
- 239000011159 matrix material Substances 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 37
- 238000013528 artificial neural network Methods 0.000 claims description 32
- 238000012549 training Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 9
- 238000012935 Averaging Methods 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 239000000470 constituent Substances 0.000 description 23
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 11
- 230000015654 memory Effects 0.000 description 11
- 238000004891 communication Methods 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- FBOUIAKEJMZPQG-AWNIVKPZSA-N (1E)-1-(2,4-dichlorophenyl)-4,4-dimethyl-2-(1,2,4-triazol-1-yl)pent-1-en-3-ol Chemical compound C1=NC=NN1/C(C(O)C(C)(C)C)=C/C1=CC=C(Cl)C=C1Cl FBOUIAKEJMZPQG-AWNIVKPZSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Animal Behavior & Ethology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides a method, a device, equipment, a storage medium and a program product for linking knowledge-graph question-answer entity, which can be applied to the technical field of artificial intelligence. The method comprises the following steps: acquiring a question of a user; extracting entity mentions in the question of the user; retrieving T candidate entities matching the entity mention from the knowledge graph; acquiring entity description refinement codes of each candidate entity in the T candidate entities; wherein the entity description refinement coding representation is obtained based on character string differences in entity description content between the candidate entity and the N similar entities of the candidate entity; acquiring a coded representation of an entity reference; calculating a second similarity of the entity mention to each candidate entity based on the entity description refinement encoding of the encoded representation of the entity mention to each candidate entity of the T candidate entities; and determining the candidate entity with the largest second similarity in the T candidate entities as the link entity mentioned by the entity.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, a medium, and a program product for linking knowledge-graph question-answer entity.
Background
The knowledge graph represents knowledge in a form of entity-relation-entity, and forms a graph structure taking the entity as a node and taking the relation between the entity and the entity as an edge on the whole. The knowledge graph question and answer is one of typical application forms of the knowledge graph, and specifically, semantic understanding is carried out on a natural language question input by a user, and then corresponding answers are obtained through query and reasoning in the knowledge graph so as to meet the requirements of the user.
In the knowledge-graph question-answering process, the correctness of question entity links is very critical for correctly answering user questions. The question entity link comprises two steps of entity mention extraction and entity disambiguation. The entity mention extraction refers to extracting a character string used for expressing an entity from a question of a user, wherein the character string mentioned by the entity may be the same as or different from a name character string of the entity in a knowledge graph. Entity disambiguation refers to the process of positively linking entity mentions to unique entities in a knowledge graph after the entity mentions are extracted. For example, when the same person name appears in different question sentences, the designated objects may be different entities. It is clear that if accurate and effective entity disambiguation cannot be achieved during the linking process, it is not possible to answer the user's question correctly.
In the related art, the entity description content (usually including the content explaining the entity) can be extended into the knowledge graph based on the question entity link method represented by the entity description to increase the description information of the entity. Thus, when the question entity is linked, the effect of entity disambiguation can be improved by means of the real description information. However, since the entity description information contains a lot of content, when the entity description information contains noise information unrelated to the question of the user, it is possible to suppress the improvement of the entity disambiguation effect.
Disclosure of Invention
In view of the above, the present disclosure provides a method, apparatus, device, medium, and program product for linkage of knowledge-graph question-answer entity that can reduce noise interference.
According to a first aspect of the present disclosure, a method for linking knowledge-graph question-answer sentence entities is provided. The method comprises the following steps: acquiring a question of a user; extracting entity mentions in the user question sentences, wherein the entity mentions are expressed by character strings of the entities in the knowledge graph in the user question sentences; retrieving T candidate entities matching the entity mention from the knowledge-graph, wherein T is an integer greater than 1; acquiring entity description refinement codes of each candidate entity in the T candidate entities; wherein the entity description refinement coding representation is obtained based on character string differences in entity description contents between the candidate entity and N similar entities of the candidate entity, wherein the N similar entities are N entities in the knowledge graph, of which a first similarity to the candidate entity satisfies a preset condition, and N is an integer greater than or equal to 1; obtaining a coded representation of the entity mention; calculating a second similarity of the entity mention to each of the T candidate entities based on the entity description refinement encoding of the encoded representation of the entity mention to each of the candidate entities; and determining the candidate entity with the largest second similarity in the T candidate entities as the link entity mentioned by the entity.
According to an embodiment of the present disclosure, the obtaining of the entity description refinement coding of each of the T candidate entities includes: acquiring entity description key content of the candidate entity; wherein the character string in the entity description key content belongs to the entity description content of the candidate entity but does not belong to the entity description content of any one of the N similar entities; obtaining an encoded representation of the entity description key content; obtaining a coded representation of the candidate entity; and obtaining the entity description refinement code of the candidate entity based on the encoded representation of the candidate entity and the encoded representation of the entity description key content.
According to an embodiment of the present disclosure, the obtaining entity description key content of the candidate entity includes: processing character strings in the entity description content of the candidate entity according to a first rule to obtain a first character string set; processing character strings in the entity description contents of the N similar entities according to the first rule to obtain a second character string set; and subtracting the second character string set from the first character string set to obtain entity description key content of the candidate entity. The first rule comprises that the constituent units of the entity description content are used as splitting granularity, and the character strings in the entity description content are split and then deduplicated.
According to an embodiment of the present disclosure, the obtaining an encoded representation of the entity description key content includes: splicing the user question and the entity description key content of each candidate entity in the T candidate entities to form a first combined character string; encoding the first combined character string by using a second text encoder to obtain an encoded representation of the first combined character string; intercepting, from the encoded representation of the first combined string, an encoding corresponding to a range of positions of the entity description key content of each candidate entity in the first combined string; and obtaining a coded representation of the entity description key content based on the intercepted codes corresponding to the entity description key content of each candidate entity.
According to an embodiment of the present disclosure, the obtaining the entity description refinement coding of the candidate entity based on the encoded representation of the candidate entity and the encoded representation of the entity description key content includes: and taking a preset hyper-parameter as a weight coefficient of the coded representation of the entity description key content, and carrying out weighted summation on the coded representation of the candidate entity and the coded representation of the entity description key content to obtain the entity description refinement code of the candidate entity.
According to an embodiment of the present disclosure, the obtaining of the encoded representation of the candidate entity includes: encoding a second combined character string consisting of the candidate entity and the entity description content thereof by using a first text encoder to obtain an encoded representation of the second combined character string; and intercepting codes corresponding to the position range of the candidate entity in the second combined character string from the coded representation of the second combined character string to obtain the coded representation of the candidate entity.
According to an embodiment of the present disclosure, before the obtaining of the entity description refinement coding for each of the T candidate entities, the method further comprises: calculating the first similarity of the candidate entity to other entities in the knowledge-graph based on the encoded representations of the candidate entity and the encoded representations of the other entities in the knowledge-graph; and selecting N entities with the first similarity meeting the preset condition from the knowledge graph to obtain the N similar entities corresponding to the candidate entities.
According to an embodiment of the present disclosure, before the obtaining of the user question, the method further includes: encoding the entities in the knowledge-graph using a first text encoder to obtain encoded representations of the entities in the knowledge-graph.
According to an embodiment of the present disclosure, the extracting entity mentions in the user question further includes predicting the entity mentions in the user question and a prediction probability thereof using a neural network. The method further comprises the following steps: normalizing the T second similarity degrees corresponding to the T candidate entities respectively to obtain a link probability for representing the correct link of each of the T candidate entities; and outputting the entity mention and its linked entity when the product of the predicted probability of the entity mention and the link probability of the linked entity mentioned by the entity is greater than a probability threshold.
According to an embodiment of the present disclosure, the retrieving, from the knowledge-graph, T candidate entities that match the entity mention includes: calculating a matching score based on a comparison of the entity-referenced character strings to name character strings of entities in the knowledge-graph; and screening the T entities with the highest matching scores from the knowledge graph to obtain the T candidate entities.
According to an embodiment of the present disclosure, the obtaining of the encoded representation of the entity reference includes: coding the character string of the user question by using a second text coder to obtain a coding expression matrix of the user question; intercepting a vector group corresponding to the position range of the character string mentioned by the entity in the user question from the coding expression matrix of the user question to obtain a coding matrix corresponding to the entity mention; and averaging vectors in the coding matrix corresponding to the entity mention according to bits to obtain the coding representation of the entity mention.
According to an embodiment of the present disclosure, the method further comprises: encoding the entities in the knowledge-graph with a first text encoder to obtain encoded representations of the entities in the knowledge-graph; wherein the first similarity is calculated based on a vector of the encoded representation of the entity; coding the character string of the user question by using a second text coder to obtain a coding expression matrix of the user question; wherein the coded representation of the entity mention is obtained by intercepting a vector corresponding to a position range of the character string mentioned by the entity in the user question from the coded representation matrix; and predicting the entity mentions in the user question using a fully connected neural network. The first text encoder, the second text encoder and the fully-connected neural network are obtained through collaborative training, wherein sample data used in the training process comprises the knowledge graph, a sample user question and correct entity mentions and correct link entities in the sample user question.
In another aspect of the disclosed embodiments, a device for linking knowledge-graph question-answer entity is provided. The device comprises a first acquisition module, a first extraction module, a matching module, a second acquisition module, a third acquisition module, a second similarity calculation module and a linking module. The first acquisition module is used for acquiring a question of a user. The first extraction module is used for extracting entity mentions in the user question, and the entity mentions are expressed by character strings of the entities in the knowledge graph in the user question. The matching module is used for retrieving T candidate entities which are matched with the entity mentions from the knowledge graph, wherein T is an integer larger than 1. The second acquisition module is used for acquiring entity description refinement codes of each candidate entity in the T candidate entities; the entity description refinement coding representation is obtained based on character string differences in entity description contents between the candidate entity and N similar entities of the candidate entity, wherein the N similar entities are N entities in the knowledge graph, the first similarity of which to the candidate entity meets a preset condition, and N is an integer greater than or equal to 1. The third obtaining module is used for obtaining the coded representation mentioned by the entity. A second similarity calculation module is to calculate a second similarity of the entity mention to each of the T candidate entities based on the entity description refinement encoding of the encoded representation of the entity mention to each of the candidate entities. The link module is configured to determine the candidate entity with the largest second similarity among the T candidate entities, as the link entity mentioned by the entity.
According to an embodiment of the present disclosure, the first extraction module comprises a fully-connected neural network for predicting the entity mentions in the user question and their predicted probabilities. Accordingly, the linking module is further configured to: normalizing the T second similarity degrees corresponding to the T candidate entities respectively to obtain a link probability for representing the correct link of each of the T candidate entities; and outputting the entity mention and its linked entity when the product of the predicted probability of the entity mention and the link probability of the linked entity mentioned by the entity is greater than a probability threshold.
According to an embodiment of the present disclosure, the second obtaining module is specifically configured to: obtaining entity description key content of the candidate entity; wherein the character string in the entity description key content belongs to the entity description content of the candidate entity but does not belong to the entity description content of any one of the N similar entities; obtaining an encoded representation of the entity description key content; obtaining a coded representation of the candidate entity; and obtaining the entity description refinement code of the candidate entity based on the encoded representation of the candidate entity and the encoded representation of the entity description key content.
According to an embodiment of the present disclosure, the apparatus further comprises a first text encoder. The first text encoder is configured to encode the entities in the knowledge-graph to obtain encoded representations of the entities in the knowledge-graph.
According to an embodiment of the present disclosure, the apparatus further includes a first similarity calculation module. The first similarity calculation module is configured to: calculating the first similarity of the candidate entity to other entities in the knowledge-graph based on the encoded representations of the candidate entity and the encoded representations of the other entities in the knowledge-graph; and selecting N entities with the first similarity meeting the preset condition from the knowledge graph to obtain the N similar entities corresponding to the candidate entities.
According to an embodiment of the present disclosure, the apparatus further comprises a second text encoder. And the second text encoder is used for encoding the character string of the user question to obtain an encoding expression matrix of the user question. Accordingly, the third obtaining module is configured to: intercepting a vector group corresponding to the position range of the character string mentioned by the entity in the user question from the coding expression matrix of the user question to obtain a coding matrix corresponding to the entity mention; and averaging vectors in the coding matrix corresponding to the entity mention according to bits to obtain the coding representation of the entity mention.
According to an embodiment of the present disclosure, the first text encoder, the second text encoder, and the fully-connected neural network are obtained by collaborative training, wherein sample data used in the training process includes the knowledge-graph, a sample user question, and a correct entity mention and a correct link entity in the sample user question.
In a third aspect of the disclosed embodiments, an electronic device is provided. The electronic device includes one or more processors, and a memory. The memory is used to store one or more programs. Wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above-described method.
In a fourth aspect of the embodiments of the present disclosure, there is also provided a computer-readable storage medium having stored thereon executable instructions, which when executed by a processor, cause the processor to perform the above-mentioned method.
In a fifth aspect of the embodiments of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the above method.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario diagram of a method, apparatus, device, storage medium and program product for knowledge-graph question-answer entity linking according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of knowledge-graph question-answer entity linking according to an embodiment of the present disclosure;
fig. 3 schematically shows a flowchart of obtaining an entity description refinement code in a question-and-answer question entity linking method according to an embodiment of the present disclosure;
fig. 4 schematically shows a flowchart for acquiring entity description key content in a question-and-answer sentence entity linking method according to an embodiment of the present disclosure;
fig. 5 schematically shows a flowchart for obtaining an encoded representation of entity description key content in a question-and-answer sentence entity linking method according to an embodiment of the present disclosure;
fig. 6 schematically shows a flowchart of encoding processing of entities in a knowledge-graph in a method of knowledge-graph question-answer entity linking according to another embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow chart of a method for knowledge-graph question-answer entity linking according to yet another embodiment of the present disclosure;
FIG. 8 schematically illustrates a flow chart of a method of knowledge-graph question-answer entity linking according to yet another embodiment of the present disclosure;
FIG. 9 is a schematic diagram illustrating data flow in the method for linking knowledge-graph question-answer entities shown in FIG. 8;
fig. 10 is a block diagram schematically illustrating the structure of a knowledge-graph question-answer entity linking apparatus according to an embodiment of the present disclosure; and
fig. 11 schematically shows a block diagram of an electronic device adapted to implement a method of linking knowledge-graph question-answer entities according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that these descriptions are illustrative only and are not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
The terms used herein are to be interpreted as follows:
knowledge graph: the knowledge graph is a structured semantic knowledge base, describes concepts, entities and mutual relations thereof in a physical world in a symbolic form, and forms a heterogeneous graph structure by taking entities with types and attributes as nodes and taking the relations between the entities as edges in a data level.
Entity: refers to specific objects in the objective world, such as human names, place names, organizational names, and category names such as "human", "animal", "plant" are not entities.
Entity mentions: for example, the united states and united states of america are references to two different entities that are "the united states.
And (3) encoding: and converting the natural language into a real number matrix by utilizing a deep learning network.
Fully connecting the neural networks: and the most basic neural network transforms the input Di dimensional vector to obtain a Do dimensional vector.
Training: the ideal values of the weights and biases of the model are learned (determined) by the labeled/correct samples.
Loss function/loss: the value of a random event or its associated random variable is mapped to a non-negative real number to represent a function of the "risk" or "loss" of the random event.
The gradient counter-propagates: a common method for training artificial neural networks calculates the gradient of the loss function for all weights in the neural network, which is used to update the weights to minimize the loss function.
The entity description content: also referred to as "entity description text," typically contains content that explains the entity. For example, the entity description of the entity "Jilin province" may be "Jilin province, China province administrative district, and Changchun province. For another example, the entity description of "Jilin City" may be "Jilin City, otherwise known as the North Jiangcheng City, which is a prefecture-level city in Jilin province. "
Entity description key content: the character string of the entity description key content of one entity belongs to the entity description content of the entity, but does not belong to the entity description content of any one of the N similar entities of the entity. The N similar entities of one entity refer to N entities in the knowledge graph, wherein the first similarity of the entities meets a preset condition.
The composition unit is as follows: the encoding granularity is preset when a text encoder encodes character strings in a language text, and can be characters, words, participles or subwords and the like, and is preset according to actual needs.
A first text encoder and a second text encoder: the method is a natural language understanding coder model, natural language sentences are input, and coding representation (which can be represented in a vector form) of each constituent unit in the sentences can be obtained through training.
In the disclosed embodiment, the first text encoder is mainly used for encoding the entity in the knowledge-graph. The input of the first text encoder is a character string formed by splicing entity names and description information thereof, and the output is a real number matrix formed by encoding vectors of all constituent units in the character string. And intercepting the coding vector corresponding to the position of the entity name from the real number matrix (if a plurality of vectors exist, averaging the vectors according to bits), so as to obtain the coding representation of the entity.
In the embodiment of the present disclosure, the second text encoder is configured to encode character strings of the user question sentence, and/or the entity mention, and/or the entity description content, and/or the entity description key content, which are respectively or jointly concatenated. When the second text encoder performs encoding, the encoded representation of each constituent unit in the character string can be output after training, wherein the dimension of the encoded representation of each constituent unit is the same as the dimension of the encoded representation of the entity.
The embodiment of the disclosure provides a method, a device, equipment, a medium and a program product for linking knowledge-graph question-answer entity. In the method, when the coded representation of the entity is obtained, the entity description refinement code can be obtained by combining the entity description key content. Because the entity description key content is obtained based on the character string difference of the entity description content of the entity and the similar entity, the entity description refinement coding can strengthen the difference of the entity and the similar entity and highlight the key point and the uniqueness of the entity. When the entity description refinement coding is used for entity link matching, noise interference in entity links can be reduced, and entity disambiguation effect is improved.
It should be noted that the method and apparatus determined by the embodiment of the present disclosure may be used in the financial field, and may also be used in any field other than the financial field, and the application field is not limited by the present disclosure.
Fig. 1 schematically illustrates an application scenario diagram of a method, an apparatus, a device, a storage medium, and a program product for linking knowledge-graph question-answer entity according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the application scenario 100 according to this embodiment may include at least one terminal device (three are shown in the figure, terminal devices 101, 102, 103), a network 104, and a server 105. Network 104 is the medium that provides communication links between terminal devices 101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. A knowledge graph may be deployed in the server 105.
The user can send a user question to the server 105 using the terminal devices 101, 102, 103. The server 105 can perform semantic understanding on the user question, then query and infer from the knowledge graph to obtain corresponding answers, and send the answers to the terminal devices 101, 102, and 103, so that intelligent knowledge graph question answering is realized.
It should be noted that the method for linking an entity of a knowledge-graph question-answer sentence provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the knowledge-graph question-answer entity linking device provided by the embodiments of the present disclosure may be generally disposed in the server 105. The method for linking knowledge-graph question-answer entity provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and can communicate with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the knowledge-graph question-answer entity linking device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The method for linking an entity of a knowledge-graph question-answer sentence according to the embodiment of the present disclosure will be described in detail below with reference to fig. 2 to 9 based on the scenario described in fig. 1.
Fig. 2 schematically shows a flowchart of a method for linking knowledge-graph question-answer entity according to an embodiment of the present disclosure.
As shown in fig. 2, the method for linking an entity of a knowledge-graph question-and-answer sentence according to the embodiment may include operations S210 to S270.
In operation S210, a user question is acquired.
In operation S220, entity mentions in the user question are extracted, and the entity mentions are expressed as character strings of the entities in the knowledge graph in the user question. In one embodiment, trained fully-connected neural networks may be used to predict entity mentions in user question sentences.
In operation S230, T candidate entities matching the entity mention are retrieved from the knowledge graph, where T is an integer greater than 1. T candidate entities may be matched by fuzzy matching. Fuzzy matching is a technique in which strings that match a pattern are found approximately (rather than precisely).
In one implementation, matching scores may be calculated based on a comparison of strings mentioned by the entities with name strings of the entities in the knowledge graph, and then the T entities with the highest matching scores may be screened from the knowledge graph as T candidate entities. The comparison between the character string mentioned by the entity and the name character string of the entity may be, for example, an overlap ratio of characters or character combinations in the character string mentioned by the entity and the name character string of the entity, or an edit distance determined according to a difference between the character string mentioned by the entity and the name character string of the entity, and a size of the edit distance is used as a matching score.
In operation S240, an entity description refinement code for each of the T candidate entities is obtained. Wherein the entity description refinement coding representation is obtained based on character string differences in entity description content between the candidate entity and the N similar entities of the candidate entity. The N similar entities are N entities in the knowledge graph, wherein the first similarity between the N entities and the candidate entities meets a preset condition, and N is an integer greater than or equal to 1.
In one embodiment, the N similar entities of an entity may be entities whose similarity reaches a threshold (e.g., 90%) that are selected by selecting N entities whose similarity is the most similar to each entity after the entities in the knowledge graph are encoded according to a uniform rule by the first text encoder, and calculating the similarity (e.g., characterized by vector inner product, included angle, or cosine similarity) between vectors corresponding to the encoded representations of each entity and other entities. Such that, after the candidate entity is determined, similar entities of the candidate entity can be obtained from the knowledge-graph in operation S240.
When the first text Encoder is a pre-trained language Representation model (Bert), the process of the first text Encoder encoding an entity in the knowledge-graph may be: the entity name and the description information (including but not limited to the entity description content) of the entity are spliced into a character string, for example, "[ CLS ] entity name [ ENT ] description information [ SEP ]", wherein [ CLS ] represents the start character of the character string, [ ENT ] represents the separator of the entity name and the description information, and [ SEP ] represents the end character of the character string, and the output is an encoding matrix. In encoding, the entity name shown in the [ CLS ] section may be encoded as one constituent unit. Therefore, in the embodiment of the present disclosure, the encoding vector corresponding to the [ CLS ] position may be taken from the encoding matrix output by the first text encoder, and is used as the encoding representation of the entity, where the dimension is h, and h may be set to 768 or 1024, for example.
In other embodiments, the N similar entities of an entity may also be determined from character level comparisons. For example, N similar entities of an entity may be determined according to the degree of coincidence of characters in two entities, or the degree of coincidence in a character combination manner.
It is to be understood that the N similar entities of each candidate entity in operation S240 may be selected offline for use before operation S240, or may be obtained by real-time screening after the candidate entities are determined in operation S230.
The entity description refinement coding of each candidate entity is based on string differences in entity description content between the candidate entity and its N similar entities. Wherein the string difference can be expressed as a string in a set of entity description contents belonging only to the candidate entity, but not to any one of the N similar entities. The set of character strings (hereinafter, may be referred to as "entity description key content") may be a character as a minimum component unit, may be a participle (for example, a chinese participle, a subword) as a minimum component unit, or may be a word (for example, a language such as english) as a minimum component unit.
The manner in which entity description refinement coding is derived based on the entity description key content of a candidate entity may be varied.
For example, the entity description key content may be used as the description information of the candidate entity or a part thereof, and the candidate entity is encoded by using a first text encoder to obtain the entity description refinement encoding of the candidate entity.
As another example, the entity description key content may be encoded and fused (e.g., weighted sum) with the encoded representation of the corresponding candidate entity to obtain an entity description refinement encoding for the candidate entity.
In encoding the entity description key content, in some embodiments, the entity description key content may be encoded separately using a second text encoder; or, in other embodiments, the entity description key and the user question may be spliced and then encoded by using a second text encoder, so that the encoded representation of the entity description key content may carry the interactive matching relationship between the entity description key content and the user question.
For example, a second text encoder may be used to encode a character string formed by splicing the user question and the entity description key content of a plurality of candidate entities, and then the encoding of the entity description key content part of each candidate entity may be extracted from the output of the second text encoder. In this way, the obtained coded representation of the entity description key content of each candidate entity can fuse the association relation information with the user question. Furthermore, after the encoded representation of the entity description key content and the encoded representation of the candidate entity are fused to obtain the entity description refinement code, the entity description refinement code can reflect the difference between the candidate entity represented by the entity description refinement code and the N similar entities, and also carries the matching information between the entity description key content of the candidate entity and the question of the user. Therefore, when the entity description refinement coding is used for entity link matching, the effect of entity disambiguation can be more effectively improved.
In operation S250, an encoded representation of the entity mention is obtained.
In one embodiment, the encoding may be mentioned by a second text encoder encoding entity to obtain an output of the second text encoder.
In another embodiment, the character strings of the user question may be encoded by using a second text encoder to obtain an encoded representation matrix of the user question, then a vector group corresponding to a position range of the character strings mentioned by the entity in the user question is intercepted from the encoded representation matrix of the user question to obtain an encoded matrix corresponding to the entity mention, and then vectors in the encoded matrix corresponding to the entity mention are averaged according to bits to obtain an encoded representation of the entity mention. The process of the obtained coded representation of the entity mention combines the context language environment where the entity mention is located to carry out coded output on the entity mention, so that the obtained coded representation of the entity mention can reflect the real meaning of the entity mention in the question of the user better and accords with the habit of language understanding in combination with specific context in natural language understanding.
In operation S260, a second similarity of the entity mention to each of the T candidate entities is calculated based on the entity description refinement encoding of the encoded representation of the entity mention to each of the candidate entities. The second similarity may be characterized by parameters such as inner product, included angle or cosine similarity between vectors corresponding to the coding representation mentioned by the entity and vectors corresponding to the entity description refinement coding of each candidate entity, for example.
In operation S270, the candidate entity with the largest second similarity among the T candidate entities is determined to be the link entity mentioned by the entity.
According to the embodiment of the disclosure, the entity description refinement coding can strengthen the difference between an entity and similar entities, and highlight the key point and uniqueness of the entity, so that when the entity description refinement coding is used for entity link matching, noise interference in a question entity link method based on entity description representation can be reduced, and entity disambiguation effect is improved.
Moreover, as introduced above, when the specific acquisition manner of the entity description refinement coding is different, the improvement effect on the entity disambiguation effect is also different. Particularly, when the second text encoder is used for encoding the character string formed by splicing the user question and the entity description key content, the encoding expression of the entity description key content is obtained according to the character string, and further the entity description refinement encoding is obtained, the obtained entity description refinement encoding is obtained by combining the specific context of the user question, and is more pertinent, and the accuracy of entity linking is improved.
According to an embodiment of the present disclosure, before operations S210 to S270, machine learning models such as a first text encoder, a second text encoder, and a fully-connected neural network used in operations S210 to S270 may be trained in cooperation. The sample data used in the training process comprises a knowledge graph, a sample user question and correct entity mentions and correct link entities in the sample user question. In the training process, the entity mention and link entity predicted by the comparison operation S270, the correct entity mention and the correct link entity thereof may be compared to obtain a loss value of the loss function, and then training of each machine learning model is implemented through a gradient back propagation algorithm.
In the entity disambiguation in the embodiment of the disclosure, the entity is represented by the entity description refinement coding, the representation learning cost is lower than that of a question entity linking method based on knowledge graph context representation, and compared with the question entity linking method based on the entity description representation, the key information more relevant to the question of a user in the entity representation can be highlighted, the noise information is reduced, and further the coding representation of the entity is optimized, and the entity linking effect is improved. The method for linking question entities based on knowledge graph context representation is characterized in that the original representation of the entities is expanded by using relevance information such as related entities, relations and the like of the entities in the knowledge graph, feature representation learning is usually carried out on the graph structure of the knowledge graph, the learned entity code representation integrates the relevance information, and then the entity code representation is used for carrying out subsequent steps of entity reference identification and entity disambiguation. The question entity linking method based on entity description representation is that original description of an entity in a knowledge graph is expanded by using description text (usually including content explaining the entity) of the entity, then the expanded entity description, such as text including an entity name and an entity description, is input into a first text encoder to obtain entity coding representation, and subsequent entity mention identification and entity disambiguation steps are carried out.
Therefore, aiming at the defects that the knowledge graph context representation-based question entity linking method information and representation acquisition cost are high, and the entity description representation-based question entity linking method has low information and representation acquisition cost but contains more irrelevant noise information and affects the entity linking effect, the embodiment of the disclosure reduces the noise information by extracting and encoding the entity description key content, has low information and representation acquisition cost, optimizes the encoding representation of the entity and improves the entity linking effect.
Fig. 3 schematically shows a flowchart of operation S240 of obtaining an entity description refinement code in the question-answer sentence entity linking method according to the embodiment of the present disclosure.
As shown in fig. 3, operation S240 may include operations S241 to S244 according to the embodiment.
In operation S241, entity description key contents of the candidate entity are acquired. The character strings in the entity description key content belong to the entity description content of the candidate entity, but do not belong to the entity description content of any one similar entity of the N similar entities. In one embodiment, the entity description key content may be obtained by a set differencing operation, for which reference may be made to the schematic of fig. 4 below.
An encoded representation of the entity description key content is then obtained in operation S242. In one embodiment, the encoded representation of the entity description key content may be obtained by encoding separately using a second text encoder. In another embodiment, the encoded representation of the entity description key content may also be obtained by splicing related information such as user question and separately encoding with a second text encoder, for which reference may be made to the following illustration in fig. 5.
In operation S243, an encoded representation of the candidate entity is obtained. For example, a second combined character string composed of the candidate entity and the entity description content thereof may be encoded by using a first text encoder to obtain an encoded representation of the second combined character string, and then an encoding corresponding to a position range of the candidate entity in the second combined character string is intercepted from the encoded representation of the second combined character string to obtain an encoded representation of the candidate entity.
In operation S244, an entity description refinement code of the candidate entity is obtained based on the encoded representation of the candidate entity and the encoded representation of the entity description key content. For example, the preset hyper-parameter may be used as a weight coefficient of the encoded representation of the entity description key content, and the encoded representation of the candidate entity and the encoded representation of the entity description key content are subjected to weighted summation to obtain the entity description refinement code of the candidate entity.
Fig. 4 schematically shows a flowchart of operation S241 obtaining entity description key content in the question-answer sentence entity linking method according to the embodiment of the present disclosure.
As shown in fig. 4, operation S241 may include operations S401 to S403 according to the embodiment.
In operation S401, the character strings in the entity description content of the candidate entity are processed according to the first rule, so as to obtain a first character string set. The first rule comprises that the composition units of the entity description content are used as splitting granularity, and the character strings in the entity description content are split and then are deduplicated.
The constituent elements may be determined according to actual requirements. For example, a character in a character string may be taken as a constituent unit, where a character string in the first character string set is a character; or for example, for a Chinese knowledge graph, a word segmentation tool can be used to segment the entity description content, so that one character string in the first character string set is a word; still alternatively, for languages such as english, words may be used as constituent elements, so that one string in the first string set is a word.
After splitting the entity description content of the candidate entity into a plurality of character strings according to the granularity of the constituent units, the plurality of character strings may be deduplicated, so that only one character string is retained in the first character string set.
In operation S402, the character strings in the entity description contents of the N similar entities are processed according to the first rule, so as to obtain a second character string set. And processing the second character string set in the same way as the first character string set to obtain a second character string set corresponding to the entity description contents of the N similar entities.
In operation S403, the entity description key content of the candidate entity is obtained by subtracting the second character string set from the first character string set.
The character strings in the obtained entity description key contents belong to the entity description contents of the candidate entities, but do not belong to the entity description contents of any one similar entity of the N similar entities.
Fig. 5 schematically shows a flowchart of operation S242 in the question-answer sentence entity linking method according to the embodiment of the present disclosure to obtain the encoded representation of the entity description key content.
As shown in fig. 5, operation S242 may include operations S501 to S504 according to the embodiment.
In operation S501, a user question is concatenated with entity description key contents of each of the T candidate entities to form a first combined character string.
In operation S502, the first combined string is encoded using a second text encoder, resulting in an encoded representation of the first combined string. When the second text encoder encodes the combined character string, an encoding vector of each constituent unit is output according to the constituent units of the first combined character string.
In operation S503, an encoding corresponding to a position range of the entity description key content of each candidate entity in the first combined string is truncated from the encoded representation of the first combined string.
In operation S504, an encoded representation of the entity description key content is obtained based on the intercepted encoding corresponding to the entity description key content of each candidate entity. When the entity description key content comprises a plurality of constituent units, the intercepted codes correspondingly comprise a plurality of vectors, so that the vectors can be averaged according to bits to obtain the coded representation of the entity description key content.
According to the embodiment of the disclosure, when the code of the entity description key content is obtained, the entity description key content and the user question are spliced to form the character string for coding, so that the interactive relation between the entity description key content and the user question can be fused in the coding process, the coded representation of the entity description key content carries the matching degree information with the user question, and the pertinence when entity linkage is carried out on subsequent entity mention in the user question is improved.
Fig. 6 schematically shows a flowchart of a process of encoding entities in a knowledge-graph in a method for linking knowledge-graph question-answer sentence entities according to another embodiment of the present disclosure.
As shown in fig. 6, according to the embodiment of the present disclosure, before operations S210 to S270, an encoding process may be performed on an entity in a knowledge graph, and specifically, operations S610 to S630 may be included.
In operation S610, an entity in a knowledge-graph is encoded using a first text encoder, resulting in an encoded representation of the entity in the knowledge-graph.
In operation S620, a first similarity of each entity to other entities in the knowledge-graph is calculated based on the encoded representation of each entity and the encoded representations of the other entities in the knowledge-graph.
For example, the first similarity is calculated by operating between a vector corresponding to the coded representation of each entity and a vector corresponding to the coded representation of another entity, wherein the first similarity can be represented by an inner product of vectors, or an included angle of vectors, or a cosine similarity of vectors, etc.
Then, in operation S630, for each entity, N entities with the first similarity satisfying a preset condition may be selected from the knowledge graph to obtain N similar entities corresponding to the candidate entity. Thus, after T candidate entities are matched in operation S230, N similar entities of each candidate entity can be directly obtained from the knowledge-graph.
The preset condition may be that the first similarity is ranked from high to low, and then ranked as the top N entities. When the first similarity is represented by different parameters, the physical meanings corresponding to the first similarity are different. For example, when the first similarity is characterized by an inner product of vectors, a larger inner product means that the two vectors are more similar, i.e., the first similarity is larger; when the first similarity is represented by a vector included angle, the smaller the vector included angle is, the more similar the two vectors are, namely the larger the first similarity is; when the first similarity is characterized by cosine similarity, a larger cosine value means that the two vectors are more similar, i.e., the first similarity is larger.
Or the preset condition may be that the first similarity exceeds a predetermined threshold. When the first similarity is characterized by different parameters (inner product, vector angle, or cosine similarity), the predetermined threshold may be set according to the corresponding parameters.
According to the embodiment of the disclosure, before the knowledge graph is used for the question-answer interaction of the user, the entities in the knowledge graph are subjected to offline coding, and similar entities are screened out, so that the interaction efficiency with the user during the question-answer of the knowledge graph can be improved.
Fig. 7 schematically shows a flowchart of a method for interlinking knowledge-graph question-answer sentence entities according to still another embodiment of the present disclosure.
As shown in fig. 7, the method for linking an ontology-based question-answering sentence entity according to the embodiment may include operations S210, S720, S230 to S270, and S780 to S790.
First, in operation S210, a user question is acquired.
Then, in operation S720, entity mentions in the user question and their predicted probabilities are predicted using the fully-connected neural network. Operation S720 is a specific embodiment of operation S220.
Next, through operations S230 to S270, T candidate entities with matching entity mentions are retrieved from the knowledge graph, a second similarity between each candidate entity and the entity mention is calculated based on the entity description essence code of the coded representation of the entity mention and each candidate entity, and the candidate entity with the largest second similarity is determined as the link entity of the entity mention. The details of operations S230 to S270 may refer to the foregoing description, and are not described herein again.
Then, in operation S780, the T second similarities corresponding to the T candidate entities are normalized to obtain a link probability for representing that each link of the T candidate entities is correct.
The T second similarities may be processed, for example, by softmax normalization. Where softmax is normalized, the T second similarities may be compressed into another T-dimensional real vector σ (z), such that each element in the T-dimensional real vector σ (z) ranges between (0, 1), and the sum of all elements is 1.
Next, in operation S790, when the product of the predicted probability of the entity mention and the link probability of the entity mention linked entity is greater than a probability threshold, the output entity mention and its linked entity.
When the product is greater than the probability threshold, the credibility of the entity mention predicted in operation S270 and the entity linked thereto is considered to meet the requirement, and the entity mention and the entity linked thereto may be output. When the product is not greater than the probability threshold, the confidence level of the entity mention predicted in operation S270 and the entity linked thereto may be considered to be too low, and the entity mention and the entity linked thereto may be deleted.
In this way, when the user question is answered by referring to the entity mention and its linked entity output by operation S790, the accuracy of the knowledge-graph answer may be improved.
Next, the application of the method for linking an entity of a knowledge-graph question-answer sentence according to the embodiment of the present disclosure is exemplarily described with reference to the specific embodiments of fig. 8 and 9. In the examples of fig. 8 and 9, the "opening time of the jilin songhuajiang culture festival" is used as the user question. It is to be understood that the examples of fig. 8 and 9 are illustrative and not limiting of the present disclosure.
Fig. 8 schematically shows a flowchart of a method for linking an entity of a knowledge-graph question-answer sentence according to still another embodiment of the present disclosure. Fig. 9 schematically shows a data flow diagram in the method for linking an entity of a knowledge-graph question-answer sentence shown in fig. 8.
As shown in fig. 8, in conjunction with fig. 9, the method for linking a knowledge-graph question-answer sentence entity of this embodiment may include steps S1 to S9.
Step S1: and (5) entity off-line coding. Where entities in the knowledge-graph are encoded primarily using a first text encoder. This step is a specific embodiment of operation S610 described above.
With reference to fig. 9, the specific process of encoding the entity in S1 is as follows:
(1) inputting: a set of names of all entities, and an entity description text for each entity. For example, the entity description text of the entity "Jilin province" is "Jilin province, which is the administrative district of Chinese province level, and province is Changchun. The entity description text of ' Jilin City ' is ' Jilin City, which is called the North China Jiangcheng and is a prefecture-level city of Jilin province. "of course, in some embodiments, the input may also include information such as the association relationship between each entity and other entities in the knowledge graph;
(2) and (3) outputting: an encoded representation of each entity, i.e., a vector of h-dimensional real numbers, where h is an integer;
(3) and (3) an encoding process: splicing the character string of the entity name and the character string in the entity description text into a character string in an off-line manner, inputting the character string into a first text encoder for vector encoding, and acquiring the code of the character string of the entity name from the output of the first text encoder so as to obtain the code expression of the entity:
for example, with the Bert model as the first text encoder, the process of performing entity description encoding is as follows:
(3.1) splicing the entity name and the entity description text thereof into a character string: "[ CLS ] entity name [ ENT ] entity description text [ SEP ]", wherein [ CLS ] represents the start character of the character string, [ ENT ] represents the separator of the entity name and the entity description text, and [ SEP ] represents the end character of the character string;
(3.2) inputting the character string into a Bert model for vector coding, and taking the output code of the [ CLS ] position as the code of an entity, wherein the dimension is h, and h can be set, such as 768 or 1024;
for example, the entity name and the entity description text of "Jilin province" and "Jilin city" are respectively spliced into a character string, which is "[ CLS ] Jilin province [ ENT ] Jilin province, which is a Chinese province administrative district, and the province is Changchun. [ SEP ] "," [ CLS ] Jilin City [ ENT ] Jilin City, which is called the North China JiangCity and is a prefecture-level city in Jilin province. [ SEP ] ", then input to Bert to carry on the vector coding, and take [ CLS ] output code of the position as" Jilin province "and" Jilin city "two entities' respective code representation.
Step S2: similar entities are selected offline. This step is one specific embodiment of the operations S620 to S630 described above. With reference to fig. 9, the specific process of this step is as follows:
(1) inputting: the coded representation of each entity output by S1, i.e., an h-dimensional vector;
(2) and (3) outputting: a set Se of N most similar entities of each entity e, N being an integer;
(3) and (3) a selection process: the first similarity of every two entities is calculated based on the coded representation of the entities, and then the most similar N entities (namely N similar entities) to each entity are selected off-line. Specifically, for each entity e, the inner product of the h-dimensional vector corresponding to e and the h-dimensional vectors corresponding to all other entities is calculated, and the N entities with the largest inner products are selected and added to Se.
Step S3: the entity description key content is extracted offline. This step is one specific embodiment of operation S241 described above. With reference to fig. 9, the specific process of this step is as follows:
(1) inputting: description texts of all entities in the knowledge graph, and a set Se of N similar entities of each entity e output by S2;
(2) and (3) outputting: the entity description key content of each entity e;
(3) the extraction process comprises the following steps: and extracting the part which is different from the description texts of the N similar entities from the entity description text of each entity e offline to be used as the entity description key content of the entity. When the characters are used as the constituent units of the entity description text, the process of extracting the key content of the entity description is as follows:
(3.1) converting the entity description text of the entity e into a character set Ce, and removing repeated characters and invalid characters, wherein the invalid characters comprise comma symbols and the like, and the's' or the like dummy words or numbers and the like;
(3.2) splicing the entity description texts of N similar entities of the entity e into a character string, converting the character string into a character set Cs, and removing repeated characters and invalid characters;
(3.3) entity description key for entity e — Ce-Cs, where subtraction represents the set difference. Thus, the characters in the entity description key of entity e are a set of strings made up of characters belonging to entity e but not its N similar entities.
Step S4: and (3) encoding question of the user, wherein the step is encoded by using a second text encoder, and the specific process is as follows in combination with the step shown in FIG. 9:
(1) inputting: a question string input by a user, for example, "the opening time of Jilin Songhuajiang culture section";
(2) and (3) outputting: the encoding representation of the question of the user, namely q _ len h dimension real number matrix Mq, wherein q _ len and h are integers, q _ len is the unit sequence length obtained by segmenting the question according to the encoder to form units (such as characters), and h is the dimension of the vector corresponding to each unit;
(3) and (3) an encoding process: inputting the character string of the question of the user into an existing text encoder for vector encoding, and acquiring the encoding of each constituent unit in the character string from the output:
for example, with Bert as the second text encoder, the process of encoding the question of the user is as follows:
(3.1) splicing the question of the user into a character string: "[ CLS ] user question [ SEP ]";
and (3.2) inputting the character string into a Bert for vector coding, and acquiring codes of all the constituent units in the character string from the output, wherein the constituent units of the Chinese character string processed by the Bert coder in the example are characters. For example, for the question "the opening time of Jilin Songhuajiang culture section", Mq obtained by the Bert encoder is a 13 × h dimensional real number matrix, wherein the user question has 13 characters.
Step S5: entity mentions prediction. This step is a specific embodiment of operations S220 or S720 described above. Referring to fig. 9, the entity mentions in the user question can be predicted by using the fully-connected neural network, and the specific implementation process is as follows:
(1) inputting: the coded representation of the user question output by the S4, namely, the real matrix Mq is expressed by q _ len × h;
(2) and (3) outputting: m entity mentions of the user question and (entity mention, prediction probability) tuples consisting of the prediction probabilities thereof;
(3) and (3) prediction process:
(3.1) calculating a probability ps (i) of an i-th constituent unit (e.g., an i-th character) in the user question as an entity-mentioned starting unit for 0 < i < q _ len-1. Specifically, a vector corresponding to each unit is converted into a probability value through a fully-connected neural network, wherein the fully-connected neural network is, for example, a 1-layer fully-connected network Mq × Ws + bs, wherein Ws is an h × 1 matrix, and bs is a real number;
(3.2) for 0 ≦ i ≦ q _ len-1, a probability pe (i) that the i-th constituent unit (e.g., the i-th character) in the question refers to the end unit as an entity is calculated. Specifically, a vector corresponding to each unit is converted into a probability value through a fully-connected neural network, wherein the fully-connected neural network is, for example, a 1-layer fully-connected network Mq We + be, where We is an h 1 matrix and be is a real number;
(3.3) calculating a probability pm (i) that the i-th constituent unit (e.g., the i-th character) in the user question is an entity-mentioned constituent unit for 0 ≦ i ≦ q _ len-1. Specifically, a vector corresponding to each unit is converted into a probability value through a fully-connected neural network, wherein the fully-connected neural network is, for example, a 1-layer fully-connected network Mq Wmn + bm, where Wm is an h × 1 matrix, and bm is a real number;
(3.4) calculating a probability p ([ i, j ]) that an arbitrary continuous unit interval [ i, j ] (e.g., an interval between the ith to jth characters) in the user question is mentioned as an entity for 0 < ═ i < ═ j < ═ q _ len-1, specifically: the probability is sigmoid (ps (i) + pe (j)) + sumi < (t) < ═ j (pm (t)); wherein, Sigmoid: a neural network activation function maps variables between 0 and 1.
(3.5) selecting top M continuous unit intervals as entity mentions according to the probability p ([ i, j ]); for example, for the question "opening time of Jilin Songhua river culture section", the probability is calculated for the continuous unit intervals of "Jilin", "Linsong", "Songhua river", etc., and two intervals with the maximum probability and the corresponding probability values are selected, for example ("Jilin", 0.6), ("Songhua river", 0.3), etc.
Step S6: a candidate entity recall. This step is a specific embodiment of operation S230 described above. With reference to fig. 9, the specific process is as follows:
(1) inputting: m entity mentions of the user question output by S5 and the predicted probabilities thereof;
(2) and (3) outputting: returning T candidate entities aiming at each entity mention in the question of the user;
(3) and (3) a recall process: aiming at each entity mention, carrying out fuzzy matching with each entity in the knowledge graph by using any character matching method, and calculating a matching score f:
for example, when the matching score f is calculated based on the edit distance of the entity mention character string and the entity name character string, the matching score f ═ 1 of the entity mention "Jilin" and the entity "Jilin City" is calculated, the matching score f ═ 1 of the "Jilin province", the matching score f ═ 3 of the "Changchun City" is calculated from the edit distance of the entity mention "Jilin" and the entity "Jilin City";
then screening the T candidate entities which are most matched according to the matching fraction f; for example, when a matching score f is calculated based on the edit distance of the entity-mentioning string and the entity-name string, a smaller f indicates a better match. For example, an entity "Jilin" is more closely matched to "Jilin City" and "Jilin province" than "Changchun City".
Step S7: the entity description key content encoding of the candidate entity. This step is a specific embodiment of operation S242 described above. With reference to fig. 9, the specific process is as follows:
(1) inputting: a character string of a user question, M entity mentions and their predicted probabilities output at S5, T candidate entities mentioned at each entity output at S6, and entity description key content at each entity output at S3;
(2) and (3) outputting: the entity of the candidate entity describes the coded representation of the key content, and is an h-dimensional real number vector;
(3) and (3) an encoding process: firstly, a set of all entity-mention corresponding all candidate entities is calculated, then a question string of a user and a character string of entity description key content corresponding to all candidate entities are spliced into a character string, and the character string is input into a second text encoder (namely, the character string and the entity description key content share one text encoder S4) to be subjected to vector encoding. And then acquiring the codes of the entity description key contents corresponding to the candidate entities from the output codes. The specific process is as follows:
and (3.1) respectively taking the T candidate entities mentioned by each entity as a set. Thus, the set of all candidate entities is the union of all entity references to the corresponding set of candidate entities;
(3.2) splicing the character string of the question of the user and the character strings of the entity description key content corresponding to all the candidate entities into one character string, inputting the character string into a second text encoder for vector encoding, and outputting the codes of all the constituent units of the whole character string;
and (3.3) extracting codes of the entity description key contents of the candidate entities according to the initial and final positions of the entity description key contents of the candidate entities in the spliced character string. And then, the coded representation of the entity description key content of the candidate entity is obtained by intercepting the code corresponding to the position range of the character string of the entity description key content of the candidate entity from the coded representation matrix of the spliced character string and averaging according to the bit.
S8: entity description refinement coding of candidate entities. This step is a specific example of the aforementioned operation S240. Referring to fig. 9, the specific process is as follows:
(1) inputting: the coded representation (i.e. h-dimensional real number vector) of each entity output by S1, M entity mentions and their prediction probabilities in the user question output by S5, T corresponding candidate entities mentioned by each entity output by S6, and entity description key content coded representation of all candidate entities output by S7;
(2) and (3) outputting: and the entity description refinement codes of all the candidate entities are h-dimensional real number vectors.
(3) And (3) a refining process: and for each candidate entity, fusing the coded representation of the entity output by the S1 and the coded representation of the entity description key content output by the S7 to obtain the entity description refinement code. The encoding refinement process based on weighted summation, for example, is:
respectively taking the T candidate entities mentioned and matched with each entity as a set, wherein the set of all the candidate entities is a candidate entity set mentioned and matched with all the entities and is a union set;
for each candidate entity e in the full set of candidate entities:
-querying from the result of the encoded representation of the entity output in S1, resulting in an encoded representation De of the entity of e;
-querying from the encoded representation of entity description key content output from S7, resulting in an encoded representation Ie of entity description key content of e;
-calculating an entity description refinement code Re + a Ie of an entity e, wherein a is a hyperparameter, empirically determined, with a value interval of [0, 1 ].
Step S9: entity mention code-entity description refines the matching ordering of the code. This step is an embodiment of operations S270, S780, and S790 described above. Referring to fig. 9, the process is detailed as follows:
(1) inputting: the coded representation of the user question output by the S4, namely, a q _ len h-dimensional real number matrix Mq, M entity mentions of the user question output by the S5 and the prediction probabilities thereof, T candidate entities corresponding to each entity mention output by the S6 and an entity description refinement coded representation (namely, h-dimensional vector) of each entity output by the S8;
(2) and (3) outputting: a set (entity mention and link entity) corresponding to the user question;
(3) and (3) calculating:
(3.1) aiming at each entity mention m, calculating the coding representation of m according to the coding representation Mq of the user question, wherein the coding representation of m is equal to the coding representation Mq of the user question, and the coding vectors of all the constituent units contained in the entity mention m are intercepted and averaged according to bits;
(3.2) inquiring the entity description refinement codes output by the S8 aiming at T candidate entities of m mentioned by each entity to obtain the entity description refinement codes of the T candidate entities;
(3.3) for each entity mentioning m and T candidate entities, respectively calculating inner products of the coding representation of m and entity description refinement coding of the candidate entities, then normalizing the T inner products through softmax to obtain probability distribution of the T candidate entities, and then selecting the candidate entity with the maximum probability value according to the probability to serve as the entity mentioning m linked entities. Outputting a set consisting of M (entity mention and link entities) aiming at the question of the user;
(3.4) for each entity mention M, calculating the product of the predicted probability of the entity mention output by S5 and the probability of its linked entity, and deleting the elements of the probability product < probability threshold from M (entity mention, linked entity), the rest (entity mention, linked entity) being output as the entity link result after entity disambiguation.
As can be seen from fig. 9, in the solution of the embodiment of the present disclosure, a first text encoder is used in step S1, a second text encoder is used in steps S4 and S7, and a fully-connected neural network is used in step S5. The first text encoder, the second text encoder and the fully-connected neural network can be obtained through collaborative training. In the training process, a training set consisting of correct entity mentions of a given user question and corresponding correct link entities, the user question and a knowledge graph is used, and in each training, after the steps included in the steps S1-S9 are executed, a loss value is calculated, wherein the loss value is equal to the sum of cross entropy loss of a prediction result of the correct entity mentions and Log likelihood loss of the correct link entities, and the cross entropy loss and the Log likelihood loss are two loss functions and are a way for measuring the deviation between a predicted value and a real value of a neural network. And then carrying out gradient back propagation based on the loss value to realize the training of the method model.
Therefore, according to the embodiment of the disclosure, with the entity in the question and the knowledge graph of the user as input, the first text encoder is firstly adopted to perform offline encoding on the entity, the similarity of the entity is calculated based on the encoded representation of the entity, and then N similar entities of each entity can be selected offline, and then the information of the similar entities can be utilized to perform offline extraction of the entity description key content. Secondly, a second text encoder is adopted to encode question sentences of the users, entity mention prediction and candidate entity recall are carried out, entity description key content encoding is carried out on the candidate entities, entity description refinement encoding is further obtained, finally, the candidate entities with the highest probability and the highest relevance are selected as link entities based on the matching sequence of the encoding expression of each entity mention and the entity description essence encoding of the T candidate entities, and then the entity mention and the link entities thereof are returned as the result of entity link.
In this way, the problem of high acquisition cost of information and representation in the entity link represented based on the context of the knowledge graph can be overcome to a certain extent, and the problem of high irrelevant noise information contained in the entity description although the acquisition cost of the information and representation in the entity link represented based on the entity description is low can be avoided to a large extent. The embodiment of the disclosure can effectively reduce noise interference in entity linking through extraction and coding of entity description key content, and has low information and representation acquisition cost, and simultaneously optimizes coding representation of the entity and improves the effect of entity linking.
Based on the method for linking the knowledge-graph question-answer entity of each embodiment, the embodiment of the disclosure also provides a device for linking the knowledge-graph question-answer entity. The apparatus will be described in detail below with reference to fig. 10.
Fig. 10 schematically shows a block diagram of the structure of a knowledge-graph question-answer entity linking apparatus 1000 according to an embodiment of the present disclosure.
As shown in fig. 10, according to an embodiment of the present disclosure, the apparatus 1000 may include a first obtaining module 1010, a first extracting module 1020, a matching module 1030, a second obtaining module 1040, a third obtaining module 1050, a second similarity calculation module 1060, and a linking module 1070. According to other embodiments of the present disclosure, the apparatus 1000 may further include a first text encoder 1080, a second text encoder 1090, and/or a first similarity calculation module 10100. The apparatus 1000 may be used to implement the methods described with reference to fig. 2-9.
The first obtaining module 1010 is configured to obtain a question from a user. In one embodiment, the first obtaining module 1010 may perform the operation S210 described above.
The first extraction module 1020 is configured to extract entity mentions in the user question, where the entity mentions are character string representations of the entity in the knowledge graph in the user question. In one embodiment, the first extraction module 1020 may perform operation S220 or operation S720 described previously. In one embodiment, the first extraction module 1020 includes a fully-connected neural network for predicting entity mentions in user question and their predicted probabilities.
The matching module 1030 is configured to retrieve T candidate entities from the knowledge graph, where T is an integer greater than 1, that match the entity mention. In one embodiment, operation S1030 may perform operation S230 described previously.
The second obtaining module 1040 is configured to obtain an entity description refinement code of each candidate entity in the T candidate entities; the entity description refinement coding representation is obtained based on character string differences in entity description contents between the candidate entity and N similar entities of the candidate entity, wherein the N similar entities are N entities in the knowledge graph, the first similarity of the N similar entities and the candidate entity meets a preset condition, and N is an integer greater than or equal to 1. In one embodiment, operation S1040 may perform operation S240 described previously.
According to some embodiments of the present disclosure, the second obtaining module 1040 is specifically configured to: acquiring entity description key content of a candidate entity; wherein, the character string in the entity description key content belongs to the entity description content of the candidate entity, but does not belong to the entity description content of any one similar entity of the N similar entities; obtaining an encoded representation of entity description key content; obtaining a coded representation of the candidate entity; and obtaining the entity description refinement coding of the candidate entity based on the coded representation of the candidate entity and the coded representation of the entity description key content.
The third obtaining module 1050 is used to obtain the encoded representation mentioned by the entity. In one embodiment, the third obtaining module 1050 may perform the operation S250 described above.
In some embodiments, the third obtaining module 1050 may obtain the encoded representation of the entity mention by means of the second text encoder 1090. Specifically, the second text encoder 1090 is configured to encode a character string of a user question, and obtain an encoding expression matrix of the user question. Accordingly, the third obtaining module 1050 may specifically be configured to: intercepting a vector group corresponding to the position range of the character string mentioned by the entity in the question of the user from the coding expression matrix of the question of the user to obtain a coding matrix corresponding to the mention of the entity; and averaging vectors in the coding matrix corresponding to the entity mention according to bits to obtain the coding representation of the entity mention.
The second similarity calculation module 1060 is for calculating a second similarity of the entity mention to each of the T candidate entities based on the entity description refinement encoding of the encoded representation of the entity mention to each of the candidate entities.
The linking module 1070 is configured to determine a candidate entity with the largest second similarity among the T candidate entities, as the linked entity mentioned by the entity. The link module 1070 may perform the operation S270 described above according to an embodiment of the present disclosure.
In other embodiments, where the first extraction module 1020 predicts the entity mention in the user question and its predicted probability through the fully connected neural network, the linking module 1070 may be further configured to: and normalizing T second similarity degrees respectively corresponding to the T candidate entities to obtain link probability for representing the respective link correctness of the T candidate entities, and outputting the entity mention and the link entity thereof when the product of the prediction probability mentioned by the entity and the link probability of the link entity mentioned by the entity is greater than a probability threshold. Accordingly, the link module 1070 may perform the operations S780 to S790 described above.
A first text encoder 1080 may be used to encode the entities in the knowledge-graph resulting in an encoded representation of the entities in the knowledge-graph. In one embodiment, the first text encoder may perform operation S610 described above.
The first similarity calculation module 10100 is configured to: calculating a first similarity of the candidate entity to other entities in the knowledge-graph based on the encoded representation of the candidate entity and the encoded representations of the other entities in the knowledge-graph; and selecting N entities with the first similarity meeting a preset condition from the knowledge graph spectrum to obtain N similar entities corresponding to the candidate entities. In one embodiment, the first similarity calculation module 10100 may perform operation S620 and operation S630 described previously.
According to an embodiment of the present disclosure, the apparatus 1000 may further include a training module. The training module may be configured to cooperatively train the first text encoder 1080, the second text encoder 1090, and the fully-connected neural network, where sample data used in the training process includes the knowledge-graph, the sample user question, and a correct entity mention in the sample user question and a correct linked entity thereof.
According to the embodiment of the present disclosure, any of the first obtaining module 1010, the first extracting module 1020, the matching module 1030, the second obtaining module 1040, the third obtaining module 1050, the second similarity calculating module 1060, the linking module 1070, the first text encoder 1080, the second text encoder 1090, the first similarity calculating module 10100, the fully-connected neural network, and the training module may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. At least one of the first obtaining module 1010, the first extracting module 1020, the matching module 1030, the second obtaining module 1040, the third obtaining module 1050, the second similarity calculating module 1060, the linking module 1070, the first text encoder 1080, the second text encoder 1090, the first similarity calculating module 10100, the fully-connected neural network, and the training module according to embodiments of the present disclosure may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or packaging a circuit, etc., or as any one of or a suitable combination of software, hardware, and firmware. Alternatively, at least one of the first obtaining module 1010, the first extracting module 1020, the matching module 1030, the second obtaining module 1040, the third obtaining module 1050, the second similarity calculation module 1060, the linking module 1070, the first text encoder 1080, the second text encoder 1090, the first similarity calculation module 10100, the fully-connected neural network, and the training module may be at least partially implemented as a computer program module that, when executed, may perform corresponding functions.
Fig. 11 schematically shows a block diagram of an electronic device adapted to implement a method of linking knowledge-graph question-answer entities according to an embodiment of the present disclosure.
As shown in fig. 11, an electronic device 1100 according to an embodiment of the present disclosure includes a processor 1101, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. The processor 1101 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or related chip sets and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1101 may also include on-board memory for caching purposes. The processor 1101 may include a single processing unit or multiple processing units for performing different actions of the method flows according to embodiments of the present disclosure.
In the RAM1103, various programs and data necessary for the operation of the electronic device 1100 are stored. The processor 1101, the ROM 1102, and the RAM1103 are connected to each other by a bus 1104. The processor 1101 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1102 and/or RAM 1103. It is to be noted that the programs may also be stored in one or more memories other than the ROM 1102 and the RAM 1103. The processor 1101 may also perform various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in the one or more memories.
The present disclosure also provides a computer-readable storage medium, which may be embodied in the device/apparatus/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1102 and/or the RAM1103 and/or one or more memories other than the ROM 1102 and the RAM1103 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the method provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 1101. The above described systems, devices, modules, units, etc. may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal over a network medium, distributed, and downloaded and installed via the communication portion 1109 and/or installed from the removable media 1111. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 1109 and/or installed from the removable medium 1111. The computer program, when executed by the processor 1101, performs the above-described functions defined in the system of the embodiment of the present disclosure. The above described systems, devices, apparatuses, modules, units, etc. may be implemented by computer program modules according to embodiments of the present disclosure.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.
Claims (16)
1. A method for linking knowledge-graph question-answer entity comprises the following steps:
acquiring a question of a user;
extracting entity mentions in the user question sentences, wherein the entity mentions are expressed by character strings of the entities in the knowledge graph in the user question sentences;
retrieving T candidate entities matching the entity mention from the knowledge-graph, wherein T is an integer greater than 1;
acquiring entity description refinement codes of each candidate entity in the T candidate entities; wherein the entity description refinement coding representation is obtained based on character string differences in entity description contents between the candidate entity and N similar entities of the candidate entity, wherein the N similar entities are N entities in the knowledge graph, of which a first similarity to the candidate entity satisfies a preset condition, and N is an integer greater than or equal to 1;
obtaining a coded representation of the entity mention;
calculating a second similarity of the entity mention to each of the T candidate entities based on the entity description refinement encoding of the encoded representation of the entity mention to each of the candidate entities; and
and determining the candidate entity with the largest second similarity in the T candidate entities as the link entity mentioned by the entity.
2. The method of claim 1, wherein said obtaining entity description refinement coding for each of the T candidate entities comprises:
acquiring entity description key content of the candidate entity; wherein the character string in the entity description key content belongs to the entity description content of the candidate entity but does not belong to the entity description content of any one of the N similar entities;
obtaining an encoded representation of the entity description key content;
obtaining a coded representation of the candidate entity; and
obtaining the entity description refinement code for the candidate entity based on the encoded representation of the candidate entity and the encoded representation of the entity description key content.
3. The method of claim 2, wherein the obtaining entity description key content for the candidate entity comprises:
processing character strings in the entity description content of the candidate entity according to a first rule to obtain a first character string set;
processing the character strings in the entity description contents of the N similar entities according to the first rule to obtain a second character string set; and
subtracting a second character string set from the first character string set to obtain entity description key content of the candidate entity;
the first rule comprises that the composition units of the entity description content are used as splitting granularity, and the character strings in the entity description content are split and then are deduplicated.
4. The method of claim 2, wherein said obtaining an encoded representation of the entity description key content comprises:
splicing the user question and the entity description key content of each candidate entity in the T candidate entities to form a first combined character string;
encoding the first combined character string by using a second text encoder to obtain an encoded representation of the first combined character string;
intercepting, from the encoded representation of the first combined string, an encoding corresponding to a range of positions of the entity description key content of each candidate entity in the first combined string; and
obtaining an encoded representation of the entity description key content based on the intercepted code corresponding to the entity description key content of each candidate entity.
5. The method of claim 2, wherein the deriving the entity description refinement encoding of the candidate entity based on the encoded representation of the candidate entity and the encoded representation of the entity description key content comprises:
and taking a preset hyper-parameter as a weight coefficient of the coded representation of the entity description key content, and carrying out weighted summation on the coded representation of the candidate entity and the coded representation of the entity description key content to obtain the entity description refinement code of the candidate entity.
6. The method of claim 2, wherein the obtaining the encoded representation of the candidate entity comprises:
encoding a second combined character string consisting of the candidate entity and the entity description content thereof by using a first text encoder to obtain an encoded representation of the second combined character string; and
and intercepting codes corresponding to the position range of the candidate entity in the second combined character string from the coded representation of the second combined character string to obtain the coded representation of the candidate entity.
7. The method of claim 1, wherein prior to said obtaining entity description refinement coding for each of said T candidate entities, said method further comprises:
calculating the first similarity of the candidate entity to other entities in the knowledge-graph based on the encoded representations of the candidate entity and the encoded representations of the other entities in the knowledge-graph; and
and selecting N entities with the first similarity meeting the preset condition from the knowledge graph to obtain the N similar entities corresponding to the candidate entities.
8. The method of claim 1, wherein prior to said obtaining a user question, the method further comprises:
and encoding the entities in the knowledge graph by using a first text encoder to obtain encoded representation of the entities in the knowledge graph.
9. The method of claim 1, wherein,
the extracting entity mentions in the user question further comprises: predicting the entity mention and the prediction probability thereof in the user question by utilizing a full-connection neural network;
the method further comprises the following steps:
normalizing the T second similarity degrees corresponding to the T candidate entities respectively to obtain a link probability for representing the correct link of each of the T candidate entities; and
outputting the entity mention and its linked entity when the product of the predicted probability of the entity mention and the linking probability of the linked entity of the entity mention is greater than a probability threshold.
10. The method of claim 1, wherein said retrieving from the knowledge-graph the T candidate entities that match the entity mention comprises:
calculating a matching score based on a comparison of the entity-referenced character strings to name character strings of entities in the knowledge-graph; and
and screening the T entities with the highest matching scores from the knowledge graph to obtain the T candidate entities.
11. The method of claim 1, wherein said obtaining the encoded representation of the entity mention comprises:
coding the character string of the user question by using a second text coder to obtain a coding expression matrix of the user question;
intercepting a vector group corresponding to the position range of the character string mentioned by the entity in the user question from the coding expression matrix of the user question to obtain a coding matrix corresponding to the entity mention; and
and averaging vectors in the coding matrix corresponding to the entity mention according to bits to obtain the coding representation of the entity mention.
12. The method of claim 1, wherein the method further comprises:
encoding the entities in the knowledge graph by using a first text encoder to obtain encoded representations of the entities in the knowledge graph; wherein the first similarity is calculated based on a vector of the encoded representation of the entity;
coding the character string of the user question by using a second text coder to obtain a coding expression matrix of the user question; wherein the coded representation of the entity mention is obtained by intercepting a vector corresponding to a position range of the character string mentioned by the entity in the user question from the coded representation matrix; and
predicting the entity mention in the user question using a fully connected neural network;
wherein,
the first text encoder, the second text encoder and the fully-connected neural network are obtained through collaborative training, wherein sample data used in the training process comprise the knowledge map, the sample user question and correct entity mentions and correct link entities in the sample user question.
13. A knowledge-graph question-answer entity linking device comprises:
the first acquisition module is used for acquiring a question of a user;
the first extraction module is used for extracting entity mentions in the user question sentences, wherein the entity mentions are expressed by character strings of the entities in the knowledge graph in the user question sentences;
the matching module is used for retrieving T candidate entities matched with the entity mentions from the knowledge graph, wherein T is an integer larger than 1;
a second obtaining module, configured to obtain an entity description refinement code of each candidate entity in the T candidate entities; wherein the entity description refinement coding representation is obtained based on character string differences in entity description contents between the candidate entity and N similar entities of the candidate entity, wherein the N similar entities are N entities in the knowledge graph, of which a first similarity to the candidate entity satisfies a preset condition, and N is an integer greater than or equal to 1;
a third obtaining module, configured to obtain the coded representation mentioned by the entity;
a second similarity calculation module for calculating a second similarity of the entity mention to each of the T candidate entities based on the encoded representation of the entity mention and the entity description refinement encoding of each of the candidate entities; and
and the link module is used for determining the candidate entity with the largest second similarity in the T candidate entities, and the candidate entity is the link entity mentioned by the entity.
14. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-12.
15. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 12.
16. A computer program product comprising a computer program which, when executed by a processor, carries out the method according to any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210649326.1A CN114925681B (en) | 2022-06-08 | 2022-06-08 | Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210649326.1A CN114925681B (en) | 2022-06-08 | 2022-06-08 | Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114925681A true CN114925681A (en) | 2022-08-19 |
CN114925681B CN114925681B (en) | 2024-08-27 |
Family
ID=82813523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210649326.1A Active CN114925681B (en) | 2022-06-08 | 2022-06-08 | Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114925681B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118116620A (en) * | 2024-04-28 | 2024-05-31 | 支付宝(杭州)信息技术有限公司 | Medical question answering method and device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106934012A (en) * | 2017-03-10 | 2017-07-07 | 上海数眼科技发展有限公司 | A kind of question answering in natural language method and system of knowledge based collection of illustrative plates |
CN108388582A (en) * | 2012-02-22 | 2018-08-10 | 谷歌有限责任公司 | The mthods, systems and devices of related entities for identification |
CN110188182A (en) * | 2019-05-31 | 2019-08-30 | 中国科学院深圳先进技术研究院 | Model training method, dialogue generation method, device, equipment and medium |
CN111259653A (en) * | 2020-01-15 | 2020-06-09 | 重庆邮电大学 | Knowledge graph question-answering method, system and terminal based on entity relationship disambiguation |
EP3933700A1 (en) * | 2020-06-30 | 2022-01-05 | Siemens Aktiengesellschaft | A method and apparatus for performing entity linking |
-
2022
- 2022-06-08 CN CN202210649326.1A patent/CN114925681B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388582A (en) * | 2012-02-22 | 2018-08-10 | 谷歌有限责任公司 | The mthods, systems and devices of related entities for identification |
CN106934012A (en) * | 2017-03-10 | 2017-07-07 | 上海数眼科技发展有限公司 | A kind of question answering in natural language method and system of knowledge based collection of illustrative plates |
CN110188182A (en) * | 2019-05-31 | 2019-08-30 | 中国科学院深圳先进技术研究院 | Model training method, dialogue generation method, device, equipment and medium |
CN111259653A (en) * | 2020-01-15 | 2020-06-09 | 重庆邮电大学 | Knowledge graph question-answering method, system and terminal based on entity relationship disambiguation |
EP3933700A1 (en) * | 2020-06-30 | 2022-01-05 | Siemens Aktiengesellschaft | A method and apparatus for performing entity linking |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118116620A (en) * | 2024-04-28 | 2024-05-31 | 支付宝(杭州)信息技术有限公司 | Medical question answering method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN114925681B (en) | 2024-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11501182B2 (en) | Method and apparatus for generating model | |
WO2022037256A1 (en) | Text sentence processing method and device, computer device and storage medium | |
CN112613314A (en) | Electric power communication network knowledge graph construction method based on BERT model | |
CN113672708B (en) | Language model training method, question-answer pair generation method, device and equipment | |
CN109902301B (en) | Deep neural network-based relationship reasoning method, device and equipment | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN114880991B (en) | Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium | |
CN112100332A (en) | Word embedding expression learning method and device and text recall method and device | |
WO2023137918A1 (en) | Text data analysis method and apparatus, model training method, and computer device | |
CN112528654A (en) | Natural language processing method and device and electronic equipment | |
CN115408525A (en) | Petition text classification method, device, equipment and medium based on multi-level label | |
CN116467417A (en) | Method, device, equipment and storage medium for generating answers to questions | |
CN110852066A (en) | Multi-language entity relation extraction method and system based on confrontation training mechanism | |
CN116975267A (en) | Information processing method and device, computer equipment, medium and product | |
CN114925681B (en) | Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium | |
CN117272937B (en) | Text coding model training method, device, equipment and storage medium | |
CN117828024A (en) | Plug-in retrieval method, device, storage medium and equipment | |
CN117971420A (en) | Task processing, traffic task processing and task processing model training method | |
CN111507108B (en) | Alias generation method and device, electronic equipment and computer readable storage medium | |
CN117708324A (en) | Text topic classification method, device, chip and terminal | |
CN117131273A (en) | Resource searching method, device, computer equipment, medium and product | |
CN117076946A (en) | Short text similarity determination method, device and terminal | |
CN113157892B (en) | User intention processing method, device, computer equipment and storage medium | |
CN114398903B (en) | Intention recognition method, device, electronic equipment and storage medium | |
CN115712732A (en) | Method, system, equipment and medium for constructing knowledge graph ontology of power equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |