CN114003735A - Knowledge graph question and answer oriented entity disambiguation method based on intelligence document - Google Patents

Knowledge graph question and answer oriented entity disambiguation method based on intelligence document Download PDF

Info

Publication number
CN114003735A
CN114003735A CN202111595751.9A CN202111595751A CN114003735A CN 114003735 A CN114003735 A CN 114003735A CN 202111595751 A CN202111595751 A CN 202111595751A CN 114003735 A CN114003735 A CN 114003735A
Authority
CN
China
Prior art keywords
entity
candidate
entities
candidate entities
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111595751.9A
Other languages
Chinese (zh)
Other versions
CN114003735B (en
Inventor
刘禹汐
侯立旺
姜青涛
董勤娇
王飞虎
牟善强
段龙海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Daoda Tianji Technology Co ltd
Original Assignee
Beijing Daoda Tianji Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Daoda Tianji Technology Co ltd filed Critical Beijing Daoda Tianji Technology Co ltd
Priority to CN202111595751.9A priority Critical patent/CN114003735B/en
Publication of CN114003735A publication Critical patent/CN114003735A/en
Application granted granted Critical
Publication of CN114003735B publication Critical patent/CN114003735B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an entity disambiguation method facing knowledge graph question answering based on information documents, which comprises the following steps: generating a plurality of candidate entities corresponding to the entities through entity linkage; a plurality of candidate entities are used as training data, a ranking model is built through RankNet based on a neural network, and the ranking model is optimized by using a gradient descent method; training the constructed ranking model by combining a BP network back propagation algorithm and a conjugate gradient algorithm, and selecting a candidate entity most related to the entity by using the trained ranking model so as to disambiguate other candidate entities. The core innovation point of the invention is that the entity link task is converted into the information retrieval problem, the candidate entity is identified and generated by a conventional rule dictionary method, and then the index is linked to the optimal possible candidate entity by adopting an improved LTR (learning to rank) method, so that the entity disambiguation task is converted into the LTR problem in the information retrieval, and the candidate entity is disambiguated by utilizing a sequencing model.

Description

Knowledge graph question and answer oriented entity disambiguation method based on intelligence document
Technical Field
The invention relates to the technical field of document entity link, in particular to an entity disambiguation method facing knowledge map question answering based on intelligence documents.
Background
The entity link technology is a key technology in a KBQA (Knowledge-based Question Answering) Knowledge Question-Answering system, has important research significance and practical value in various fields of Knowledge organization, information retrieval, semantic publishing and the like, and can be widely applied to Knowledge base expansion, machine translation, automatic Question Answering and the like of an information system. The entity link facing the knowledge base question and answer, namely, the entity in the natural language question sentence is mapped to the entity corresponding to the knowledge map, the entity is endowed with real and definite meanings, and the important basis for establishing the link is the matching degree of the text context and the entity in the specific knowledge base. The entity link mainly solves the problems of one-word polysemous and multi-word polysemous of the entity in the natural language stationery, and helps to understand the specific meaning of the natural language question. Entity links are therefore an important way to connect natural arguments to the knowledge base, and are also a necessary condition for understanding natural language question sentences.
Currently, when using the technique of entity linking, the entity linking is divided into two subtasks, respectively candidate entity generation and candidate entity disambiguation. Candidate entity generation is a prerequisite, and the same reference may correspond to several entities in the knowledge base. The purpose of candidate entity disambiguation is to find one candidate entity from the set of candidate entities that best fits the context of the sentence as the target entity. The final result can thus be improved from both aspects, the current main method of candidate entity generation is by means of constructing a dictionary of mappings of the designations to the candidate entities, i.e. aliases, which include acronyms, fuzzy matches, nicknames, misspellings, etc. The main methods for disambiguating the target of the candidate entity comprise a method for directly calculating the similarity between the designated context and the description text of the candidate entity, a method based on a graph model, a method based on a probability model, a method based on a subject model and the like.
The main difficulty of the entity link research is entity ambiguity, and the application of the existing entity link technology to the knowledge-graph question-answering mainly has the following disadvantages:
1) the lack of the context is referred, and the sentence facing the knowledge base question and answer only contains a few words and cannot provide sufficient context to assist entity disambiguation;
2) most short texts only have one reference, so that an entity joint disambiguation method cannot be used;
3) the structured knowledge base lacks text description information of the entity, and the entity in the knowledge base is difficult to represent.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides an entity linking method facing knowledge-graph question answering based on an intelligence document.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
an entity disambiguation method facing knowledge graph question answering based on intelligence documents comprises the following steps:
step S1: generating a plurality of candidate entities corresponding to the entities through entity linkage;
step S2: a plurality of candidate entities are used as training data, a ranking model is built through RankNet based on a neural network, and the ranking model is optimized by using a gradient descent method;
step S3: training the constructed ranking model by combining a BP network back propagation algorithm and a conjugate gradient algorithm, and selecting a candidate entity most related to the entity by using the trained ranking model so as to disambiguate other candidate entities.
The step of generating a plurality of candidate entities corresponding to the entity through the entity link includes:
identifying entity name boundaries based on the set spelling rule, word construction rule, indicator word and prefix and suffix string definition template; and recognizing entity names by utilizing the entries in the existing dictionary so as to generate a plurality of candidate entities corresponding to the entities.
The step of constructing a ranking model based on a neural network by using the candidate entities as training data through RankNet comprises the following steps:
after manual marking is respectively carried out on the candidate entities, an idealized scoring function g is obtained;
computing every two candidate entities x using neural networksu、xvModel probability P ofu,vAnd according to the model probability Pu,vConstructing target probabilities
Figure 499569DEST_PATH_IMAGE001
According to the target probability
Figure 479026DEST_PATH_IMAGE001
And the model probability Pu,vCalculating cross entropy, the calculated cross entropy being defined as a loss function
Figure 153371DEST_PATH_IMAGE002
Thereby obtaining a loss function
Figure 876477DEST_PATH_IMAGE002
The step of obtaining an idealized scoring function g after the candidate entities are respectively manually labeled comprises: respectively carrying out manual marking on the candidate entities to obtain score comparison of every two candidate entities, and when each two candidate entities are compared, assigning the candidate entity with higher score as +1, otherwise assigning the candidate entity with lower score as-1, thereby forming a sequencing order of a plurality of candidate entities; and obtaining an idealized scoring function g according to the sorting order of the candidate entities.
The computing of every two candidate entities x using a neural networku、xvModel probability P ofu,vAnd according to the model probability Pu,vConstructing target probabilities
Figure 753166DEST_PATH_IMAGE001
According to the target probability
Figure 342672DEST_PATH_IMAGE001
And the model probability Pu,vCalculating cross entropy, the calculated cross entropy being defined as a loss function
Figure 672022DEST_PATH_IMAGE002
Thereby obtaining a loss function
Figure 198819DEST_PATH_IMAGE002
The method comprises the following steps:
given two candidate entities xu、xvIn training data
Figure 930014DEST_PATH_IMAGE003
The above uses a neural network to calculate the score, s represents the predicted result of the scoring function f:
Figure 484230DEST_PATH_IMAGE004
Figure 628773DEST_PATH_IMAGE005
computing candidate entities x based on a scoring function fuAnd xvDefining a model probability Pu,vThis is a sigmoid function representing candidate entity xuRank in candidate entity xvThe preceding probabilities, namely:
Figure 428101DEST_PATH_IMAGE006
constructing target probabilities based on true annotations
Figure 843164DEST_PATH_IMAGE007
Figure 804167DEST_PATH_IMAGE008
In the formula (I), the compound is shown in the specification,
Figure 420960DEST_PATH_IMAGE009
is {1, -1}, if the candidate entity u is more relevant to the searched entity than the candidate entity v, the value is 1, and if not relevant, the value is-1;
defining a loss function as a target probability on a candidate entity
Figure 23980DEST_PATH_IMAGE007
And the model probability Pu,vCross entropy between, i.e.:
Figure 57664DEST_PATH_IMAGE010
and (3) pushing out:
Figure 222191DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 810168DEST_PATH_IMAGE002
is a loss function.
The method for constructing the ranking model based on the neural network through RankNet and optimizing the ranking model by using a gradient descent method comprises the following steps of:
using gradient descent method as optimization algorithm to learn scoring function f, passing through loss function
Figure 420141DEST_PATH_IMAGE002
Calculating loss, and updating the weight w of the neural network based on a gradient descent method;
loss function
Figure 72445DEST_PATH_IMAGE002
And (5) deriving the weight w to obtain:
Figure 171988DEST_PATH_IMAGE012
Figure 653785DEST_PATH_IMAGE013
wherein the partial derivative of the score s with respect to the weight w is related to a specific learning process.
The step of training the constructed ranking model by combining the BP network back propagation algorithm and the conjugate gradient algorithm comprises the following steps:
obtaining the output error E of the sequencing model by using a BP network:
Figure 896810DEST_PATH_IMAGE014
wherein
Figure 577190DEST_PATH_IMAGE015
Figure 414346DEST_PATH_IMAGE016
Is a function of the hidden layer of the BP network,
Figure 180177DEST_PATH_IMAGE017
Figure 866373DEST_PATH_IMAGE018
is a function of the output layer of the BP network,
Figure 761779DEST_PATH_IMAGE019
Figure 406387DEST_PATH_IMAGE020
the weights of the output layer and the hidden layer are respectively,
Figure 987410DEST_PATH_IMAGE021
for the expected output vector, k represents the kth neuron, j represents the jth hidden layer, and i represents the ith output layer;
suppose an output layerThe initial weight of (1) is w (0), the initial weight of the hidden layer is v (0), and the initial search direction d(0)Is a negative gradient direction h(0)Namely:
Figure 975832DEST_PATH_IMAGE022
the adjustment amount of the weight is in direct proportion to the negative gradient direction of the error, namely:
Figure 365225DEST_PATH_IMAGE023
Figure 39789DEST_PATH_IMAGE024
wherein j =0,1,2, ·, m; k =1,2,. l; i =0,1,2, ·, n; wherein the sign of d is negative, representing a decreasing gradient, constant
Figure 281677DEST_PATH_IMAGE025
The scale factor is expressed, and the learning rate is reflected in the training;
the iterative learning rate step length of the conjugate gradient algorithm is as follows:
Figure 371992DEST_PATH_IMAGE026
Figure 84733DEST_PATH_IMAGE027
in the formula, Q is a symmetrical real matrix of n x n,
Figure 336723DEST_PATH_IMAGE028
the direction d is conjugated with respect to Q.
Compared with the prior art, the invention has the beneficial effects that:
the core innovation point of the invention is that the entity link task is converted into the information retrieval problem, the candidate entity is identified and generated by a conventional rule dictionary method, and then the index is linked to the optimal possible candidate entity by adopting an improved LTR (learning to rank) method, so that the entity disambiguation task is converted into the LTR problem in the information retrieval, and the candidate entity is disambiguated by utilizing a sequencing model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a method for entity linking according to the present invention;
FIG. 2 is a schematic diagram of a PairWise sorting method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a conjugate direction vector according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Also, in the description of the present invention, the terms "first", "second", and the like are used for distinguishing between descriptions and not necessarily for describing a relative importance or implying any actual relationship or order between such entities or operations.
Example (b):
the invention is realized by the following technical scheme, as shown in figure 1, an entity disambiguation method facing to knowledge map question answering based on intelligence documents, and a technology facing to entity linking in knowledge base question answering, wherein the key point is an entity disambiguation part. The innovation of the scheme is that an entity link task is converted into an information retrieval problem, a candidate entity is identified and generated through a conventional rule dictionary method, then the designation is linked to the most possible candidate entity through an improved LTR (learning to rank) method, so that an entity disambiguation task is converted into the LTR problem in the information retrieval, and the candidate entity is disambiguated through a sequencing model.
The method comprises the following steps:
step S1: and generating a plurality of candidate entities corresponding to the entities through entity linkage.
The generation of the candidate entity is realized by adopting a conventional method and mainly by using a rule and a dictionary method. After word segmentation and part-of-speech tagging are carried out on a text, an entity name boundary is identified by a rule-based method according to a set spelling rule, a word construction rule, an indicator word and a prefix-suffix character string definition template, then an entity name is identified by utilizing entries in an existing dictionary, and a candidate entity is generated.
The expression forms of the target entities in the open source information have diversity, including alias, short name, nickname and the like, and according to statistics, each named entity has 3.3 different expression forms on average, in order to solve the problem of expression form diversity, the scheme can obtain all candidate entities corresponding to the entities after entity links such as Wikipedia (Chinese edition), interactive encyclopedia, Baidu encyclopedia and the like, and further generalize and summarize the different expression forms corresponding to the entities to construct a synonym table. For example, the name of the entity named "zhang san" is linked by the entity, and the corresponding different expression forms include a man swimmer, a college professor, a manager of an enterprise department, etc., and the different expression forms are a plurality of candidate entities corresponding to the name of the entity named "zhang san", and the plurality of candidate entities form a candidate entity set.
Step S2: and a plurality of candidate entities are used as training data, a ranking model is constructed through RankNet based on a neural network, and the ranking model is optimized by using a gradient descent method.
In the step, the entity link and the feature text constructed by the candidate entity set are selected and predicted, for example, the searched Zhang III is selected to be most relevant to which candidate entity set in the candidate entity set, and other candidate entities are disambiguated.
The scheme converts an entity disambiguation task into an LTR problem in information retrieval, and adopts an improved PairWise method, wherein the method mainly shifts to a candidate entity sequence relation, the input of the method is a candidate entity, and the output of the method is a local priority relation in the candidate entity. The ordering problem is mainly ascribed to binary classification problems such as Boost, SVM, neural networks, and the like.
In detail, please refer to fig. 2, assuming that a certain entity corresponds to three candidate entities, after the three candidate entities corresponding to the entity are labeled manually, d1The score of =5 is highest, followed by d2=3, worst is d3=2, this is converted into a relative relationship followed by:
d2>d1、d2>d3、d3>d1
for any two different labeled candidate entities, a training instance (d) can be obtainedi,dj) If d isi>djThen to diAssigned a value of +1, whereas the pair djThe value is-1, so that the training samples required by the training of the binary classifier can be obtained, and the sorting order of the correlation can also be obtained according to the reverse order relation. Since the score is manually labeled, it can be regarded as a standard answer, which is equivalent to imagine that there is an optimal scoring function g, and the next task is to construct another scoring function f whose scoring result can be as identical as possible to the scoring function g. During testing, only the scoring function f is used for classifying all candidate entities, and then a ranking relation of all candidate entities can be obtained, so that the ranking relation is realizedAnd (6) sorting.
The Ranking problem of the candidate entities is converted into the sequence judgment through PairWise, the PairWise has a plurality of implementation modes such as a Ranking SVM, a Ranking net, a Frank, a Ranking boost and the like, and the scheme improves a Ranking learning method of the neural network, namely the Ranking net. RankNet is one of representatives of a neural network-based ranking learning method, which depends on each candidate entity and defines a probability-based loss function, and the scheme adopts a neural network and a gradient descent method to try to minimize a cross entropy loss function.
Two candidate entities x given an associationu、xvIn training data
Figure 385014DEST_PATH_IMAGE029
The above uses a neural network to calculate the score, s represents the predicted result of the scoring function f:
Figure 13442DEST_PATH_IMAGE004
Figure 377427DEST_PATH_IMAGE005
computing candidate entities x based on a scoring function fuAnd xvDefining a model probability Pu,vThis is a sigmoid function representing candidate entity xuRank in candidate entity xvThe preceding probabilities, namely:
Figure 505045DEST_PATH_IMAGE006
constructing target probabilities based on true annotations
Figure 100002_DEST_PATH_IMAGE030
Figure 16798DEST_PATH_IMAGE008
In the formula (I), the compound is shown in the specification,
Figure 714495DEST_PATH_IMAGE031
is {1, -1}, if the candidate entity u is more relevant to the searched entity than the candidate entity v, the value is 1, and if not relevant, the value is-1.
Defining a loss function as a target probability on a candidate entity
Figure 165943DEST_PATH_IMAGE030
And the model probability Pu,vCross entropy between, i.e.:
Figure 25315DEST_PATH_IMAGE010
and (3) pushing out:
Figure 696467DEST_PATH_IMAGE011
the ultimate goal is to optimize
Figure DEST_PATH_IMAGE032
The sum of (a) and (b) to minimize the loss function. The neural network is used for modeling, and a gradient descent method is used as an optimization algorithm to learn the scoring function f. Passing loss function
Figure 496058DEST_PATH_IMAGE032
And calculating loss, and updating the weight w of the neural network based on a gradient descent method. The loss function C is derived from the weight w to obtain:
Figure 896953DEST_PATH_IMAGE012
Figure 927226DEST_PATH_IMAGE013
in the above formula, the partial derivative of the score s with respect to the weight w is related to a specific learning process, and the original RankNet method uses a neural network model, where a gradient descent method is used to find a ranking model in RankNet. The cross entropy is used as a loss function, so that derivation is convenient, and the method is suitable for a gradient descending framework.
Step S3: training the constructed ranking model by combining a BP network back propagation algorithm and a conjugate gradient algorithm, and selecting a candidate entity most related to the entity by using the trained ranking model so as to disambiguate other candidate entities.
According to the scheme, the training mode of the sequencing model is improved and optimized, the weight and the threshold value of a BP network back propagation algorithm (hereinafter referred to as BP network) are adjusted according to the negative gradient direction of a network performance function, the search directions of adjacent iterations in the calculation method are orthogonal, vibration is easy to occur when the search directions are close to an extreme value, and even though the BP network can reduce the network performance function value at the fastest speed, the BP network is easy to fall into a local minimum point, which is a place with defects of the BP learning algorithm.
Obtaining the output error E of the sequencing model by using a BP network:
Figure 855648DEST_PATH_IMAGE014
wherein
Figure 629569DEST_PATH_IMAGE033
Figure 557074DEST_PATH_IMAGE034
Is a function of the hidden layer of the BP network,
Figure 758248DEST_PATH_IMAGE035
Figure 905458DEST_PATH_IMAGE036
is a function of the output layer of the BP network,
Figure DEST_PATH_IMAGE037
Figure 279808DEST_PATH_IMAGE038
the weights of the output layer and the hidden layer are respectively,
Figure 327398DEST_PATH_IMAGE039
for the desired output vector, k denotes the kth neuron, j denotes the jth layer hidden layer, and i denotes the ith layer output layer. It can be seen that adjusting the weights can change the output error E.
The principle of adjusting the weight is to reduce the error continuously, and in order to better train the BP network, a conjugate gradient algorithm is combined with the training of the BP network, please refer to fig. 3, and the main algorithm is described as follows:
assuming that the initial weight of the output layer is w (0), the initial weight of the hidden layer is v (0), and the initial search direction d(0)Is a negative gradient direction h(0)Namely:
Figure 401271DEST_PATH_IMAGE022
the adjustment amount of the weight is in direct proportion to the negative gradient direction of the error, namely:
Figure 799891DEST_PATH_IMAGE023
Figure 915615DEST_PATH_IMAGE024
wherein j =0,1,2, ·, m; k =1,2,. l; i =0,1, 2. Wherein the sign of d is negative, representing a decreasing gradient, constant
Figure 647073DEST_PATH_IMAGE040
The scale factor is expressed, reflecting the learning rate in the training.
The iterative learning rate step length of the conjugate gradient algorithm is as follows:
Figure 190050DEST_PATH_IMAGE026
Figure 544807DEST_PATH_IMAGE027
in the formula, Q is a symmetrical real matrix of n x n,
Figure DEST_PATH_IMAGE041
the direction d is conjugated with respect to Q.
The first search direction is a negative gradient direction, then iteration is searched along a conjugate direction to obtain a minimum point, the conjugate direction is continuously generated along with the iteration, in each iteration, a new direction is constructed by utilizing the linear combination of the last search direction between the gradient vectors of the current iteration point, the new direction and the previously generated search direction form the conjugate direction, and the iteration is searched along the conjugate direction to obtain the minimum point.
The above is a quadratic function problem, and for a non-quadratic function problem, the objective function is subjected to second-order approximation by a taylor expansion formula, and a selection coefficient is calculated by adopting a Fletcher-Reeves formula. In the implementation process of the algorithm, a pure steepest descent step is performed every n steps and is used as an interpolation step, and the global convergence of the algorithm can be ensured because other steps do not increase the objective function. By interpolating steps, it is meant that some algorithm calculates one step of other algorithms, and the only requirement of the interpolating steps is that they do not increase the value of the decreasing function, so as to ensure the convergence of the composite process, i.e.:
Figure 308231DEST_PATH_IMAGE042
by adopting the improved conjugate gradient method, the overall convergence is ensured, the convergence speed of the algorithm is considered, and the defect that BP network learning is easy to fall into a local minimum point is overcome.
Thus, the trained ranking model can rank the generated candidate entities, convert the entity disambiguation task into the LTR problem in information retrieval, and utilize the ranking model to disambiguate the candidate entities.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. An entity disambiguation method facing knowledge graph question answering based on intelligence documents is characterized in that: the method comprises the following steps:
step S1: generating a plurality of candidate entities corresponding to the entities through entity linkage;
step S2: a plurality of candidate entities are used as training data, a ranking model is built through RankNet based on a neural network, and the ranking model is optimized by using a gradient descent method;
step S3: training the constructed ranking model by combining a BP network back propagation algorithm and a conjugate gradient algorithm, and selecting a candidate entity most related to the entity by using the trained ranking model so as to disambiguate other candidate entities.
2. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 1, characterized in that: the step of generating a plurality of candidate entities corresponding to the entity through the entity link includes:
identifying entity name boundaries based on the set spelling rule, word construction rule, indicator word and prefix and suffix string definition template; and recognizing entity names by utilizing the entries in the existing dictionary so as to generate a plurality of candidate entities corresponding to the entities.
3. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 1, characterized in that: the step of constructing a ranking model based on a neural network by using the candidate entities as training data through RankNet comprises the following steps:
after manual marking is respectively carried out on the candidate entities, an idealized scoring function g is obtained;
computing every two candidate entities x using neural networksu、xvModel probability P ofu,vAnd according to the model probability Pu,vConstructing target probabilities
Figure 792382DEST_PATH_IMAGE001
According to the target probability
Figure 902421DEST_PATH_IMAGE001
And the model probability Pu,vCalculating cross entropy, the calculated cross entropy being defined as a loss function
Figure 759256DEST_PATH_IMAGE002
Thereby obtaining a loss function
Figure 169509DEST_PATH_IMAGE002
4. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 3, characterized in that: the step of obtaining an idealized scoring function g after the candidate entities are respectively manually labeled comprises: respectively carrying out manual marking on the candidate entities, comparing scores of every two candidate entities, and assigning a value of +1 to the candidate entity with a higher score when the scores of every two candidate entities are compared, otherwise assigning a value of-1 to the candidate entity with a lower score, thereby forming a sequencing order of the candidate entities; and obtaining an idealized scoring function g according to the sorting order of the candidate entities.
5. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 3, characterized in that: computing every two candidate entities x using neural networksu、xvModel probability P ofu,vAnd according to the model probability Pu,vConstructing target probabilities
Figure 220641DEST_PATH_IMAGE003
According to the target probability
Figure 83555DEST_PATH_IMAGE003
And the model probability Pu,vCalculating cross entropy, the calculated cross entropy being defined as a loss function
Figure 738222DEST_PATH_IMAGE004
Thereby obtaining a loss function
Figure 737402DEST_PATH_IMAGE004
The method comprises the following steps:
given two candidate entities xu、xvIn training data
Figure 959436DEST_PATH_IMAGE005
The above uses a neural network to calculate the score, s represents the predicted result of the scoring function f:
Figure 309646DEST_PATH_IMAGE006
Figure 773863DEST_PATH_IMAGE007
computing candidate entities x based on a scoring function fuAnd xvDefining a model probability Pu,vThis is a sigmoid function representing candidate entity xuRank in candidate entity xvThe preceding probabilities, namely:
Figure 893129DEST_PATH_IMAGE008
constructing target probabilities based on true annotations
Figure 20485DEST_PATH_IMAGE009
Figure 857991DEST_PATH_IMAGE010
In the formula (I), the compound is shown in the specification,
Figure 860320DEST_PATH_IMAGE011
is {1, -1}, if the candidate entity u is more relevant to the searched entity than the candidate entity v, the value is 1, and if not relevant, the value is-1;
defining a loss function as a target probability on a candidate entity
Figure 365250DEST_PATH_IMAGE009
And the model probability Pu,vCross entropy between, i.e.:
Figure 397928DEST_PATH_IMAGE012
and (3) pushing out:
Figure 722730DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 528750DEST_PATH_IMAGE014
is a loss function.
6. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 5, characterized in that: the method for constructing the ranking model based on the neural network through RankNet and optimizing the ranking model by using a gradient descent method comprises the following steps of:
using gradient descent method as optimization algorithm to learn scoring function f, passing through loss function
Figure 357029DEST_PATH_IMAGE014
Calculating loss, and updating the weight w of the neural network based on a gradient descent method;
loss function
Figure 91767DEST_PATH_IMAGE014
And (5) deriving the weight w to obtain:
Figure 638286DEST_PATH_IMAGE015
Figure 253856DEST_PATH_IMAGE016
wherein the partial derivative of the score s with respect to the weight w is related to a specific learning process.
7. The intelligence document-based knowledge-graph question-answer oriented entity disambiguation method of claim 6, characterized in that: the step of training the constructed ranking model by combining the BP network back propagation algorithm and the conjugate gradient algorithm comprises the following steps:
obtaining the output error E of the sequencing model by using a BP network:
Figure 998958DEST_PATH_IMAGE017
wherein
Figure 904597DEST_PATH_IMAGE018
Figure 938412DEST_PATH_IMAGE019
Is a function of the hidden layer of the BP network,
Figure 86234DEST_PATH_IMAGE020
Figure 889105DEST_PATH_IMAGE021
is a function of the output layer of the BP network,
Figure 965646DEST_PATH_IMAGE022
Figure 486757DEST_PATH_IMAGE023
the weights of the output layer and the hidden layer are respectively,
Figure 172691DEST_PATH_IMAGE024
for the expected output vector, k represents the kth neuron, j represents the jth hidden layer, and i represents the ith output layer;
assuming that the initial weight of the output layer is w (0), the initial weight of the hidden layer is v (0), and the initial search direction d(0)Is a negative gradient direction h(0)Namely:
Figure 95648DEST_PATH_IMAGE025
the adjustment amount of the weight is in direct proportion to the negative gradient direction of the error, namely:
Figure 343089DEST_PATH_IMAGE026
Figure 85917DEST_PATH_IMAGE027
wherein j =0,1,2, ·, m; k =1,2,. l; i =0,1,2, ·, n; wherein the sign of d is negative, representing a decreasing gradient, constant
Figure DEST_PATH_IMAGE028
The scale factor is expressed, and the learning rate is reflected in the training;
the iterative learning rate step length of the conjugate gradient algorithm is as follows:
Figure 513225DEST_PATH_IMAGE029
Figure DEST_PATH_IMAGE030
in the formula, Q is a symmetrical real matrix of n x n,
Figure 228372DEST_PATH_IMAGE031
the direction d is conjugated with respect to Q.
CN202111595751.9A 2021-12-24 2021-12-24 Knowledge graph question and answer oriented entity disambiguation method based on intelligence document Active CN114003735B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111595751.9A CN114003735B (en) 2021-12-24 2021-12-24 Knowledge graph question and answer oriented entity disambiguation method based on intelligence document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111595751.9A CN114003735B (en) 2021-12-24 2021-12-24 Knowledge graph question and answer oriented entity disambiguation method based on intelligence document

Publications (2)

Publication Number Publication Date
CN114003735A true CN114003735A (en) 2022-02-01
CN114003735B CN114003735B (en) 2022-03-18

Family

ID=79931956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111595751.9A Active CN114003735B (en) 2021-12-24 2021-12-24 Knowledge graph question and answer oriented entity disambiguation method based on intelligence document

Country Status (1)

Country Link
CN (1) CN114003735B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968419A (en) * 2011-08-31 2013-03-13 微软公司 Disambiguation method for interactive Internet entity name
US20160342684A1 (en) * 2010-07-01 2016-11-24 Match.Com, L.L.C. System for determining and optimizing for relevance in match-making systems
US20180246899A1 (en) * 2017-02-28 2018-08-30 Laserlike Inc. Generate an index for enhanced search based on user interests
CN109299221A (en) * 2018-09-04 2019-02-01 广州神马移动信息科技有限公司 Entity extraction and sort method and device
CN109522465A (en) * 2018-10-22 2019-03-26 国家电网公司 The semantic searching method and device of knowledge based map
CN109564636A (en) * 2016-05-31 2019-04-02 微软技术许可有限责任公司 Another neural network is trained using a neural network
CN110941722A (en) * 2019-10-12 2020-03-31 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN111581973A (en) * 2020-04-24 2020-08-25 中国科学院空天信息创新研究院 Entity disambiguation method and system
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity
CN112236766A (en) * 2018-04-20 2021-01-15 脸谱公司 Assisting users with personalized and contextual communication content
CN112651244A (en) * 2020-12-25 2021-04-13 上海交通大学 TopK entity extraction method and system based on paper abstract QA
CN112765983A (en) * 2020-12-14 2021-05-07 四川长虹电器股份有限公司 Entity disambiguation method based on neural network combined with knowledge description
CN113361283A (en) * 2021-06-28 2021-09-07 东南大学 Web table-oriented paired entity joint disambiguation method
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113641707A (en) * 2018-01-25 2021-11-12 北京百度网讯科技有限公司 Knowledge graph disambiguation method, device, equipment and storage medium
CN113807520A (en) * 2021-11-16 2021-12-17 北京道达天际科技有限公司 Knowledge graph alignment model training method based on graph neural network

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342684A1 (en) * 2010-07-01 2016-11-24 Match.Com, L.L.C. System for determining and optimizing for relevance in match-making systems
CN102968419A (en) * 2011-08-31 2013-03-13 微软公司 Disambiguation method for interactive Internet entity name
CN109564636A (en) * 2016-05-31 2019-04-02 微软技术许可有限责任公司 Another neural network is trained using a neural network
US20180246899A1 (en) * 2017-02-28 2018-08-30 Laserlike Inc. Generate an index for enhanced search based on user interests
CN113641707A (en) * 2018-01-25 2021-11-12 北京百度网讯科技有限公司 Knowledge graph disambiguation method, device, equipment and storage medium
CN112236766A (en) * 2018-04-20 2021-01-15 脸谱公司 Assisting users with personalized and contextual communication content
CN109299221A (en) * 2018-09-04 2019-02-01 广州神马移动信息科技有限公司 Entity extraction and sort method and device
CN109522465A (en) * 2018-10-22 2019-03-26 国家电网公司 The semantic searching method and device of knowledge based map
CN110941722A (en) * 2019-10-12 2020-03-31 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN111581973A (en) * 2020-04-24 2020-08-25 中国科学院空天信息创新研究院 Entity disambiguation method and system
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity
CN112765983A (en) * 2020-12-14 2021-05-07 四川长虹电器股份有限公司 Entity disambiguation method based on neural network combined with knowledge description
CN112651244A (en) * 2020-12-25 2021-04-13 上海交通大学 TopK entity extraction method and system based on paper abstract QA
CN113392182A (en) * 2021-05-11 2021-09-14 宜通世纪物联网研究院(广州)有限公司 Knowledge matching method, device, equipment and medium fusing context semantic constraints
CN113361283A (en) * 2021-06-28 2021-09-07 东南大学 Web table-oriented paired entity joint disambiguation method
CN113807520A (en) * 2021-11-16 2021-12-17 北京道达天际科技有限公司 Knowledge graph alignment model training method based on graph neural network

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
CHRIS BURGES 等: "Learning to rank using gradient descent", 《PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING》 *
VIKAS C. RAYKAR 等: "A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
YANG SONG 等: "Adapting Deep RankNet for Personalized Search", 《PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING》 *
张平: "基于直接优化信息检索评价方法的排序学习算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
林泽斐 等: "多特征融合的中文命名实体链接方法研究", 《情报学报》 *
毛存礼: "有色金属领域实体检索关键技术研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *
苏高利 等: "论基于MATLAB 语言的BP神经网络的改进算法", 《科技通报》 *

Also Published As

Publication number Publication date
CN114003735B (en) 2022-03-18

Similar Documents

Publication Publication Date Title
TWI732271B (en) Human-machine dialog method, device, electronic apparatus and computer readable medium
CN107273355B (en) Chinese word vector generation method based on word and phrase joint training
CN109918666B (en) Chinese punctuation mark adding method based on neural network
CN107291693B (en) Semantic calculation method for improved word vector model
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
CN112100351A (en) Method and equipment for constructing intelligent question-answering system through question generation data set
CN111611361A (en) Intelligent reading, understanding, question answering system of extraction type machine
CN107832306A (en) A kind of similar entities method for digging based on Doc2vec
US20050021323A1 (en) Method and apparatus for identifying translations
CN110674252A (en) High-precision semantic search system for judicial domain
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
CN109800437A (en) A kind of name entity recognition method based on Fusion Features
CN109977220B (en) Method for reversely generating abstract based on key sentence and key word
CN113268995A (en) Chinese academy keyword extraction method, device and storage medium
CN110879834B (en) Viewpoint retrieval system based on cyclic convolution network and viewpoint retrieval method thereof
CN110489554B (en) Attribute-level emotion classification method based on location-aware mutual attention network model
CN114428850B (en) Text retrieval matching method and system
CN112948562A (en) Question and answer processing method and device, computer equipment and readable storage medium
CN113704416A (en) Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
Wu et al. An effective approach of named entity recognition for cyber threat intelligence
Sharma et al. BioAMA: towards an end to end biomedical question answering system
CN113407697A (en) Chinese medical question classification system for deep encyclopedia learning
CN112765983A (en) Entity disambiguation method based on neural network combined with knowledge description
CN115934951A (en) Network hot topic user emotion prediction method
JP3024045B2 (en) Data retrieval device based on natural language

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 100085 room 703, 7 / F, block C, 8 malianwa North Road, Haidian District, Beijing

Patentee after: Beijing daoda Tianji Technology Co.,Ltd.

Address before: 100085 room 703, 7 / F, block C, 8 malianwa North Road, Haidian District, Beijing

Patentee before: Beijing daoda Tianji Technology Co.,Ltd.

CP01 Change in the name or title of a patent holder