CN111858896A

CN111858896A - Knowledge base question-answering method based on deep learning

Info

Publication number: CN111858896A
Application number: CN202010751026.5A
Authority: CN
Inventors: 翁兆琦; 张琳
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2020-10-30
Anticipated expiration: 2040-07-30
Also published as: CN111858896B

Abstract

The invention discloses a knowledge base question-answering method based on deep learning, which comprises the following steps: performing topic entity identification on a natural language question of a user to identify a plurality of topic entities; carrying out weight assignment according to a plurality of subject entities to obtain a plurality of central entities with different weights; selecting a candidate answer path according to a plurality of central entities with different weights, and calculating a similarity total score; according to the similarity total score, ranking the candidate answer paths to obtain a plurality of candidate answer paths with different weights; and performing function matching calculation on the plurality of central entities with different weights and the plurality of candidate answer paths with different weights to obtain a final answer and feeding the final answer back to the user. The invention solves the problem that the traditional question answering method identifies wrong subject entities or can not identify the subject entities, reduces the error rate of the subject entity identification model, improves the accuracy rate of the attribute relation detection model and improves the accuracy rate of the question answering of the whole knowledge base.

Description

Knowledge base question-answering method based on deep learning

Technical Field

The invention relates to the technical field of computer question-answering systems, in particular to a knowledge base question-answering method based on deep learning.

Background

In recent years, with the continuous innovation of science and technology, artificial intelligence has also been developed dramatically. Automatic question-answering systems have also received increasing attention as an important branch of artificial intelligence. The exponential growth of network information has become a necessary trend for the development of the internet. In the face of such huge increase of information amount, how to quickly extract effective information required by users becomes important. The advent of search engine technology has largely satisfied the needs of people for information acquisition. Search engines are also becoming a convenient way for people to acquire knowledge and screen information. However, the conventional search engine returns a series of web pages or documents according to the user's question, rather than directly returning the answer required by the user. Question-answering systems have emerged to address this need of users. The user inputs the required questions to the question-answering system in a natural language mode, and the question-answering system quickly returns the most accurate answers to the user. The question-answering system will be the basic modality of the next generation search engine.

Knowledge base questions and answers are one type of automatic questions and answers. The knowledge base is also called a knowledge map, and is a database system for storing large-scale structured knowledge. Generally, the structured knowledge in the knowledge base can be represented in the form of a triple, i.e., (subject, predictor, object). Wherein, the subject and the object respectively represent a head entity and a tail entity in the triple, and the predicate represents the relationship between the head entity and the tail entity. In a large-scale knowledge base, tens of millions of triples are usually contained, many triples are connected with each other to form a directed graph, each entity represents a node in the graph, and a predicate relationship represents a directed edge between the nodes. Massive knowledge storage in the knowledge base provides abundant factual resources for the question-answering system, and the structured knowledge representation mode in the knowledge base is also very suitable for being applied to the factual question-answering task.

The research methods of the knowledge base-based question-answering system can be divided into a semantic analysis-based method, an information extraction-based method and a vector modeling-based method.

The method based on semantic analysis firstly converts natural language questions into expressions in a logic form, then converts the expression results in the last step into knowledge base query languages such as spark ql and the like to query corresponding information from a knowledge base and return answers.

The method based on information extraction comprises the steps of extracting an entity in a question, inquiring the entity in a knowledge base to obtain a knowledge base subgraph taking the entity node as the center, enabling each node or edge in the subgraph to serve as a candidate answer, then carrying out information extraction according to certain rules or templates by observing the question to obtain a question feature vector, and establishing a classifier to screen the candidate answers by inputting the question feature vector so as to obtain a final answer.

The method based on vector modeling comprises the steps of firstly extracting a subject entity according to a natural language question, finding candidate answers in a knowledge base according to the subject entity, then extracting a series of characteristics of the question and the candidate answers to finally obtain distributed expressions of the question and the candidate answers, and sequencing and selecting the candidate answers by calculating scores of vector expressions of the question and the candidate answers.

The existing knowledge base question-answer method based on deep learning generally divides knowledge base question-answer into three steps, namely topic entity identification, attribute relation detection and answer selection. The task of topic entity recognition is to recognize the topic entities in the natural language questions posed by the user through a topic entity recognition model. The task of attribute relation detection is to search a corresponding head entity from a knowledge base according to the topic entity identified in the previous step, take all paths connected with the head entity as candidate answers of the user question, and then sort all the candidate answers through an attribute relation detection model. The task of answer selection is to return the highest-ranking candidate path returned by the attribute relation detection step to the user as the final answer of the knowledge base question and answer.

The existing topic entity recognition model usually applies a simple neural network to carry out end-to-end training, and the final obtained topic entity is only one most obvious topic entity in a question sentence and is used by a next step attribute relationship detection model. Such as the question "is computer theory and technology published by which publisher? The labeling sequence obtained by the subject entity recognition model may be 1, computer theory 2 and computer theory and technology, but the model finally selects an attribute relationship detection step returned by the computer theory entity with higher probability, and the two entities have corresponding head entities and paths in the knowledge base. Also as the question "who the author of the stone note is? ", the subject entity recognition model recognized the entity" stone inscription ", but there was no" stone inscription "of this head entity in the knowledge base. Only the most obvious subject entity 'computer theory' or 'stone mark' is selected as the central entity of the attribute relation detection step, so that no matter how high the accuracy of the attribute relation detection model is, a correct answer cannot be given to the user, and the subsequent work is invalid.

In the existing attribute relationship detection model, n paths (including entities and relationships) connecting a central entity and a user question are encoded to obtain low-dimensional dense vectors in the same semantic space, so that the similarity between the vectors is calculated to obtain the sequencing result of each path. But the uniform coding of the n paths ignores the effect of the relationships on the individual paths. In addition, because the entities and the relations on the path are usually phrases, and the sentence, i.e., the user question sentence, is not in a magnitude, and the user question sentence as a sentence contains more semantic information, the simple encoding of the user question sentence does not fully utilize the semantic information of the question sentence and the interaction information between the question sentence and the path, which is the key to lower the accuracy of the attribute relation detection model.

The previous answer selection model usually simply selects the top-ranked path in the attribute relationship detection step as the result of the knowledge base question-answer. But the relation between the subject entity identification step and the attribute relation detection step is not considered in the process, and the two steps are considered to be independent, so that the accuracy of the knowledge base question answering is reduced.

Disclosure of Invention

The invention aims to provide a knowledge base question-answering method based on deep learning. The method aims to solve the problem that the traditional question answering method identifies wrong subject entities or cannot identify the subject entities, reduces the error rate of a subject entity identification model, improves the accuracy rate of an attribute relation detection model, and improves the accuracy rate of question answering of the whole knowledge base.

In order to achieve the aim, the invention provides a knowledge base question-answering method based on deep learning, which comprises the following steps:

step 1: identifying a subject entity for a natural language question of a user based on a subject entity identification SAMM model, and identifying a plurality of subject entities;

step 2: carrying out weight assignment according to a plurality of subject entities to obtain a plurality of central entities with different weights;

and step 3: detecting an MGQR model based on the attribute relationship, selecting a candidate answer path according to a plurality of central entities with different weights, and calculating the similarity total score of a natural language question of a user and the candidate answer path;

and 4, step 4: according to the similarity total score, ranking the candidate answer paths to obtain a plurality of candidate answer paths with different weights;

and 5: and performing function matching calculation on a plurality of central entities with different weights and a plurality of candidate answer paths with different weights based on a linear function to obtain a final answer and feed the final answer back to the user.

Most preferably, the subject entity identification further comprises the steps of:

step 1.1: performing word segmentation and splicing on a natural language question of a user to obtain a spliced user question;

step 1.2: extracting context information of the spliced user question to obtain semantic features containing the context information;

step 1.3: and performing feature recognition according to semantic features containing context information to recognize a plurality of topic entities.

Most preferably, word segmentation concatenation further comprises the steps of:

step 1.1.1: performing word segmentation and updating on a natural language question of a user to obtain a user question;

step 1.1.2: calculating the dependency relationship between any two word vectors in the user question based on a self-attention mechanism to obtain a self-attention interaction matrix of the user question;

step 1.1.3: and splicing the user question with the self-attention interaction matrix to obtain a spliced user question.

Most preferably, the feature identification further comprises the steps of:

step 1.3.1: inputting semantic features containing context information into a conditional random field CRF to obtain a tagged question sequence;

step 1.3.2: and identifying a plurality of topic entities according to the marked question sentence sequence.

Most preferably, the weight assignment further comprises the steps of:

step 2.1: selecting a plurality of topic entities ranked in the front from the plurality of topic entities according to the ranking order of the topic entities;

step 2.2: and respectively endowing different weights to the plurality of topic entities which are ranked at the front, obtaining a plurality of topic entities with different weights, and taking the topic entities as a plurality of central entities with different weights.

Most preferably, the step of calculating the total similarity score further comprises the following steps:

step 3.1: performing granularity vector representation on the relation among the characters, the words and the entities in the candidate answer path, and respectively obtaining the character granularity vector, the word granularity vector and the entity relation granularity vector of the candidate answer path;

step 3.2: extracting semantic features according to the word granularity vector, the word granularity vector and the entity relationship granularity vector to obtain word granularity semantic features, word granularity semantic features and entity relationship granularity semantic features;

step 3.3: and calculating the similarity according to the word granularity vector, the entity relation granularity vector, the word granularity semantic features and the entity relation granularity semantic features, and calculating the total similarity score.

Most preferably, the particle size vector representation further comprises the steps of:

step 3.1.1: performing word granularity vector representation on words in the candidate answer path based on a random initialization mode to obtain word granularity vectors;

step 3.1.2: performing word granularity vector representation on words in the candidate answer path based on a word2vec pre-training word vector tool to obtain a word granularity vector;

step 3.1.3: and based on a random initialization mode, carrying out entity relation granularity vector representation on the entity relation in the candidate answer path to obtain an entity relation granularity vector.

Most preferably, the semantic feature extraction further comprises the steps of:

step 3.2.1: respectively carrying out weighted calculation on the word granularity vector, the word granularity vector and the entity relationship granularity vector to respectively obtain a word granularity user question under a word granularity candidate path, a word granularity user question under a word granularity candidate path and an entity relationship granularity user question under a candidate path of entity relationship granularity;

step 3.2.2: and respectively extracting a word granularity user question, a word granularity user question and an entity relation granularity user question to respectively obtain a word granularity semantic feature, a word granularity semantic feature and an entity relation granularity semantic feature.

Most preferably, the similarity calculation further comprises the steps of:

step 3.3.1: carrying out word granularity similarity calculation on the word granularity vector and the word granularity semantic features to calculate the word granularity similarity scores of the natural language question of the user and the candidate answer path under the word granularity candidate path;

step 3.3.2: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating word granularity similarity scores of natural language question sentences of the user and candidate answer paths under word granularity candidate paths;

step 3.3.3: carrying out entity relation granularity similarity calculation on the entity relation granularity vector and the entity relation granularity semantic features, and calculating the natural language question of the user under the entity relation granularity candidate path and the entity relation granularity similarity score of the candidate answer path;

step 3.3.4: and carrying out weight distribution on the word granularity similarity score, the word granularity similarity score and the entity relation granularity similarity score, and calculating a similarity total score.

Most preferably, the ranking weights further comprise the steps of:

step 4.1: according to the similarity total score, sorting the candidate answer paths to obtain sorted candidate answer paths;

step 4.2: screening out a plurality of candidate answer paths from the sorted candidate answer paths;

step 4.3: and respectively giving different weights to the candidate answer paths to obtain the candidate answer paths with different weights.

By applying the method and the system, the problem that the traditional question answering method identifies wrong subject entities or cannot identify the subject entities is solved, the error rate of the subject entity identification model is reduced, the accuracy rate of the attribute relation detection model is improved, and the accuracy rate of the question answering of the whole knowledge base is improved.

Compared with the prior art, the invention has the following beneficial effects:

the knowledge base question-answering method based on deep learning provided by the invention solves the problem that an error subject entity is identified or can not be identified in the subject entity identification step in the three steps of the knowledge base question-answering, and improves the method, so that the error rate of the subject entity identification model is reduced. In the second step, the function of the relation on the candidate path and more semantic information expressed by the question of the user are considered, and the accuracy of the attribute relation detection model is improved. The results of the first two steps are combined in the third step, so that the accuracy of the question answering of the whole knowledge base is improved.

Drawings

Fig. 1 is a flow chart of a knowledge base question-answering method based on deep learning provided by the invention.

Fig. 2 is a flowchart of a method for identifying and weighting subject entities according to the present invention.

Fig. 3 is a flowchart of a method for calculating a similarity total score according to the present invention.

FIG. 4 is a flow chart of a method for ranking weights according to the present invention.

Detailed Description

The invention will be further described by the following specific examples in conjunction with the drawings, which are provided for illustration only and are not intended to limit the scope of the invention.

The invention relates to a knowledge base question-answering method based on deep learning, which comprises the following steps as shown in figure 1:

step 1: as shown in fig. 2, subject entity recognition is performed on a natural language question of a user based on a subject entity recognition SAMM model, and a plurality of subject entities are recognized.

Wherein the subject entity identification further comprises the steps of:

step 1.1: performing word segmentation and splicing on a natural language question of a user to obtain a spliced user question; the word segmentation and splicing further comprises the following steps:

step 1.1.1: segmenting words of a natural language question of a user, updating each word after segmentation through a word2vec pre-training word vector tool to obtain n word vectors with dimension d, and taking the n word vectors with dimension d as a user question E^q(ii) a Wherein, the user asks sentence E^qSatisfies the following conditions:

wherein n word vectors E with dimension d^q∈R^n×d(ii) a n is the number of word vectors; d is the word vector dimension;

a word vector of the ith word in the question of the user; in this embodiment, the dimension d is 300.

Step 1.1.2: calculating a user question E based on a self-attention mechanism^qDependency between any two word vectors in the middle or inside, i.e. the word vector of the ith word

And the word vector of the jth word

Get user question E^qSelf-attention interaction matrix of

And satisfies the following conditions:

wherein k is ∈ [1, n ]]；

For the user to ask sentence E^qA word vector of the jth word; alpha is alpha_ijFor the user to ask sentence E^qThe point-multiplication similarity between the ith word and the jth word satisfies the following conditions:

step 1.1.3: will ask user for sentence E^qInteraction matrix with self-attention

Splicing to obtain a spliced user question sentence C^qAnd satisfies the following conditions:

step 1.2: based on bidirectional long-and-short-term memory network, question C for spliced user^qExtracting context information to obtain semantic features S containing the context information^qAnd satisfies the following conditions:

wherein,

for the spliced user question C^qSemantic features at time t;

step 1.3: according to semantic features S containing context information^qAnd performing feature recognition to identify a plurality of subject entities. Wherein the feature identification further comprises the steps of:

step 1.3.1: semantic features S containing context information^qInputting a conditional random field CRF to obtain a tagged question sequence;

step 1.3.2: identifying a plurality of subject entities according to the marked question sentence sequence; in this embodiment, N topic entities are identified.

Step 2: carrying out weight assignment according to a plurality of subject entities to obtain a plurality of central entities with different weights; wherein the weight assignment further comprises the steps of:

step 2.2: according to the sorting sequence, different weights are respectively given to a plurality of topic entities which are sorted at the front, so that a plurality of topic entities with different weights are obtained and are respectively used as a plurality of central entities with different weights.

In this embodiment, the number of the plurality of individual subject entities ranked in the top is K, and the number of the central entities is K.

The different weights are given according to the sorting sequence, wherein the larger weight is given to the front part in the sequence in the subject entity, and the smaller weight is given to the rear part in the sequence, so that a basis for selecting the question and answer result of the knowledge base is provided for the subsequent answer selection step.

And step 3: as shown in fig. 3, based on the attribute relationship monitoring MGQR model, a candidate answer path is selected according to a plurality of central entities with different weights, and a total similarity score between a natural language question of a user and the candidate answer path is calculated.

And selecting K corresponding subgraphs which take the subject entity as the center and n nodes as the radius from the knowledge base by the aid of K central entities, and taking all paths on the subgraphs as candidate answer paths.

The method for calculating the similarity total score of the natural language question and the candidate answer path of the user further comprises the following steps:

step 3.1: performing granularity vector representation on the relation among the characters, the words and the entities in the candidate answer path, and respectively obtaining the character granularity vector, the word granularity vector and the entity relation granularity vector of the candidate answer path; wherein the granularity vector representation further comprises the steps of:

step 3.1.1: performing word granularity vector representation on words in the candidate answer path based on a random initialization mode to obtain word granularity vectors of the candidate answer path; wherein the word granularity vector is P^charAnd satisfies the following conditions:

wherein v is the number of words in the candidate answer path.

Step 3.1.2: performing word granularity vector representation on words in the candidate answer path based on a word2vec pre-training word vector tool to obtain a word granularity vector of the candidate answer path; wherein the word granularity vector is P^wordAnd satisfies the following conditions:

wherein h is the number of words in the candidate answer path.

Step 3.1.3: word-pair granularity vector P^wordUpdating, namely performing entity relationship granularity vector representation on the entity relationship in the candidate answer path based on a random initialization mode to obtain the entity relationship granularity vector in the candidate answer path; wherein the entity relationship granularity vector is P^relationAnd satisfies the following conditions:

wherein l is the number of entity relationships in the candidate answer path.

Step 3.2: according toWord granularity vector P of candidate answer path^charWord granularity vector P^wordAnd entity relationship granularity vector P^relationAnd extracting semantic features to obtain word granularity semantic features, word granularity semantic features and entity relation granularity semantic features of the candidate answer paths.

The semantic feature extraction further comprises the following steps:

step 3.2.1: word granularity vector P for candidate answer path^charWord granularity vector P^wordAnd entity relationship granularity vector P^relationRespectively carrying out weighting calculation to respectively obtain a word granularity user question under a word granularity candidate path, a word granularity user question under the word granularity candidate path and an entity relation granularity user question under a candidate path of entity relation granularity; wherein, the word granularity user question is

And satisfies the following conditions:

wherein,

matching information of the ith word in the question of the user and each word in the word granularity candidate path; e.g. of the type_ijAnd multiplying the similarity between the ith word in the question of the user and the jth word in the word granularity candidate path by a point, wherein the point and the similarity satisfy the following conditions:

word granularity user question as

And satisfies the following conditions:

wherein,

matching information of the ith word in the question of the user and each word in the word granularity candidate path; b_ijAnd multiplying the similarity of the ith word in the question of the user and the jth word in the word granularity candidate path by the point, wherein the similarity satisfies the following conditions:

the entity relationship granularity user question is

And satisfies the following conditions:

wherein,

matching information of the relation between the ith word in the question of the user and each entity in the word granularity candidate path; f. of_ijAnd multiplying the similarity between the ith word in the question of the user and the jth entity relationship in the word granularity candidate path, wherein the similarity satisfies the following conditions:

step 3.2.2: respective word-size user question based on convolutional neural network

Word granularity user question

And entity relationship granularity user question

Performing extractionRespectively obtaining word granularity semantic features, word granularity semantic features and entity relation granularity semantic features; wherein, the word granularity semantic feature is c _ char^qAnd satisfies the following conditions:

c_char^q＝CNN(α^q)；

word granularity semantic feature is c _ word^qAnd satisfies the following conditions:

c_word^q＝CNN(β^q)；

the entity relationship granularity semantic feature is c _ re^qAnd satisfies the following conditions:

c_re^q＝CNN(γ^q)。

step 3.3: based on bidirectional long-and-short-term memory network, word granularity vector P according to candidate answer path^charWord granularity vector P^wordAnd entity relationship granularity vector P^relationAnd word-granularity semantic features c _ char of candidate answer paths^qWord granularity semantic feature c word^qEntity relationship granularity semantic feature c _ re^qAnd performing similarity calculation to calculate the similarity total score of the natural language question and the candidate answer path of the user.

Wherein, the similarity calculation also comprises the following steps:

step 3.3.1: a word granularity vector P^charAnd word granularity semantic feature c _ char^qCorrespondingly inputting the characters into the bidirectional long-short time memory network, carrying out character granularity similarity calculation, and calculating a natural language question of the user under a character granularity candidate path and a character granularity similarity Score Score1 of the candidate answer path;

step 3.3.2: word granularity vector P^wordAnd word granularity semantic feature c _ word^qCorrespondingly inputting the words into the bidirectional long-short time memory network, calculating word granularity similarity, and calculating a natural language question of the user under a word granularity candidate path and a word granularity similarity Score Score2 of the candidate answer path;

step 3.3.3: the entity relationship granularity vector P^relationAnd entity relationship granularity semantic feature c _ re^qCorresponding to the input in the bidirectional long-short time memory network, the granularity similarity of the entity relationship is calculated, and the entity relationship is calculatedAn entity relationship granularity similarity Score Score3 of the natural language question of the user and the candidate answer path under the granularity candidate path;

step 3.3.4: carrying out weight distribution on the word granularity similarity Score Score1, the word granularity similarity Score Score2 and the entity relation granularity similarity Score Score3, calculating the similarity total Score Score of the question and the candidate path of the user, and satisfying the following conditions:

wherein, the ratio of mu to mu,

omega is the weight coefficient of the character granularity, the word granularity and the entity relation granularity respectively; and the weight coefficient mu of the word granularity and the weight coefficient of the word granularity

And the weight coefficient omega of the entity relation granularity is the optimal solution calculated by experiments.

And 4, step 4: as shown in fig. 4, according to the total similarity score, ranking the candidate answer paths and obtaining a plurality of candidate answer paths with different weights; wherein the sorting weight further comprises the steps of:

step 4.1: sequencing the candidate answer paths from high to low according to the total Score of the similarity, and obtaining the sequenced candidate answer paths;

step 4.2: screening out a plurality of candidate answer paths from the sorted candidate answer paths; in this embodiment, the number of candidate answer paths is Z; wherein Z is more than or equal to 1 and is less than the total number of the candidate answer paths.

Among the Z candidate answer paths, the higher weight is assigned to the top in the sequence, and the lower weight is assigned to the bottom in the sequence.

And 5: based on a linear function, performing function matching calculation on a plurality of central entities with different weights and a plurality of candidate answer paths with different weights to obtain a final answer and feeding the final answer back to a user; wherein, the final answer is Selection _ Function, and satisfies the following conditions:

wherein, K _ top _ entry is K central entities with different weights; z _ top _ path is Z candidate answer paths with different weights;

is a linear function coefficient.

The working principle of the invention is as follows:

identifying a subject entity for a natural language question of a user based on a subject entity identification SAMM model, and identifying a plurality of subject entities; carrying out weight assignment according to a plurality of subject entities to obtain a plurality of central entities with different weights; detecting an MGQR model based on the attribute relationship, selecting a candidate answer path according to a plurality of central entities with different weights, and calculating the similarity total score of a natural language question of a user and the candidate answer path; according to the similarity total score, ranking the candidate answer paths to obtain a plurality of candidate answer paths with different weights; and performing function matching calculation on a plurality of central entities with different weights and a plurality of candidate answer paths with different weights based on a linear function to obtain a final answer and feed the final answer back to the user.

In conclusion, the knowledge base question-answering method based on deep learning solves the problem that the traditional question-answering method identifies wrong subject entities or cannot identify the subject entities, reduces the error rate of a subject entity identification model, improves the accuracy rate of an attribute relation detection model, and improves the accuracy rate of the question-answering of the whole knowledge base.

While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims

1. A knowledge base question-answering method based on deep learning is characterized by comprising the following steps:

step 2: carrying out weight assignment according to the plurality of subject entities to obtain a plurality of central entities with different weights;

and step 3: detecting an MGQR model based on attribute relation, selecting a candidate answer path according to the plurality of central entities with different weights, and calculating the similarity total score of a natural language question of a user and the candidate answer path;

and 5: and performing function matching calculation on the plurality of central entities with different weights and the plurality of candidate answer paths with different weights based on a linear function to obtain a final answer and feed the final answer back to the user.

2. The deep learning based knowledge base question-answering method according to claim 1, wherein the subject entity identification further comprises the steps of:

step 1.2: extracting context information from the spliced user question to obtain semantic features containing the context information;

step 1.3: and performing feature recognition according to the semantic features containing the context information to recognize a plurality of topic entities.

3. The knowledge base question-answering method based on deep learning of claim 2, wherein the segmentation stitching further comprises the following steps:

step 1.1.3: and splicing the user question and the self-attention interaction matrix to obtain a spliced user question.

4. The deep learning based knowledge base question-answering method according to claim 2, wherein the feature recognition further comprises the steps of:

step 1.3.1: inputting the semantic features containing the context information into a conditional random field CRF to obtain a tagged question sequence;

step 1.3.2: and identifying a plurality of subject entities according to the marked question sentence sequence.

5. The deep learning based knowledge base question-answering method according to claim 1, wherein the weight assignment further comprises the steps of:

step 2.2: and respectively giving different weights to the plurality of topic entities which are ranked at the front, so as to obtain a plurality of topic entities with different weights, wherein the topic entities are used as a plurality of central entities with different weights.

6. The deep learning-based knowledge base question-answering method according to claim 1, wherein calculating the similarity total score further comprises the steps of:

step 3.3: and calculating the similarity according to the word granularity vector, the entity relation granularity vector, the word granularity semantic feature and the entity relation granularity semantic feature, and calculating the total similarity score.

7. The deep learning based knowledge base question-answering method according to claim 6, wherein the granular vector representation further comprises the steps of:

step 3.1.1: performing word granularity vector representation on words in the candidate answer path based on a random initialization mode to obtain the word granularity vector;

step 3.1.2: performing word granularity vector representation on words in the candidate answer path based on a word2vec pre-training word vector tool to obtain the word granularity vector;

step 3.1.3: and performing entity relation granularity vector representation on the entity relation in the candidate answer path based on a random initialization mode to obtain the entity relation granularity vector.

8. The deep learning based knowledge base question-answering method according to claim 6, wherein the semantic feature extraction further comprises the steps of:

step 3.2.2: and extracting the word granularity user question, the word granularity user question and the entity relation granularity user question respectively to obtain the word granularity semantic features, the word granularity semantic features and the entity relation granularity semantic features respectively.

9. The deep learning-based knowledge base question-answering method according to claim 6, wherein the similarity calculation further comprises the steps of:

step 3.3.1: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating word granularity similarity scores of a natural language question of the user and a candidate answer path under the word granularity candidate path;

step 3.3.2: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating word granularity similarity scores of a natural language question of the user and a candidate answer path under a word granularity candidate path;

step 3.3.4: and carrying out weight distribution on the word granularity similarity score, the word granularity similarity score and the entity relationship granularity similarity score, and calculating the similarity total score.

10. The deep learning based knowledge base question-answering method according to claim 1, wherein the ranking weights further comprise the steps of:

step 4.1: sorting the candidate answer paths according to the similarity total score to obtain sorted candidate answer paths;