CN111858896B

CN111858896B - Knowledge base question-answering method based on deep learning

Info

Publication number: CN111858896B
Application number: CN202010751026.5A
Authority: CN
Inventors: 翁兆琦; 张琳
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2024-03-29
Anticipated expiration: 2040-07-30
Also published as: CN111858896A

Abstract

The invention discloses a knowledge base question-answering method based on deep learning, which comprises the following steps: performing topic entity identification on natural language questions of a user, and identifying a plurality of topic entities; performing weight assignment according to the plurality of subject entities to obtain a plurality of center entities with different weights; selecting candidate answer paths according to a plurality of central entities with different weights, and calculating a similarity total score; according to the similarity total score, sorting weights are carried out on the candidate answer paths, and a plurality of candidate answer paths with different weights are obtained; and performing function matching calculation on the plurality of center entities with different weights and the plurality of candidate answer paths with different weights to obtain a final answer and feeding the final answer back to the user. The method solves the problem that the traditional question-answering method can identify the wrong topic entity or can not identify the topic entity, reduces the error rate of the topic entity identification model, improves the accuracy of the attribute relation detection model, and improves the accuracy of the question-answering of the whole knowledge base.

Description

Knowledge base question-answering method based on deep learning

Technical Field

The invention relates to the technical field of computer question-answering systems, in particular to a knowledge base question-answering method based on deep learning.

Background

In recent years, with continuous innovation of technology, artificial intelligence has also been developed. Automated question-answering systems have also received increased attention as an important branch of artificial intelligence. The exponential growth of network information is a necessary trend in the development of the internet. In the face of such a tremendous increase in the amount of information, it becomes important how quickly to extract the effective information required by the user. The advent of search engine technology has largely met the need for information acquisition. Search engines are also becoming a convenient way for people to acquire knowledge and screen information. But traditional search engines return a series of web pages or documents based on the user's question, rather than directly returning the answer that the user needs. Question-answering systems are presented to address such needs of users. The user inputs the required questions to the question-answering system in a natural language mode, and the question-answering system rapidly returns the most accurate answers to the user. The question-answering system will be the basic modality of the next generation search engine.

Knowledge base questions and answers are one type of automatic questions and answers. Knowledge bases, also known as knowledge maps, are a type of database system used to store large-scale structured knowledge. The structured knowledge in the common knowledge base may be represented in the form of triples, i.e. (subjects, predictes, objects). Where the subject and object represent the head and tail entities, respectively, in the triplet, and the predicte represents the relationship between the head and tail entities. In a large-scale knowledge base, tens of millions of triples are typically included, many triples are interconnected to form a directed graph, each entity represents a node in the graph, and predicate relationships represent directed edges between nodes. The massive knowledge reserves in the knowledge base provide rich fact resources for the question-answering system, and the structured knowledge representation mode in the knowledge base is also very suitable for being applied to the actual question-answering task.

Research methods of a knowledge base-based question-answering system can be classified into a semantic parsing-based method, an information extraction-based method, and a vector modeling-based method.

The method based on semantic analysis firstly converts the natural language problem into the expression of a logic form, then converts the expression result of the last step into knowledge base query language such as sparql and the like to query corresponding information from the knowledge base and returns an answer.

The method based on information extraction is that an entity in a question is extracted, a knowledge base subgraph taking the node of the entity as a center is obtained by inquiring the entity in a knowledge base, each node or edge in the subgraph can be used as a candidate answer, then information extraction is carried out according to certain rules or templates by observing the question to obtain a question feature vector, a classifier is established, and the candidate answer is screened by inputting the question feature vector, so that a final answer is obtained.

The method based on vector modeling firstly extracts a topic entity from a natural language question, finds a candidate answer in a knowledge base according to the topic entity, then extracts a series of characteristics of the question and the candidate answer to finally obtain distributed expressions of the question and the candidate answer, and ranks and selects the candidate answer by calculating the scores of the vector expressions of the question and the candidate answer.

The existing knowledge base question-answering method based on deep learning generally divides knowledge base question-answering into three steps, namely topic entity identification, attribute relationship detection and answer selection. The task of topic entity identification is to identify topic entities in natural language questions raised by users through topic entity identification models. The task of attribute relation detection is to search the corresponding head entity in the knowledge base according to the topic entity identified in the last step, take all paths connected with the head entity as candidate answers of the user questions, and then order all candidate answers through an attribute relation detection model. The task of answer selection is to return the highest ranked candidate path returned by the attribute relationship detection step to the user as the final answer to the knowledge base question-answer.

The existing topic entity recognition model usually adopts a simple neural network to perform end-to-end training, and the obtained topic entity is usually only one most obvious topic entity in a question sentence for the next attribute relationship detection model. Such as the question "which publishing houses are computer theory and technology published? The labeling sequence obtained by the subject entity recognition model is probably 1, computer theory 2, computer theory and technology, but the model finally selects the attribute relation detection step returned by the entity with larger probability, and the two entities have corresponding head entities and paths in the knowledge base. Another example is the question "who is the author of a stone note? The subject entity recognition model recognizes the entity "stone note", but there is no "stone note" in the knowledge base. Only the most obvious topic entity 'computer theory' or 'stone record' is selected as the central entity of the attribute relation detection step, so that no matter how high the accuracy of the attribute relation detection model is, a correct answer can not be given to the user, and the follow-up work is invalid.

The existing attribute relation detection model obtains low-dimensional dense vectors in the same semantic space by coding n paths (including entities and relations) connected with a central entity and user questions, so that similarity among the vectors is calculated to obtain a sequencing result of each path. But unified encoding of the n paths ignores the role of the relationships on the individual paths. In addition, because the entities and the relations on the path are usually phrases and the sentence of the user question is not on an order of magnitude, the user question contains more semantic information as a sentence, so that the simple encoding of the user question does not fully utilize the semantic information of the question and the interaction information between the question and the path, which is a key for causing lower accuracy of the attribute relation detection model.

The previous answer selection model is usually simply to select the path most forward in the attribute relation detection step as the result of knowledge base questions and answers. But this does not take into account the association between the topic entity identification step and the attribute relationship detection step, which are considered to be independent of each other, thereby reducing the accuracy of knowledge base questions and answers.

Disclosure of Invention

The invention aims to provide a knowledge base question-answering method based on deep learning. The method aims to solve the problem that the traditional question-answering method can identify the wrong subject entity or can not identify the subject entity, reduce the error rate of the subject entity identification model, improve the accuracy of the attribute relation detection model and improve the accuracy of the question-answering of the whole knowledge base.

In order to achieve the above purpose, the invention provides a knowledge base question-answering method based on deep learning, which comprises the following steps:

step 1: based on a topic entity identification SAMM model, carrying out topic entity identification on natural language questions of a user, and identifying a plurality of topic entities;

step 2: performing weight assignment according to the plurality of subject entities to obtain a plurality of center entities with different weights;

step 3: detecting an MGQR model based on attribute relations, selecting candidate answer paths according to a plurality of central entities with different weights, and calculating the similarity total score of the natural language question of the user and the candidate answer paths;

step 4: according to the similarity total score, sorting weights are carried out on the candidate answer paths, and a plurality of candidate answer paths with different weights are obtained;

step 5: based on the linear function, performing function matching calculation on a plurality of center entities with different weights and a plurality of candidate answer paths with different weights, obtaining a final answer and feeding the final answer back to the user.

Most preferably, the topic entity identification further comprises the steps of:

step 1.1: word segmentation and splicing are carried out on the natural language question of the user, and a spliced user question is obtained;

step 1.2: extracting context information from the spliced user question to obtain semantic features containing the context information;

step 1.3: and carrying out feature recognition according to the semantic features containing the context information, and recognizing a plurality of subject entities.

Most preferably, the word segmentation and concatenation further comprises the steps of:

step 1.1.1: word segmentation and updating are carried out on the natural language question of the user, so that the user question is obtained;

step 1.1.2: based on a self-attention mechanism, calculating the dependency relationship between any two word vectors in the user question, and obtaining a self-attention interaction matrix of the user question;

step 1.1.3: and splicing the user question with the self-attention interaction matrix to obtain the spliced user question.

Most preferably, the feature recognition further comprises the steps of:

step 1.3.1: inputting semantic features containing context information into a Conditional Random Field (CRF) to obtain a labeled question sequence;

step 1.3.2: and identifying a plurality of topic entities according to the marked question sequence.

Most preferably, the weight assignment further comprises the steps of:

step 2.1: selecting a plurality of topic entities with the top ranking according to the ranking order of the topic entities;

step 2.2: and respectively giving different weights to the theme entities which are ranked at the front, and obtaining the theme entities with different weights as a plurality of center entities with different weights.

Most preferably, calculating the similarity total score further comprises the steps of:

step 3.1: performing granularity vector representation on the characters, words and entity relations in the candidate answer paths to respectively obtain character granularity vectors, word granularity vectors and entity relation granularity vectors of the candidate answer paths;

step 3.2: extracting semantic features according to the word granularity vector, the word granularity vector and the entity relation granularity vector to obtain word granularity semantic features, word granularity semantic features and entity relation granularity semantic features;

step 3.3: and carrying out similarity calculation according to the word granularity vector, the entity relation granularity vector, the word granularity semantic feature and the entity relation granularity semantic feature, and calculating the similarity total score.

Most preferably, the granularity vector representation further comprises the steps of:

step 3.1.1: based on a random initialization mode, carrying out word granularity vector representation on words in the candidate answer paths to obtain word granularity vectors;

step 3.1.2: based on a word2vec pre-training word vector tool, carrying out word granularity vector representation on words in a candidate answer path to obtain word granularity vectors;

step 3.1.3: and based on a random initialization mode, entity relationship granularity vector representation is carried out on entity relationships in the candidate answer paths, and the entity relationship granularity vector is obtained.

Most preferably, the semantic feature extraction further comprises the steps of:

step 3.2.1: respectively carrying out weighted calculation on the word granularity vector, the word granularity vector and the entity relation granularity vector to respectively obtain a word granularity user question under a word granularity candidate path, a word granularity user question under the word granularity candidate path and an entity relation granularity user question under the entity relation granularity candidate path;

step 3.2.2: extracting word granularity user questions, word granularity user questions and entity relation granularity user questions respectively to obtain word granularity semantic features, word granularity semantic features and entity relation granularity semantic features.

Most preferably, the similarity calculation further comprises the steps of:

step 3.3.1: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating a word granularity similarity score of a natural language question of a user under a word granularity candidate path and a candidate answer path;

step 3.3.2: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating word granularity similarity scores of natural language questions of the user under the word granularity candidate path and the candidate answer path;

step 3.3.3: performing entity relationship granularity similarity calculation on the entity relationship granularity vector and the entity relationship granularity semantic feature, and calculating the entity relationship granularity similarity score of the user's natural language question and the candidate answer path under the entity relationship granularity candidate path;

step 3.3.4: and performing weight distribution on the word granularity similarity score, the word granularity similarity score and the entity relation granularity similarity score, and calculating the similarity total score.

Most preferably, the ranking weight further comprises the steps of:

step 4.1: sorting the candidate answer paths according to the similarity total score to obtain sorted candidate answer paths;

step 4.2: screening a plurality of candidate answer paths from the sorted candidate answer paths;

step 4.3: and respectively giving different weights to the multiple candidate answer paths to obtain multiple candidate answer paths with different weights.

By using the method, the problem that the traditional question-answering method can identify the wrong subject entity or can not identify the subject entity is solved, the error rate of the subject entity identification model is reduced, the accuracy of the attribute relation detection model is improved, and the accuracy of the question-answering of the whole knowledge base is improved.

Compared with the prior art, the invention has the following beneficial effects:

according to the knowledge base question-answering method based on deep learning, the improvement methods are respectively provided in the three steps of solving the knowledge base question-answering, and the problem that the theme entity identification step identifies the wrong theme entity or the theme entity cannot be identified in the first step is solved to a great extent, so that the error rate of the theme entity identification model is reduced. In the second step, the action of the relation on the candidate path and more semantic information expressed by the user question are considered, so that the accuracy of the attribute relation detection model is improved. And in the third step, the results of the first two steps are combined, so that the accuracy of the question and answer of the whole knowledge base is improved.

Drawings

Fig. 1 is a flowchart of a knowledge base question-answering method based on deep learning.

Fig. 2 is a flowchart of a method for identifying and assigning weights to subject entities provided by the present invention.

Fig. 3 is a flowchart of a method for calculating a similarity total score according to the present invention.

Fig. 4 is a flowchart of a method for ranking weights according to the present invention.

Detailed Description

The invention is further described by the following examples, which are given by way of illustration only and are not limiting of the scope of the invention.

The invention relates to a knowledge base question-answering method based on deep learning, which is shown in figure 1 and comprises the following steps:

step 1: as shown in fig. 2, based on the topic entity recognition SAMM model, topic entity recognition is performed on the natural language question of the user, and a plurality of topic entities are recognized.

Wherein the topic entity identification further comprises the steps of:

step 1.1: word segmentation and splicing are carried out on the natural language question of the user, and a spliced user question is obtained; the word segmentation and splicing method further comprises the following steps:

step 1.1.1: word segmentation is carried out on a natural language question of a user, each word after word segmentation is updated through a word2vec pre-training word vector tool, n word vectors with the dimension d are obtained, and the n word vectors with the dimension d are used as a user question E ^q The method comprises the steps of carrying out a first treatment on the surface of the Wherein, user question E ^q The method meets the following conditions:

wherein n word vectors E with dimension d ^q ∈R ^n×d The method comprises the steps of carrying out a first treatment on the surface of the n is the number of word vectors; d is the word vector dimension;a word vector of an i-th word in a question of a user; in this embodiment, the dimension d is 300.

Step 1.1.2: based on self-attention mechanism, user question E is calculated ^q The dependency relationship between any two internal word vectors, i.e. the word vector of the i-th wordAnd the word vector of the j-th word +.>Obtaining user question E ^q Self-care interaction matrix of (a)And satisfies the following:

wherein k is [1, n ]]；For user question E ^q Word vectors of the j-th word in (a); alpha _ij For user question E ^q The similarity of the point multiplication between the ith word and the jth word is satisfied：

Step 1.1.3: question E of the user ^q Interaction matrix with self-attentionSplicing to obtain a spliced user question C ^q And satisfies:

step 1.2: based on a bidirectional long-short-time memory network, the user question C after splicing ^q Extracting context information to obtain semantic features S containing the context information ^q And satisfies:

wherein,for spliced user question C ^q Semantic features at time t;

step 1.3: from semantic features S containing context information ^q And performing feature recognition to identify a plurality of subject entities. Wherein the feature recognition further comprises the steps of:

step 1.3.1: semantic features S to be contained with context information ^q Inputting a Conditional Random Field (CRF) to obtain a labeled question sequence;

step 1.3.2: identifying a plurality of topic entities according to the marked question sequence; in this embodiment, N topic entities are identified.

Step 2: performing weight assignment according to the plurality of subject entities to obtain a plurality of center entities with different weights; wherein the weight assignment further comprises the steps of:

step 2.2: according to the sorting order, a plurality of theme entities with different weights are respectively assigned to the theme entities with the front sorting order, so that a plurality of theme entities with different weights are obtained and respectively used as a plurality of center entities with different weights.

In this embodiment, the number of the plurality of top-ranked body theme entities is K, and the number of the center entities is K.

Different weights are given according to the sorting order, namely, a larger weight is given to the topic entity in the front order, a smaller weight is given to the topic entity in the rear order, and a basis for selecting the question-answer result of the knowledge base is provided for the subsequent answer selection step.

Step 3: as shown in fig. 3, the MGQR model is monitored based on attribute relationships, candidate answer paths are selected according to a plurality of central entities with different weights, and the similarity total score of the natural language question of the user and the candidate answer paths is calculated.

The candidate answer paths are selected through K center entities, K corresponding sub-graphs taking the subject entity as the center and n nodes as the radius are selected from the knowledge base, and all paths on the sub-graphs are used as the candidate answer paths.

The method for calculating the similarity total score of the natural language question and the candidate answer path of the user further comprises the following steps:

step 3.1: performing granularity vector representation on the characters, words and entity relations in the candidate answer paths to respectively obtain character granularity vectors, word granularity vectors and entity relation granularity vectors of the candidate answer paths; wherein the granularity vector representation further comprises the steps of:

step 3.1.1: based on a random initialization mode, carrying out word granularity vector representation on words in the candidate answer paths to obtain word granularity vectors of the candidate answer paths; wherein the word granularity vector is P ^char And satisfies:

where v is the number of words in the candidate answer path.

Step 3.1.2: word granularity vector representation is carried out on words in the candidate answer paths based on a word2vec pre-training word vector tool, and word granularity vectors of the candidate answer paths are obtained; wherein the word granularity vector is P ^word And satisfies:

where h is the number of words in the candidate answer path.

Step 3.1.3: subtended word granularity vector P ^word Updating, namely carrying out entity relationship granularity vector representation on entity relationships in candidate answer paths based on a random initialization mode to obtain entity relationship granularity vectors in the candidate answer paths; wherein the entity relation granularity vector is P ^relation And satisfies:

wherein l is the number of entity relations in the candidate answer path.

Step 3.2: word granularity vector P according to candidate answer paths ^char Word granularity vector P ^word And entity relationship granularity vector P ^relation And extracting semantic features to obtain word granularity semantic features, word granularity semantic features and entity relation granularity semantic features of the candidate answer paths.

The semantic feature extraction further comprises the following steps:

step 3.2.1: word granularity vector P for candidate answer paths ^char Word granularity vector P ^word And entity relationship granularity vector P ^relation Respectively carrying out weighted calculation to respectively obtain word granularity user question under the word granularity candidate path, word granularity user question under the word granularity candidate path and entity relation granularityThe entity relationship under the candidate path is granular in terms of user questions; wherein, the word granularity user question isAnd satisfies the following:

wherein,matching information of an ith word in a user question and each word in a word granularity candidate path; e, e _ij The similarity of dot multiplication between the ith word in the question sentence of the user and the jth word in the word granularity candidate path is as follows:

word granularity user question isAnd satisfies the following:

wherein,matching information of the ith word in the user question sentence and each word in the word granularity candidate path; b _ij The similarity of the dot product between the ith word in the question sentence of the user and the jth word in the word granularity candidate path is calculated, and the following conditions are satisfied:

entity relationship granularity user question isAnd satisfies the following:

wherein,matching information of relation between the ith word in the question sentence of the user and each entity in the word granularity candidate path; f (f) _ij The similarity of the dot product between the ith word in the user question and the jth entity relation in the word granularity candidate path is:

step 3.2.2: based on convolutional neural network, user questions and sentences with word granularity are respectively givenWord granularity user question->Granularity of user question +.>Extracting to obtain word granularity semantic features, word granularity semantic features and entity relationship granularity semantic features respectively; wherein, the semantic feature of the word granularity is c_char ^q And satisfies:

c_char ^q ＝CNN(α ^q )；

word granularity semantic feature is c word ^q And satisfies:

c_word ^q ＝CNN(β ^q )；

entity relationship granularity semantic feature is c_re ^q And satisfies:

c_re ^q ＝CNN(γ ^q )。

step 3.3: based on bidirectional long-short time memory network, according to word granularity vector P of candidate answer path ^char Word granularity vector P ^word And entity relationship granularity vector P ^relation And word granularity semantic feature c_char of candidate answer path ^q Word granularity semantic feature c_word ^q Entity relationship granularity semantic feature c_re ^q And (5) carrying out similarity calculation, and calculating the similarity total score of the natural language question sentence and the candidate answer path of the user.

Wherein, the similarity calculation further comprises the following steps:

step 3.3.1: vector P of granularity of words ^char And word granularity semantic feature c_char ^q Correspondingly inputting the natural language questions of the user under the word granularity candidate path and the word granularity similarity Score1 of the candidate answer path into a bidirectional long-short-time memory network, and performing word granularity similarity calculation;

step 3.3.2: word granularity vector P ^word Sum word granularity semantic feature c word ^q Correspondingly inputting the word granularity similarity calculation in a bidirectional long-short-time memory network, and calculating a word granularity similarity Score2 of a natural language question sentence of a user under a word granularity candidate path and a word granularity similarity Score of a candidate answer path;

step 3.3.3: granularity vector P of entity relationship ^relation And entity relationship granularity semantic feature c_re ^q In a corresponding input bidirectional long-short-term memory network, calculating the granularity similarity of the entity relationship, and calculating the granularity similarity Score3 of the entity relationship of the natural language question of the user and the candidate answer path under the granularity candidate path of the entity relationship;

step 3.3.4: weight distribution is carried out on the word granularity similarity Score1, the word granularity similarity Score2 and the entity relation granularity similarity Score3, and the similarity total Score of the user question and the candidate path is calculated, and the following conditions are satisfied:

wherein, mu is the same as mu,omega is the weight coefficient of the granularity of the word, the granularity of the word and the granularity of the entity relation respectively; and the weight coefficient of the word granularity mu, the weight coefficient of the word granularity +.>And the weight coefficient omega of the entity relation granularity is the optimal solution calculated by experiments.

Step 4: as shown in fig. 4, ranking weights are performed on the candidate answer paths according to the similarity total score, so as to obtain a plurality of candidate answer paths with different weights; wherein the ranking weight further comprises the steps of:

step 4.1: sorting the candidate answer paths according to the similarity total Score from high to low to obtain sorted candidate answer paths;

step 4.2: screening a plurality of candidate answer paths from the sorted candidate answer paths; in this embodiment, the number of candidate answer paths is Z; wherein Z is more than or equal to 1, and Z is less than the total number of candidate answer paths.

Among the Z candidate answer paths, a larger weight is given to the paths in the front order, and a smaller weight is given to the paths in the rear order.

Step 5: based on the linear function, performing function matching calculation on a plurality of center entities with different weights and a plurality of candidate answer paths with different weights to obtain a final answer and feeding the final answer back to a user; wherein, the final answer is selection_function, and satisfies:

wherein, k_top_entity is K central entities with different weights; z candidate answers with z_top_path of different weightsA plan path;is a linear function coefficient.

The working principle of the invention is as follows:

based on a topic entity identification SAMM model, carrying out topic entity identification on natural language questions of a user, and identifying a plurality of topic entities; performing weight assignment according to the plurality of subject entities to obtain a plurality of center entities with different weights; detecting an MGQR model based on attribute relations, selecting candidate answer paths according to a plurality of central entities with different weights, and calculating the similarity total score of the natural language question of the user and the candidate answer paths; according to the similarity total score, sorting weights are carried out on the candidate answer paths, and a plurality of candidate answer paths with different weights are obtained; based on the linear function, performing function matching calculation on a plurality of center entities with different weights and a plurality of candidate answer paths with different weights, obtaining a final answer and feeding the final answer back to the user.

In summary, the knowledge base question-answering method based on deep learning solves the problem that the traditional question-answering method can identify wrong topic entities or can not identify topic entities, reduces the error rate of topic entity identification models, improves the accuracy of attribute relation detection models, and improves the accuracy of question-answering of the whole knowledge base.

While the present invention has been described in detail through the foregoing description of the preferred embodiment, it should be understood that the foregoing description is not to be considered as limiting the invention. Many modifications and substitutions of the present invention will become apparent to those of ordinary skill in the art upon reading the foregoing. Accordingly, the scope of the invention should be limited only by the attached claims.

Claims

1. The knowledge base question-answering method based on deep learning is characterized by comprising the following steps of:

step 1: based on a topic entity identification SAMM model, carrying out topic entity identification on natural language questions of a user, and identifying a plurality of topic entities; the topic entity identification further comprises the steps of:

the word segmentation and splicing further comprises the following steps:

step 1.1.3: splicing the user question with the self-attention interaction matrix to obtain a spliced user question;

step 1.2: extracting context information from the spliced user question sentence to obtain semantic features containing the context information;

step 1.3: performing feature recognition according to the semantic features containing the context information to identify a plurality of subject entities; the feature recognition further comprises the steps of:

step 1.3.1: inputting the semantic features containing the context information into a Conditional Random Field (CRF) to obtain a labeled question sequence;

step 1.3.2: identifying a plurality of topic entities according to the noted question sequence;

step 3: detecting an MGQR model based on attribute relations, selecting candidate answer paths according to the plurality of central entities with different weights, and calculating the similarity total score of the natural language question of the user and the candidate answer paths; the calculation of the similarity total score further comprises the following steps:

step 3.1: performing granularity vector representation on the characters, words and entity relations in the candidate answer paths to respectively obtain character granularity vectors, word granularity vectors and entity relation granularity vectors of the candidate answer paths; the granularity vector representation further comprises the steps of:

step 3.1.1: based on a random initialization mode, carrying out word granularity vector representation on words in the candidate answer paths to obtain the word granularity vector;

step 3.1.2: based on a word2vec pre-training word vector tool, carrying out word granularity vector representation on words in the candidate answer path to obtain the word granularity vector;

step 3.1.3: based on a random initialization mode, entity relation granularity vector representation is carried out on entity relations in the candidate answer paths, and the entity relation granularity vector is obtained;

step 3.2: extracting semantic features according to the word granularity vector, the word granularity vector and the entity relation granularity vector to obtain word granularity semantic features, word granularity semantic features and entity relation granularity semantic features; the semantic feature extraction further comprises the following steps:

step 3.2.2: extracting the word granularity user question, the word granularity user question and the entity relation granularity user question respectively to obtain the word granularity semantic feature, the word granularity semantic feature and the entity relation granularity semantic feature respectively

Step 3.3: performing similarity calculation according to the word granularity vector, the entity relation granularity vector, the word granularity semantic feature and the entity relation granularity semantic feature, and calculating the similarity total score; the similarity calculation further comprises the following steps:

step 3.3.2: performing word granularity similarity calculation on the word granularity vector and the word granularity semantic features, and calculating word granularity similarity scores of natural language questions of the user under the word granularity candidate path and candidate answer paths;

step 3.3.3: performing entity relationship granularity similarity calculation on the entity relationship granularity vector and the entity relationship granularity semantic feature, and calculating entity relationship granularity similarity scores of user natural language questions and candidate answer paths under entity relationship granularity candidate paths;

step 3.3.4: weight distribution is carried out on the word granularity similarity score, the word granularity similarity score and the entity relation granularity similarity score, and the similarity total score is calculated;

step 5: and carrying out function matching calculation on the plurality of center entities with different weights and the plurality of candidate answer paths with different weights based on a linear function, obtaining a final answer and feeding back the final answer to a user.

2. The deep learning-based knowledge base question-answering method according to claim 1, wherein the weight assignment further comprises the steps of:

step 2.1: selecting a plurality of topic entities with the top ranking from the plurality of topic entities according to the ranking order of the plurality of topic entities;

step 2.2: and respectively giving different weights to the theme entities with the front sequence to obtain a plurality of theme entities with different weights, and taking the theme entities with different weights as a plurality of center entities with different weights.

3. The deep learning based knowledge base question-answering method according to claim 1, wherein the ranking weight further comprises the steps of:

step 4.2: screening a plurality of candidate answer paths from the sorted candidate answer paths; step 4.3: and respectively giving different weights to the multiple candidate answer paths to obtain the multiple candidate answer paths with different weights.