CN113157888A

CN113157888A - Multi-knowledge-source-supporting query response method and device and electronic equipment

Info

Publication number: CN113157888A
Application number: CN202110424743.1A
Authority: CN
Inventors: 程渤; 赵帅; 韦翔晟; 陈俊亮
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-04-20
Filing date: 2021-04-20
Publication date: 2021-07-23

Abstract

The invention provides a query answering method, a device and electronic equipment supporting multiple knowledge sources, which comprise the following steps: determining an inquiry sentence text; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences. The method, the device and the electronic equipment provided by the invention realize the optimal response to the user inquiry sentence based on multiple knowledge sources.

Description

Multi-knowledge-source-supporting query response method and device and electronic equipment

Technical Field

The invention relates to the technical field of automatic response, in particular to an inquiry response method and device supporting multiple knowledge sources and electronic equipment.

Background

With the rapid development of the internet in recent years, the role of online shopping in the life of people is more and more important. The most common form that a user knows product information through a network is to browse publicity pages of merchants or detailed product information pages, and the method has limitations, and the user needs to browse and compare a large amount of information to obtain information desired by the user, so that the user cannot obtain satisfactory experience. When a user needs targeted information, the user often needs to consult pre-sale customer service, consult product information for pre-sale customer service personnel by using the language of the user, and then answer the user questions through the experience of the user or after consulting related information pages or documents. There are some disadvantages to manually answering a user's questions. Due to the physiological limitation of people, the user often needs a long time to read, think and reply, so that the second return is difficult to achieve, and the probability of people making mistakes is increased due to long-time customer service work. In addition, as the number of users increases, one customer service staff often needs to have question and answer conversations with multiple users at the same time, and more customer service staff need to be employed when the number of the customer service staff is insufficient, so that the cost of the customer service is increased. To cope with these situations, question-answering systems have come into play. A question-answering system is a computer program that attempts to understand a question posed by a user and answers quickly, accurately, and easily understood using natural language that approximates the human style. The question-answering system is used for the question answering of the product information, the user requirements can be further met, and efficient and accurate question answering, labor force liberation and customer service cost reduction are achieved.

For specific industries and fields, some ready knowledge, such as unstructured or semi-structured product information documents or structured product information databases, generally exists, and some fields may have question-answer pairs written for common problems of users. By utilizing knowledge of these sources, a question-and-answer system can be constructed to answer the user's questions.

The existing question-answering systems generally only utilize one of a common question-answer pair or a certain knowledge database as a knowledge source, and many of the systems adopt a technical route based on manual writing rules or rules and a simple machine learning model, so that the problem which can be answered is small in range and the answer accuracy is difficult to improve. And many industry fields usually possess information and knowledge of various forms at the same time, and the question-answering system based on a single knowledge source cannot fully utilize the existing knowledge in the industry fields.

Therefore, how to avoid the situation that the existing question-answering system has a single knowledge source and cannot consider various forms of information and knowledge in the current industry field is still a problem to be solved by the technical staff in the field.

Disclosure of Invention

The invention provides an inquiry reply method, a device and electronic equipment supporting multiple knowledge sources, which are used for solving the problems that the existing inquiry reply system has a single knowledge source and cannot consider information and knowledge in multiple forms in the current industry field, multiple replies are output by answering input inquiry sentences by adopting reply units of multiple knowledge sources, then final reply sentences are obtained and output by selecting or fusing the multiple replies by using preset decision rules, because the single knowledge source is not used for answering the inquiry sentences, the reply units of at least two knowledge sources process the inquiry sentences and output the replies, and finally the multiple replies are screened and/or fused according to fusion decisions made according to the characteristics of the reply units of the selected knowledge sources, and the optimal reply sentences are output.

The invention provides a query response method supporting multiple knowledge sources, which comprises the following steps:

determining an inquiry sentence text;

respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources;

and processing the replies of all the reply units by a preset decision rule and outputting reply sentences.

According to the query answering method supporting multiple knowledge sources provided by the invention, the multiple answering units comprise a first answering unit, a second answering unit and a third answering unit, correspondingly,

the first reply unit replies by using a knowledge source based on text similarity matching to output a first reply, the second reply unit replies by using a knowledge source based on a knowledge graph to output a second reply, and the third reply unit replies by using a knowledge source based on machine reading understanding to output a third reply.

According to the query reply method supporting multiple knowledge sources provided by the invention, the query sentence text is input into the first reply unit, and the first reply is output, which specifically comprises the following steps:

inputting the query sentence text into a text embedded representation model, and outputting a query sentence characteristic matrix;

inputting the feature vector corresponding to the inquiry sentence feature matrix and any question in a pre-constructed common question-answer library into a first similarity model, and outputting corresponding first similarity;

determining a reply sentence corresponding to the question in the common question-answer library with the largest value in the first similarity as a first reply;

the text embedded expression model is obtained by training a sample query statement text by adopting a BERT network structure, and the first similarity model is obtained by training a sample query statement feature matrix, a feature matrix corresponding to a question in the common question-answer library and a similarity label.

According to the inquiry reply method supporting multiple knowledge sources provided by the invention, the inquiry sentence text is input into the second reply unit, and the second reply is output, which specifically comprises the following steps:

inputting the query sentence characteristic matrix into a question entity recognition model, and outputting a query sentence keyword;

filling the query sentence keywords into a pre-written template to generate a query sentence, and querying in a pre-constructed knowledge map database by adopting the query sentence to obtain a query result;

filling the query result into a natural language compiling template to generate a second reply;

the text embedded expression model is obtained by training a sample query sentence text by adopting a BERT network structure, and the problem entity recognition model is obtained by training a sample query sentence feature matrix and a corresponding query sentence keyword label.

According to the inquiry reply method supporting multiple knowledge sources provided by the invention, the inquiry sentence text is input into a third reply unit, and a third reply is output, and the method specifically comprises the following steps:

determining a second similarity of the query sentence text and any related text in a pre-constructed corpus;

determining spliced texts of related texts in a corpus of a preset number in the second descending order of similarity as target texts;

inputting the query sentence text input text and the target text into a text embedded representation model, and outputting a query sentence characteristic matrix and a target text characteristic matrix;

inputting the feature matrix of the question sentence and the feature matrix of the target text into a carefully chosen model, and outputting a starting point and an ending point of an answer text;

determining a third reply based on the starting point, the ending point, and the target text;

the text embedded expression model is obtained by training through a BERT network structure based on a sample query sentence text and a sample target text, and the carefully chosen model is obtained by training based on the sample target text, the sample query sentence text, and a starting point label and an ending point label on the corresponding sample target text.

According to the query answering method supporting multiple knowledge sources provided by the invention, the determining of the second similarity of the query sentence text and any one of the related texts in the pre-constructed corpus specifically comprises the following steps:

determining the TF-IDF characteristics of the query sentences of the query sentence texts, and determining the TF-IDF characteristics of the relevant texts of any one of the pre-constructed corpus;

and calculating the TF-IDF characteristics of the query sentence and any related text TF-IDF characteristics by adopting a cosine similarity algorithm, and determining a second similarity of the query sentence text and any related text in a pre-constructed corpus.

According to the inquiry reply method supporting multiple knowledge sources provided by the invention, the replies of all reply units are processed by the preset decision rule, and the reply sentence is output, and the method specifically comprises the following steps:

if the first similarity with the largest value in the first similarities is higher than a preset threshold value, determining that the first reply is a reply sentence and outputting the reply sentence;

if the first similarity with the largest value in the first similarities is not higher than a preset threshold and a second reply is not empty, determining that the second reply is a reply sentence and outputting the reply sentence;

and if the first similarity with the largest value in the first similarities is not higher than a preset threshold and the second reply is empty, determining that the third reply is a reply sentence and outputting the reply sentence.

The invention also provides an inquiry answering device supporting multiple knowledge sources, which comprises:

a determination unit for determining an inquiry sentence text;

the reply subunit inputs the query sentence texts into a plurality of reply units respectively to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources;

and the fusion unit is used for processing the replies of all the reply units according to a preset decision rule and outputting reply sentences.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the query-reply method supporting multiple knowledge sources as described in any one of the above when executing the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the query answering method supporting multiple knowledge sources as described in any one of the above.

The invention provides the inquiry answering method, the device and the electronic equipment supporting multiple knowledge sources, which are characterized in that the inquiry sentence text is determined; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences. The method comprises the steps of adopting reply units with various knowledge sources to answer input inquiry sentences to output various replies, then using preset decision rules to select or fuse from the various replies to obtain final reply sentences and output the final reply sentences. Therefore, the method, the device and the electronic equipment provided by the invention realize the optimal response to the user inquiry sentence based on multiple knowledge sources.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the following briefly introduces the drawings needed for the embodiments or the prior art descriptions, and obviously, the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow chart of a query response method supporting multiple knowledge sources according to the present invention;

FIG. 2 is an exemplary diagram of a sample data structure during training of a problem entity recognition model according to the present invention;

FIG. 3 is a sample exemplary diagram used in the refinement model training provided by the present invention;

FIG. 4 is a schematic structural diagram of an inquiry response device supporting multiple knowledge sources according to the present invention;

fig. 5 is a schematic physical structure diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The problem that various forms of information and knowledge in the current industry field cannot be considered due to single knowledge source of the existing question-answering system generally exists. A query answering method supporting multiple knowledge sources of the present invention is described below in conjunction with fig. 1. Fig. 1 is a schematic flowchart of a query response method supporting multiple knowledge sources according to the present invention, as shown in fig. 1, the method includes:

step 110, determine the query sentence text.

Specifically, the execution subject of the query answering method supporting multiple knowledge sources provided by the invention is a query answering device supporting multiple knowledge sources, namely a query answering system which receives a query sentence input by a user and then outputs a corresponding answer sentence. Generally, the inquiry response device is constructed for a certain field, industry or product, and is specially designed for accurately responding to inquiry questions related to the field, industry or product, so that when the inquiry response device is constructed, the adopted database is also constructed based on the related content information of the field, industry or product, the content or form in the database may be different, and the manner of querying the database to find out the response result of the corresponding question may be different, thus forming inquiry response units with different knowledge sources. The first step of the implementation of the inquiry response method supporting multiple knowledge sources provided by the invention is to receive inquiry sentence text input by a user, the user can input voice or text, the form of the user input is not specifically limited, and after the inquiry response device receives the user input, the user input is converted into the inquiry sentence text for subsequent processing.

Step 120, inputting the query sentence text into a plurality of reply units respectively to obtain a reply output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources;

specifically, as described above, the database constructed for the related content information of the same field, industry or product may have different knowledge sources of the query response units constructed by the database due to different stored data contents or structures, and may also have different knowledge sources of the query response units constructed based on the query by querying the database in different query manners for finding the response result corresponding to the question. Therefore, the execution subject query response device supporting the query response method of multiple knowledge sources provided by the embodiment includes multiple response units, each of which is a query response unit of a different knowledge source, that is, the multiple response units form a relationship between the query response units of different knowledge sources due to different data contents or structures in the constructed database and/or different query modes for the self-constructed database. In the present embodiment, for the determined query sentence text, the plurality of reply units included in the query reply device all perform reply processing on the determined query sentence text, and each reply unit outputs a respective corresponding reply result, where it should be noted that each reply unit is equivalent to a complete query reply system, the query sentence is input, the reply sentence is output, and the reply output by each reply unit is a reply natural language that can be directly used as a final reply, so that the subsequent determination of the optimal reply sentence from the plurality of replies can be directly selected from the plurality of replies without performing other transformations.

And step 130, processing replies of all the reply units by using a preset decision rule, and outputting a reply sentence.

Specifically, since the query response method supporting multiple knowledge sources is provided in this embodiment, after obtaining the response results output by multiple response units of different knowledge sources, the multiple response results are processed according to a preset decision rule, where the preset decision rule may be to select one of the multiple response results as a final optimal response sentence according to the characteristics of the different response units or the priority level set for each response result by a certain index corresponding to the response result, or may be to extract keywords from the multiple response results and then splice the extracted keywords to obtain the final optimal response sentence according to a certain fusion rule, which is not limited specifically here.

The inquiry answering method supporting multiple knowledge sources provided by the invention determines the text of an inquiry sentence; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences. The method comprises the steps of adopting reply units with various knowledge sources to answer input inquiry sentences to output various replies, then using preset decision rules to select or fuse from the various replies to obtain final reply sentences and output the final reply sentences. Therefore, the method provided by the invention realizes the optimal response to the user inquiry sentence based on multiple knowledge sources.

Based on the above-described embodiment, in the method, the plurality of reply units includes the first reply unit, the second reply unit, and the third reply unit, and correspondingly,

Specifically, the present embodiment further defines that the number of the plurality of reply units included in the inquiry reply device is 3, which are the first reply unit, the second reply unit and the third reply unit, the knowledge sources of the three response units are different from each other, furthermore, the first response unit adopts the knowledge source based on text similarity matching to perform response and output a first response, namely, the first response unit comprises a pre-constructed frequently asked questions and answers database which comprises the asked questions and the accurate response sentences which are frequently found in the field, the question-answer database can be constructed based on historical manual answer records, and can also be used for compiling simple script files aiming at the content in the field to generate question sentences and corresponding answer sentences in batches, and the sentences formed by the question-answer database can be obviously structured and have more natural and diversified properties than the manual question-answer sentences. The first reply unit calculates the similarity between the input query sentence text and each query sentence in the frequently asked questions and answers database, and finds out the reply sentence corresponding to the query sentence with the highest similarity in the frequently asked questions and answers database as the first reply. The second reply unit replies by adopting knowledge sources based on the knowledge graph to output a second reply, firstly, an input query sentence text is converted into a feature matrix, then, key words in the query sentence are extracted from the feature matrix, then, a query sentence is constructed by the key words, finally, the query sentence is queried in a pre-constructed knowledge graph database to obtain a query result, the query result is filled in a natural language compiling template to generate a sentence which accords with natural language order as the second reply, the content information in the field in the pre-constructed knowledge graph database is stored and associated in a special form, and the query modes of the knowledge graph database are different correspondingly. The third reply unit replies by adopting a knowledge source based on machine reading understanding to output a third reply, a database constructed by the third reply unit is a corpus of the field, the construction mode is simple, a question and a answer do not need to be constructed, a knowledge graph obtained by association in a special form does not need to be stored, and the third reply unit is only a set of all related texts in the field. Because the regularity of the content in the database is not strong and the relevance is weak, the query mode is complex, a model for extracting the most relevant target text from the corpus is required to be trained, a model for cutting out the reply sentence from the target text is also required to be trained, the front model and the rear model are used in sequence, and the last model is used for outputting a third reply.

Based on the above embodiment, in the method, inputting the text of the query sentence into the first reply unit, and outputting the first reply specifically includes:

Specifically, the first reply unit inputs a text of the question sentence into the text embedded representation model, embeds the text of the question sentence into a vector space for representation, and outputs a first similarity model, which is a result of the embedded representation, namely a feature matrix of the question sentence, to the feature matrices corresponding to the feature matrix of the question sentence and all question texts in the frequently asked question-and-answer library, the first similarity model calculates first similarities one by one, finds a frequently asked question with the highest first similarity, and takes an answer corresponding to the frequently asked question with the highest first similarity in the frequently asked question-and-answer library as an output of the first reply unit (i.e., a first reply).

The text embedded representation model is specifically explained here: the text embedded representation model is mainly trained by using a BERT model, which is a model for unsupervised learning. Externally, the input of the training is massive natural language texts (the unsupervised learning model needs a large number of training samples because no artificial labeling provides prior knowledge), and the texts are automatically subjected to unsupervised learning in the training stage. The use stage inputs natural language text with T words and outputs an embedded expression matrix with the size of T multiplied by D, wherein D is the length of a predefined embedded vector of each word.

The first similarity model is also specifically described here: the training data set of the first similarity model comprises a sample query sentence feature matrix, a feature matrix corresponding to a question in the frequently asked question and answer library of the sample and a similarity label, and actually, the query sentence and the similarity label of each frequently asked question in the frequently asked question and answer library of the sample are labeled by a manual method, wherein the similarity label is represented by a numerical value 1 or 0, 1 represents that the currently labeled query sentence is consistent with the frequently asked questions in the frequently asked question and answer library, 0 represents that the currently labeled query sentence is irrelevant to the frequently asked questions in the frequently asked question and answer library, all sample query sentences, the frequently asked questions in the frequently asked question and the corresponding similarity label triplets of the sample are labeled manually, then the sample query sentences and the sample frequently asked questions in each triplet are converted into respective corresponding feature matrices through a text embedded representation model, and the respective corresponding matrices form a sample query feature matrix, a feature matrix and a similarity label corresponding to the sample query feature matrix, And taking the feature matrix and the similarity label triplets corresponding to the questions in the common question-answering library as unit training data in a training data set of the first similarity model.

Based on the above embodiment, in the method, inputting the text of the query sentence into the second reply unit, and outputting the second reply specifically includes:

Specifically, the second reply unit inputs the query sentence text into the text embedded representation model, embeds the query sentence text into a vector space for representation, outputs a feature matrix of the query sentence text to the question entity recognition model, outputs a query sentence keyword, fills the keyword into a template written in advance to generate a query sentence, executes the query sentence to query a database included in the knowledge graph and acquire a query result, and fills the query result into the natural language writing template to generate a second reply output as the second reply unit.

It should be noted here that the text embedded representation model used in this embodiment is the same as the text embedded representation model described in the previous embodiment, and the training mode and the using method are the same, so that the text embedded representation model only needs to be trained once, and can be used by the first response unit and the second response unit after the training is completed.

The problem entity recognition model is specifically described here: fig. 2 is an exemplary diagram of a sample data structure during training of the problem entity recognition model provided by the present invention, and as shown in fig. 2, the first column is input user problem text, and the second column is a label of each word. In use, the contents of the first column are input, the model outputs the contents of the second column, and then the entity and entity type (i.e. keyword and its type) can be extracted from the contents of the second column.

The knowledge graph database and the query method thereof are also specifically explained here: in the embodiment, knowledge is represented in the form of three groups, wherein one three group is 2 points and 1 edge in the knowledge graph, and a graph structure is formed among a plurality of three groups. The triple is expressed by a Resource Description Framework (RDF), which is a machine-readable format based on XML, and different applications can share the triple with each other through the standardized format of RDF, so as to achieve the purpose of exchanging knowledge. RDF includes three object types: resources (resources), Properties (Properties), and Statements (states). A resource is anything described by RDF, which can be anything, such as a person, an organization, or a city, etc. Attributes are used to describe characteristics or relationships of resources, and for a particular resource an attribute will generally have a corresponding attribute value. The resources and attributes of the RDF are named by Uniform Resource Identifiers (URIs), so that all resources can be unambiguously distinguished and stored. A statement contains a particular resource, a particular attribute, and its corresponding attribute value, where the resource is the subject, the attribute is the predicate, and the attribute value is the object, which may be a string or a resource. The form of triples, while simple, may be constructed by combining triples to form a semantic network to express a variety of complex relationships.

Apache Jena is a Java framework for an open source semantic web, contains common functions of a series of semantic webs, and provides rich APIs and tool programs for modeling, storing, querying and other operations of a knowledge graph. The TDB provides a high-performance semantic web persistent storage database, and Fuseki provides a server allowing SPARQL language execution through http, so that in the embodiment, a template is written in advance to generate a query statement, keywords are converted into SPARQL language, and a knowledge graph database constructed based on Apache Jena is queried.

Based on the above embodiment, in the method, inputting the query sentence text into a third reply unit, and outputting a third reply specifically includes:

Specifically, the third response unit screens out the target text of the question text in the corpus from the corpus according to the text similarity. In the rough screening, only the related text with the highest similarity is selected as the target text, but the first m related texts in the sequence obtained by descending order of similarity numerical values are all selected and spliced to obtain the target text, wherein m is a preset number, the larger m is, the higher the complexity of subsequent calculation is, but the accuracy of the result of the third reply output by the third reply unit is also high, and therefore, m needs to select a proper numerical value to balance the calculation complexity and the reply accuracy. And inputting the target text and the text of the question sentence into a text embedded expression model, outputting a corresponding target text feature matrix and a corresponding question sentence feature matrix, inputting the question sentence feature matrix and the target text feature matrix into a selection model, and outputting a starting point and an ending point of an answer text, wherein the starting point and the ending point of the answer text are defined as that answers which are output by a third response unit and used as third responses are obtained by cutting two ends of the starting point and the ending point in the target text.

The selection model is also specified here: the training data set samples used in the training of the choice model are complex and are stored in the JSON format. The data set contains a key, which is "data", and a value, which is a list containing a number of samples. FIG. 3 is an exemplary diagram of samples used in the selection model training process provided by the present invention, as shown in FIG. 3, each sample contains 2 keys and values corresponding to the keys, wherein one key is called "context", and the corresponding value is a context sentence. Another key is called "qas" and the value it corresponds to is a list, each entry in the list corresponding to a user question and its answer in a contextual sentence. The key "query" corresponds to a value of a user question, the key "id" corresponds to a unique number of the question, and the key "answer" corresponds to a value including a start position of an answer in a context sentence (a value corresponding to the key "answer _ start") and a text of the answer (a value corresponding to the key "text"). In summary, an example of a sample in the training dataset used in the training of the refinement model is shown in FIG. 3.

When the choice model is trained, the input is the text of a user query sentence and the text of a context, and the labels are the starting point and the ending point (represented in a one-hot form) of the answer (namely, a correct reply sentence) in the context. In use, the model input is the user question text, the context text, the output is the answer (the correct reply sentence as a third reply) starting and ending points (in one-hot form) in the context, and then the text between the starting and ending points in the context is taken out as a third reply.

Based on the above embodiment, in the method, the determining the second similarity between the query sentence text and any relevant text in the pre-constructed corpus specifically includes:

Specifically, the method extracts the feature of the TF-IDF, and then calculates the similarity between the feature of the query sentence TF-IDF and any one of the TF-IDF features of the relevant text using a cosine similarity formula to determine the similarity between the two text segments. Features of TF-IDF during training, the TF-IDF vector for each word is calculated using the following formula.

Ith word w in text library D_iIn the j' th text d_jThe TF value in (a) is defined by the following formula:

wherein n is_i,jIs the word w_iIn the text d_jNumber of occurrences, Σ_kn_k,jAs text d_jTotal number of Chinese words. The IDF value is defined by the following equation:

where | D | is the total number of text paragraphs in the text corpus, | s_j|w_i∈s_jI is the inclusive word w_iText passage s of_jThe total number of (c). The TF-IDF value is defined by the following equation:

TF-IDF_ij＝TF_ij*IDF_i

for each text segment d_jI.e. user question textAnd each text paragraph in the corpus of text, a corresponding text vector may be calculated:

v_j＝(TF-IDF_0j,TF-IDF_1j,…,TF-IDF_tj)

wherein t is the size of the vocabulary in the text library.

When the method is used, similarity is calculated between every two questions of a user and a feature matrix corresponding to each section of text in a text library, and k sections with the highest similarity to the questions of the user are taken as a coarse screening result. For a given user question text du and the jth text passage s in the corpus_jThe similarity r can be calculated by the following cosine similarity formula_j：

Based on the above embodiment, in the method, processing replies of all reply units by using the preset decision rule, and outputting a reply sentence specifically includes:

Specifically, the setting of the fusion decision of the final reply sentence requires analysis of the characteristics of the output of the question-answer of three knowledge sources. Knowledge graph-based question answering: since the knowledge graph does not necessarily contain the answers to the questions asked by the user, and the generated SPARQL query sentence may not find the result in the specific graph, the output of the knowledge source can be divided into two cases of answer output and no answer output. Question answering based on similarity matching: because the model gives the corresponding similarity value when the user question is matched with each common question in similarity, the common question with the highest similarity finally output also has the similarity value with the user question, and the knowledge source inevitably has answer output and can be accompanied by the corresponding similarity for the question-answer fusion module to refer to. Question-answering based on machine reading understanding: the model predicts the starting point and the ending point of the answer, and if the starting point and the ending point are overlapped or the sequence is wrong, the answer is not output, so that the output of the knowledge source can be divided into two conditions of answer output and no output. Table 1 shows the preset decision rules provided by the present invention, and based on the fusion decision of the similarity matching-based reply, the knowledge graph-based reply, and the machine reading understanding-based reply provided by the present invention, the given decision rules are as shown in table 1 below:

TABLE 1 Preset decision rules

Wherein the similarity threshold is set by a system administrator. As the answers output by the question and answer similarity matching module are manually written and have relatively rich characteristics, the system is biased to select the part of answers on the strategy of answer selection. In order to improve richness of system answers while avoiding answers irrelevant to user questions from being answered by the system as much as possible, a threshold design is adopted for adjusting the possibility that the system uses question similarity matching question answering module outputs. The lower the threshold, the more likely the system will select answers that the question matches the answer output by the question-answering module.

The query answering device supporting multiple knowledge sources provided by the present invention is described below, and the query answering device supporting multiple knowledge sources described below and the query answering method supporting multiple knowledge sources described above can be referred to correspondingly.

Fig. 4 is a schematic structural diagram of the query answering device supporting multiple knowledge sources according to the present invention, as shown in fig. 4, the device includes a determining unit 410, an answering sub-unit 420 and a fusing unit 430, wherein,

the determining unit 410 is configured to determine an inquiry sentence text;

the reply subunit 420 is configured to input the query sentence text into a plurality of reply units respectively to obtain a reply output by each reply unit, where the plurality of reply units are query reply units with different knowledge sources;

the fusion unit 430 is configured to process replies of all the reply units according to a preset decision rule, and output a reply sentence.

The inquiry answering device supporting multiple knowledge sources provided by the invention determines the text of an inquiry sentence; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences. The method comprises the steps of adopting reply units with various knowledge sources to answer input inquiry sentences to output various replies, then using preset decision rules to select or fuse from the various replies to obtain final reply sentences and output the final reply sentences. Therefore, the device provided by the invention realizes the optimal response to the user inquiry sentence based on multiple knowledge sources.

On the basis of the above-described embodiment, in the apparatus, the plurality of reply units include a first reply unit, a second reply unit, and a third reply unit, and, correspondingly,

On the basis of the above embodiment, the apparatus, inputting the query sentence text into a first reply unit, and outputting a first reply, specifically includes:

On the basis of the above embodiment, the apparatus, inputting the query sentence text into a second reply unit, and outputting a second reply, specifically includes:

On the basis of the above embodiment, the apparatus, inputting the query sentence text into a third reply unit, and outputting a third reply, specifically includes:

On the basis of the foregoing embodiment, in the apparatus, the determining a second similarity between the query sentence text and any relevant text in the pre-constructed corpus specifically includes:

On the basis of the foregoing embodiment, in the apparatus, the fusion unit is specifically configured to:

Fig. 5 is a schematic physical structure diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)510, a communication Interface (Communications Interface)520, a memory (memory)530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a query reply method that supports multiple knowledge sources, the method comprising: determining an inquiry sentence text; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences.

Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the multi-knowledge-source-enabled query response method provided by the above methods, the method comprising: determining an inquiry sentence text; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a query-response method supporting multiple knowledge sources provided by the above methods, the method comprising: determining an inquiry sentence text; respectively inputting the query sentence texts into a plurality of reply units to obtain replies output by each reply unit, wherein the plurality of reply units are query reply units with different knowledge sources; and processing the replies of all the reply units by a preset decision rule and outputting reply sentences.

The above-described server embodiments are only illustrative, and the units described as separate components may or may not be physically separate, and components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A query answering method supporting multiple knowledge sources, comprising:

determining an inquiry sentence text;

2. The query response method supporting multiple knowledge sources according to claim 1, wherein said plurality of response units includes a first response unit, a second response unit and a third response unit, respectively,

3. The query response method supporting multiple knowledge sources according to claim 2, wherein the inputting of the query sentence text into the first response unit and the outputting of the first response unit specifically include:

4. The query response method supporting multiple knowledge sources according to claim 3, wherein the step of inputting the query sentence text into a second response unit and outputting a second response includes:

5. The query response method supporting multiple knowledge sources according to claim 4, wherein the step of inputting the query sentence text into a third response unit and outputting a third response includes:

6. The query response method supporting multiple knowledge sources according to claim 5, wherein the determining the second similarity between the query sentence text and any related text in the pre-constructed corpus specifically comprises:

7. The multi-knowledge-source-capable query response method according to any one of claims 3 to 6, wherein the processing of the responses of all the response units with the preset decision rule and the outputting of the response sentence specifically include:

8. A query answering device supporting multiple knowledge sources, comprising:

a determination unit for determining an inquiry sentence text;

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the multi-knowledge-source-enabled query-reply method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the multi-knowledge-source-enabled query response method according to any one of claims 1 to 7.