CN111488431A

CN111488431A - Hit determination method, device and system

Info

Publication number: CN111488431A
Application number: CN202010269621.5A
Authority: CN
Inventors: 宋雨
Original assignee: Bank of China Ltd
Current assignee: Bank of China Ltd
Priority date: 2020-04-08
Filing date: 2020-04-08
Publication date: 2020-08-04
Anticipated expiration: 2040-04-08
Also published as: CN111488431B

Abstract

The invention provides a hit determination method, a hit determination device and a hit determination system, wherein the method comprises the following steps: acquiring primary question and answer data between a question and answer robot and a user; extracting a question-answer characteristic set from the question-answer data; wherein the set of question-answering characteristics comprises: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round; and judging the question-answering feature set based on a predetermined decision tree hit model, and determining whether the question-answering robot hits the result. The invention can provide more accurate results for whether the question-answering robot outputs answers to hit the user questions.

Description

Hit determination method, device and system

Technical Field

The present application relates to the field of communications technologies, and in particular, to a method, an apparatus, and a system for determining a hit.

Background

The general processing logic of the question-answering robot is to intelligently search the user questions in a knowledge base and then return answer sets meeting the conditions. The background engine scores all answers in the answer set, and finally, the answer with the highest scoring value is selected as an output answer.

If the score value of the output answer is larger than the threshold value, the user question is indicated to be hit by the question-answering robot, and if the score value of the output answer is smaller than the threshold value, the user question is indicated to be missed by the question-answering robot.

As user questions vary, the robot believes that the output answer that hits may be a complete answer to the user. Therefore, in the prior art, the scheme of determining whether the robot hits through the score value does not consider the recognition degree of the real user to the output answer, and the accuracy is low.

Disclosure of Invention

In view of this, the present application provides a hit determination method, device, and system, which can provide a more accurate result for the question-answering robot to output whether an answer can hit a user question.

In order to achieve the above object, the present invention provides the following technical features:

a hit determination method, comprising:

acquiring primary question and answer data between a question and answer robot and a user;

extracting a question-answer characteristic set from the question-answer data; wherein the set of question-answering characteristics comprises: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round;

and judging the question-answering feature set based on a predetermined decision tree hit model, and determining whether the question-answering robot hits the result.

Optionally, before acquiring the question-answer data between the question-answer robot and the user, the method further includes:

acquiring multiple times of question and answer data between a question and answer robot and different users;

extracting a plurality of question-answer feature sets from the multiple times of question-answer data respectively, and indicating whether the results are hit or not corresponding to the plurality of question-answer feature sets;

training a decision tree model based on the plurality of question-answer feature sets and a corresponding plurality of results;

and obtaining the decision tree hit model after training is finished.

Optionally, the one-time question-and-answer data includes:

a first round of user questions; and the combination of (a) and (b),

the question-answering robot outputs the scoring value of the answer for the first round; and the combination of (a) and (b),

outputting user evaluation of the answer in the first round; and the combination of (a) and (b),

the second round of user questions.

Optionally, the extracting a question-answer feature set from the question-answer data includes:

carrying out feature extraction on the scoring value of the first round of output answers of the question answering robot, and determining the scoring value feature of the first round of output answers of the question answering robot;

performing feature extraction on the user evaluation of the first round of output answers, and determining the user evaluation features of the first round of output answers;

recognizing the conversation type of the second round of user questions by adopting an intention recognition model, and constructing the conversation type characteristics of the second round of user questions;

under the condition that the conversation type of the second round of user questions is a non-service type, recognizing the second round of user questions by adopting an emotion recognition model, and determining the emotional state characteristics of the second round of user questions;

and under the condition that the conversation type of the second round of user problems is the service type, calculating the similarity between the first round of user problems and the second round of user problems, and constructing the similarity characteristics between the first round of user problems and the second round of user problems.

Optionally, the method further includes:

determining a plurality of outcomes of the question-answering robot using the method of claim 1;

the hit rate is calculated based on the hit results and the number of all results.

A hit determination apparatus, comprising:

the system comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring primary question and answer data between a question and answer robot and a user;

the extracting unit is used for extracting a question-answer characteristic set from the question-answer data; wherein the set of question-answering characteristics comprises: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round;

and the determining unit is used for judging the question-answering feature set based on a predetermined decision tree hit model and determining whether the question-answering robot hits the result.

Optionally, before the obtaining unit, the method further includes:

the training unit is used for acquiring multiple times of question and answer data between the question and answer robot and different users; extracting a plurality of question-answer feature sets from the multiple times of question-answer data respectively, and indicating whether the results are hit or not corresponding to the plurality of question-answer feature sets; training a decision tree model based on the plurality of question-answer feature sets and a corresponding plurality of results; and obtaining the decision tree hit model after training is finished.

Optionally, the one-time question-and-answer data includes:

a first round of user questions; and the combination of (a) and (b),

the second round of user questions.

Optionally, the extracting unit includes:

the first extraction unit is used for performing feature extraction on the scoring value of the answer output by the question-answering robot in the first round and determining the scoring value feature of the answer output by the question-answering robot in the first round;

the second extraction unit is used for carrying out feature extraction on the user evaluation of the first round of output answers and determining the user evaluation features of the first round of output answers;

the third extraction unit is used for identifying the conversation type of the second round of user questions by adopting an intention recognition model and constructing the conversation type characteristics of the second round of user questions;

the fourth extraction unit is used for identifying the second round of user problems by adopting an emotion recognition model under the condition that the conversation type of the second round of user problems is a non-service type, and determining the emotional state characteristics of the second round of user problems;

and the fifth extraction unit is used for calculating the similarity between the first round of user questions and the second round of user questions and constructing the similarity characteristics between the first round of user questions and the second round of user questions under the condition that the conversation type of the second round of user questions is the service type.

Optionally, the method further includes:

and a hit rate calculation unit for determining a plurality of results of the question-answering robot, and calculating a hit rate based on the hit results and the number of all the results.

Through the technical means, the following beneficial effects can be realized:

the invention provides a hit determining method, which not only determines whether hit is achieved according to a score, but also extracts a question-answer characteristic set from question-answer data, wherein the question-answer characteristic set comprises characteristics of a plurality of aspects: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round.

The question-answering feature set is judged based on a predetermined decision tree hit model, whether the question-answering robot hits or not is determined, and whether the question-answering robot hits or not can be judged by integrating multiple dimensions due to the fact that judgment is carried out on multiple aspects, and therefore the result indicating whether the question-answering robot hits or not is more accurate.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flowchart of a method for training a decision tree hit model according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a decision tree hit model according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of a hit determination method disclosed in an embodiment of the present application;

fig. 4 is a schematic structural diagram of a hit determining apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an electronic device disclosed in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, the present invention provides a training method of a decision tree hit model, including:

step S101: and acquiring multiple times of question and answer data between the question and answer robot and different users.

Multiple times of question and answer data between the question and answer robot and different users are collected, and the multiple times of question and answer data can be used as training samples after being processed.

The question-answer data may include: and in the first round of user questions, the Q value of the answer output by the question-answering robot to the first round, the user evaluation of the answer output by the first round and the second round of user questions.

The first round of user questions input by the user can be obtained in the question input interface, the questions and answers of the first round of user questions can be scored for the first round of output answers in the answering process of the questions and answers of the question and answer robot, and scores of the questions and answers of the question and answer robot on the first round of user questions are collected.

After outputting the first round of user questions, the question-answering robot outputs the rating (satisfied, dissatisfied) for the user to rate. If the user clicks satisfactorily, the user is evaluated to be satisfactory, if the user clicks unsatisfactorily, the user is evaluated to be unsatisfactorily, and if the user does not evaluate, the user is evaluated to be null.

After the first round of output answers are displayed, the user can decide whether to perform a second round of questioning according to the actual situation, and if the second round of questioning exists, the second round of user questions are collected. If there is no second round of questions, the second round of user questions is empty.

It is understood that one turn of question-answering data may include at least one turn of question-answering, and most of the question-answering data will be followed by a second turn of question-answering. With a second round of questions and answers, with no second round of user questions, and without second round of questions and answers, with no second round of user questions.

Step S102: and extracting a plurality of question-answer feature sets from the multiple times of question-answer data respectively, and a result which corresponds to the plurality of question-answer feature sets and is used for indicating whether the data are hit or not.

Since the execution process of each question-answering data is consistent, the detailed description will be given by taking the question-answering data as an example.

And manually rechecking whether the first round of output answers are hit or not, giving a hit result if the first round of output answers are hit, and giving a miss result if the first round of output answers are not hit. If hit, a "1" may be used, and if miss, a 0 may be used.

And (4) taking the combination of the [ scoring value characteristic, the conversation type characteristic, the emotional state characteristic, the similarity characteristic and the result ], namely the question-answer characteristic set and the result as a training sample. Similarly, a plurality of training samples may be obtained in the manner described above.

Step S103: training a decision tree model based on the plurality of question-answer feature sets and the corresponding plurality of results.

Training the decision tree model based on a plurality of training samples, wherein the performing may include:

s1, traversing each separation mode to find the best separation point of each training sample;

s2: separating each training sample into two parts, N1 and N2;

the steps S1 and S2 continue for the two portions N1 and N2, respectively, until an end condition is reached.

The decision tree training process is a greedy algorithm strategy, only the best separation mode under a group of training samples is considered, and backtracking operation cannot be carried out. For the whole training sample set, dividing operation is carried out according to all the characteristic attributes, the purities of the splitting sets of all the dividing operation are compared, the characteristic attribute with higher purity is selected as the data set needing to be divided for dividing operation, and iteration is continued until a final result is obtained.

Step S104: and obtaining the decision tree hit model after training is finished.

Step S105: and performing model evaluation and continuous optimization on the decision tree hit model to determine a final decision tree hit model.

Referring to fig. 2, a schematic diagram of a hit model of a decision tree after training is completed is shown.

Referring to fig. 3, the present invention provides a hit determination method, including:

step S301: and acquiring one-time question and answer data between the question and answer robot and the user.

In the online running process, a first round of user questions of the question-answering robot can be obtained, the score of the question-answering robot for outputting answers in the first round, user evaluation of the answers in the first round and second round of user questions can be output by the question-answering robot in the first round.

Step S302: extracting a question-answer characteristic set from the question-answer data; wherein the set of question-answering characteristics comprises: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round.

And taking [ scoring value characteristics, conversation type characteristics, emotional state characteristics and similarity characteristics ] as input characteristics.

Step S303: and judging the question-answering feature set based on a predetermined decision tree hit model, and determining whether the question-answering robot hits the result.

Inputting the [ scoring value characteristics, conversation type characteristics, emotional state characteristics and similarity characteristics ] into a decision tree hit model, and outputting hit or miss results after calculation of the decision tree hit model.

Step S304: determining a plurality of results for the question-answering robot using the steps of steps S301-303; the hit rate is calculated based on the number of hits and the number of all results.

During the continuous operation of the question-answering robot, the results of the question-answering robot are continuously determined based on the steps S301-S303, and the results of the question-answering robot can be obtained after a period of time, so that the overall situation of the question-answering robot can be represented.

The ratio of the number of hit results to the number of all results is taken as the hit rate. When the hit rate is low, the algorithm of the question answering robot can be adjusted.

Through the technical means, the following beneficial effects can be realized:

Referring to fig. 4, the present invention provides a hit determining apparatus, including:

an acquisition unit 41 for acquiring primary question and answer data between the question and answer robot and the user;

an extracting unit 42, configured to extract a question-answer feature set from the question-answer data; wherein the set of question-answering characteristics comprises: the question-answering robot outputs scoring value characteristics of answers for the first round, user evaluation characteristics of the answers for the first round, conversation type characteristics of user questions for the second round, emotional state characteristics of the user questions for the second round, and similarity characteristics of the user questions for the first round and the user questions for the second round;

and the determining unit 43 is configured to determine the question-answering feature set based on a predetermined decision tree hit model, and determine whether the question-answering robot hits the result.

Optionally, before the obtaining unit 41, the method further includes:

a training unit 44, configured to obtain multiple times of question and answer data between the question and answer robot and different users; extracting a plurality of question-answer feature sets from the multiple times of question-answer data respectively, and indicating whether the results are hit or not corresponding to the plurality of question-answer feature sets; training a decision tree model based on the plurality of question-answer feature sets and a corresponding plurality of results; and obtaining the decision tree hit model after training is finished.

Wherein the one-time question-and-answer data includes:

a first round of user questions; and the combination of (a) and (b),

the second round of user questions.

Wherein the extraction unit includes:

The hit determining apparatus further includes:

a hit rate calculation unit 45 for determining a plurality of results of the question-answering robot, and calculating a hit rate based on the hit results and the number of all the results.

Through the technical means, the following beneficial effects can be realized:

Referring to fig. 5, the present invention provides an electronic device including:

a memory for storing a software program;

a processor for executing the software program and implementing:

Wherein, still include before obtaining the first question-answering data between the robot and the user of the question-answering:

and obtaining the decision tree hit model after training is finished.

Wherein the one-time question-and-answer data comprises:

a first round of user questions; and the combination of (a) and (b),

the second round of user questions.

Wherein the extracting of the question-answer feature set from the question-answer data comprises:

Wherein, still include:

determining a plurality of results of the question-answering robot; the hit rate is calculated based on the hit results and the number of all results.

The functions described in the method of the present embodiment, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of hit determination, comprising:

2. The method of claim 1, further comprising, prior to obtaining question-and-answer data between the question-and-answer robot and the user:

and obtaining the decision tree hit model after training is finished.

3. The method of claim 1, wherein the one-time question-and-answer data comprises:

a first round of user questions; and the combination of (a) and (b),

the second round of user questions.

4. The method of claim 3, wherein said extracting a set of question-answer features from said question-answer data comprises:

5. The method of claim 1, further comprising:

6. A hit determination apparatus, comprising:

7. The apparatus of claim 6, further comprising, prior to the obtaining unit:

8. The apparatus of claim 6, wherein the one-time question-and-answer data comprises:

a first round of user questions; and the combination of (a) and (b),

the second round of user questions.

9. The apparatus of claim 6, wherein the extraction unit comprises:

10. The apparatus of claim 6, further comprising: