CN110737759A

CN110737759A - Evaluation method and device for customer service robot, computer equipment and storage medium

Info

Publication number: CN110737759A
Application number: CN201910843141.2A
Authority: CN
Inventors: 彭晶
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2020-01-31
Anticipated expiration: 2039-09-06
Also published as: CN110737759B

Abstract

The invention discloses an evaluation method, an evaluation device, computer equipment and a storage medium of a customer service robot, which are applied to the technical field of customer service robots and used for solving the problem of low accuracy of evaluation results of the customer service robot.

Description

Evaluation method and device for customer service robot, computer equipment and storage medium

Technical Field

The invention relates to the technical field of customer service robots, in particular to an evaluation method and device of customer service robots, computer equipment and a storage medium.

Background

In the market, many users have difficulties when using company products, and need to search company customer service for answering, so , the customer service robot is particularly important in customer service work, the evaluation of company products by the users can be reflected in the service of the customer service robot, if the answer of the customer service robot is greatly different from the question made by the users, the evaluation of the users is naturally low, so the accuracy requirement on the reply of the customer service robot is high, and based on the above, the accuracy of evaluating the answer returned by the customer service robot for the question of the users is extremely important.

At present, the evaluation of the accuracy of the customer service robot is directly evaluated based on a corpus, which is a corpus used for training a model and is difficult to reflect the real effect of an FAQ (frequently asked questions) model of the customer service robot, and since the corpus data volume of the customer service robot is large and the time and manpower resource limitations are added, all corpuses are difficult to evaluate.

Therefore, finding methods for improving the evaluation accuracy of the customer service robot becomes an urgent problem to be solved by those skilled in the art.

Disclosure of Invention

The embodiment of the invention provides an evaluation method and device for customer service robots, computer equipment and a storage medium, and aims to solve the problem of low accuracy of the evaluation method for the customer service robots.

An evaluation method of customer service robots comprises the following steps:

acquiring user question sentences in a verification corpus set, wherein each records in the verification corpus set comprise pre-collected user question sentences and expected answers corresponding to the user question sentences;

calling a customer service robot interface, and submitting the user question sentences in the verification corpus set to a customer service robot;

analyzing a data packet returned by the customer service robot interface, and extracting a customer service answer corresponding to the question of the user from the data packet;

inputting the question of the user into a preset FAQ model to obtain an answer threshold;

determining customer service answers corresponding to the user question with the answer threshold value in a preset th interval, customer service answers corresponding to the user question with the answer threshold value in a preset second interval and customer service answers corresponding to the user question with the answer threshold value in a preset third interval as -type answers, second-type answers and third-type answers respectively;

comparing of the answers, the second answers and the third answers with expected answers corresponding to the user question respectively to obtain matching degrees corresponding to the user question respectively;

respectively counting the number of answers with the matching degree larger than a preset threshold, the number of answers with the matching degree larger than a preset second threshold, the number of answers with the matching degree larger than a preset third threshold and the total number of question sentences of the user;

respectively calculating according to the counted quantity to obtain accuracy, second accuracy and third accuracy corresponding to the th answer, the second answer and the third answer;

and calculating to obtain a total score value of the accuracy of the customer service robot according to the th accuracy, the second accuracy, the third accuracy and respective corresponding weights.

An evaluation device of customer service robots, comprising:

the system comprises a user question acquiring module, a verification corpus collecting module and a user question acquiring module, wherein the user question acquiring module is used for acquiring user questions in a verification corpus set, and each records in the verification corpus set comprise pre-collected user questions and expected answers corresponding to the user questions;

the user question submitting module is used for calling a customer service robot interface and submitting the user question with the centralized verification corpus to a customer service robot;

the customer service answer extraction module analyzes a data packet returned by the customer service robot interface and extracts a customer service answer corresponding to the question of the user from the data packet;

the answer threshold value calculation module is used for inputting the question of the user into a preset FAQ model to obtain an answer threshold value;

the answer type determining module is used for respectively determining customer service answers corresponding to the user question with the answer threshold value in a preset th interval, customer service answers corresponding to the user question with the answer threshold value in a preset second interval and customer service answers corresponding to the user question with the answer threshold value in a preset third interval as -type answers, second-type answers and third-type answers;

an answer matching degree calculation module, configured to perform comparison on the th type answer, the second type answer, and the third type answer with respective expected answers to the user question, so as to obtain respective matching degrees of the user question;

the answer number counting module is used for respectively counting the number of th answers with the matching degree larger than a preset th threshold, the number of second answers with the matching degree larger than a preset second threshold, the number of third answers with the matching degree larger than a preset third threshold and the total number of question sentences of the user;

the accuracy calculation module is used for respectively calculating accuracy, second accuracy and third accuracy corresponding to the th answer, the second answer and the third answer according to the counted number;

and the total score value calculating module is used for calculating a total score value of the accuracy of the customer service robot according to the th accuracy, the second accuracy, the third accuracy and respective corresponding weights.

computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method for evaluating a customer service robot.

computer-readable storage medium, which stores a computer program that, when being executed by a processor, carries out the steps of the above-mentioned method for evaluating a customer service robot.

Advantageous effects

The evaluation method, the evaluation device, the computer equipment and the storage medium of the customer service robot comprise the steps of firstly obtaining user question sentences in a verification corpus set, wherein each record in the verification corpus set comprises pre-collected user question sentences and expected answers corresponding to the user question sentences, then calling a customer service robot interface to submit the user question sentences in the verification corpus set to a customer service robot, secondly analyzing a data packet returned by the customer service robot interface to obtain customer service answers corresponding to the user question sentences from the data packet, then inputting the user question sentences into a preset FAQ model to obtain answer thresholds, then determining customer service answers corresponding to the user question sentences with the answer thresholds in a preset interval, determining answer answers corresponding to the user question sentences with the answer thresholds in the preset second interval, respectively determining answer answers corresponding to the user question sentences with the answer thresholds in the preset 3870 interval, respectively corresponding to the answer thresholds in the preset second interval, respectively corresponding to the user question sentences with the answer thresholds in the preset second interval and accurately calculating answer scores corresponding to the answer scores of the second question sentences according to the preset customer service scores of the customer service rules, respectively corresponding to the second class scores of the customer service sentences, respectively, and the answer scores of the second class scores of the customer service rules of the customer service, respectively, and the customer service rules of the customer service, and the customer service rules of the customer service, and the customer service rules of the customer service rules.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a flow chart of an embodiment of the present invention illustrating a method for evaluating a customer service robot;

FIG. 2 is a schematic flow chart illustrating a method for evaluating a customer service robot to create a corpus of verification documents according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a detailed process of calculating a total score value by an evaluation method of a customer service robot according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart illustrating the updating of verification corpus in application scenarios by the evaluation method of the customer service robot in the embodiment of the present invention;

FIG. 5 is a schematic flow chart of an evaluation method of a customer service robot in an embodiment of the present invention, in which a coverage rate is calculated under an application scenario ;

FIG. 6 is a schematic structural diagram of an evaluation device of a customer service robot according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of an evaluation device of a customer service robot according to another embodiment of the present invention;

FIG. 8 is a diagram illustrating a structure of a coverage determination module in an embodiment of the present invention;

FIG. 9 is a schematic diagram of a computer device embodying the present invention .

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention.

In an embodiment, as shown in fig. 1, a method for evaluating customer service robots is provided, which includes the following steps:

101. acquiring user question sentences in a verification corpus set, wherein each records in the verification corpus set comprise pre-collected user question sentences and expected answers corresponding to the user question sentences;

in this embodiment, all records in the verification corpus set are loaded first, user question sentences of each record are obtained, each records include a user question sentence and an expected answer corresponding to the user question sentence, and are stored in a database in a list or dictionary form, wherein the verification corpus set further includes a standard question sentence corresponding to the user question sentence and a corresponding extended question sentence, the standard question sentence is a standard description of the user question sentence, and the extended question sentence is another description of the user question sentence.

As shown in fig. 2, before obtaining the verification corpus, the method may further include:

201. collecting original user question sentences of the customer service robot within a preset second time;

202. dividing the original user question sentence according to the service type;

203. aiming at the original user question sentences corresponding to the divided service types, screening out a preset number of user question sentences according to the frequency of the original user question sentences;

204. respectively marking expected answers corresponding to the question sentences of each user in each service type aiming at the screened question sentences of each service type;

205. and determining the user question and an expected answer corresponding to the user question as a verification corpus.

For the above step 201, the original user question that is proposed by the user to the customer service robot within the preset second time is collected, and the real original user question is extracted from the production environment log by developing the log mining tool, for example, the original user question within time period may be extracted, or a certain number of original user questions within time period of the latest time period may be extracted, and it is understood that the original user question extracted from the production environment log is real and latest.

With respect to the step 202, it can be understood that each companies have a plurality of services, the original user question is divided according to the service types, and the original user question presented by the user to the customer service robot also relates to each side, and further steps are performed, because the data volume of the original user question extracted from the production environment log is very large, and a large number of original user questions exist in each service types after division, thus ensuring that the extracted original user question covers as much as possible.

For step 203, regarding the original user question corresponding to each divided service type, because the extracted original user question has a huge data volume, steps are taken to the original user question, and the original user question is screened according to the frequency of appearance of the original user question to obtain a preset number of user questions of each service type, for example, for the original user question of the service type a, assuming that 1 ten thousand user questions are needed, a plurality of original user questions of fixed time can be extracted, or a fixed number of original user questions of in the latest time period can be extracted, and representative 1 ten thousand user questions can be finally screened according to the frequency of appearance of the original user question.

For step 204, for the user question after being screened, the expected answers corresponding to the user question of each type are marked, a corresponding standard question and an extended question can be marked, in the process of marking each user question, the number of the marks involved is not less than a preset threshold, the marking content is the corresponding expected answer, the standard question corresponding to the user question and the corresponding extended question can be marked simultaneously, and a plurality of marked results take a plurality of opinions as final marked results.

As for step 205, the user question and the expected answer corresponding to the user question are determined as the verification corpus, and it can be understood that the expected answer corresponding to the labeled user question is the best result in the labeling result, and the corresponding expected answer and the returned answer of the customer service robot are used for de-matching, so that the accuracy of the returned answer of the customer service robot can be reflected more truly.

102. Calling a customer service robot interface, and submitting the user question sentences in the verification corpus set to a customer service robot;

in this embodiment, because the verification corpus is large in data volume, in order to improve the efficiency of a single round of evaluation test, a multithreading mechanism is adopted to concurrently call an interface of the customer service robot, and according to an interface protocol mode of the customer service robot, user question sentences in each records in the verification corpus are submitted to the customer service robot.

103. Analyzing a data packet returned by the customer service robot interface, and extracting a customer service answer corresponding to the question of the user from the data packet;

in this embodiment, the customer service robot finishes processing each user question, returns the processed data packet through the customer service robot interface and stores the processed data packet into the database, analyzes the data packet returned by the customer service robot, and returns the data packet including the front-end code tags of the customer service robot in the interactive display process, removes the tags, restores and extracts the returned answers of the customer service robot, and stores each answers into a dictionary or list of the user question corresponding to the verified corpus.

104. Inputting the question of the user into a preset FAQ model to obtain an answer threshold;

in this embodiment, the user question in the acquired verification corpus is input into the preset FAQ model to obtain the returned answer threshold, and it can be understood that the answer types of the returned customer service answers are determined in the following according to the input user question and the output answer threshold.

105. Determining customer service answers corresponding to the user question with the answer threshold value in a preset th interval, customer service answers corresponding to the user question with the answer threshold value in a preset second interval and customer service answers corresponding to the user question with the answer threshold value in a preset third interval as -type answers, second-type answers and third-type answers respectively;

in this embodiment, the customer service answers are respectively determined as -th answers, second answers and third answers according to answer thresholds, for example, the returned answer threshold is 1, within a preset interval, the customer service answer is -th answer, the -th answer is specifically expressed that the customer service answer directly corresponds to a question of a user or directly corresponds to a labeled standard question or an expanded question, if the returned answer threshold is 0.8, within a preset second interval, the customer service answer is a second answer, the second answer is specifically expressed that the customer service answer is associated with a question asked by the user, if the returned answer threshold is within a preset third interval, the customer service answer is a third answer, the customer service answer of the third answer is not associated with the question asked by the user, or the customer service robot does not return an answer, and the third answer is specifically expressed that the customer service answer is answered or the question asked by the user without answer.

Further , the third type of answer and the corresponding question of the user may be saved in the badcase result set, and the saved result set is provided to the developer for analysis and adjustment of the optimized FAQ model.

It should be noted that the return value of the answer threshold is calculated according to actual conditions, and this embodiment is not limited to this.

106. Comparing of the answers, the second answers and the third answers with expected answers corresponding to the user question respectively to obtain matching degrees corresponding to the user question respectively;

in this embodiment, the -th answer, the second answer and the third answer are respectively compared with the expected answers corresponding to the question of the user in the corpus verification set by to obtain the matching degrees corresponding to the question of the user, wherein the -th answer, the second answer and the third answer are all answers returned by the customer service robot, and are compared with the expected answers corresponding to the question of the user to obtain the matching degrees corresponding to each question of the user.

107. Respectively counting the number of answers with the matching degree larger than a preset threshold, the number of answers with the matching degree larger than a preset second threshold, the number of answers with the matching degree larger than a preset third threshold and the total number of question sentences of the user;

in this embodiment, after the matching degree of the question is calculated, the number of the question of the user whose matching degree is greater than the threshold in the th answer is counted, the number of the question of the user whose matching degree reaches the second threshold in the second answer is counted, the number of the question of the user whose matching degree reaches the third threshold in the third answer is counted, and meanwhile, the total number of the question of the user is counted.

108. Respectively calculating according to the counted quantity to obtain accuracy, second accuracy and third accuracy corresponding to the th answer, the second answer and the third answer;

in this embodiment, the accuracy, the second accuracy and the third accuracy corresponding to the th answer, the second answer and the third answer are respectively calculated according to the counted number.

For convenience of description, the numbers of -th answers, second answers and third answers reaching the corresponding thresholds and the total number of question sentences of the user are sequentially represented as C1, C2, C3 and S, respectively, then the calculation of -th accuracy a 1-C1/S, the calculation of second accuracy a 2-C2/S and the calculation of third accuracy A3-C3/S are performed, and "/" represents division.

109. And calculating to obtain a total score value of the accuracy of the customer service robot according to the th accuracy, the second accuracy, the third accuracy and respective corresponding weights.

In this embodiment, a total score of the accuracy of the customer service robot is obtained by calculation according to the th accuracy, the second accuracy, the third accuracy and respective corresponding weights, and a user question provided to the analysis model of the developer in a badcase result set can also be obtained at the same time.

For understanding, as shown in fig. 3, further , the calculating of the total score of the customer service robot accuracy rate according to the th accuracy rate, the second accuracy rate, the third accuracy rate and the respective corresponding weights may be implemented by:

301. respectively obtaining weight, second weight and third weight corresponding to the accuracy, the second accuracy and the third accuracy;

302. respectively calculating scores, second scores and third scores corresponding to the answers, the second answers and the third answers according to the weight, the second weight, the third weight and a preset scoring rule;

303. and calculating to obtain a total score value of the accuracy of the customer service robot according to the th score value, the second score value, the third score value and a preset second score rule, wherein the total score value is positively correlated with the accuracy of the customer service robot.

For the step 301, the weight, the second weight and the third weight corresponding to the th accuracy, the second accuracy and the third accuracy are respectively obtained, and it can be understood that the th accuracy, the second accuracy and the third accuracy can laterally influence the evaluation result of the customer service robot, and the respective weights are obtained for the subsequent calculation of the total score value.

With regard to the above step 302, after the weights corresponding to the th accuracy, the second accuracy and the third accuracy are obtained, specifically, the product of the th accuracy and the th weight, the product of the second accuracy and the second weight and the product of the third accuracy and the third weight may be calculated respectively to obtain the corresponding th assessment value, the corresponding second assessment value and the corresponding third assessment value.

For convenience of explanation, the th accuracy, the second accuracy, the third accuracy, the th weight, the second weight, and the third weight are sequentially denoted as a1, a2, A3, W1, W2, and W3, respectively, and calculation of the th evaluation value of L1 ═ C1 × a1, calculation of the th evaluation value of L2 ═ C2 ═ a2, and calculation of the third evaluation value of L3 ═ C3 × A3, "×" denote multiplication.

For the step 303, calculating th evaluation value, a sum of the second evaluation value and the third evaluation value to obtain a total score value of the customer service robot, where the total score value is positively correlated with the accuracy of the customer service robot, and the higher the calculated total score value is, the higher the evaluation accuracy of the customer service robot is, and conversely, the lower the calculated total score value is, the lower the evaluation accuracy of the customer service robot is.

Considering the problem that the corpus needs to be updated after the evaluation result is obtained, as shown in fig. 4, in step , after step 109, the method may further include:

401. updating the verification corpus within a preset time to obtain an updated verification corpus;

402. respectively acquiring th threshold values corresponding to preset user question coverage rates of verification corpus in each service type;

403. judging whether the question coverage rate of the user reaches the respective corresponding threshold value, if so, executing step 404, and if not, executing step 405;

404. determining the verification corpus to be a target verification corpus;

405. and (5) processing according to a preset flow.

For the above step 401, the corpus is updated within the preset th time, for example, weeks may be set as a node, or half a month may be set as a node, which is not limited herein, a verification corpus of a plurality of records is created every th period, after the creation of the verification corpus every th period is completed, the verification corpus in the new th period is continuously created, the verification corpus in the new th period is merged with the verification corpus in the last th period, and the repeatedly inappropriate records are deleted, so as to form the verification corpus after updating.

It should be noted that, since the service corpus is continuously updated and iterated, the verification corpus set is also continuously updated, and when the records of the verification corpus set are made to reach the expected number, the verification corpus set reaching the expected number is updated to obtain the th verification corpus set after updating.

For the step 402, the preset respective corresponding thresholds of the user question coverage of the verification corpus in each service type are respectively obtained, and it can be understood that, since the user question of different service types is different, the measurement standard of the user question coverage of each service type is not .

As for the step 403, determining whether the user question coverage rates reach the respective corresponding thresholds, it can be understood that, when the user question coverage rates all reach the respective corresponding thresholds, the th verification corpus can more effectively reflect the evaluation result of the customer service robot, as shown in fig. 5, step may further include:

501. the statistics shows the number of user question sentences of each service type in the corpus and the number of user question sentences of each service type collected in advance;

502. respectively calculating according to the counted number to obtain user question coverage rate of each service type in the verification corpus set;

503. and judging whether the question coverage rate of the user reaches the corresponding threshold value.

With respect to the step 501, it can be understood that the th verification corpus is the th corpus combined after the past records are duplicated and improper records are deleted, and the number of user question sentences of each service type in the th verification corpus after updating and the number of user question sentences of each service type collected in advance can be obtained through statistics.

For the step 502, the user question coverage rate of each service type in the th verification corpus is respectively calculated according to the counted number, and the user question coverage rate of each service type is obtained by quotient of the number of the user question of each service type in the th verification corpus and the number of the user question of each service type collected in advance.

For step 503, it is determined whether the user question coverage reaches the respective corresponding threshold, and the user question coverage of every service types is compared with the preset threshold of each service type in the verification corpus.

For the above step 404, the th verification corpus in which the user question coverage of the th verification corpus in each service type reaches the threshold is determined as the target verification corpus, which is obtained through continuous updating and iteration and can be used as a relatively stable verification corpus within hours.

Further , the update frequency of the verification corpus can be adjusted appropriately according to the target verification corpus, and the adjustment range can be adjusted according to the number of the user question sentences in the production environment log and/or the difference of the total score values evaluated by the two adjacent customer service robots.

For example, a time node which is used as a target verification corpus after updating is used as a starting time, if it is detected that user question sentences in a production environment log exceed fixed values, workers are prompted to update the verification corpus, and within two months from the starting time, the worker is also prompted to update the verification corpus if it is detected that the number of the user question sentences does not reach a preset threshold value, so that the problem of timeliness of the corpus is prevented.

For another example, if it is detected that the accuracy of the two adjacent verification corpus evaluated on the customer service robot exceeds fixed difference values, the staff is prompted to update the verification corpus, and if it is detected that the accuracy difference of the two adjacent verification corpus evaluated on the customer service robot is lower and does not exceed the threshold value, the two verification corpus can be updated times in two months.

It should be noted that the stable target verification corpus is obtained by updating after merging, deduplication and deletion of inappropriate records every times, so the number of user question sentences of each service type in the target verification corpus needs to be counted to calculate the coverage rate, and the phenomenon that similar user question sentences in crossed service types are deduplicated or mistakenly deleted is avoided, thereby determining the target verification corpus.

It should be noted that the time node and the time interval in this embodiment are determined according to actual situations, and this embodiment does not limit this.

For the above step 405, if the user question coverage does not reach the respective corresponding threshold, the processing is performed according to a preset flow, for example, since the th verification corpus is continuously updated until the th verification corpus reaches the respective corresponding threshold, the th verification corpus is taken as the target verification corpus.

In the embodiment, firstly, records in a verification corpus set comprise pre-collected user question sentences and expected answers corresponding to the user question sentences, then, a customer service robot interface is called to submit the user question sentences in the verification corpus set to a customer service robot, then, a data packet returned by the customer service robot interface is analyzed, customer service answers corresponding to the user question sentences are extracted from the data packet, then, the user question sentences are input into a preset FAQ model to obtain answer thresholds, next, the customer service answers corresponding to the user question sentences with the answer thresholds in a preset interval, the customer answers corresponding to the user question sentences with the answer thresholds in a preset second interval and the customer service answers corresponding to the user question sentences with the answer thresholds in a preset third interval are determined as -th answers, second-th answers and third-th answers, the answer answers corresponding to the second-th question sentences with the answer thresholds are determined, the customer service answers corresponding to the user question sentences with the answer thresholds in the preset third interval and the answer thresholds corresponding to the user question sentences with the answer thresholds are calculated according to a preset number of the first question sentence matching accuracy, the first question score, the customer service score of the customer service data, the customer service data of the customer service robot interface is calculated, the customer service question sentence matching rate, the customer service question sentence type of the customer service score of the customer service sentence matching, the customer service score of the customer service sentence matching, the customer service score of the customer service sentence of the customer service score of the customer service sentence of the customer service score of the customer.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In embodiment, an evaluation apparatus of kinds of customer service robots is provided, the evaluation apparatus of the customer service robots corresponds to the evaluation method of the customer service robots in the above embodiments, as shown in fig. 6, the evaluation apparatus of the customer service robots includes a user question acquisition module 601, a user question submission module 602, a customer service answer extraction module 603, an answer threshold calculation module 604, an answer type determination module 605, an answer matching degree calculation module 606, an answer quantity statistics module 607, an accuracy calculation module 608, and a total value score calculation module 609, each of the functional modules is detailed as follows:

a user question acquiring module 601, configured to acquire a user question in a verification corpus set, where each records in the verification corpus set include pre-collected user questions and expected answers corresponding to the user questions;

a user question submitting module 602, configured to invoke a customer service robot interface, and submit the user question in the verification corpus set to the customer service robot;

a customer service answer extracting module 603, configured to analyze a data packet returned by the customer service robot interface, and extract a customer service answer corresponding to the user question from the data packet;

an answer threshold calculation module 604, configured to input the user question into a preset FAQ model to obtain an answer threshold;

an answer type determining module 605, configured to determine customer service answers corresponding to user question sentences with answer thresholds in a preset th interval, customer service answers corresponding to user question sentences with answer thresholds in a preset second interval, and customer service answers corresponding to user question sentences with answer thresholds in a preset third interval as -type answers, second-type answers, and third-type answers, respectively;

an answer matching degree calculation module 606, configured to perform comparison on the -type answer, the second-type answer, and the third-type answer with respective expected answers to the user question, so as to obtain respective matching degrees of the user question;

the answer number counting module 607 is configured to count the number of th answers with the matching degree greater than a preset th threshold, the number of second answers with the matching degree greater than a preset second threshold, the number of third answers with the matching degree greater than a preset third threshold, and the total number of question sentences of the user;

the accuracy calculation module 608 is configured to calculate, according to the counted number, a accuracy, a second accuracy, and a third accuracy that correspond to the th answer, the second answer, and the third answer, respectively;

and a total score value calculating module 609, configured to calculate a total score value of the accuracy of the customer service robot according to the th accuracy, the second accuracy, the third accuracy, and respective corresponding weights.

As shown in fig. 7, , the evaluation apparatus of the customer service robot may further include:

the verification corpus updating module 610 is configured to update the verification corpus within a preset th time to obtain an updated th verification corpus;

a coverage rate threshold value obtaining module 611, configured to obtain th threshold values preset for respective user question coverage rates of the verification corpus in each service type;

a coverage rate judging module 612, configured to judge whether coverage rates of question sentences of users in each service type of the verified corpus reach preset respective corresponding threshold values;

a verification corpus determining module 613, configured to determine that the th verification corpus is a target verification corpus if the determination result of the coverage determination module is yes.

As shown in fig. 8, , the coverage rate determining module includes:

a user question number counting unit 6111, configured to count th user question numbers of each service type in the verification corpus and user question numbers of each service type collected in advance;

a coverage rate calculating unit 6112, configured to calculate user question sentence coverage rates of various service types in the th verification corpus according to the counted number respectively;

a coverage rate determining unit 6113, configured to determine whether the coverage rate of the question of the user reaches the corresponding threshold.

For specific limitations of the evaluation device of the customer service robot, reference may be made to the above limitations of the evaluation method of the customer service robot, and details thereof are not repeated here. All or part of each module in the evaluation device of the customer service robot can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In embodiments, computer devices are provided, which can be servers, the internal structure of which can be as shown in fig. 9, wherein the computer devices comprise a processor, a memory, a network interface and a database which are connected through a system bus, the processor of the computer devices is used for providing computing and control capabilities, the memory of the computer devices comprises a nonvolatile storage medium and an internal memory, the nonvolatile storage medium stores an operating system, a computer program and a database, the internal memory provides an environment for the operation of the operating system and the computer program in the nonvolatile storage medium, the database of the computer devices is used for storing data involved in the evaluation method of the customer service robot, the network interface of the computer devices is used for communicating with an external terminal through a network connection, and the computer program is executed by the processor to realize the evaluation method of the customer service robot of .

In embodiments, computer devices are provided, which include a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the evaluation method for the customer service robot in the above embodiments are implemented, for example, in step 101 to step 109 shown in fig. 1, or when the processor executes the computer program, the functions of the modules/units of the evaluation device for the customer service robot in the above embodiments are implemented, for example, the functions of the modules 601 to 609 shown in fig. 6.

In embodiments, computer readable storage media are provided, on which computer programs are stored, and when being executed by a processor, the computer programs implement the steps of the evaluation method for a customer service robot in the above embodiments, such as step 101 to step 109 shown in fig. 1, or when being executed by a processor, the computer programs implement the functions of the modules/units of the evaluation device for a customer service robot in the above embodiments, such as the functions of module 601 to module 609 shown in fig. 6.

It will be understood by those of ordinary skill in the art that all or a portion of the processes of the methods of the embodiments described above may be implemented by a computer program that may be stored in a non-volatile computer-readable storage medium, which when executed, may include the processes of the embodiments of the methods described above, wherein any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1, kinds of customer service robot evaluating method, characterized by, including:

2. The method for evaluating a customer service robot according to claim 1, further comprising, after calculating a total score of the customer service robot accuracy according to the th accuracy, the second accuracy, the third accuracy and the respective weights, the steps of:

updating the verification corpus within a preset time to obtain an updated verification corpus;

respectively acquiring th threshold values corresponding to preset user question coverage rates of verification corpus in each service type;

judging whether the question coverage rate of the user reaches the respective corresponding threshold value;

and if the user question coverage rate reaches the respective corresponding threshold value, determining the th verification corpus to be a target verification corpus.

3. The method for evaluating a customer service robot according to claim 2, wherein the determining whether the user question coverage reaches the respective corresponding threshold value comprises:

the statistics shows the number of user question sentences of each service type in the corpus and the number of user question sentences of each service type collected in advance;

respectively calculating according to the counted number to obtain user question coverage rate of each service type in the verification corpus set;

and judging whether the question coverage rate of the user reaches the respective corresponding threshold value.

4. The method for evaluating a customer service robot according to claim 1, wherein before the obtaining of the user question sentences in the verification corpus, the method further comprises:

collecting original user question sentences of the customer service robot within a preset second time;

dividing the original user question sentence according to the service type;

aiming at the original user question sentences corresponding to the divided service types, screening out a preset number of user question sentences according to the frequency of the original user question sentences;

respectively marking expected answers corresponding to the question sentences of each user in each service type aiming at the screened question sentences of each service type;

and determining the user question and an expected answer corresponding to the user question as a verification corpus.

5. An evaluation method for a customer service robot according to of claims 1-4, wherein the calculating a total score of the customer service robot accuracy according to the , the second, the third and the respective weights comprises:

respectively obtaining weight, second weight and third weight corresponding to the accuracy, the second accuracy and the third accuracy;

respectively calculating scores, second scores and third scores corresponding to the answers, the second answers and the third answers according to the weight, the second weight, the third weight and a preset scoring rule;

and calculating to obtain a total score value of the accuracy of the customer service robot according to the th score value, the second score value, the third score value and a preset second score rule, wherein the total score value is positively correlated with the accuracy of the customer service robot.

An evaluating device of a customer service robot of kinds, comprising:

the customer service answer extraction module is used for analyzing a data packet returned by the customer service robot interface and extracting a customer service answer corresponding to the question of the user from the data packet;

the accuracy calculation module is used for respectively calculating a accuracy, a second accuracy and a third accuracy which correspond to the th answer, the second answer and the third answer according to the counted number;

7. The evaluating apparatus of a customer service robot according to claim 6, wherein the evaluating apparatus of a customer service robot further comprises:

the verification corpus updating module is used for updating the verification corpus within the preset th time to obtain an updated th verification corpus;

a coverage rate threshold value obtaining module, configured to obtain th threshold values corresponding to respective preset user question coverage rates of the verification corpus in each service type;

a coverage rate judging module, configured to judge whether the coverage rate of the question of the user reaches the respective corresponding threshold value;

and the verification corpus determining module is configured to determine, if the determination result of the coverage determination module is yes, that the th verification corpus is the target verification corpus.

8. The evaluating apparatus of a customer service robot according to claim 7, wherein the coverage judging module comprises:

a user question number counting unit, which is used for counting the number of user questions of each service type in the verification corpus and the number of pre-collected user questions of each service type;

the coverage rate calculating unit is used for respectively calculating and obtaining th user question sentence coverage rates of all service types in the verification corpus set according to the counted number;

and the coverage rate judging unit is used for judging whether the coverage rate of the question sentence of the user reaches the corresponding threshold value.

Computer device of 9, , comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, implements the method of evaluating a customer service robot of any of claims 1 to 5, .

10, computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out a method for evaluating a customer service robot according to any of claims 1 to 5, .