CN112799747A

CN112799747A - Intelligent assistant evaluation and recommendation method, system, terminal and readable storage medium

Info

Publication number: CN112799747A
Application number: CN201911115568.7A
Authority: CN
Inventors: 林震亚; 屠要峰; 郭斌; 周祥生; 李春霞
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2019-11-14
Filing date: 2019-11-14
Publication date: 2021-05-14
Also published as: WO2021093821A1

Abstract

The embodiment of the invention provides an intelligent assistant evaluation and recommendation method, a system, a terminal and a readable storage medium, wherein the intelligent assistant evaluation method evaluates a target intelligent assistant according to a preset evaluation scheme to obtain an evaluation result, and the evaluation comprises at least one of the following steps: internal evaluation and external evaluation, and an evaluation report is generated based on the evaluation result. The invention also provides an intelligent assistant recommending method, a system, a terminal and a readable storage medium, through the implementation of the invention, the intelligent assistant is evaluated based on multiple important indexes in the preset evaluation scheme, and an evaluation result is obtained to generate an evaluation report, so that a standardized intelligent assistant evaluation method is provided, the development of the intelligent assistant industry can be promoted by the evaluation method, the service quality is promoted, and the user experience is further improved.

Description

Intelligent assistant evaluation and recommendation method, system, terminal and readable storage medium

Technical Field

The embodiment of the invention relates to the field of mobile communication, in particular to a method, a system, a terminal and a readable storage medium for evaluating and recommending intelligent assistants.

Background

Having a virtual assistant or a chat partner system with sufficient intelligence appears to be fantasy at present and for a long time, people believe that this may only exist in science fiction movies. In recent years, however, human-computer conversations have received increased attention from researchers due to their potential and attractive commercial value.

With the development of big data and deep learning techniques, it is no longer a fantasy to create an automatic human-machine conversation system as a personal assistant or chat partner for people. Moreover, some simple man-machine conversation system devices are also available in the market at present, which can perform simple conversation with users and provide simple services.

Among them, in particular, the dialog system can be roughly divided into two types:

(1) task-oriented dialog system

(2) Non-task-oriented dialog systems (also known as chat robots).

Task oriented systems are intended to assist users in performing actual specific tasks, such as assisting users in finding merchandise, reserving hotel restaurants, and the like.

The widely used method of task-oriented system is to treat the dialog response as a pipe (pipeline), the system first understands the information conveyed by human as an internal state, then takes a series of corresponding actions according to the strategy of the dialog state, and finally converts the actions into the expression form of natural language.

Although language understanding is handled through statistical models, most deployed dialog systems still use manual features or manually formulated rules for state and action space representation, intent detection, and slot population.

Non-task oriented dialog systems interact with humans, providing reasonable reply and entertainment functions, usually focusing primarily on open fields to talk to humans. Although a non-task oriented system appears to be chatting, it plays a role in many practical applications.

Data shows that in an online shopping scenario, nearly 80% of utterances are chat information, and the way these questions are handled is closely related to the user experience.

As intelligent assistants become popular, currently, there is no reasonable evaluation mode for the evaluation of the intelligent assistants, the service quality of the intelligent assistants is not uniform, the user experience is low, and therefore it becomes important to create an intelligent assistant evaluation system.

Disclosure of Invention

The intelligent assistant evaluation and recommendation method, the intelligent assistant evaluation and recommendation system, the intelligent assistant terminal and the readable storage medium mainly solve the technical problems that no reasonable evaluation mode exists for intelligent assistant evaluation, the intelligent assistant has good service quality and low user experience degree as the intelligent assistant is gradually popularized.

In order to solve the above technical problem, an embodiment of the present invention provides an intelligent assistant evaluation method, including:

evaluating the target intelligent assistant according to a preset evaluation scheme;

obtaining an evaluation result;

and generating an evaluation report according to the evaluation result.

Further, an embodiment of the present invention further provides an intelligent assistant recommendation method, including:

packaging interfaces of at least two target intelligent assistants and accessing a unified management interface;

acquiring the current requirements of external users;

determining corresponding capability items and the priority of each capability item according to the current requirement;

obtaining an evaluation report of the target intelligent assistant;

determining the preferred intelligent assistant in each target intelligent assistant according to the ability item, the priority of the ability item and the evaluation report;

the management interface provides an interface of the preferred intelligent assistant for the external user to use.

Further, an embodiment of the present invention further provides an intelligent assistant evaluation system, where the intelligent assistant evaluation system includes:

the evaluation module is used for evaluating the target intelligent assistant according to a preset evaluation scheme;

the first acquisition module is used for acquiring an evaluation result;

and the first generation module is used for generating an evaluation report according to the evaluation result.

Further, an embodiment of the present invention further provides an intelligent assistant recommendation system, including:

the packaging module is used for packaging the interfaces of at least two target intelligent assistants and accessing a unified management interface;

the fifth acquisition module is used for acquiring the current requirements of the external users;

the capacity item determining module is used for determining corresponding capacity items and the priority of each capacity item according to the current requirement;

a sixth obtaining module, configured to obtain an evaluation report of the target intelligent assistant;

the optimal selection determining module is used for determining optimal selection intelligent assistants in the target intelligent assistants according to the ability items, the priorities of the ability items and the evaluation reports;

a providing module for the management interface to provide an interface of the preferred intelligent assistant for the external user to use.

Further, an embodiment of the present invention further provides an intelligent assistant evaluation terminal, including: the system comprises a first processor, a first memory and a first communication bus;

the first communication bus is used for realizing connection communication between the first processor and the first memory;

the first processor is configured to execute one or more first computer programs stored in the first memory to implement the steps of the intelligent assistant evaluation method as described in any of the above embodiments.

Further, an embodiment of the present invention further provides an intelligent assistant recommendation terminal, including: the second processor, the second memory and the second communication bus;

the second communication bus is used for realizing connection communication between the second processor and the second memory;

the second processor is configured to execute one or more second computer programs stored in the second memory to implement the steps of the intelligent assistant recommendation method according to the above embodiments.

Further, the present invention also provides a readable storage medium, which stores one or more first computer programs, where the one or more first computer programs are executable by one or more first processors to implement the steps of the intelligent assistant evaluation method according to any one of the above embodiments.

Further, the present invention also provides a readable storage medium, which stores one or more second computer programs, where the one or more second computer programs are executable by one or more second processors to implement the steps of the intelligent assistant recommendation method according to the above embodiment.

The invention has the beneficial effects that:

according to the method, the target intelligent assistant is evaluated according to a preset evaluation scheme, an evaluation result is obtained, wherein the evaluation comprises at least one of internal evaluation and external evaluation, and an evaluation report is generated according to the evaluation result. According to the method, the intelligent assistant is evaluated based on multiple important indexes in the preset evaluation scheme, the evaluation result is obtained, and the evaluation report is generated.

Additional features and corresponding advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

Fig. 1 is a schematic flow chart of an intelligent assistant evaluation method according to an embodiment of the present invention;

FIG. 2 is a diagram of an exemplary target intelligent assistant architecture according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of an internal evaluation method according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of another internal evaluation method according to an embodiment of the present invention;

FIG. 5-1 is a schematic flow chart illustrating an intention recognition method according to an embodiment of the present invention;

FIG. 5-2 is a schematic diagram illustrating an exemplary multi-target classification architecture according to an embodiment of the present invention;

fig. 6 is a schematic flowchart of a comment summary generation method according to an embodiment of the present invention;

fig. 7 is a schematic flow chart of an external evaluation method according to an embodiment of the present invention;

fig. 8-1 is a schematic flow chart of an evaluation report generation method according to an embodiment of the present invention;

FIG. 8-2 is a schematic flow chart of single sentence modeling according to an embodiment of the present invention;

8-3 are a schematic flow chart of modeling for sequence editing according to an embodiment of the present invention;

fig. 9 is a schematic flowchart of another intelligent assistant evaluation method according to an embodiment of the present invention;

fig. 10 is a schematic flowchart of another intelligent assistant evaluation method according to an embodiment of the present invention;

fig. 11 is a schematic flowchart of another intelligent assistant evaluation method according to an embodiment of the present invention;

fig. 12 is a flowchart illustrating an intelligent assistant recommendation method according to a second embodiment of the present invention;

fig. 13 is a structural diagram of an intelligent assistant evaluation system according to a third embodiment of the present invention;

fig. 14 is a structural diagram of an intelligent assistant recommendation system according to a fourth embodiment of the present invention;

fig. 15 is a schematic structural diagram of an intelligent assistant evaluation terminal according to a fifth embodiment of the present invention;

fig. 16 is a schematic structural diagram of an intelligent assistant recommendation terminal according to a sixth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The first embodiment is as follows:

referring to fig. 1, the method for evaluating an intelligent assistant provided in this embodiment includes:

s101: evaluating the target intelligent assistant according to a preset evaluation scheme to obtain an evaluation result;

s102: and generating an evaluation report according to the evaluation result.

In some embodiments, a target intelligent assistant to which embodiments of the present invention are directed includes, but is not limited to, a system or device that can engage in human-machine conversation. The dialog is to help the user to complete a specific task, such as helping the user inquire local weather; the conversation may also be casual chatting, giving the user a way to accompany and dismiss the lonely. Of course, the intelligent assistant may also provide services to the user by collecting information of the user's actions, expressions, moods, etc., and even guide the user to perform corresponding actions, etc.

Of course, in some embodiments, the target smart assistant according to the embodiments of the present invention may also include, but is not limited to, a device that can communicate with other creatures that have their own ideas, including people, for example, a device that can obtain a need expressed by a language of a dog by analyzing the language communication of the dog may also be used as the target smart assistant evaluated in the embodiments of the present application.

In some embodiments, the target intelligent assistant in the embodiments of the present invention may communicate with a living being, such as a human being, through a form other than a language, for example, by acquiring some other biological signals of the living being, such as brain waves, and then performing analysis processing, so as to give some feedback to the living being. For example, through analyzing biological signals including brain waves of the user, the user is further aware that the intention of the user is to know the weather in tomorrow, and then through the next action, the user is informed of the weather in tomorrow in a voice broadcast and/or text display manner, so as to meet the current requirements of the user.

In some embodiments, the solidification profile of the target intelligent assistant may not be fixed, for example, the target intelligent assistant may communicate with the user via any device that satisfies certain conditions, such as a device with a speaker.

In some embodiments, the target intelligent assistant in embodiments of the present invention is capable of understanding source information from a user, such as text, sound, voice, image, video, touch operations, etc., and performing related actions; can also understand the source information such as sensor input signals from the environment and complete the related actions; at the same time, it is also possible to understand the source information from the feedback and to perform the relevant actions. Referring to fig. 2, fig. 2 is a diagram of a typical target intelligent assistant architecture, which is described in detail as follows:

1) the user interface module obtains source information such as voice, characters, touch control, gestures and the like input by a user;

2) the user interface module outputs a source information stream to the information collection module;

3) the information collection module outputs the sorted information containing the context to the information understanding module;

4) the information understanding module outputs the analysis result of the context information to the action decision module;

5) the action decision module outputs a decision result to the information collection module for evaluating the selected optimal decision;

6) the action decision module outputs an optimal decision result to the action module;

7) the action module outputs feedback information to the information collection module;

8) the action module outputs information such as text, images, videos, sounds and the like to the user interface module;

9) the user interface outputs media streams such as voice, text, images, video, sound and the like to the user;

10) the action module outputs requests such as form submission, resource acquisition or command execution and the like to the information adaptation and exchange module;

11) the information adaptation and exchange module outputs request information such as control commands and the like to the external Internet of things equipment;

12) the information adaptation and exchange module outputs request information such as form submission, resource acquisition and the like to external application;

13) the information adaptation and exchange module outputs information such as commands to be executed to the robot;

14) the information adaptation and exchange module obtains source information such as events, signals and the like input by external sensors and the like;

15) the information adaptation and exchange module obtains source information such as new knowledge or knowledge update input by an external knowledge source and the like;

16) the information adaptation and exchange module obtains other event source information such as external cooperation requests, service state updates and the like;

17) the information adaptation and exchange module outputs the source information to the information collection module.

In some embodiments, the smart assistant is able to understand source information from a user, such as text, voice, speech, images, video, touch operations, and the like, and perform related actions; can also understand the source information such as sensor input signals from the environment and complete the related actions; at the same time, it is also possible to understand the source information from the feedback and to perform the relevant actions. An intelligent assistant should include at least one of: the system comprises six parts of a user interface, information collection, information understanding, action decision, action, information adaptation and exchange.

The intelligent assistant comprises the following functions:

a) the user interface provides a keyboard, handwriting, touch, voice, gesture and other human-computer interaction modes for a user to input source information, and transmits information to the user in voice, text, image, voice, video and other modes;

b) the information collection module fuses various source information to form context information which can be understood by the intelligent assistant;

c) the information understanding module analyzes the context information arranged by the information collecting module, and predicts and generates information for supporting behavior decision; meanwhile, the module needs to learn source information such as knowledge and feedback from the inside and the outside, so that the analysis and understanding capability is improved;

d) the action decision module selects a proper action or a group of actions according to the information generated by the information understanding module; meanwhile, the module needs to expand decision space and improve planning capability according to source information from the inside and the outside, such as knowledge, feedback and the like;

e) the action module calls internal and external resources according to the optimal decision generated by the action decision module and executes corresponding action, and simultaneously, the action execution result is fed back to the information collection module by the action module.

f) The information adaptation and exchange module is responsible for connecting the internal and external resources and completing the data format conversion of the internal and external resources.

In some embodiments, the evaluating comprises at least one of:

internal evaluation and external evaluation.

It is noted that, in some embodiments, the classification of the evaluations may be divided by obtaining the source of the evaluations, which may be divided by the identity of the evaluator participating in the evaluation, e.g., if the evaluator's identity is a trained internal professional, the evaluation of the person is an internal evaluation, and if the evaluator's identity is an external user, the evaluation of the user is an external evaluation.

In some embodiments, the division of the source of the evaluations may also be divided by the interface through which the evaluations are transmitted, e.g., an evaluation that is external if a certain evaluation is collected through an interface provided to an external user and an evaluation that is internal if a certain evaluation is collected through an interface provided to an internal evaluator.

In some embodiments, evaluating the target smart assistant may be an internal evaluation only or an external evaluation only. In some embodiments, the evaluation of the target smart assistant may also be a composite evaluation that combines an internal evaluation and an external evaluation. It should be noted that, for the evaluation of the target intelligent assistant, a simple internal evaluation or external evaluation may be selected, or a mode of selecting the internal evaluation and the external evaluation may be selected by those skilled in the art according to the needs.

In some embodiments, if the source comprises an internal assessment, the preset assessment plan comprises an item of assessment capability to be assessed by an internal professional for assessing the intelligent capability of the target intelligent assistant.

It should be noted that in some embodiments, the intelligent capability may be understood as the capability of the intelligent assistant to receive the user's requirement, determine the user's requirement, perform the next action according to the user's requirement, and improve the business capability of the intelligent assistant.

It is noted that in some embodiments, the internal rating may be based on a user's classification of the requirements of the target intelligent assistant. The user's demand for target intelligent assistants can be summarized into the following four broad categories: emotional support, knowledge support, campaign support, and decision support. The following are specific contents required by various requirements:

a) emotional support

1) The method has the advantages that the method gives encouragement, cares and cares to the user through human-computer interaction, makes time, and reduces negative emotions such as loneliness and the like;

2) basic needs refer to chatting, and important needs refer to emotional, topical, and heuristic conversations.

b) Knowledge support

1) Providing knowledge question answering and knowledge searching for a user;

2) the basic requirements are limited domain question answering and single sentence searching, and the key requirements are open domain question answering and drill-down searching.

c) Activity support

1) The device can replace people to perform important activities in daily life, such as controlling household appliances, playing audio-visual contents, shopping, cleaning, information inquiry and the like;

2) the basic requirements are single sentence instruction control, conversational form submission and conversational form cancellation, and the key requirements are heuristic control, autonomous interactive control, scene linkage control, conversational form filling and conversational form modification.

d) Decision support

1) Making decision suggestions such as recommendation, planning and the like for the user;

2) the basic requirements are personalized recommendation (interest sensitive) and dynamic task planning, and the key requirements are personalized recommendation (time sensitive), personalized recommendation (relevance sensitive), deductive reasoning and task sequence planning (time sensitive and cost sensitive).

The need for additional capabilities on the intelligent capability framework is expanded around these needs. The ability of the target intelligent assistant to meet the strain of abnormal conditions and to learn autonomously further improves its ability to meet the needs of the user. The working modes of the target intelligent assistant for serving the user can be divided into an active mode and a passive mode, and the working modes have obvious influence on the capability of meeting the requirements of the user.

In some embodiments, the intelligent capabilities of the target intelligent assistant include, but are not limited to, at least one of the following capabilities: interaction ability, decision ability, thing ability, learning ability.

In some embodiments, the interaction capability includes at least one of the following sub-capability items: information feedback, information understanding, information identification and information collection, wherein:

the information feedback includes at least one of the following capabilities: image generation, voice synthesis, abstract generation and natural language generation;

the information understanding includes at least one of the following capabilities: dynamic topic drift, spatial understanding emotional understanding, temporal understanding, video understanding, image understanding, natural language understanding (without context), natural language understanding (with context);

the information identification includes at least one of the following capabilities: action recognition, emotion recognition, image recognition, voice recognition and knowledge extraction;

the information collection includes at least one of the following capabilities: feedback information input, image input, video input, external event source input, text input, voice input.

In some embodiments, the decision capability includes at least one of the following sub-capability items: planning, recommending and reasoning, wherein:

the planning includes at least one of the following capabilities: dynamic task planning, task sequence planning and exception handling planning;

the recommendation includes at least one of the following capabilities: personalized recommendation;

reasoning includes at least one of the following capabilities: case reasoning, uncertainty reasoning, inductive reasoning, and deductive reasoning.

In some embodiments, the transaction force includes at least one of the following sub-capability items: third party services, dialogue, control, task form submission, search, performance, business monitoring and handling, and knowledge question answering, wherein:

the third party service includes at least one of the following capabilities: a service access mode and a service system;

the dialog includes at least one of the following capability items: multi-modal dialog, personalized dialog, heuristic dialog, task-based dialog, emotional dialog, chatting, active dialog;

controlling includes at least one of the following capabilities: scene linkage control, single sentence instruction control, multi-mode control, heuristic control and autonomous interactive control;

task form submission includes at least one of the following capabilities: dialogue table, single biometric verification;

the search includes at least one of the following capabilities: vertical search, single sentence search, reply automatic search, heuristic search, image search, drill-down search;

the performance includes at least one of the following capabilities: reliability, transaction flow efficiency, availability, response speed and initiative;

traffic monitoring and handling includes at least one of the following capabilities: task exception handling, task exception notification and task state management;

the knowledge question-answer includes at least one of the following ability items: open domain question answering, context question answering, map question answering, limited domain question answering, information abstract and reading comprehension.

In some embodiments, the learning force includes at least one of the following sub-capability items: feedback learning, personalized learning, algorithm optimization and new knowledge learning, wherein:

the feedback learning includes at least one of the following ability items: online learning of user feedback;

personalized learning includes at least one of the following capabilities: updating the user image in real time and learning the online characteristics;

the algorithm optimization comprises at least one of the following ability items: model fusion, model optimization and small sample learning;

the new knowledge learning includes at least one of the following ability items: new logic learning, new emotional emotion learning, new task learning, new speech expression learning, knowledge discovery, new voice learning, knowledge updating and new image learning.

It should be noted that the evaluation direction of the intelligent capability can be adjusted and increased according to the development of the technology or the needs of the industry, the user, and the like.

Table 1 is an optional comparison table of the low energy level division standard of the capability, and those skilled in the art may make corresponding adjustments according to actual needs.

TABLE 1 comparison table of capability rating

It should be noted that the evaluation level division standard given in table 1 is an exemplary feasible standard, and certainly, the evaluation criteria of each master capability item of the intelligent capability evaluation item given above and the specific capability items classified under the master capability item may also be changed to some extent or have a newly added capability item, at this time, table 2 gives an example of the intelligent assistant capability level division rule, and a person skilled in the art may adjust the evaluation standard of table 1 according to the intelligent assistant capability level division level of table 2.

It should be noted that the intelligent assistant intelligent capability rating specification is not constant, and those skilled in the art can make appropriate revisions as needed. As shown in table 2:

TABLE 2 Intelligent Assistant Smart capability ratings provisions

In some embodiments, if the evaluation includes an internal evaluation, the preset evaluation scheme includes an evaluation capability item to be evaluated, in which an internal professional evaluates the intelligent capability of the target intelligent assistant;

the evaluation results include internal evaluation results.

In some embodiments, referring to fig. 3, when the evaluation is internal evaluation, the evaluating the target smart assistant according to a preset evaluation scheme, and the obtaining the evaluation result includes:

s301: acquiring a preset evaluation scheme;

in some embodiments, the preset evaluation scheme includes an evaluation grade division standard corresponding to each item of the evaluation ability to be tested, an evaluation case of the item of the evaluation ability to be tested, and a benchmark rating corresponding to the item of the evaluation ability to be tested;

s302: and obtaining an internal evaluation result.

Wherein the internal evaluation result comprises an internal actual rating and a ratio of reaching the standard;

the internal actual rating is the internal actual rating of each item of evaluation capability to be tested of the target intelligent assistant, which is obtained by evaluating the target intelligent assistant by an internal professional according to a preset evaluation scheme;

the standard ratio comprises the ratio of the number of items to be evaluated, the actual rating of which is greater than or equal to the standard rating, to the total number of the items to be evaluated.

In some embodiments, the internal assessment results further include a composite rating,

the comprehensive rating includes calculating an internal actual rating to obtain a comprehensive rating of the target intelligent assistant.

It should be noted that the standard-reaching percentage includes the ratio of the number of items to be evaluated, the internal actual rating of which is greater than or equal to the standard rating, to the total number of items to be evaluated;

it should be noted that the preset evaluation scheme may be at least a part of the evaluation equivalent division criteria shown in table 1, the reference rating of each item to be evaluated set by the relevant person or algorithm, and the evaluation case of the item to be evaluated.

It should be noted that the evaluation case may be a command which is specifically available for the target intelligent assistant to execute based on the capability item to be evaluated. For example, when the image input capability item of a certain target intelligent assistant is taken as the capability item to be evaluated, taking the equivalent division standard of table 1 as an example, the evaluation case at least includes common format images (gif, jpg, png, etc.) provided to the target intelligent assistant and required to be input by the target intelligent assistant, and the target intelligent assistant is required to collect images, that is, whether the target intelligent assistant can support a camera to take a picture is detected, the target intelligent assistant is required to take a picture, a useful image is captured from a captured film, and the target intelligent assistant is required to focus.

The following explains the process of evaluating the target smart assistant according to the preset evaluation scheme more easily by a specific embodiment:

the method comprises the following steps of grading the following three items of capability to be tested of the intelligent assistant according to a preset evaluation scheme of the current target intelligent assistant A: case reasoning, uncertainty reasoning and scene linkage control, the grade division standard of each capability item refers to table 3, and the standard rating of each capability item to be tested is determined as follows: case reasoning 3 level, uncertainty reasoning 4 level and scene linkage control 2 level. And acquiring the preset evaluation scheme by the target intelligent assistant A, executing an evaluation case of the items of the ability to be evaluated by the target intelligent assistant A to generate the internal actual rating of each item of the ability to be evaluated by the target intelligent assistant A, and filling the internal actual rating and the standard-reaching ratio into a report template if the internal actual rating of each item of the ability to be evaluated of the target intelligent assistant A is as follows, namely case inference 1 level, uncertainty inference 3 level and scene linkage control 5 level, wherein the standard-reaching ratio of the target intelligent assistant A is 33.3 percent (1/3).

TABLE 3 comparison table of grade division standard of item of capability to be measured

In some embodiments, after the target intelligent assistant obtains the preset evaluation scheme and generates the internal actual rating of each item of evaluation capability to be tested of the target intelligent assistant, the method further includes:

calculating the internal actual rating to obtain the comprehensive rating of the target intelligent assistant;

filling in the comprehensive rating into the report template.

It should be noted that, a calculation method for calculating the internal actual rating to obtain the comprehensive rating of the target intelligent assistant may be selected by a person skilled in the art according to needs, for example, weighted average, average number, and the like.

The internal evaluation process in the examples of the present invention will be further described below by way of a specific example. Referring to fig. 4, fig. 4 is a flow chart of another internal evaluation provided by the embodiment of the present invention, as shown in fig. 4:

s401: and (5) making a preset evaluation scheme.

In some embodiments, according to the evaluation purpose needs, the influence factors of the intelligent capability level of the target intelligent assistant are comprehensively considered, and an evaluation scheme conforming to the needs of the target intelligent assistant is formulated. The evaluation can be implemented by selecting a self-made scheme, or a professional organization or a third party can be entrusted to make an evaluation scheme so as to obtain a social approval result.

In some embodiments, the smart assistant product being evaluated and its characteristics, including system source, use, and manner of use, etc., are identified, defined, and described prior to evaluation. Before evaluation, the evaluation purpose and range are determined, and a preset evaluation scheme is determined according to the evaluation capacity items to be evaluated given by the evaluation grade division standard and the reference grades corresponding to the evaluation capacity items to be evaluated.

S402: and packaging the interface of the target intelligent assistant and accessing the interface into a unified management interface.

S403: and obtaining an internal evaluation result.

In some embodiments, the internal evaluation result comprises an internal actual rating, in some embodiments, an evaluation case of the item to be evaluated in a preset evaluation scheme is imported, evaluation is performed according to the evaluation case of the item to be evaluated, and each item to be evaluated of the target intelligent assistant is rated according to an evaluation grade division standard to obtain the internal actual rating.

In some embodiments, the target intelligent assistant intelligence capability level is evaluated in combination with the capability of the evaluated target intelligent assistant's functionality to meet the demand according to the purpose of the evaluation. Here we only consider which capability items are satisfied by the target intelligent assistant, with high level capability items overriding low level capability items. Taking the content of the capability items in table 1 as an example, the following items are: the feedback information input comprises capability items of 1 and 4 levels (same as the xx level and no non-counting), and if the intelligent assistant reaches the 4-level requirement, the two capability items of 1 and 4 are simultaneously satisfied.

It should be noted that the internal evaluation result also includes the ratio of reaching standards.

In some embodiments, the statistical target intelligent assistant counts the number of capability items reaching the benchmark rating, and calculates the percentage of capability items reaching the benchmark rating according to the total number of the evaluation capability items. Namely, the number of the capacity items of which the internal actual rating is greater than the standard rating corresponding to the capacity item is obtained, and the ratio of the number to the total number of the capacity items to be evaluated is measured to obtain the up-to-standard occupation ratio.

In some embodiments, the internal evaluation result further includes a composite rating, and the composite rating is obtained by calculating the internal actual rating.

In some embodiments, the internal actual ratings corresponding to the various items of ability to be evaluated are weighted and averaged to obtain a comprehensive rating.

It should be noted that, in some embodiments, the weight setting may be performed according to actual conditions by combining with the comprehensive rating obtained through internal evaluation, so that when the evaluation is performed in the same industry or at the same purpose, the same weight setting scheme for the evaluation capability item to be tested is adopted, and thus, the evaluation result is ensured to have comparability.

In some embodiments, if the evaluation includes an external evaluation, the preset evaluation scheme includes an evaluation item of the external user evaluation target smart assistant, and an evaluation result of the external evaluation is an external evaluation result.

It should be noted that, in some embodiments, the external evaluation is mainly divided into two parts, namely, an intention identification and a comment abstract, wherein the intention identification identifies the intention of the external user comment and confirms the emotional tendency of the external user comment; the review summary is mainly combined with ratings and reviews of various services by different users to give comprehensive results.

In some embodiments, if the evaluation is an external evaluation, the target smart assistant is evaluated according to a preset evaluation scheme, and obtaining the evaluation result includes:

obtaining comments of external users on the ability items to be evaluated and external ratings;

identifying a comment intent of the comment;

and generating a comment abstract according to the external rating and the comment intention.

It is noted that, in some embodiments, the review summary is an external review result.

It should be noted that, in some embodiments, before obtaining the external user's comment and external rating of the item of ability to be evaluated, an evaluation index is further determined, where the evaluation index at least includes an evaluation purpose and a range, and the item of ability to be evaluated is determined according to an evaluation ability item system and an ability item given by a pre-specified external evaluation standard.

It should be noted that, in some embodiments, the comments of the external user to the item of ability to be evaluated may be characters, expressions, or the like, and the comments may also be subjected to preliminary screening to screen out comments that obviously do not have a reference value, and to retain comments that meet the comment screening conditions, where the comment screening conditions may include, for example, expressions that are not related to the target intelligent assistant, novels, prose, large-segment copies of lyrics, and the like, and copies the same comments of the pasted large segments. Furthermore, the number of the screened comments which do not meet the comment screening condition, the frequency of occurrence, the user identity of the external user, the region, the login mode (WeChat, telephone number and the like) can be recorded, and analysis can be performed according to the record.

It should be noted that, the manner of embodying various ratings in the present embodiment, such as the reference rating, the external rating, the internal actual rating, etc., may be set by those skilled in the art as needed, for example, the ratings are simply embodied by numbers, such as the levels 1 to 6 in table 1; it may also be a Chinese or English word: good, very good, etc.; also expression, such as crying, committing, facial expression, smiling, laughing, crying, etc.; the brightness, color temperature, color and the like can be adjusted through the progress bar.

In some embodiments, identifying the comment intent of the comment includes:

obtaining comment categories of comments;

acquiring a format corresponding to the comment category;

rewriting the comments according to the format;

acquiring an optimal comment identification model corresponding to the comment category, wherein the identification model comprises characteristics, characteristic distances and evaluation grade division rules;

and recording the comments into the optimal comment identification model to obtain the comment intention of the comments.

It is noted that, in some embodiments, the comment category may be that the comment is determined by at least one of: comment languages such as english, chinese, japanese, etc.; comment composition, such as expression, text, picture, text + expression, etc.

In some embodiments, rewriting the comment according to a format includes: and performing data cleaning on the comments, wherein the data cleaning comprises but is not limited to performing word segmentation, augmentation, word deactivation and the like on the comments. It should be noted that word segmentation includes the recombination of a sequence of consecutive words into a sequence of words according to a certain specification, for example, dividing "surface" into "surface" and "partial". Augmentation may be understood as the addition of a synonym to a comment, e.g., a comment of "good", to "satisfactory", etc. Stop Words refer to Words that are automatically filtered before or after processing natural language data (or text) in order to save storage space and improve search efficiency in information retrieval, and are called Stop Words (Stop Words) in order to remove such Words.

It is noted that, in some embodiments, rewriting the comment according to the format further includes: before the comments are subjected to data cleaning, the comments are subjected to data format unification. The data format unification may unify the comments according to a preset format rule, and a specific manner of the format unification may adopt a related technology known by those skilled in the art. The preset format can be defined by those skilled in the art according to actual needs.

In some embodiments, obtaining the best comment recognition model corresponding to the comment category includes:

obtaining at least one target comment identification model;

embedding the comments into each target comment recognition model;

and selecting the target comment recognition model with the best embedding result as the best comment recognition model.

The target comment recognition model may be a model for each comment category that is set in advance by a person skilled in the art according to conventional technical means.

In some embodiments, when the target comment recognition model is set, after the target comment recognition model is preliminarily set, model hyper-parameter tuning and/or feature selection for the model may be performed on the target comment recognition model.

In some embodiments, obtaining the target review recognition model comprises:

obtaining a feature extraction and dimension reduction method of each comment category;

performing distance measurement on each category containing the features;

and setting a target comment identification model according to the feature extraction and dimension reduction method and the distance between the features.

Referring to fig. 5-1, fig. 5-1 is a schematic flow chart of an intention identification method, as shown in fig. 5-1:

s501: obtaining the category of the comment;

in some embodiments, when multiple reviews are obtained, the different questions are separated and the data set is summarized.

S502: acquiring a format corresponding to the category of the comment;

s503: rewriting the comments according to the format;

in some embodiments, overwriting comprises:

unifying data formats;

and/or the presence of a gas in the gas,

data cleaning, word segmentation, augmentation, word stop and the like of text data.

S504: obtaining an extraction and dimension reduction method of each class of features;

s505: the metrics comprise the distance of each class between each feature;

s506: establishing a data set which can be read by a model according to the characteristics;

it should be noted that the data set includes various methods such as cross validation, and also includes functions such as feature concatenation.

S507: establishing a target comment identification model comprising various types of problems;

it should be noted that the target comment identification model also stores evaluation indexes and evaluation results, so that hyper-parameter tuning and embedding are facilitated.

S508: carrying out hyper-parameter tuning on the target comment identification model;

s509: selecting characteristics aiming at the target comment recognition model;

s510: adopting a plurality of methods to embed;

s511: and selecting a model according to the imbedding result.

It should be noted that, for the case of small data volume and short text length, a machine learning model such as SVM or XGBOOST may be used, and for the case of large data volume and complex data, a deep learning model may be used to obtain a better effect, and fig. 5-2 provides a typical multi-target classification architecture example, whose core modules use multi-head association and initiation-response, which is briefly described below.

The sequence is an embedding form of an input Sentence, and comprises word embedding, character embedding, position embedding and the like, and multi-head integration is a common multi-head attention mechanism in a generating model, and the method is adopted to better extract Sentence features.

The pre _ information is other information than text, such as context information, user rating, etc. The information is processed into a vector or matrix form and is directly added with the result after multi-head integration.

The latter structure is a typical acceptance-rest structure, which has demonstrated its powerful feature extraction capability in the image domain. The difference is that the initiation _ resnet _ c is disassembled because the multi-target situation is considered. Because a multi-head attention mechanism and full connection are adopted before the framework is used, word vectors are not split, and abstract features of the word vectors are processed. It should be noted that in the multi-objective classification, the more modules are processed, the more abstract the features are, and the loss function information before is included, so that the more processes are performed, the finer the intention of the loss function is, and the finer the granularity is. As shown in FIG. 5-2, the loss _ intent2 needs to be more granular than the loss _ intent 1.

In the above structure, if a classification target is to be added, the correlation module is connected to the subsequent module, and then loss is added. For model convergence, it is not recommended that a model contain too many classification targets unless the targets are strongly correlated. If no external information exists or the problem is only a single target, only the redundant modules are directly removed.

In some embodiments, generating the review summary from the external rating and the review intent includes:

performing first statement processing on the comment intention to obtain a first statement processing result;

acquiring a weight corresponding to the comment intention;

calculating the first statement processing result and the external rating according to the weight to obtain a calculation result;

performing second statement processing on the comment intention to obtain a second statement processing result;

normalizing the second statement processing result to obtain a normalized result;

and interacting the calculation result with the normalization result to generate a comment abstract.

Fig. 6 provides a flow diagram of a review summary generation method, which is shown in the review summary generation framework of fig. 6. The framework is mainly based on a generating model, for the condition of less training data, the traditional extraction type methods such as textcaster and textrank can be adopted for abstracting, and if the quality of the result obtained by the method is higher, the result can be used as the training corpus of the generating type framework after being audited. The structure in fig. 6 mainly includes the following processes:

1) summarizing the user comments, and processing the embedded results, namely processing the comment intentions, wherein the processing mode is mainly characterized by feature extraction and can adopt structures such as bi-GRU, multi-head attribution, TCN and the like;

2) because the system supports the user scoring function, the result is added with the result after the statement processing after being vectorized, and null support is needed, namely, the user does not score. There are also multiple ways of rating vectorization, a typical approach is to assign a trainable random vector to each level;

3) and introducing an externally trainable attention matrix, and performing attention calculation on the result of the previous step to obtain the weight. Then, multiplying all the weights by the result of the previous step to obtain a final result of the encoder end;

4) the Decoder end adopts a Decoder structure of a transform, the statement processing of the part adopts multi-head attribution with mask, the normalization adopts a layer normalization method, and then the obtained result and the final result of the encoder end adopt the multi-head attribution method for interaction. The whole process can be repeated Nx times, and the obtained final result is the final comment result.

It should be noted that the model may be used online requiring multiple runs, each using only the result of its corresponding location.

The comment abstract generation method has the function of automatically generating an online external evaluation result for a specific service according to a large number of online user comments and scores given by users.

In some embodiments, the external evaluation further comprises:

and stopping obtaining the rating and the external rating according to a test stopping condition, wherein the test stopping condition comprises at least one of the following conditions:

the service time of the target intelligent assistant is longer than the preset service time;

the number of the comments is larger than the preset number of the comments;

an external stop command.

The external evaluation process of the intelligent assistant is illustrated by a specific embodiment. Referring to fig. 7, fig. 7 is a schematic flowchart of an external evaluation method of an intelligent assistant, as shown in fig. 7:

s701: and (5) making a preset evaluation scheme.

S702: and packaging the interface of the target intelligent assistant and accessing the interface into a unified management interface.

S703: and obtaining the comment of the external user on the capability item to be evaluated and the external rating.

In some embodiments, the evaluation capability items to be evaluated by the target intelligent assistant are provided for external users under a unified interface, and the evaluation and the rating of the users are supported.

It should be noted that, the external user is a common user who uses the target intelligent assistant, and the comment and rating of the external user are given based on the use experience of the external user, and professional training is not required for the external user to enable the rating or comment standard of the external user to be in the same standard. But if necessary, some specific information of the external user can be obtained so as to perform more accurate analysis on the comment and the rating of the external user. For example, a region of the external user is obtained, if the region of the external user is Sinkiang, when information collection and voice input capability items of the target intelligent assistant are evaluated, if the external user in the region has a large range of low scores and poor scores, the Xinjiang dialect of the target intelligent assistant can be trained, so that the service capability of the part of external users is improved.

S704: and setting a test stopping condition, and stopping obtaining comments and external ratings of the user on the intelligent capacity to be evaluated of the target intelligent assistant according to the test stopping condition.

It should be noted that the test stop condition may be set by the user or other relevant person, device, or system before the external evaluation is performed. The test stop condition may be set after external evaluation according to actual conditions. The test stop condition may be set before the start of the external evaluation, but may be adjusted during the external evaluation to form a new test stop condition.

In some embodiments, the test stop condition may be at least one of:

the number of the comments is larger than the preset number of the comments;

an external stop command.

Of course, the test stop condition may be other conditions set by those skilled in the art as needed.

S705: and acquiring the comment intention of the user.

S706: and obtaining the review abstract.

It should be noted that the review summary includes the actual rating of the target intelligent assistant by the external user and the actual review after processing.

In some embodiments, the intelligent capability level of the target intelligent assistant is evaluated according to the evaluation purpose in combination with the capability of the function of the evaluated target intelligent assistant to meet the demand.

In some embodiments, a reasonable evaluation result is formed by applying a comprehensive evaluation method or other methods according to an evaluation capability item system corresponding to the level of the intelligent capability of the target intelligent assistant and the evaluation capability item to be evaluated, and the actual rating is calculated to obtain the comprehensive rating.

In some embodiments, the actual ratings corresponding to the various items of ability to be evaluated are weighted and averaged to obtain a comprehensive rating.

In some embodiments, generating the evaluation report according to the evaluation result comprises:

acquiring an evaluation report template, wherein the evaluation report template is obtained by acquiring and filling the content required to be filled in a preset evaluation report description template;

analyzing the evaluation result, and extracting target data and target character information;

filling target data and target character information into an evaluation report template;

and generating an evaluation report.

It should be noted that, in some embodiments, the preset evaluation instruction template may be a template set by a user or an evaluator as needed, or one of a plurality of evaluation instruction templates preset by the system may be selected as the preset evaluation instruction template.

In some embodiments, according to the requirement of the evaluation purpose, the influence factors of the intelligent capability level of the target intelligent assistant are comprehensively considered, and an evaluation description template conforming to the requirement is formulated. The evaluation can be carried out by choosing a self-made scheme, or a professional organization or a third party can be entrusted to make an evaluation description template so as to obtain a social approval result.

In some embodiments, a pre-set comment template may be understood as a textual comment for a comment report, which may include, but is not limited to, at least one of:

the evaluation system comprises an evaluation program implementation process and situation, a special case specification, an evaluation report date, an evaluation basis, a basic outline of an intelligent assistant product, intelligent assistant intelligent grading and definition, an evaluation report use limit specification, an evaluation purpose, an evaluation method, an evaluation hypothesis and limiting condition, an evaluation object and range and the like.

In some embodiments, before the evaluation instruction template is established, the evaluated target intelligent assistant product and the characteristics thereof, including system source, use mode and the like, are identified, defined and described, and the evaluation instruction template is established according to the information.

It should be noted that the evaluation result may be an evaluation result obtained from an external evaluation source, that is, a review summary, and/or an internal evaluation source, that is, an actual rating, a standard ratio, and a comprehensive ratio.

In some embodiments, filling in the target data and target text information into the evaluation report template includes:

acquiring a slot position of an evaluation report template;

and filling the target data and the target character information into the slot position of the evaluation report template through single sentence modeling and sequence editing modeling.

The following further illustrates a specific flow of the method for generating an evaluation report according to the evaluation result by a specific embodiment:

as shown in fig. 8-1, in order to output the overall evaluation result of the target intelligent assistant more conveniently and efficiently, the invention designs an evaluation report generation method. One implementation flow diagram is shown by 8-1:

1) and acquiring an evaluation report template.

In some embodiments, the evaluation report template may be obtained by acquiring and filling in the content required by the preset evaluation report specification template. It should be noted that the preset comment instruction template may be understood as a text instruction of a comment report, and the content to be filled in by the preset comment instruction template may include, but is not limited to, at least one of the following:

basic outline of the intelligent assistant product, evaluation purpose, evaluation object and scope, intelligent assistant intelligent capability grading and definition, evaluation hypothesis and limiting condition, evaluation basis, evaluation method, evaluation program implementation process and situation, special case description, use limit description of evaluation report, evaluation report date;

in some embodiments, by obtaining the contents and generating the evaluation report template according to the contents, the process may use text summarization and text matching techniques, i.e., selecting important words of a large number of structured reports and generating the evaluation report template.

2) Analyzing the evaluation result, and extracting target data and target character information.

In some embodiments, the evaluation result comprises an internal evaluation result and an external evaluation result, and the important target data and the target character information are extracted by performing information extraction on the internal evaluation result and the external evaluation result. The internal evaluation result is at least one of actual rating, standard ratio and comprehensive rating, the external evaluation result comprises a comment abstract, the comment abstract comprises actual comments and actual ratings, the internal evaluation result and the external evaluation result are structured, and the content can be extracted in a regular mode.

3) And filling the target data and the target character information obtained after analysis into a slot position of the evaluation report template to generate an evaluation report.

The above is a basic flow, when there is enough training data, a final evaluation report can be generated directly by acquiring the content required to be filled in by the preset evaluation description template and the internal and external evaluation results, and the whole flow is a typical seq2seq problem. In addition, the final evaluation report needs to give emotional tendency according to the quality of the target intelligent assistant function, in order to implement the function on the premise of ensuring language diversity, in some embodiments, the present invention may adopt a latest QuaSE framework, see fig. 8-2 and fig. 8-3, and a specific flow is as follows:

the model includes a single sentence modeling illustrated in fig. 8-2 and a modeling of two parts of a sequence edit illustrated in fig. 8-3. Fig. 8-2 illustrates single-sentence modeling, where X and R are observations, each representing a sentence (e.g., a user's evaluation of a certain function) and its corresponding numerical value (e.g., a user score). Z and Y are hidden variables, which are modeling representations of sentence content and sentence numerical correlation properties.

The modeling of the hidden variables Z and Y is realized by generating a model. We designed two encoders (E1 and E2) and one decoder (d), with X being generated conditioned on Z and Y.

The optimization goal of the model is to enable the generated sentence X' to reconstruct the input sentence X to the maximum extent. Meanwhile, due to the reasons that the integral of the optimization target is difficult to calculate and the like, the lower bound of the optimization target is searched by a variational method. In addition, a regression function F is designed to learn the mapping relation between the hidden variable Y and the value R.

Referring to the flow diagram of modeling by sequence editing as illustrated in fig. 8-3, a pseudo-parallel sentence pair data set is first constructed as illustrated in fig. 8-3. Modeling for sentence editing mainly contains three parts:

1) the relationship between the content change and the numerical value change of the sentences x to x' is established. The change from the original sentence x to the target sentence x 'must be an increase or decrease of some words, so that a change in the value property, i.e., a difference y to y', is produced. For this change map we have designed a first objective function Ldiff;

2) we mention that x and x' must remain consistent in terms of primary content, e.g. must both be in the description of the "emotional dialog function". We introduce a second objective function Lsim to make z and z' as similar as possible;

3) if the generation process is to generate x (p (x | z, y)) given z and y, then the rewriting process can be to generate x '(p (x' | z, y ')) given z and y', or to generate x (p (x | z ', y)) given z' and y at the same time, which is a bi-directional process. A third loss function Ld-rec is introduced for both generation processes.

Finally, the single sentence modeling and the sequence editing modeling model can be fused into a unified optimization problem to be trained through an end-to-end method.

And filling the target data and the target text information into the slot of the evaluation report template through the single sentence modeling and the sequence editing modeling.

In some embodiments, evaluating the target smart assistant according to a preset evaluation scheme comprises:

acquiring a preset evaluation scheme;

packaging interfaces of at least one target intelligent assistant and accessing a unified management interface;

and respectively evaluating the target intelligent assistants through the management interface according to a preset evaluation scheme.

By encapsulating the interfaces of the target intelligent assistants and accessing the unified management interface, unified internal professionals or external users can use and evaluate the target intelligent assistants through the same management interface, and the working efficiency is improved. And the evaluation cases of the items to be evaluated in the preset evaluation scheme can be directly imported through a uniform management interface, so that the working efficiency is further improved, and the evaluation errors caused by the evaluation case errors of the items to be evaluated are reduced to a greater extent.

In some embodiments, the input and output interfaces of different typical intelligent assistants are packaged and managed in a unified way, so that the interfaces of different types of intelligent assistants can be used and tested on line in the same way.

The following shows a target intelligent assistant weighting and rating process through a specific internal evaluation process schematic diagram, and explains how to execute a normal evaluation process under the condition that the evaluation grade division standard corresponding to the evaluation capability item to be evaluated lacks the capability item. Referring to fig. 9, fig. 9 is a schematic flow chart of another intelligent assistant evaluation method:

s901: and acquiring a preset evaluation scheme.

And determining a preset evaluation scheme, and comprehensively considering the service items to be evaluated by the user according to the evaluation purpose. The method can select to formulate a preset evaluation scheme by self to carry out evaluation, and also can entrust professional organizations or third parties to formulate the preset evaluation scheme so as to obtain social approval results.

S902: and acquiring the ability items to be evaluated.

Before evaluation, the intelligent assistant product being evaluated and its characteristics, including system source, use, and manner of use, are identified, defined, and described. Before evaluation, the purpose and the range of evaluation are determined, and the evaluation ability item to be evaluated is determined by combining an evaluation ability item system and an ability item given by an evaluation grade division standard.

S903: and determining the current evaluation grade division standard.

If the evaluation grade division standard does not contain or contains outdated capacity items, the outdated capacity items can be added or modified, all modified results can be voted by industry experts as required, and once the updated results pass, the updated results are adopted as new evaluation grade division standards. Of course, the rating criteria may be directly adjusted by those skilled in the art.

S904: and setting the weight of the ability item to be evaluated.

And according to the actual situation, combining with the internal expert scoring, and performing weight setting on the to-be-evaluated ability item. When the evaluation is carried out in the same industry or under the same purpose, a uniform index weight setting scheme is adopted to ensure that the evaluation results have comparability.

S905: and generating an evaluation report template.

The evaluation report template comprises a preset evaluation scheme, an evaluation capability item to be evaluated, an evaluation grade division standard and the weight of the evaluation capability item to be evaluated, a structured template is automatically generated from the above contents, and a position which can be filled in is reserved for the evaluation result of each evaluation capability item to be evaluated.

S906: and packaging the target intelligent assistant interface to be evaluated and accessing the target intelligent assistant interface into a unified management interface.

S907: and importing a data set required by evaluation, and evaluating according to the evaluation content in the evaluation scheme.

S908: and the target intelligent assistant generates actual rating, comprehensive rating and standard ratio of each item to be evaluated of the target intelligent assistant according to a preset evaluation scheme.

And grading each item of the ability to be evaluated of the target intelligent assistant according to the evaluation grade division standard, and carrying out weighted summation according to actual grades of different grades corresponding to each grade of each item of the ability to be evaluated to obtain the comprehensive grade of the target intelligent assistant. And acquiring the ratio of the number of items to be evaluated, the actual rating of which is greater than or equal to the standard rating, to the total number of the items to be evaluated as the ratio of reaching the standard.

S909: and filling the actual rating, the comprehensive rating and the standard ratio of each item of the evaluation capability to be tested into an evaluation report template.

In some embodiments, the models of fig. 8-2 and 8-3 may be used to rewrite report text according to the result tendency of the internal expert evaluation in combination with the evaluation target and content to obtain the final evaluation report.

In some embodiments, after a large number of intelligent assistants are evaluated, internal and external evaluation result processing can be automatically performed on the intelligent assistants after the internal evaluation result and the external evaluation result are obtained, and an evaluation report is generated. Fig. 10 is a schematic diagram of another flow of an intelligent assistant evaluation method, which is shown in fig. 10:

s1001: and acquiring an evaluation report template.

The evaluation report specification template can be obtained, and at least one of the following information is input according to the corresponding requirements of the template: the method comprises the steps of basic outline, evaluation purpose, evaluation object and range, intelligent assistant intelligent capability grade division and definition, evaluation hypothesis and limiting conditions, evaluation basis, evaluation method, evaluation program implementation process and situation, special matter description, use limit description of evaluation report, evaluation report date and the like of a target intelligent assistant product, and correspondingly adjusting an evaluation report description template, for example, deleting blank slots and the like to obtain the evaluation report template. The above information can be filled in the evaluation report description template through the scheme shown in fig. 8-2 and fig. 8-3.

S1002: analyzing the evaluation result, and extracting target data and target character information.

And automatically determining a preset evaluation scheme according to the evaluation report template, importing an evaluation case for internal evaluation, opening a corresponding interface for external evaluation, and obtaining an internal evaluation result and an external evaluation result.

Wherein the evaluation result comprises an internal evaluation result and an external evaluation result.

And evaluating the internal evaluation result according to a preset evaluation scheme to obtain information such as actual rating, comprehensive rating, and ratio of reaching standards.

And finally, processing the internal evaluation result and the external evaluation result, and extracting target data and target character information.

S1003: and filling the target data and the target character information into the evaluation report template.

And automatically generating a final evaluation report according to the internal and external evaluation results by adopting the multi-input source version of the model in the figures 8-2 and 8-3. The multi-input source version is obtained by customizing a feature processing module for different input data and training based on a large amount of historical evaluation data.

In some embodiments, referring to fig. 11, fig. 11 is a system architecture diagram of an intelligent assistant evaluation system, and referring to fig. 11, a flow of an intelligent assistant evaluation method performed according to the system is schematically as follows:

s1101: and analyzing and preprocessing various evaluation data and storing the various evaluation data into a database.

It should be noted that the evaluation data includes various items of ability to be evaluated, their rating criteria, and other main contents of the intelligent assistant as evaluation targets, for example: basic outline of target intelligent assistant product, evaluation purpose, evaluation object and scope, intelligent assistant intelligent ability grade division and definition, evaluation hypothesis and limiting condition, evaluation basis, evaluation method, evaluation program implementation process and situation, special matter description, use limit description of evaluation report and evaluation report date, etc.

S1102: confirming a preset evaluation scheme according to the requirement of an evaluation task, wherein the preset evaluation scheme comprises a capability item to be evaluated, an evaluation case, an evaluation grade division standard and the like;

s1103: selecting a specific mode for evaluation according to a preset evaluation scheme through the packaged intelligent assistant interface;

s1104: if internal evaluation is needed, selecting corresponding evaluation contents in a preset evaluation scheme in the database for evaluation, and rating each item of evaluation capability to be tested through an internal evaluation module;

s1105: if external evaluation is needed, open online test is performed through an external interface, and based on external rating and comment of an external user, an external evaluation module is adopted to rate each service;

the user can completely experience the services provided by various intelligent assistants through the provided unified user interface. The usage record of the user is completely recorded, in addition, the user can score each service and is divided into five grades of poor, medium, more satisfactory and satisfactory, and meanwhile, the user can comment on the service.

S1106: and summarizing the internal evaluation result and the external evaluation result, automatically filling slots based on an evaluation report template formulated by the evaluation, and automatically generating a short evaluation report according to the evaluation result of each item of the capability to be evaluated and the evaluation report template.

According to the embodiment of the invention, the target intelligent assistant is evaluated according to the preset evaluation scheme, the evaluation result is obtained, and the evaluation report is generated according to the evaluation result. The evaluation method can promote the development of the intelligent assistant industry, promote the improvement of the service quality and further improve the experience degree of users.

Furthermore, in order to enable the credibility and the application universality of the evaluation report to meet the requirements of people, the evaluation mode can be subjected to at least one of internal evaluation or external evaluation according to the requirements of users. The internal evaluation has the evaluation result of the internal professional from the professional perspective, and the external evaluation can acquire the actual experience of an external user with more samples, so that a reference angle is provided for more comprehensive understanding of the intelligent assistant and targeted improvement of the intelligent assistant.

Further, in the internal evaluation provided in the embodiment of the present invention, the intelligent capability of the target intelligent assistant is evaluated, the preset evaluation scheme includes an evaluation level division standard, and an internal professional performs rating on the target intelligent assistant according to the evaluation level division standard to obtain an internal actual rating of each item of the evaluation capability to be tested.

Further, in the external evaluation provided in the embodiment of the present invention, the evaluation subject is an external user, and the external rating and the comment are evaluations with their own color performed by the external user according to their own experience.

Further, in the embodiment of the invention, comment intention recognition is carried out on the comment and the external rating of the external user by obtaining the comment and the external rating of the external user, so that a comment abstract is formed. For external evaluation, since there is a huge external user group in some cases, and the currently acquired comment and external rating are already sufficient to complete external evaluation, the comment and external rating may be stopped from being acquired according to the test stop condition.

Further, evaluating the target intelligent assistant according to a preset evaluation scheme may further include: and the interface of at least one target intelligent assistant is encapsulated and accessed to a unified management interface so as to realize the evaluation of the target intelligent assistant through the management interface. The cost required by evaluation is greatly saved, and no matter an internal professional or an external user, the management interface can be used for realizing the use of the plurality of target intelligent assistants. Is more convenient.

Example two

Referring to fig. 12, an embodiment of the present invention provides an intelligent assistant recommendation method, including:

s1201: packaging interfaces of at least two target intelligent assistants and accessing a unified management interface;

it should be noted that the above-mentioned interface for encapsulating the target intelligent assistant and the manner of accessing the unified management interface are implemented by various common technical means in the field.

S1202: acquiring the current requirements of external users;

s1203: determining corresponding capability items and the priority of each capability item according to the current requirement;

s1204: acquiring an evaluation report of the target intelligent assistant;

it should be noted that the evaluation report can be obtained by the method of the above embodiment. Wherein the target intelligent assistant can directly use the evaluation report when the evaluation report record exists. Of course, the real-time evaluation may be performed according to the capability item corresponding to the current requirement of the external user, the evaluation may be an internal evaluation,

and/or the presence of a gas in the gas,

external evaluations of external users other than the external user are acquired.

S1205: determining the optimal intelligent assistant in each target intelligent assistant according to the ability item, the priority of the ability item and the evaluation report;

in some embodiments, the determination of the preferred intelligent assistant may be a ranking according to a rank of the respective capability item of each target intelligent assistant in the evaluation report. For example: the capacity items corresponding to the current requirements are sorted into A, B and C according to the priority; the current A intelligent assistant has a rating of 5 for capability item A, a rating of 4 for capability item B, and a rating of 6 for capability item C; the current capability item A of the B intelligent assistant is rated as 3, the capability item B is rated as 7, and the capability item C is rated as 10; in this case, the determination of the preferred one of the a and B smart assistants may be determined according to a preset rule, which is the preferred one if it is selected with reference to only one capability item with the highest priority. If the preset rule is that the highest weighted average of the capability items is determined as the preferred intelligent assistant, the capability items of the intelligent assistant a and the intelligent assistant B can be weighted and averaged respectively to obtain their respective weighted average ratings, and then a higher one is selected as the preferred intelligent assistant. It should be noted that the preset rule may also be other rules that are set by those skilled in the art according to needs.

S1206: the management interface provides an interface for the external user that prefers the smart assistant for use by the external user.

It should be noted that this method is equivalent to the fact that the external user can select a plurality of target intelligent assistants for use in the management interface at present, and by the method of the embodiment of the present invention, a preferred intelligent assistant is provided for the external user in combination with the requirements of the external user, so that the use experience of the intelligent assistant is improved. The annoyance of the user trying the intelligent assistant one by one is eliminated. The method automatically selects the intelligent assistant according to the function required by the external user based on the evaluation result, thereby improving the service quality.

In some embodiments, users can also rate and comment on various ability items, and the system records the ability items, so that the shortcomings of various functions can be obtained, and subsequent improvement is facilitated.

Example three:

based on the intelligent assistant evaluation method provided by the first embodiment, the present embodiment further provides an intelligent assistant evaluation system 1300, as shown in fig. 13, which includes:

the evaluation module 1301 is used for evaluating the target intelligent assistant according to a preset evaluation scheme;

a first obtaining module 1302, configured to obtain an evaluation result;

and a first generating module 1303, configured to generate an evaluation report according to the evaluation result.

In some embodiments, the evaluation module 1301 includes at least one of:

an external evaluation module 13011 and an internal evaluation module 13012.

In some embodiments, the internal evaluation module 13011 comprises:

a second obtaining module 130111, configured to obtain a preset evaluation scheme, where the preset evaluation scheme includes an evaluation level division standard corresponding to each item of capability to be evaluated, an evaluation case of the item of capability to be evaluated, and a benchmark rating corresponding to each item of capability to be evaluated;

a third obtaining module 130112, configured to obtain an internal evaluation result, where the internal evaluation result includes an internal actual rating and an up-to-standard ratio;

In some embodiments, external evaluation module 13012 comprises:

the fourth obtaining module 130121 is configured to obtain an external evaluation result, where the external evaluation result includes a comment of an external user on a to-be-evaluated ability item and an external rating;

an identification module 130122 for identifying a comment intent of the comment;

and a second generating module 130123, configured to generate the review summary according to the external rating and the review intention.

Example four

Based on the method for recommending by intelligent assistant provided in the second embodiment, the present embodiment further provides an intelligent assistant recommending system 1400, as shown in fig. 14, which includes:

an encapsulation module 1401, configured to encapsulate interfaces of at least two target intelligent assistants, and access a unified management interface;

a fifth obtaining module 1402, configured to obtain a current requirement of an external user;

a capability item determining module 1403, configured to determine, according to the current requirement, corresponding capability items and priorities of the capability items;

a sixth obtaining module 1404, configured to obtain an evaluation report of the target intelligent assistant;

a preference determination module 1405, configured to determine, according to the capability item, the priority of the capability item, and the evaluation report, a preferred intelligent assistant among the target intelligent assistants;

a module 1406 is provided for the management interface to provide an interface for an external user to prefer the intelligent assistant for use by the external user.

Example five:

the present embodiment further provides an intelligent assistant evaluation terminal, which is shown in fig. 15 and includes a first processor 1501, a first memory 1503, and a first communication bus 1502, where:

the first communication bus 1502 is used to realize connection communication between the first processor 1501 and the first memory 1503;

the first processor 1501 is configured to execute one or more computer programs stored in the first memory 1503 to implement at least one step of the above-described embodiment of the intelligent assistant evaluation.

Example six:

the present embodiment further provides an intelligent assistant recommending terminal, as shown in fig. 16, which includes a second processor 1601, a second memory 1603, and a second communication bus 1602, wherein:

the second communication bus 1602 is used for realizing connection communication between the second processor 1601 and the second memory 1603;

the second processor 1601 is configured to execute one or more computer programs stored in the second memory 1603 to implement at least one step of the intelligent assistant recommendation method in the second embodiment.

The present embodiments also provide a computer-readable storage medium including volatile or non-volatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, computer program modules or other data. Computer-readable storage media include, but are not limited to, RAM (random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash Memory or other Memory technology, CD-ROM (Compact disk Read-Only Memory), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.

The computer readable storage medium in this embodiment may be used to store one or more first computer programs, and the stored one or more first computer programs may be executed by a processor to implement at least one step of the intelligent assistant evaluation method in the first embodiment.

The computer readable storage medium in this embodiment may be used to store one or more second computer programs, and the stored one or more second computer programs may be executed by the processor to implement at least one step of the recommendation of the smart assistant in the second embodiment.

The present embodiment also provides a computer program (or computer software), which can be distributed on a computer-readable medium and executed by a computing device to implement at least one step of keeping resources consistent in the first embodiment; and in some cases at least one of the steps shown or described may be performed in an order different than that described in the embodiments above.

The present embodiment further provides a computer program (or computer software), which can be distributed on a computer-readable medium and executed by a computing apparatus to implement at least one step of keeping resources consistent in the second embodiment; and in some cases at least one of the steps shown or described may be performed in an order different than that described in the embodiments above.

It should be understood that in some cases, at least one of the steps shown or described may be performed in a different order than described in the embodiments above.

The present embodiments also provide a computer program product comprising a computer readable means on which a computer program as shown above is stored. The computer readable means in this embodiment may include a computer readable storage medium as shown above.

It will be apparent to those skilled in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software (which may be implemented in computer program code executable by a computing device), firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit.

In addition, communication media typically embodies computer readable instructions, data structures, computer program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to one of ordinary skill in the art. Thus, the present invention is not limited to any specific combination of hardware and software.

The foregoing is a more detailed description of embodiments of the present invention, and the present invention is not to be considered limited to such descriptions. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. An intelligent assistant evaluation method, wherein the intelligent assistant evaluation method comprises:

evaluating the target intelligent assistant according to a preset evaluation scheme to obtain an evaluation result, wherein the evaluation comprises at least one of the following: internal evaluation and external evaluation;

and generating an evaluation report according to the evaluation result.

2. An intelligent assistant evaluation method as defined in claim 1,

if the evaluation comprises internal evaluation, the preset evaluation scheme comprises an evaluation capability item to be tested, which is used for evaluating the intelligent capability of the target intelligent assistant by an internal professional;

the evaluation results include internal evaluation results.

3. The intelligent assistant evaluation method according to claim 2, wherein the evaluating the target intelligent assistant according to a preset evaluation scheme, and the obtaining of the evaluation result comprises:

acquiring the preset evaluation scheme, wherein the preset evaluation scheme comprises evaluation grade division standards corresponding to the to-be-evaluated ability items, evaluation cases of the to-be-evaluated ability items and benchmark ratings corresponding to the to-be-evaluated ability items;

obtaining the internal evaluation result, wherein the internal evaluation result comprises an internal actual rating and an up-to-standard ratio;

the internal actual rating is the internal actual rating of each item of evaluation capability to be tested of the target intelligent assistant, which is obtained by evaluating the target intelligent assistant by the internal professional according to the preset evaluation scheme;

the standard occupation ratio comprises the ratio of the number of items to be evaluated, of which the internal actual rating is greater than or equal to the standard rating, to the total number of items to be evaluated.

4. An intelligent assistant evaluation method as defined in claim 3, wherein the internal evaluation results further comprise a composite rating;

the composite rating includes calculating the internal actual rating to obtain a composite rating for the target intelligent assistant.

5. The intelligent assistant evaluation method according to claim 4, wherein if the evaluation comprises an external evaluation, the preset evaluation scheme comprises an external user evaluating an evaluation capability item to be evaluated of the target intelligent assistant;

the evaluation result includes an external evaluation result.

6. The intelligent assistant evaluation method according to claim 5, wherein the evaluating the target intelligent assistant according to a preset evaluation scheme, and the obtaining of the evaluation result comprises:

identifying a comment intent of the comment;

7. An intelligent assistant evaluation method as defined in claim 6, further comprising:

stopping obtaining the rating and the external rating according to a test stop condition, the test stop condition including at least one of:

the number of the comments is larger than the number of preset comments;

an external stop command.

8. The intelligent assistant evaluation method of claim 6, wherein the identifying a comment intent of the comment comprises:

obtaining the comment category of the comment;

acquiring a format corresponding to the comment category;

rewriting the comment according to the format;

obtaining an optimal comment identification model corresponding to the comment category, wherein the identification model comprises the characteristics, the distance of the characteristics and an evaluation grade division rule;

and inputting the comment into the optimal comment identification model to obtain the comment intention of the comment.

9. The intelligent assistant evaluation method of claim 6, wherein the generating a review summary from the external rating and the review intent comprises:

acquiring the weight corresponding to the comment intention;

10. An intelligent assistant evaluation method as defined in any of claims 1-9, wherein the generating an evaluation report based on the evaluation results comprises:

filling the target data and the target character information into the evaluation report template;

and generating an evaluation report.

11. An intelligent assistant evaluation method as defined in claim 10, wherein said populating the evaluation report template with the target data and target textual information comprises:

acquiring a slot position of the evaluation report template;

12. An intelligent assistant evaluation method according to any one of claims 1-9, wherein the evaluating the target intelligent assistant according to a preset evaluation scheme comprises:

packaging at least one interface of the target intelligent assistant and accessing a unified management interface;

and evaluating the target intelligent assistants respectively through the management interfaces.

13. An intelligent assistant recommendation method, comprising:

acquiring the current requirements of external users;

obtaining an evaluation report of the target intelligent assistant;

14. An intelligent assistant evaluation system, the intelligent assistant evaluation system comprising:

the evaluation module is used for evaluating the target intelligent assistant according to a preset evaluation scheme, and comprises at least one of the following modules: an external evaluation module and an internal evaluation module;

the first acquisition module is used for acquiring an evaluation result;

15. An intelligent assistant evaluation system as defined in claim 14, wherein the internal evaluation module comprises:

a second obtaining module, configured to obtain the preset evaluation scheme, where the preset evaluation scheme includes an evaluation level division standard corresponding to each to-be-evaluated capability item, an evaluation case of the to-be-evaluated capability item, and a benchmark rating corresponding to each to-be-evaluated capability item;

the third acquisition module is used for acquiring the internal evaluation result, and the internal evaluation result comprises an internal actual rating and a standard ratio;

the standard reaching proportion comprises the ratio of the number of items to be evaluated, of which the actual rating is greater than or equal to the standard rating, to the total number of items to be evaluated.

16. An intelligent assistant evaluation system as defined in claim 14, wherein the external evaluation module comprises:

the fourth obtaining module is used for obtaining the external evaluation result, and the external evaluation result comprises the comment of the external user on the capacity item to be evaluated and the external rating;

an identification module for identifying a comment intent of the comment;

and the second generation module is used for generating a comment abstract according to the external rating and the comment intention.

17. An intelligent assistant recommendation system, comprising:

18. An intelligent assistant evaluation terminal, comprising: the system comprises a first processor, a first memory and a first communication bus;

the first processor is configured to execute one or more first computer programs stored in the first memory to implement the steps of the intelligent assistant evaluation method of any of claims 1 to 12.

19. An intelligent assistant recommending terminal, comprising: the second processor, the second memory and the second communication bus;

the second processor is configured to execute one or more second computer programs stored in the second memory to implement the steps of the intelligent assistant recommendation method of claim 13.

20. A readable storage medium, characterized in that the computer readable storage medium stores one or more first computer programs executable by one or more first processors to implement the steps of the intelligent assistant evaluation method according to any one of claims 1 to 12.

21. A readable storage medium, characterized in that the computer readable storage medium stores one or more second computer programs, which are executable by one or more second processors to implement the steps of the intelligent assistant recommendation method as claimed in claim 13.