WO2022253251A1 - Method and apparatus for evaluating interaction performance of interaction system - Google Patents

Method and apparatus for evaluating interaction performance of interaction system Download PDF

Info

Publication number
WO2022253251A1
WO2022253251A1 PCT/CN2022/096513 CN2022096513W WO2022253251A1 WO 2022253251 A1 WO2022253251 A1 WO 2022253251A1 CN 2022096513 W CN2022096513 W CN 2022096513W WO 2022253251 A1 WO2022253251 A1 WO 2022253251A1
Authority
WO
WIPO (PCT)
Prior art keywords
interaction
log
success rate
sub
data collection
Prior art date
Application number
PCT/CN2022/096513
Other languages
French (fr)
Chinese (zh)
Inventor
刘建国
赵培
Original Assignee
青岛海尔科技有限公司
海尔智家股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海尔科技有限公司, 海尔智家股份有限公司 filed Critical 青岛海尔科技有限公司
Publication of WO2022253251A1 publication Critical patent/WO2022253251A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis

Definitions

  • the present disclosure relates to the technical field of human-computer interaction, for example, to a method and device for evaluating the interaction performance of an interaction system.
  • the interaction success rate is an important index to measure the interactive system, which is defined as the number of successful interactive samples divided by the total number of samples. The higher the interactive success rate, the better the performance of the interactive system. The performance improvement of the interactive system is usually expressed by the interactive success rate.
  • the evaluation index used to evaluate the performance of the interactive system is single, which is not conducive to the comprehensive evaluation of the performance of the interactive system, and is not conducive to the further improvement of the performance of the interactive system.
  • the embodiments of the present disclosure provide a method and device for evaluating the interactive performance of an interactive system, so as to solve the problem that the existing interactive success rate index is difficult for the interactive system to improve its own performance through self-learning during the interactive process of the interactive system. technical aspects of process evaluation.
  • a method for evaluating interactive performance of an interactive system includes:
  • determining the self-learning index of the interactive system includes:
  • the self-learning index is determined according to the interaction success rate of the current evaluation period.
  • the current evaluation period includes multiple sub-data collection periods; determining the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period, Including: determining the interaction success rate of each sub-data collection cycle according to the first failure log; obtaining the average value of the interaction success rates of multiple sub-data collection cycles; determining the average value as the interaction success rate of the current evaluation cycle Rate.
  • the set duration is a sub-data collection period; according to the first failure log, determining the interaction success rate of each sub-data collection period includes: when the current sub-data collection period is the current evaluation period In the case of the first sub-data collection period, the interaction success rate of the interaction content corresponding to the first failure log in the current sub-data collection period is determined as the interaction success rate of the current sub-data collection period.
  • determining the interaction success rate of each sub-data collection cycle according to the first failure log further includes: when the current sub-data collection cycle is not the first sub-data collection cycle of the current evaluation cycle, In the interaction log, obtain the second failure log of the interaction failure in the previous sub-data collection period of the current sub-data collection period; add the interaction content corresponding to the second failure log to the current sub-data collection period The interaction success rate within the period is determined as the interaction success rate of the current sub-data collection period.
  • determining the interaction success rate of each sub-data collection period according to the first failure log includes: determining the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection period as Interaction success rate for each sub-data collection cycle.
  • determining the self-learning index according to the interaction success rate of the current evaluation cycle includes: obtaining the interaction failure rate of the current evaluation cycle; determining the interaction success rate and the interaction failure rate according to the current evaluation cycle A self-learning index, so that the self-learning index is a monotonically increasing function along with the interaction success rate.
  • determining the self-learning index according to the interaction success rate and interaction failure rate of the current evaluation cycle includes:
  • S is the self-learning index
  • k' is the interaction success rate
  • r is the interaction failure rate
  • obtaining the interaction failure rate of the current evaluation period includes: obtaining a first number of logs in the current evaluation period in the interaction logs, and a log of interaction failures in the current evaluation period the second quantity; according to the first quantity and the second quantity, determine the interaction failure rate of the current evaluation cycle.
  • obtaining the interaction status of each log in the interaction log includes: if the log in the interaction log is determined to be a reply satisfactory to the user through a preset interaction strategy, then adding the log in the interaction log to The interaction status of the log is determined as interaction success; if the log in the interaction log is determined to be an unsatisfactory reply from the user through the preset interaction policy, the interaction status of the log in the interaction log is determined as interaction failure.
  • the device for evaluating the interaction performance of an interactive system includes a processor and a memory storing program instructions, and the processor is configured to execute the evaluation method provided by the foregoing embodiments when executing the program instructions.
  • a method for interactive performance of interactive systems includes a processor and a memory storing program instructions, and the processor is configured to execute the evaluation method provided by the foregoing embodiments when executing the program instructions.
  • the interactive system includes the apparatus for evaluating the interactive performance of the interactive system provided in the foregoing embodiments.
  • the self-learning index of the interactive system can evaluate the dynamic process of the interactive system to improve its performance through self-learning, which is beneficial to the self-learning process of the interactive system.
  • FIG. 1 is a schematic diagram of an implementation scenario of an interactive system provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a method for evaluating the interactive performance of an interactive system provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a process of determining a self-learning index provided by an embodiment of the present disclosure
  • Fig. 4 is a schematic diagram of an apparatus for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure.
  • A/B means: A or B.
  • a and/or B means: A or B, or, A and B, these three relationships.
  • Fig. 1 is a schematic diagram of an implementation scenario of an interactive system provided by an embodiment of the present disclosure.
  • the interaction system can include an interaction model and an interaction log.
  • the interaction model can receive request information, and after receiving the request information, the interaction model can reply to the request information; the interaction log can record the request information and its corresponding reply.
  • the request information here may be user request information, such as user voice instruction information, user gesture instruction information, and the like.
  • the response of the interaction model to the request information can be to issue control commands to control other devices, such as control commands to control smart air conditioners, control commands to control smart refrigerators, or control commands to control smart TVs.
  • control commands to control smart air conditioners such as control commands to control smart air conditioners, control commands to control smart refrigerators, or control commands to control smart TVs.
  • User reply such as control commands to control smart TVs.
  • the interactive system in the embodiments of the present disclosure needs to be implemented by relying on specific devices, and the interactive system can be implemented by terminal devices, household electrical appliances, and the like.
  • the terminal devices here may be smart phones, tablet computers, ultra-mobile personal computers (Ultra-mobile Personal Computer, UMPC), netbooks, personal digital assistants (Personal Digital Assistant, PDA) and other terminal devices, but are not limited thereto.
  • UMPC Ultra-mobile Personal Computer
  • PDA Personal Digital Assistant
  • the home appliances here may be home appliances such as smart TVs, smart refrigerators, smart air conditioners, smart air conditioners, and smart refrigerators, but are not limited thereto.
  • Fig. 2 is a schematic diagram of a method for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure, and the method can be executed by a terminal device or a home appliance with an interaction function.
  • the methods used to evaluate the interactive performance of the interactive system include:
  • the interaction status includes interaction success and interaction failure.
  • the interaction log records the request information (for example, the request information sent by the user in the form of voice or gesture) and the response of the interaction system to the request information.
  • request information for example, the request information sent by the user in the form of voice or gesture
  • obtaining the interaction status of each log in the interaction log includes: determining the interaction status of the log in the interaction log when the log in the interaction log is determined to be a satisfactory reply to the user through the preset interaction strategy is the interaction success; if the log in the interaction log is determined to be an unsatisfactory reply from the user through the preset interaction policy, the interaction status of the log in the interaction log is determined as interaction failure.
  • the preset interaction strategy here is the strategy used to determine whether the interaction is successful. For example, if two or more logs appear in the interaction log with the same request information but different replies, determine the interaction of these two or more logs The status is interaction failure; or, after the interactive system sends a reply corresponding to the request information, it receives the "correct” or "wrong” feedback triggered by the user. If an interactive content gets a "correct” feedback, it is determined to record the interaction The interaction status of the content log is interaction success, and if an interaction content receives an "error” feedback, then determine that the interaction status of the interaction content log is interaction failure.
  • marking each log in the interaction log according to the interaction status includes: when the interaction status of a log is interaction success, marking the log as interaction success; when the interaction status of a log is interaction In case of failure, mark the log as interaction failure. After marking in this way, it is convenient to determine the failure log and the interaction success rate of the current evaluation cycle in the subsequent steps.
  • the interaction status of each log can be recorded in the log.
  • S202 Determine the self-learning index of the interactive system according to the labeling result of each log, so as to evaluate the interactive performance of the interactive system.
  • the self-learning index of the interactive system can evaluate the dynamic process of the interactive system to improve its performance through self-learning, which is beneficial to the self-learning process of the interactive system.
  • Fig. 3 is a schematic diagram of a process of determining a self-learning index provided by an embodiment of the present disclosure. As shown in Figure 3, according to the labeling results of each log, determine the self-learning index of the interactive system, including:
  • Each log in the interaction log also records the interaction time, and the current evaluation period has a start time and an end time.
  • the interaction log firstly select the logs whose interaction time is before the start time of the current evaluation period within the set duration, and then read the interaction status of these logs sequentially. If the interaction status of the log is interaction failure, the log Determined as the first failure log.
  • the number of logs in the first failure log may be one or more.
  • the interaction success rate of the first failure log in the current evaluation period is obtained.
  • the interaction success rate of the interaction content corresponding to the first failure log in a sub-data collection period in the current evaluation period can be obtained, and the first failure log can also be obtained.
  • the interactive content here may be a control command for controlling other devices, for example, it may be a control command for controlling a smart air conditioner, a control command for controlling a smart refrigerator, or a control command for controlling a smart TV, or it may be other commands such as querying the weather.
  • the interaction content corresponding to each log in the first failure log can be performed one or more times in the current evaluation period, or not performed once.
  • the set duration can be longer than the current evaluation cycle, and the interaction success rate of the first failure log in the current evaluation cycle is used as the interaction success rate of the current evaluation cycle; or, the set duration can be equal to the current evaluation cycle, and the The interaction success rate of the first failure log in the current evaluation cycle is used as the interaction success rate of the current evaluation cycle; or, the set duration can be shorter than the current evaluation cycle, and the interaction success rate of the first failure log in the current evaluation cycle is used as the current evaluation Periodic interaction success rate.
  • the interaction success rate of the interactive content corresponding to the first failure log in the sub-data collection period can be obtained, and the interaction content corresponding to the first failure log in the sub-data collection period can be obtained.
  • the interaction success rate in the sub-data collection cycle is used as the interaction success rate in the current evaluation cycle.
  • the current evaluation period includes multiple sub-data collection periods, and correspondingly, the set duration may be one or more sub-data collection periods.
  • the current evaluation cycle may include 2, 3, 4 or more sub-data collection cycles.
  • a sub-data collection cycle can be one week, and the current evaluation cycle can be 14 days, and the current evaluation cycle includes 2 sub-data collection cycles; or, the current evaluation cycle can be 21 days, and the current evaluation cycle includes 3 sub-data collection period; alternatively, the current evaluation period may be 28 days, in which case the current evaluation period includes 4 sub-data collection periods.
  • the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period determine the interaction success rate of the current evaluation period, including: according to the first failure log, determine the interaction success of each sub-data collection period rate; obtain the average value of the interaction success rate of multiple sub-data collection cycles; determine the average value as the interaction success rate of the current evaluation cycle.
  • the interaction success rate of the current evaluation cycle can be determined.
  • the duration is set as one sub-data collection cycle, and the current evaluation cycle includes multiple sub-data collection cycles; on this basis, according to the first failure log, the interaction success rate of each sub-data collection cycle is determined, including: When the current sub-data collection period is the first sub-data collection period of the current evaluation period, the interaction success rate of the interaction content corresponding to the first failure log in the current sub-data collection period is determined as the current sub-data collection period interaction success rate.
  • the interaction content corresponding to each log in the first failure log can be performed once or multiple times in the current sub-data collection cycle, and the total number of times the interaction content corresponding to all logs in the first failure log is carried out in the current sub-data collection cycle is used as a sample
  • the total number of cases; the number of successful interactions in the current sub-data collection cycle of the interactive content corresponding to all logs in the first failure log is used as the number of successful cases, and the first failure can be obtained by dividing the number of successful cases by the total number of cases Interaction success rate of the interaction content corresponding to the log in the current sub-data collection cycle.
  • determining the interaction success rate of each sub-data collection period further includes: in the interaction log, in the event that the current sub-data collection period is not the first sub-data collection period of the current evaluation period, Obtain the second failure log of the interaction failure in the previous sub-data collection period of the current sub-data collection period; determine the interaction success rate of the interaction content corresponding to the second failure log in the current sub-data collection period as the current sub-data collection period Periodic interaction success rate.
  • the interaction content corresponding to each log in the second failure log can be performed once or multiple times in the current sub-data collection cycle, and the total number of times the interaction content corresponding to all logs in the second failure log is carried out in the current sub-data collection cycle is used as a sample
  • the total number of cases; the number of successful interactions in the current sub-data collection cycle of all the interactive content corresponding to the second failure log is used as the number of successful cases, and the second failure can be obtained by dividing the number of successful cases by the total number of cases Interaction success rate of the interaction content corresponding to the log in the current sub-data collection period.
  • the four sub-data collection cycles are as follows in order of time: the first sub-data collection cycle, the second sub-data collection cycle, the third sub-data collection cycle and the fourth sub-data collection cycle.
  • the interaction success rate of the current evaluation period first obtain the first failure log of the interaction failure in the 7 days before the current evaluation period (the last sub-data collection period in the previous evaluation period) in the interaction log, these first failures
  • the interactive content corresponding to the log can be repeated or non-repetitive; in the first sub-data collection cycle of the current evaluation cycle, the interaction success rate of the interactive content corresponding to the first failure log is obtained, and in the first sub-data collection period,
  • the interactive content corresponding to each log in the first failure log may be performed one or more times, or the interactive content corresponding to one or more logs in the first failure log may not be performed; the first failure log corresponding to The interaction success rate of the interactive content is taken as the interaction success rate k 1 of the first sub-data collection period.
  • determining the interaction success rate of each sub-data collection period according to the first failure log includes: determining the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection period as the corresponding The interaction success rate of the sub data collection cycle.
  • the interaction content corresponding to each log in the first failure log can be performed once or multiple times in a sub-data collection cycle, and the total number of times that the interaction content corresponding to all logs in the first failure log is carried out in the sub-data collection cycle is taken as The total number of samples; the number of times the interaction content corresponding to all the logs in the first failure log is successfully interacted in the current sub-data collection cycle is used as the number of sample successes, and the number of successful samples is divided by the total number of samples to obtain the first The interaction success rate of the interaction content corresponding to the failure log within the sub-data collection period; this process is repeated to obtain and determine the interaction success rate of each sub-data collection period in turn.
  • the interaction success rate of the current evaluation period is obtained as an example.
  • the four sub-data collection cycles are as follows in order of time: the first sub-data collection cycle, the second sub-data collection cycle, the third sub-data collection cycle and the fourth sub-data collection cycle.
  • the self-learning index is determined based on the aforementioned interaction success rate and interaction failure rate, and the self-learning index is Monotonically increasing function with interaction success rate.
  • determining the self-learning index according to the interaction success rate of the current evaluation cycle may include: obtaining the interaction failure rate of the current evaluation cycle; determining the self-learning index according to the interaction success rate and interaction failure rate of the current evaluation cycle, so that the self-learning index is Monotonically increasing function with interaction success rate. Afterwards, the self-learning index is used as the optimization criterion for optimization.
  • the interaction success rate of the current evaluation period may be obtained by using the method provided in the foregoing embodiments.
  • the optimization method here varies with different interaction models, and those skilled in the art can adopt an appropriate optimization method according to the self-learning model that is the essence of the interaction model, and details will not be repeated here.
  • the self-learning index can be determined by the following formula:
  • S is the self-learning index
  • k' is the interaction success rate
  • r is the interaction failure rate.
  • the self-learning index S reaches the upper limit.
  • the interaction failure rate of the current evaluation period can be obtained in the following manner: obtain the first number of logs in the current evaluation period in the interaction log, and the second number of interaction failure logs in the current evaluation period; according to the first number and The second quantity, determines the interaction failure rate for the current evaluation cycle.
  • the interaction failure rate of the current evaluation period can be obtained by dividing the second quantity by the first quantity.
  • Each log in the interaction log is marked as interaction success and interaction failure.
  • the interaction log take the failure logs of the interaction failure within the set time before the current evaluation cycle, and take the interaction content corresponding to these failure logs as an example.
  • the interaction success rate in the current evaluation cycle is evaluated. This sample is selected during the interaction process of the interactive system, which can be used to evaluate the dynamic performance of the interactive system during the interactive process, and then the interactive system can be improved through self-learning. Evaluating the dynamic process of its own performance is beneficial to the self-learning process of the interactive system.
  • the device for evaluating the interactive performance of an interactive system includes a processor and a memory storing program instructions, and the processor is configured to execute the method for evaluating the interactive performance of an interactive system provided by the foregoing embodiments when executing the program instructions. Methods.
  • Fig. 4 is a schematic diagram of an apparatus for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure. As shown in Figure 4, the means for evaluating the interactive performance of the interactive system include:
  • a processor (processor) 41 and a memory (memory) 42 may also include a communication interface (Communication Interface) 43 and a bus 44. Wherein, the processor 41 , the communication interface 43 , and the memory 42 can communicate with each other through the bus 44 .
  • the communication interface 43 can be used for information transmission.
  • the processor 41 may invoke logic instructions in the memory 42 to execute the method for evaluating the interaction performance of the interaction system provided in the foregoing embodiments.
  • logic instructions in the memory 42 may be implemented in the form of software functional units and when sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the memory 42 can be used to store software programs and computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure.
  • the processor 41 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 42, that is, implements the methods in the foregoing method embodiments.
  • the memory 42 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and at least one application required by a function; the data storage area may store data created according to the use of the terminal device, and the like.
  • the memory 42 may include a high-speed random access memory, and may also include a non-volatile memory.
  • An embodiment of the present disclosure provides an interactive system, including the device for evaluating the interactive performance of the interactive system provided in the foregoing embodiments.
  • An embodiment of the present disclosure provides a computer-readable storage medium, which stores computer-executable instructions, and the computer-executable instructions are configured to execute the method for evaluating the interaction performance of an interaction system provided in the foregoing embodiments.
  • An embodiment of the present disclosure provides a computer program product.
  • the computer program product includes a computer program stored on a computer-readable storage medium.
  • the computer program includes program instructions. When the program instructions are executed by a computer, the computer is made to execute the information provided in the foregoing embodiments.
  • the above-mentioned computer-readable storage medium may be a transitory computer-readable storage medium, or a non-transitory computer-readable storage medium.
  • the technical solutions of the embodiments of the present disclosure can be embodied in the form of software products, which are stored in a storage medium and include one or more instructions to enable a computer device (which may be a personal computer, a server, or a network equipment, etc.) to execute all or part of the steps of the methods in the embodiments of the present disclosure.
  • the aforementioned storage medium can be a non-transitory storage medium, including: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
  • the term “comprise” and its variants “comprises” and/or comprising (comprising) etc. refer to stated features, integers, steps, operations, elements, and/or The presence of a component does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groupings of these.
  • an element qualified by the statement “comprising a " does not preclude the presence of additional identical elements in the process, method or apparatus comprising the element.
  • what each embodiment focuses on may be the difference from other embodiments, and the same and similar parts of the various embodiments may refer to each other.
  • the relevant part can refer to the description of the method part.
  • the disclosed methods and products can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units may only be a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to implement this embodiment.
  • each functional unit in the embodiments of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more executable instruction.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • Each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or action, or can be implemented by dedicated hardware implemented in combination with computer instructions. .

Abstract

The present application relates to the technical field of human-computer interaction, and discloses a method for evaluating interaction performance of an interaction system. The method for evaluating interaction performance of an interaction system comprises: obtaining the interaction state of each log in interaction logs of an interaction system, and labeling each log according to the interaction state; and determining a self-learning index of the interaction system according to the labeling result of each log to evaluate the interaction performance of the interaction system. The method for evaluating interaction performance of an interaction system can be used to evaluate a dynamic process of improving the performance of the interaction system by means of self-learning. Also disclosed in the present application are an apparatus for evaluating interaction performance of an interaction system.

Description

用于评价交互系统交互性能的方法和装置Method and device for evaluating interactive performance of interactive system
本公开基于申请号为202110616138.4、申请日为2021年6月2日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。This disclosure is based on a Chinese patent application with application number 202110616138.4 and a filing date of June 2, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this disclosure.
技术领域technical field
本公开涉及人机交互技术领域,例如涉及一种用于评价交互系统交互性能的方法和装置。The present disclosure relates to the technical field of human-computer interaction, for example, to a method and device for evaluating the interaction performance of an interaction system.
背景技术Background technique
目前,用户可通过语音、手势等与携带有交互系统的智能设备交互,实现对智能设备的控制。交互成功率是衡量交互系统的重要指标,其定义为交互成功的样例个数除以总样例个数,交互成功率越高,交互系统的性能越优异。交互系统性能的提升,通常用交互成功率来表示。Currently, users can interact with smart devices carrying interactive systems through voice, gestures, etc., to control the smart devices. The interaction success rate is an important index to measure the interactive system, which is defined as the number of successful interactive samples divided by the total number of samples. The higher the interactive success rate, the better the performance of the interactive system. The performance improvement of the interactive system is usually expressed by the interactive success rate.
在交互系统的应用过程中,即在交互系统的交互过程中,用于评价交互系统性能的评价指标单一,不利于对交互系统的性能进行全面评价,进而不利于交互系统性能的进一步提升。In the application process of the interactive system, that is, in the interactive process of the interactive system, the evaluation index used to evaluate the performance of the interactive system is single, which is not conducive to the comprehensive evaluation of the performance of the interactive system, and is not conducive to the further improvement of the performance of the interactive system.
发明内容Contents of the invention
为了对披露的实施例的一些方面有基本的理解,下面给出了简单的概括。所述概括不是泛泛评述,也不是要确定关键/重要组成元素或描绘这些实施例的保护范围,而是作为后面的详细说明的序言。In order to provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is presented below. The summary is not intended to be an extensive overview nor to identify key/important elements or to delineate the scope of these embodiments, but rather serves as a prelude to the detailed description that follows.
本公开实施例提供了一种用于评价交互系统交互性能的方法和装置,以解决在交互系统的交互过程中,现有的交互成功率指标很难对交互系统通过自学习提升自身性能的动态过程进行评价的技术问题。The embodiments of the present disclosure provide a method and device for evaluating the interactive performance of an interactive system, so as to solve the problem that the existing interactive success rate index is difficult for the interactive system to improve its own performance through self-learning during the interactive process of the interactive system. technical aspects of process evaluation.
在一些实施例中,用于评价交互系统交互性能的方法包括:In some embodiments, a method for evaluating interactive performance of an interactive system includes:
获得交互系统的交互日志中每条日志的交互状态,并依据所述交互状态对每条日志进行标注;Obtain the interaction state of each log in the interaction log of the interaction system, and mark each log according to the interaction state;
根据每条日志的标注结果,确定所述交互系统的自学习指数,以评价所述交互系统的 交互性能。According to the labeling result of each log, determine the self-learning index of described interactive system, to evaluate the interactive performance of described interactive system.
可选地,根据每条日志的标注结果,确定所述交互系统的自学习指数,包括:Optionally, according to the labeling result of each log, determining the self-learning index of the interactive system includes:
获得所述交互日志中在当前评价周期前设定时长内交互失败的第一失败日志;Obtaining the first failure log of the interaction failure within the set time period before the current evaluation period in the interaction log;
根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率;Determine the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period;
根据所述当前评价周期的交互成功率确定所述自学习指数。The self-learning index is determined according to the interaction success rate of the current evaluation period.
可选地,所述当前评价周期包括多个子数据采集周期;根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率,包括:根据所述第一失败日志,确定每个子数据采集周期的交互成功率;获得多个子数据采集周期的交互成功率的平均值;将所述平均值确定为所述当前评价周期的交互成功率。Optionally, the current evaluation period includes multiple sub-data collection periods; determining the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period, Including: determining the interaction success rate of each sub-data collection cycle according to the first failure log; obtaining the average value of the interaction success rates of multiple sub-data collection cycles; determining the average value as the interaction success rate of the current evaluation cycle Rate.
可选地,所述设定时长为一个子数据采集周期;根据所述第一失败日志,确定每个子数据采集周期的交互成功率,包括:在当前子数据采集周期为所述当前评价周期的第一个子数据采集周期的情况下,将所述第一失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率。Optionally, the set duration is a sub-data collection period; according to the first failure log, determining the interaction success rate of each sub-data collection period includes: when the current sub-data collection period is the current evaluation period In the case of the first sub-data collection period, the interaction success rate of the interaction content corresponding to the first failure log in the current sub-data collection period is determined as the interaction success rate of the current sub-data collection period.
可选地,根据所述第一失败日志,确定每个子数据采集周期的交互成功率,还包括:在当前子数据采集周期非所述当前评价周期的第一个子数据采集周期的情况下,在所述交互日志中,获得所述当前子数据采集周期的前一子数据采集周期内的交互失败的第二失败日志;将所述第二失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率。Optionally, determining the interaction success rate of each sub-data collection cycle according to the first failure log further includes: when the current sub-data collection cycle is not the first sub-data collection cycle of the current evaluation cycle, In the interaction log, obtain the second failure log of the interaction failure in the previous sub-data collection period of the current sub-data collection period; add the interaction content corresponding to the second failure log to the current sub-data collection period The interaction success rate within the period is determined as the interaction success rate of the current sub-data collection period.
可选地,根据所述第一失败日志,确定每个子数据采集周期的交互成功率,包括:将所述第一失败日志对应的交互内容在每个子数据采集周期内的交互成功率,确定为每个子数据采集周期的交互成功率。Optionally, determining the interaction success rate of each sub-data collection period according to the first failure log includes: determining the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection period as Interaction success rate for each sub-data collection cycle.
可选地,根据所述当前评价周期的交互成功率确定所述自学习指数,包括:获得所述当前评价周期的交互失败率;根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数,使所述自学习指数为随所述交互成功率的单调递增函数。Optionally, determining the self-learning index according to the interaction success rate of the current evaluation cycle includes: obtaining the interaction failure rate of the current evaluation cycle; determining the interaction success rate and the interaction failure rate according to the current evaluation cycle A self-learning index, so that the self-learning index is a monotonically increasing function along with the interaction success rate.
可选地,根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数,包括:Optionally, determining the self-learning index according to the interaction success rate and interaction failure rate of the current evaluation cycle includes:
S=k′/(k′+r)S=k'/(k'+r)
其中,S为自学习指数,k′为交互成功率,r为交互失败率。Among them, S is the self-learning index, k' is the interaction success rate, and r is the interaction failure rate.
可选地,获得所述当前评价周期的交互失败率,包括:获得所述交互日志中在所述当前评价周期内的日志的第一数量,以及在所述当前评价周期内的交互失败的日志的第二数 量;根据所述第一数量和所述第二数量,确定所述当前评价周期的交互失败率。Optionally, obtaining the interaction failure rate of the current evaluation period includes: obtaining a first number of logs in the current evaluation period in the interaction logs, and a log of interaction failures in the current evaluation period the second quantity; according to the first quantity and the second quantity, determine the interaction failure rate of the current evaluation cycle.
可选地,获得所述交互日志中每条日志的交互状态,包括:在通过预设交互策略确定所述交互日志中的日志为用户满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互成功;在通过预设交互策略确定所述交互日志中的日志为用户不满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互失败。Optionally, obtaining the interaction status of each log in the interaction log includes: if the log in the interaction log is determined to be a reply satisfactory to the user through a preset interaction strategy, then adding the log in the interaction log to The interaction status of the log is determined as interaction success; if the log in the interaction log is determined to be an unsatisfactory reply from the user through the preset interaction policy, the interaction status of the log in the interaction log is determined as interaction failure.
在一些实施例中,用于评价交互系统交互性能的装置包括处理器和存储有程序指令的存储器,所述处理器被配置为在执行所述程序指令时,执行前述实施例提供的用于评价交互系统交互性能的方法。In some embodiments, the device for evaluating the interaction performance of an interactive system includes a processor and a memory storing program instructions, and the processor is configured to execute the evaluation method provided by the foregoing embodiments when executing the program instructions. A method for interactive performance of interactive systems.
在一些实施例中,交互系统包括前述实施例提供的用于评价交互系统交互性能的装置。In some embodiments, the interactive system includes the apparatus for evaluating the interactive performance of the interactive system provided in the foregoing embodiments.
本公开实施例提供的用于评价交互系统交互性能的方法和装置,可以实现以下技术效果:The method and device for evaluating the interactive performance of an interactive system provided by the embodiments of the present disclosure can achieve the following technical effects:
交互系统的自学习指数可对交互系统通过自学习提升自身性能的动态过程进行评价,有利于交互系统的自学习过程。The self-learning index of the interactive system can evaluate the dynamic process of the interactive system to improve its performance through self-learning, which is beneficial to the self-learning process of the interactive system.
以上的总体描述和下文中的描述仅是示例性和解释性的,不用于限制本公开。The foregoing general description and the following description are exemplary and explanatory only and are not intended to limit the present disclosure.
附图说明Description of drawings
一个或一个以上实施例通过与之对应的附图进行示例性说明,这些示例性说明和附图并不构成对实施例的限定,附图中具有相同参考数字标号的元件视为类似的元件,并且其中:One or more embodiments are exemplified by corresponding drawings, and these exemplifications and drawings do not constitute limitations to the embodiments, and elements with the same reference numerals in the drawings are regarded as similar elements, and where:
图1是本公开实施例提供的一种交互系统的实施场景的示意图;FIG. 1 is a schematic diagram of an implementation scenario of an interactive system provided by an embodiment of the present disclosure;
图2是本公开实施例提供的一种用于评价交互系统交互性能的方法的示意图;FIG. 2 is a schematic diagram of a method for evaluating the interactive performance of an interactive system provided by an embodiment of the present disclosure;
图3是本公开实施例提供一种确定自学习指数的过程的示意图;FIG. 3 is a schematic diagram of a process of determining a self-learning index provided by an embodiment of the present disclosure;
图4是本公开实施例提供的一种用于评价交互系统交互性能的装置的示意图。Fig. 4 is a schematic diagram of an apparatus for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为了能够更加详尽地了解本公开实施例的特点与技术内容,下面结合附图对本公开实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本公开实施例。在以下的技术描述中,为方便解释起见,通过多个细节以提供对所披露实施例的充分理解。然而,在没有这些细节的情况下,一个或一个以上实施例仍然可以实施。在其它情况下,为简化附图,熟知的结构和装置可以简化展示。In order to understand the characteristics and technical content of the embodiments of the present disclosure in more detail, the implementation of the embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present disclosure. In the following technical description, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown simplified in order to simplify the drawings.
本公开实施例的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开实施例的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含。The terms "first", "second" and the like in the description and claims of the embodiments of the present disclosure and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It should be understood that the data so used may be interchanged under appropriate circumstances so as to facilitate the embodiments of the disclosed embodiments described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion.
除非另有说明,术语“多个”表示两个或两个以上。Unless stated otherwise, the term "plurality" means two or more.
本公开实施例中,字符“/”表示前后对象是一种“或”的关系。例如,A/B表示:A或B。In the embodiments of the present disclosure, the character "/" indicates that the preceding and following objects are an "or" relationship. For example, A/B means: A or B.
术语“和/或”是一种描述对象的关联关系,表示可以存在三种关系。例如,A和/或B,表示:A或B,或,A和B这三种关系。The term "and/or" is an associative relationship describing objects, indicating that there can be three relationships. For example, A and/or B means: A or B, or, A and B, these three relationships.
图1是本公开实施例提供的一种交互系统的实施场景的示意图。结合图1所示,交互系统可包括交互模型与交互日志,交互模型可接收请求信息,交互模型在接收到请求信息之后,可对该请求信息进行回复;交互日志中可记录请求信息及其对应的回复。Fig. 1 is a schematic diagram of an implementation scenario of an interactive system provided by an embodiment of the present disclosure. As shown in Figure 1, the interaction system can include an interaction model and an interaction log. The interaction model can receive request information, and after receiving the request information, the interaction model can reply to the request information; the interaction log can record the request information and its corresponding reply.
这里的请求信息可以是用户请求信息,例如用户的语音指令信息、用户的手势指令信息等。The request information here may be user request information, such as user voice instruction information, user gesture instruction information, and the like.
交互模型对请求信息的回复,可以是发出控制其他设备的控制指令,例如控制智能空调的控制指令、控制智能冰箱的控制指令或控制智能电视的控制指令等,也可以是通过视频、音频等向用户回复。The response of the interaction model to the request information can be to issue control commands to control other devices, such as control commands to control smart air conditioners, control commands to control smart refrigerators, or control commands to control smart TVs. User reply.
本公开实施例中的交互系统需要依托具体的设备来执行,该交互系统可通过终端设备、家电设备等执行。The interactive system in the embodiments of the present disclosure needs to be implemented by relying on specific devices, and the interactive system can be implemented by terminal devices, household electrical appliances, and the like.
这里的终端设备可为智能手机、平板电脑、超级移动个人计算机(Ultra-mobile Personal Computer,UMPC)、上网本、个人数字助理(Personal Digital Assistant,PDA)等终端设备,且不限于此。The terminal devices here may be smart phones, tablet computers, ultra-mobile personal computers (Ultra-mobile Personal Computer, UMPC), netbooks, personal digital assistants (Personal Digital Assistant, PDA) and other terminal devices, but are not limited thereto.
这里的家电设备可以是智能电视、智能冰箱、智能空调、智能空调、智能冰箱等家电,且不限于此。The home appliances here may be home appliances such as smart TVs, smart refrigerators, smart air conditioners, smart air conditioners, and smart refrigerators, but are not limited thereto.
图2是本公开实施例提供的一种用于评价交互系统交互性能的方法的示意图,该方法可由具备交互功能的终端设备或家电设备执行。结合图2所示,用于评价交互系统交互性能的方法包括:Fig. 2 is a schematic diagram of a method for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure, and the method can be executed by a terminal device or a home appliance with an interaction function. As shown in Figure 2, the methods used to evaluate the interactive performance of the interactive system include:
S201、获得交互系统的交互日志中每条日志的交互状态,并依据交互状态对每条日志进行标注。S201. Obtain the interaction state of each log in the interaction log of the interaction system, and mark each log according to the interaction state.
其中,交互状态包括交互成功和交互失败。Wherein, the interaction status includes interaction success and interaction failure.
交互日志中记录有请求信息(例如用户以语音或手势的形式发送的请求信息)以及交互系统对该请求信息的回复。The interaction log records the request information (for example, the request information sent by the user in the form of voice or gesture) and the response of the interaction system to the request information.
在一些应用场景中,获得交互日志中每条日志的交互状态,包括:在通过预设交互策略确定交互日志中的日志为用户满意的回复的情况下,将交互日志中的日志的交互状态确定为交互成功;在通过预设交互策略确定交互日志中的日志为用户不满意的回复的情况下,则将交互日志中的日志的交互状态确定为交互失败。In some application scenarios, obtaining the interaction status of each log in the interaction log includes: determining the interaction status of the log in the interaction log when the log in the interaction log is determined to be a satisfactory reply to the user through the preset interaction strategy is the interaction success; if the log in the interaction log is determined to be an unsatisfactory reply from the user through the preset interaction policy, the interaction status of the log in the interaction log is determined as interaction failure.
逐条读取交互日志中的每条日志,并根据预设交互策略对每条日志逐条判断,以确定每条日志的交互状态为交互成功或交互失败。Read each log in the interaction log one by one, and judge each log one by one according to the preset interaction strategy, so as to determine the interaction status of each log as interaction success or interaction failure.
这里的预设交互策略是用于确定是否交互成功的策略,例如,如果在交互日志中出现两条或多条日志的请求信息相同,但回复不同,则确定这两条或多条日志的交互状态为交互失败;或者,在交互系统发送对应于请求信息的回复后,再接收用户触发的“正确”或“错误”的反馈,如果一个交互内容得到“正确”的反馈,则确定记录该交互内容的日志的交互状态为交互成功,如果一个交互内容到的“错误”的反馈,则确定记录该交互内容的日志的交互状态为交互失败。The preset interaction strategy here is the strategy used to determine whether the interaction is successful. For example, if two or more logs appear in the interaction log with the same request information but different replies, determine the interaction of these two or more logs The status is interaction failure; or, after the interactive system sends a reply corresponding to the request information, it receives the "correct" or "wrong" feedback triggered by the user. If an interactive content gets a "correct" feedback, it is determined to record the interaction The interaction status of the content log is interaction success, and if an interaction content receives an "error" feedback, then determine that the interaction status of the interaction content log is interaction failure.
可选地,依据交互状态对交互日志中的每条日志进行标注,包括:在一条日志的交互状态为交互成功的情况下,将该一条日志标注为交互成功;在一条日志的交互状态为交互失败的情况下,将该一条日志标注为交互失败。在这样标注后,便于在后续步骤中确定出失败日志,以及当前评价周期的交互成功率。Optionally, marking each log in the interaction log according to the interaction status includes: when the interaction status of a log is interaction success, marking the log as interaction success; when the interaction status of a log is interaction In case of failure, mark the log as interaction failure. After marking in this way, it is convenient to determine the failure log and the interaction success rate of the current evaluation cycle in the subsequent steps.
在一些实际应用中,可将每条日志的交互状态记录在该条日志中。In some practical applications, the interaction status of each log can be recorded in the log.
S202、根据每条日志的标注结果,确定交互系统的自学习指数,以评价交互系统的交互性能。S202. Determine the self-learning index of the interactive system according to the labeling result of each log, so as to evaluate the interactive performance of the interactive system.
交互系统的自学习指数可对交互系统通过自学习提升自身性能的动态过程进行评价,有利于交互系统的自学习过程。The self-learning index of the interactive system can evaluate the dynamic process of the interactive system to improve its performance through self-learning, which is beneficial to the self-learning process of the interactive system.
图3是本公开实施例提供一种确定自学习指数的过程的示意图。结合图3所示,根据每条日志的标注结果,确定交互系统的自学习指数,包括:Fig. 3 is a schematic diagram of a process of determining a self-learning index provided by an embodiment of the present disclosure. As shown in Figure 3, according to the labeling results of each log, determine the self-learning index of the interactive system, including:
S301、获得交互日志中在当前评价周期前设定时长内交互失败的第一失败日志。S301. Obtain a first failure log of an interaction failure within a set time before the current evaluation period in the interaction log.
交互日志中每条日志还记录有交互时刻,当前评价周期具有开始时刻和结束时刻。在交互日志中,首先选定交互时刻在当前评价周期的开始时刻之前的设定时长内的日志,再依次读取获得这些日志的交互状态,如果日志的交互状态为交互失败,则将该日志确定为第一失败日志。第一失败日志中日志的数量可以是一个或多个。Each log in the interaction log also records the interaction time, and the current evaluation period has a start time and an end time. In the interaction log, firstly select the logs whose interaction time is before the start time of the current evaluation period within the set duration, and then read the interaction status of these logs sequentially. If the interaction status of the log is interaction failure, the log Determined as the first failure log. The number of logs in the first failure log may be one or more.
在获得第一失败日志后,再获得第一失败日志在当前评价周期内的交互成功率。例如,在当前评价周期包括一个或多个子数据采集周期的情况下,可以获得第一失败日志对应的交互内容在当前评价周期中的一个子数据采集周期内的交互成功率,也可以获得第一失败日志对应的交互内容在当前评价周期中的全部子数据采集周期中的交互成功率。After the first failure log is obtained, the interaction success rate of the first failure log in the current evaluation period is obtained. For example, when the current evaluation period includes one or more sub-data collection periods, the interaction success rate of the interaction content corresponding to the first failure log in a sub-data collection period in the current evaluation period can be obtained, and the first failure log can also be obtained. The interaction success rate of the interaction content corresponding to the failure log in all sub-data collection cycles in the current evaluation cycle.
S302、根据第一失败日志对应的交互内容在当前评价周期内的交互成功率,确定当前评价周期的交互成功率。S302. Determine the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period.
这里的交互内容,可以是控制其他设备的控制指令,例如可以是控制智能空调的控制指令、控制智能冰箱的控制指令或者控制智能电视的控制指令等,也可以是查询天气等其他指令。The interactive content here may be a control command for controlling other devices, for example, it may be a control command for controlling a smart air conditioner, a control command for controlling a smart refrigerator, or a control command for controlling a smart TV, or it may be other commands such as querying the weather.
第一失败日志中每条日志对应的交互内容,可在当前评价周期内进行一次或多次,或者,一次也不进行。The interaction content corresponding to each log in the first failure log can be performed one or more times in the current evaluation period, or not performed once.
在一些应用场景中,设定时长可大于当前评价周期,以第一失败日志在当前评价周期的交互成功率,作为当前评价周期的交互成功率;或者,设定时长可等于当前评价周期,以第一失败日志在当前评价周期的交互成功率,作为当前评价周期的交互成功率;或者,设定时长可小于当前评价周期,以第一失败日志在当前评价周期的交互成功率,作为当前评价周期的交互成功率。In some application scenarios, the set duration can be longer than the current evaluation cycle, and the interaction success rate of the first failure log in the current evaluation cycle is used as the interaction success rate of the current evaluation cycle; or, the set duration can be equal to the current evaluation cycle, and the The interaction success rate of the first failure log in the current evaluation cycle is used as the interaction success rate of the current evaluation cycle; or, the set duration can be shorter than the current evaluation cycle, and the interaction success rate of the first failure log in the current evaluation cycle is used as the current evaluation Periodic interaction success rate.
可选地,当前评价周期包括一个子数据采集周期,则可获得第一失败日志对应的交互内容在该一个子数据采集周期内的交互成功率,以第一失败日志对应的交互内容在该一个子数据采集周期内的交互成功率,作为当前评价周期的交互成功率。Optionally, if the current evaluation period includes a sub-data collection period, the interaction success rate of the interactive content corresponding to the first failure log in the sub-data collection period can be obtained, and the interaction content corresponding to the first failure log in the sub-data collection period can be obtained. The interaction success rate in the sub-data collection cycle is used as the interaction success rate in the current evaluation cycle.
可选地,当前评价周期包括多个子数据采集周期,对应地,设定时长可以是一个或多个子数据采集周期。当前评价周期可以包括2个、3个、4个或更多个子数据采集周期。例如,一个子数据采集周期可以是一周,当前评价周期可以是14天,此时当前评价周期包括2个子数据采集周期;或者,当前评价周期可以是21天,此时当前评价周期包括3个子数据采集周期;或者,当前评价周期可以是28天,此时当前评价周期包括4个子数据采集周期。在此基础上,根据第一失败日志对应的交互内容在当前评价周期内的交互成功率,确定当前评价周期的交互成功率,包括:根据第一失败日志,确定每个子数据采集周期的交互成功率;获得多个子数据采集周期的交互成功率的平均值;将平均值确定为当前评价周期的交互成功率。Optionally, the current evaluation period includes multiple sub-data collection periods, and correspondingly, the set duration may be one or more sub-data collection periods. The current evaluation cycle may include 2, 3, 4 or more sub-data collection cycles. For example, a sub-data collection cycle can be one week, and the current evaluation cycle can be 14 days, and the current evaluation cycle includes 2 sub-data collection cycles; or, the current evaluation cycle can be 21 days, and the current evaluation cycle includes 3 sub-data collection period; alternatively, the current evaluation period may be 28 days, in which case the current evaluation period includes 4 sub-data collection periods. On this basis, according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period, determine the interaction success rate of the current evaluation period, including: according to the first failure log, determine the interaction success of each sub-data collection period rate; obtain the average value of the interaction success rate of multiple sub-data collection cycles; determine the average value as the interaction success rate of the current evaluation cycle.
采用上述步骤即可确定出当前评价周期的交互成功率。By adopting the above steps, the interaction success rate of the current evaluation cycle can be determined.
在一些应用场景中,设定时长为一个子数据采集周期,当前评价周期包括多个子数据 采集周期;在此基础上,根据第一失败日志,确定每个子数据采集周期的交互成功率,包括:在当前子数据采集周期为当前评价周期的第一个子数据采集周期的情况下,将第一失败日志对应的交互内容在当前子数据采集周期内的交互成功率,确定为当前子数据采集周期的交互成功率。In some application scenarios, the duration is set as one sub-data collection cycle, and the current evaluation cycle includes multiple sub-data collection cycles; on this basis, according to the first failure log, the interaction success rate of each sub-data collection cycle is determined, including: When the current sub-data collection period is the first sub-data collection period of the current evaluation period, the interaction success rate of the interaction content corresponding to the first failure log in the current sub-data collection period is determined as the current sub-data collection period interaction success rate.
第一失败日志中每条日志对应的交互内容可在当前子数据采集周期进行一次或多次,将第一失败日志中全部日志对应的交互内容在当前子数据采集周期内进行的总次数作为样例总数;将第一失败日志中全部日志对应的交互内容在当前子数据采集周期内交互成功的次数,作为样例成功数,以样例成功数除以样例总数,即可获得第一失败日志对应的交互内容在当前子数据采集周期内的交互成功率。The interaction content corresponding to each log in the first failure log can be performed once or multiple times in the current sub-data collection cycle, and the total number of times the interaction content corresponding to all logs in the first failure log is carried out in the current sub-data collection cycle is used as a sample The total number of cases; the number of successful interactions in the current sub-data collection cycle of the interactive content corresponding to all logs in the first failure log is used as the number of successful cases, and the first failure can be obtained by dividing the number of successful cases by the total number of cases Interaction success rate of the interaction content corresponding to the log in the current sub-data collection cycle.
进一步地,根据第一失败日志,确定每个子数据采集周期的交互成功率,还包括:在当前子数据采集周期非当前评价周期的第一个子数据采集周期的情况下,在交互日志中,获得当前子数据采集周期的前一子数据采集周期内的交互失败的第二失败日志;将第二失败日志对应的交互内容在当前子数据采集周期内的交互成功率,确定为当前子数据采集周期的交互成功率。Further, according to the first failure log, determining the interaction success rate of each sub-data collection period further includes: in the interaction log, in the event that the current sub-data collection period is not the first sub-data collection period of the current evaluation period, Obtain the second failure log of the interaction failure in the previous sub-data collection period of the current sub-data collection period; determine the interaction success rate of the interaction content corresponding to the second failure log in the current sub-data collection period as the current sub-data collection period Periodic interaction success rate.
第二失败日志中每条日志对应的交互内容可在当前子数据采集周期进行一次或多次,将第二失败日志中全部日志对应的交互内容在当前子数据采集周期内进行的总次数作为样例总数;将第二失败日志中全部日志对应的交互内容在当前子数据采集周期内交互成功的次数,作为样例成功数,以样例成功数除以样例总数,即可获得第二失败日志对应的交互内容在当前子数据采集周期内的交互成功率。The interaction content corresponding to each log in the second failure log can be performed once or multiple times in the current sub-data collection cycle, and the total number of times the interaction content corresponding to all logs in the second failure log is carried out in the current sub-data collection cycle is used as a sample The total number of cases; the number of successful interactions in the current sub-data collection cycle of all the interactive content corresponding to the second failure log is used as the number of successful cases, and the second failure can be obtained by dividing the number of successful cases by the total number of cases Interaction success rate of the interaction content corresponding to the log in the current sub-data collection period.
这里以一个子数据采集周期为7天,当前评价周期包括4个子数据采集周期为例,对获得当前评价周期的交互成功率进行示例性说明。Here, taking a sub-data collection cycle of 7 days and a current evaluation cycle including 4 sub-data collection cycles as an example, an exemplary description is given to obtain the interaction success rate of the current evaluation cycle.
在当前评价周期中,四个子数据采集周期按时间由先至后依次为:第一个子数据采集周期、第二个子数据采集周期、第三个子数据采集周期和第四个子数据采集周期。为了获得当前评价周期的交互成功率,首先获得交互日志中当前评价周期之前的7天(上一个评价周期内的最后一个子数据采集周期)内的交互失败的第一失败日志,这些第一失败日志对应的交互内容可以是重复的,也可以是不重复的;在当前评价周期的第一个子数据采集周期内,获得第一失败日志对应的交互内容的交互成功率,在第一个子数据采集周期内,第一失败日志中每个日志对应的交互内容可进行一次或多次,也可未进行第一失败日志中的一个或多个日志对应的交互内容;将第一失败日志对应的交互内容的交互成功率作为第一个子数据采集周期的交互成功率k 1In the current evaluation cycle, the four sub-data collection cycles are as follows in order of time: the first sub-data collection cycle, the second sub-data collection cycle, the third sub-data collection cycle and the fourth sub-data collection cycle. In order to obtain the interaction success rate of the current evaluation period, first obtain the first failure log of the interaction failure in the 7 days before the current evaluation period (the last sub-data collection period in the previous evaluation period) in the interaction log, these first failures The interactive content corresponding to the log can be repeated or non-repetitive; in the first sub-data collection cycle of the current evaluation cycle, the interaction success rate of the interactive content corresponding to the first failure log is obtained, and in the first sub-data collection period, During the data collection period, the interactive content corresponding to each log in the first failure log may be performed one or more times, or the interactive content corresponding to one or more logs in the first failure log may not be performed; the first failure log corresponding to The interaction success rate of the interactive content is taken as the interaction success rate k 1 of the first sub-data collection period.
获得第一个子数据采集周期内交互失败的第二失败日志,以这些第二失败日志对应的交互内容在第二个子数据采集周期内的交互成功率,作为第二个子数据采集周期的交互成功率k 2Obtain the second failure log of the interaction failure in the first sub-data collection period, and use the interaction success rate of the interaction content corresponding to these second failure logs in the second sub-data collection period as the interaction success of the second sub-data collection period rate k 2 .
获得第二个子数据采集周期内交互失败的第二失败日志,以这些第二失败日志对应的交互内容在第三个子数据采集周期内的交互成功率,作为第三个子数据采集周期的交互成功率k 3Obtain the second failure log of the interaction failure in the second sub-data collection period, and use the interaction success rate of the interaction content corresponding to these second failure logs in the third sub-data collection period as the interaction success rate of the third sub-data collection period k 3 .
获得第三个子数据采集周期内交互失败的第二失败日志,以这些第二失败日志对应的交互内容在第四个子数据采集周期内的交互成功率,作为第四个子数据采集周期的交互成功率k 4Obtain the second failure log of the interaction failure in the third sub-data collection period, and use the interaction success rate of the interaction content corresponding to these second failure logs in the fourth sub-data collection period as the interaction success rate of the fourth sub-data collection period k 4 .
那么,当前评价周期的交互成功率k′=(k 1+k 2+k 3+k 4)/4。 Then, the interaction success rate k'=(k 1 +k 2 +k 3 +k 4 )/4 in the current evaluation cycle.
采用上述步骤即可获得每个子数据采集周期的交互成功率。By adopting the above steps, the interaction success rate of each sub-data collection cycle can be obtained.
在一些应用场景中,根据第一失败日志,确定每个子数据采集周期的交互成功率,包括:将第一失败日志对应的交互内容在每个子数据采集周期内的交互成功率,确定为对应的子数据采集周期的交互成功率。In some application scenarios, determining the interaction success rate of each sub-data collection period according to the first failure log includes: determining the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection period as the corresponding The interaction success rate of the sub data collection cycle.
第一失败日志中每条日志对应的交互内容可在一个子数据采集周期进行一次或多次,将第一失败日志中全部日志对应的交互内容在该一个子数据采集周期内进行的总次数作为样例总数;将第一失败日志中全部日志对应的交互内容在当前子数据采集周期内交互成功的次数,作为样例成功数,以样例成功数除以样例总数,即可获得第一失败日志对应的交互内容在该一个子数据采集周期内的交互成功率;重复执行该过程,依次获得确定每个子数据采集周期的交互成功率。The interaction content corresponding to each log in the first failure log can be performed once or multiple times in a sub-data collection cycle, and the total number of times that the interaction content corresponding to all logs in the first failure log is carried out in the sub-data collection cycle is taken as The total number of samples; the number of times the interaction content corresponding to all the logs in the first failure log is successfully interacted in the current sub-data collection cycle is used as the number of sample successes, and the number of successful samples is divided by the total number of samples to obtain the first The interaction success rate of the interaction content corresponding to the failure log within the sub-data collection period; this process is repeated to obtain and determine the interaction success rate of each sub-data collection period in turn.
这里以一个子数据采集周期为7天,当前评价周期包括4个子数据采集周期为例,对获得当前评价周期的交互成功率进行示例性说明。Here, taking a sub-data collection period of 7 days and the current evaluation period including 4 sub-data collection periods as an example, the interaction success rate of the current evaluation period is obtained as an example.
在当前评价周期中,四个子数据采集周期按时间由先至后依次为:第一个子数据采集周期、第二个子数据采集周期、第三个子数据采集周期和第四个子数据采集周期。In the current evaluation cycle, the four sub-data collection cycles are as follows in order of time: the first sub-data collection cycle, the second sub-data collection cycle, the third sub-data collection cycle and the fourth sub-data collection cycle.
获得交互日志中在当前评价周期之前的7天内的交互失败的第一失败日志,以这些第一失败日志在第一个子数据评价周期内的交互成功率,作为第一个子数据评价周期的交互成功率k 1;以这些第一失败日志在第二个子数据评价周期内的交互成功率,作为第二个子数据采集周期的交互成功率k 2;以这些第一失败日志在第三个子数据评价周期内的交互成功率,作为第三个子数据采集周期的交互成功率k 3;以这些第一失败日志在第四个子数据采集周期的交互成功率,作为第四个子数据采集周期的交互成功率k 4Obtain the first failure log of the interaction failure within 7 days before the current evaluation period in the interaction log, and use the interaction success rate of these first failure logs in the first sub-data evaluation period as the first sub-data evaluation period Interaction success rate k 1 ; take the interaction success rate of these first failure logs in the second sub-data evaluation cycle as the interaction success rate k 2 of the second sub-data collection cycle; take these first failure logs in the third sub-data The interaction success rate within the evaluation period is taken as the interaction success rate k 3 of the third sub-data collection cycle; the interaction success rate of these first failure logs in the fourth sub-data collection cycle is taken as the interaction success rate of the fourth sub-data collection cycle rate k 4 .
那么,当前评价周期的交互成功率k′=(k 1+k 2+k 3+k 4)/4。 Then, the interaction success rate k'=(k 1 +k 2 +k 3 +k 4 )/4 in the current evaluation cycle.
采用上述步骤即可获得每个子数据采集周期的交互成功率。By adopting the above steps, the interaction success rate of each sub-data collection cycle can be obtained.
S303、根据当前评价周期的交互成功率确定自学习指数。S303. Determine the self-learning index according to the interaction success rate of the current evaluation period.
为了更准确地对交互系统通过自学习提升自身性能的动态过程进行评价,这里引入一个自学习指数的概念,自学习指数是依据前述交互成功率与交互失败率确定的,且该自学习指数为随交互成功率的单调递增函数。In order to more accurately evaluate the dynamic process of the interactive system improving its own performance through self-learning, a concept of self-learning index is introduced here. The self-learning index is determined based on the aforementioned interaction success rate and interaction failure rate, and the self-learning index is Monotonically increasing function with interaction success rate.
那么,根据当前评价周期的交互成功率确定自学习指数,可包括:获得当前评价周期的交互失败率;根据当前评价周期的交互成功率和交互失败率,确定自学习指数,使自学习指数为随交互成功率的单调递增函数。之后再以自学习指数为优化准则进行优化。Then, determining the self-learning index according to the interaction success rate of the current evaluation cycle may include: obtaining the interaction failure rate of the current evaluation cycle; determining the self-learning index according to the interaction success rate and interaction failure rate of the current evaluation cycle, so that the self-learning index is Monotonically increasing function with interaction success rate. Afterwards, the self-learning index is used as the optimization criterion for optimization.
这里的当前评价周期的交互成功率,可以是采用前述实施例中提供的方法获得的。Here, the interaction success rate of the current evaluation period may be obtained by using the method provided in the foregoing embodiments.
这里的优化方式随交互模型的不同而不同,本领域技术人员可根据交互模型实质的自学习模型,采取合适的优化方式,这里不再一一赘述。The optimization method here varies with different interaction models, and those skilled in the art can adopt an appropriate optimization method according to the self-learning model that is the essence of the interaction model, and details will not be repeated here.
具体地,可通过如下公式确定自学习指数:Specifically, the self-learning index can be determined by the following formula:
S=k′/(k′+r)S=k'/(k'+r)
其中,S为自学习指数,k′为交互成功率,r为交互失败率。在当前评价周期的交互式失败率r为零的情况下,自学习指标S达到上限。Among them, S is the self-learning index, k' is the interaction success rate, and r is the interaction failure rate. In the case that the interactive failure rate r of the current evaluation period is zero, the self-learning index S reaches the upper limit.
可采用如下方式获得当前评价周期的交互失败率:获得交互日志中在当前评价周期内的日志的第一数量,以及在当前评价周期内的交互失败的日志的第二数量;根据第一数量和第二数量,确定当前评价周期的交互失败率。The interaction failure rate of the current evaluation period can be obtained in the following manner: obtain the first number of logs in the current evaluation period in the interaction log, and the second number of interaction failure logs in the current evaluation period; according to the first number and The second quantity, determines the interaction failure rate for the current evaluation cycle.
以第二数量除以第一数量,即可获得当前评价周期的交互失败率。The interaction failure rate of the current evaluation period can be obtained by dividing the second quantity by the first quantity.
交互日志中的每条日志均被标注为交互成功和交互失败,在交互日志中,取当前评价周期之前设定时长内交互失败的失败日志,以这些失败日志对应的交互内容作为样例,对当前评价周期内的交互成功率进行评价,这种样例是在交互系统的交互过程中选取的,可用于对交互系统的交互过程中的动态性能进行评价,进而可对交互系统通过自学习提升自身性能的动态过程进行评价,有利于交互系统的自学习过程。Each log in the interaction log is marked as interaction success and interaction failure. In the interaction log, take the failure logs of the interaction failure within the set time before the current evaluation cycle, and take the interaction content corresponding to these failure logs as an example. The interaction success rate in the current evaluation cycle is evaluated. This sample is selected during the interaction process of the interactive system, which can be used to evaluate the dynamic performance of the interactive system during the interactive process, and then the interactive system can be improved through self-learning. Evaluating the dynamic process of its own performance is beneficial to the self-learning process of the interactive system.
在一些实施例中,用于评价交互系统交互性能的装置包括处理器和存储有程序指令的存储器,处理器被配置为在执行程序指令时,执行前述实施例提供的用于评价交互系统交互性能的方法。In some embodiments, the device for evaluating the interactive performance of an interactive system includes a processor and a memory storing program instructions, and the processor is configured to execute the method for evaluating the interactive performance of an interactive system provided by the foregoing embodiments when executing the program instructions. Methods.
图4是本公开实施例提供的一种用于评价交互系统交互性能的装置的示意图。结合图4所示,用于评价交互系统交互性能的装置包括:Fig. 4 is a schematic diagram of an apparatus for evaluating the interaction performance of an interaction system provided by an embodiment of the present disclosure. As shown in Figure 4, the means for evaluating the interactive performance of the interactive system include:
处理器(processor)41和存储器(memory)42,还可以包括通信接口(Communication Interface)43和总线44。其中,处理器41、通信接口43、存储器42可以通过总线44完成相互间的通信。通信接口43可以用于信息传输。处理器41可以调用存储器42中的逻辑指令,以执行前述实施例提供的用于评价交互系统交互性能的方法。A processor (processor) 41 and a memory (memory) 42 may also include a communication interface (Communication Interface) 43 and a bus 44. Wherein, the processor 41 , the communication interface 43 , and the memory 42 can communicate with each other through the bus 44 . The communication interface 43 can be used for information transmission. The processor 41 may invoke logic instructions in the memory 42 to execute the method for evaluating the interaction performance of the interaction system provided in the foregoing embodiments.
此外,上述的存储器42中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。In addition, the above-mentioned logic instructions in the memory 42 may be implemented in the form of software functional units and when sold or used as an independent product, may be stored in a computer-readable storage medium.
存储器42作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序,如本公开实施例中的方法对应的程序指令/模块。处理器41通过运行存储在存储器42中的软件程序、指令以及模块,从而执行功能应用以及数据处理,即实现上述方法实施例中的方法。The memory 42, as a computer-readable storage medium, can be used to store software programs and computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 41 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 42, that is, implements the methods in the foregoing method embodiments.
存储器42可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器42可以包括高速随机存取存储器,还可以包括非易失性存储器。The memory 42 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and at least one application required by a function; the data storage area may store data created according to the use of the terminal device, and the like. In addition, the memory 42 may include a high-speed random access memory, and may also include a non-volatile memory.
本公开实施例提供了一种交互系统,包含前述实施例提供的用于评价交互系统交互性能的装置。An embodiment of the present disclosure provides an interactive system, including the device for evaluating the interactive performance of the interactive system provided in the foregoing embodiments.
本公开实施例提供了一种计算机可读存储介质,存储有计算机可执行指令,计算机可执行指令设置为执行前述实施例提供的用于评价交互系统交互性能的方法。An embodiment of the present disclosure provides a computer-readable storage medium, which stores computer-executable instructions, and the computer-executable instructions are configured to execute the method for evaluating the interaction performance of an interaction system provided in the foregoing embodiments.
本公开实施例提供了一种计算机程序产品,计算机程序产品包括存储在计算机可读存储介质上的计算机程序,计算机程序包括程序指令,当程序指令被计算机执行时,使计算机执行前述实施例提供的用于评价交互系统交互性能的方法。An embodiment of the present disclosure provides a computer program product. The computer program product includes a computer program stored on a computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the computer is made to execute the information provided in the foregoing embodiments. A method for evaluating the interactive performance of an interactive system.
上述的计算机可读存储介质可以是暂态计算机可读存储介质,也可以是非暂态计算机可读存储介质。The above-mentioned computer-readable storage medium may be a transitory computer-readable storage medium, or a non-transitory computer-readable storage medium.
本公开实施例的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括一个或一个以上指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例中方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质,包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机读取存储器(Random Access Memory,RAM)、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。The technical solutions of the embodiments of the present disclosure can be embodied in the form of software products, which are stored in a storage medium and include one or more instructions to enable a computer device (which may be a personal computer, a server, or a network equipment, etc.) to execute all or part of the steps of the methods in the embodiments of the present disclosure. The aforementioned storage medium can be a non-transitory storage medium, including: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc. A medium that can store program code, or a transitory storage medium.
以上描述和附图充分地示出了本公开的实施例,以使本领域的技术人员能够实践它们。其他实施例可以包括结构的、逻辑的、电气的、过程的以及其他的改变。实施例仅代表可 能的变化。除非明确要求,否则单独的部件和功能是可选的,并且操作的顺序可以变化。一些实施例的部分和特征可以被包括在或替换其他实施例的部分和特征。而且,本公开中使用的用词仅用于描述实施例并且不用于限制权利要求。如在实施例以及权利要求的描述中使用的,除非上下文清楚地表明,否则单数形式的“一个”(a)、“一个”(an)和“所述”(the)旨在同样包括复数形式。另外,当用于本公开中时,术语“包括”(comprise)及其变型“包括”(comprises)和/或包括(comprising)等指陈述的特征、整体、步骤、操作、元素,和/或组件的存在,但不排除一个或一个以上其它特征、整体、步骤、操作、元素、组件和/或这些的分组的存在或添加。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法或者设备中还存在另外的相同要素。本文中,每个实施例重点说明的可以是与其他实施例的不同之处,各个实施例之间相同相似部分可以互相参见。对于实施例公开的方法、产品等而言,如果其与实施例公开的方法部分相对应,那么相关之处可以参见方法部分的描述。The above description and drawings sufficiently illustrate the embodiments of the present disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, procedural, and other changes. The examples merely represent possible variations. Individual components and functions are optional unless explicitly required, and the order of operations may vary. Portions and features of some embodiments may be included in or substituted for those of other embodiments. Also, the terms used in the present disclosure are used to describe the embodiments only and are not used to limit the claims. As used in the examples and description of the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well unless the context clearly indicates otherwise . Additionally, when used in this disclosure, the term "comprise" and its variants "comprises" and/or comprising (comprising) etc. refer to stated features, integers, steps, operations, elements, and/or The presence of a component does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groupings of these. Without further limitations, an element qualified by the statement "comprising a ..." does not preclude the presence of additional identical elements in the process, method or apparatus comprising the element. Herein, what each embodiment focuses on may be the difference from other embodiments, and the same and similar parts of the various embodiments may refer to each other. For the method, product, etc. disclosed in the embodiment, if it corresponds to the method part disclosed in the embodiment, then the relevant part can refer to the description of the method part.
本领域技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,可以取决于技术方案的特定应用和设计约束条件。技术人员可以对每个特定的应用来使用不同方法以实现所描述的功能,但是这种实现不应认为超出本公开实施例的范围。技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software may depend on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functions using different methods for each specific application, but such implementation should not be considered as exceeding the scope of the disclosed embodiments. Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
本文所披露的实施例中,所揭露的方法、产品(包括但不限于装置、设备等),可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,可以仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例。另外,在本公开实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In the embodiments disclosed herein, the disclosed methods and products (including but not limited to devices, equipment, etc.) can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units may only be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms. A unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to implement this embodiment. In addition, each functional unit in the embodiments of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
附图中的流程图和框图显示了根据本公开实施例的系统、方法和计算机程序产品的可 能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或一个以上用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这可以依所涉及的功能而定。框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more executable instruction. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or action, or can be implemented by dedicated hardware implemented in combination with computer instructions. .

Claims (19)

  1. 一种用于评价交互系统交互性能的方法,包括:A method for evaluating the interactive performance of an interactive system, comprising:
    获得交互系统的交互日志中每条日志的交互状态,并依据所述交互状态对每条日志进行标注;Obtain the interaction state of each log in the interaction log of the interaction system, and mark each log according to the interaction state;
    根据每条日志的标注结果,确定所述交互系统的自学习指数,以评价所述交互系统的交互性能。According to the labeling result of each log, the self-learning index of the interactive system is determined to evaluate the interactive performance of the interactive system.
  2. 根据权利要求1所述的方法,其中,根据每条日志的标注结果,确定所述交互系统的自学习指数,包括:The method according to claim 1, wherein, according to the labeling result of each log, determining the self-learning index of the interactive system includes:
    获得所述交互日志中在当前评价周期前设定时长内交互失败的第一失败日志;Obtaining the first failure log of the interaction failure within the set time period before the current evaluation period in the interaction log;
    根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率;Determine the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period;
    根据所述当前评价周期的交互成功率确定所述自学习指数。The self-learning index is determined according to the interaction success rate of the current evaluation period.
  3. 根据权利要求2所述的方法,其中,所述当前评价周期包括多个子数据采集周期;The method of claim 2, wherein the current evaluation period includes a plurality of sub-data collection periods;
    根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率,包括:According to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period, determining the interaction success rate of the current evaluation period includes:
    根据所述第一失败日志,确定每个子数据采集周期的交互成功率;Determine the interaction success rate of each sub-data collection cycle according to the first failure log;
    获得多个子数据采集周期的交互成功率的平均值;Obtain the average value of the interaction success rate of multiple sub-data collection cycles;
    将所述平均值确定为所述当前评价周期的交互成功率。The average value is determined as the interaction success rate of the current evaluation cycle.
  4. 根据权利要求3所述的方法,其中,所述设定时长为一个子数据采集周期;根据所述第一失败日志,确定每个子数据采集周期的交互成功率,包括:在当前子数据采集周期为所述当前评价周期的第一个子数据采集周期的情况下,将所述第一失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率;The method according to claim 3, wherein the set duration is a sub-data collection period; determining the interaction success rate of each sub-data collection period according to the first failure log, comprising: in the current sub-data collection period If it is the first sub-data collection period of the current evaluation period, the interaction success rate of the interaction content corresponding to the first failure log within the current sub-data collection period is determined as the current sub-data Interaction success rate of collection cycle;
    或者,将所述第一失败日志对应的交互内容在每个子数据采集周期内的交互成功率,确定为每个子数据采集周期的交互成功率。Alternatively, the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection cycle is determined as the interaction success rate of each sub-data collection cycle.
  5. 根据权利要求4所述的方法,其中,根据所述第一失败日志,确定每个子数据采集周期的交互成功率,还包括:The method according to claim 4, wherein, according to the first failure log, determining the interaction success rate of each sub-data collection cycle further comprises:
    在当前子数据采集周期非所述当前评价周期的第一个子数据采集周期的情况下,在所述交互日志中,获得所述当前子数据采集周期的前一子数据采集周期内的交互失 败的第二失败日志;If the current sub-data collection period is not the first sub-data collection period of the current evaluation period, in the interaction log, obtain the interaction failure in the previous sub-data collection period of the current sub-data collection period The second failure log of;
    将所述第二失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率。The interaction success rate of the interaction content corresponding to the second failure log in the current sub-data collection period is determined as the interaction success rate of the current sub-data collection period.
  6. 根据权利要求2至5任一项所述的方法,其中,根据所述当前评价周期的交互成功率确定所述自学习指数,包括:The method according to any one of claims 2 to 5, wherein determining the self-learning index according to the interaction success rate of the current evaluation cycle includes:
    获得所述当前评价周期的交互失败率;obtaining the interaction failure rate of the current evaluation period;
    根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数,使所述自学习指数为随所述交互成功率的单调递增函数。A self-learning index is determined according to the interaction success rate and the interaction failure rate in the current evaluation period, so that the self-learning index is a monotonically increasing function along with the interaction success rate.
  7. 根据权利要求6所述的方法,其中,根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数,包括:The method according to claim 6, wherein, according to the interaction success rate and interaction failure rate of the current evaluation period, determining the self-learning index comprises:
    S=k′/(k′+r)S=k'/(k'+r)
    其中,S为自学习指数,k′为交互成功率,r为交互失败率。Among them, S is the self-learning index, k' is the interaction success rate, and r is the interaction failure rate.
  8. 根据权利要求6所述的方法,其中,获得所述当前评价周期的交互失败率,包括:The method according to claim 6, wherein obtaining the interaction failure rate of the current evaluation period comprises:
    获得所述交互日志中在所述当前评价周期内的日志的第一数量,以及在所述当前评价周期内的交互失败的日志的第二数量;Obtaining a first number of logs in the interaction log within the current evaluation period, and a second number of interaction failure logs in the current evaluation period;
    根据所述第一数量和所述第二数量,确定所述当前评价周期的交互失败率。An interaction failure rate of the current evaluation period is determined according to the first quantity and the second quantity.
  9. 根据权利要求1至5任一项所述的方法,其中,获得所述交互日志中每条日志的交互状态,包括:The method according to any one of claims 1 to 5, wherein obtaining the interaction status of each log in the interaction log includes:
    在通过预设交互策略确定所述交互日志中的日志为用户满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互成功;If the log in the interaction log is determined to be a satisfactory reply from the user through the preset interaction strategy, then determine the interaction status of the log in the interaction log as successful interaction;
    在通过预设交互策略确定所述交互日志中的日志为用户不满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互失败。If it is determined through the preset interaction policy that the log in the interaction log is an unsatisfactory reply from the user, the interaction status of the log in the interaction log is determined as interaction failure.
  10. 一种用于评价交互系统交互性能的装置,包括处理器和存储有程序指令的存储器,所述处理器被配置为在执行所述程序指令时,执行如下方法:A device for evaluating the interactive performance of an interactive system, comprising a processor and a memory storing program instructions, the processor is configured to perform the following method when executing the program instructions:
    获得交互系统的交互日志中每条日志的交互状态,并依据所述交互状态对每条日志进行标注;Obtain the interaction state of each log in the interaction log of the interaction system, and mark each log according to the interaction state;
    根据每条日志的标注结果,确定所述交互系统的自学习指数,以评价所述交互系统的交互性能。According to the labeling result of each log, the self-learning index of the interactive system is determined to evaluate the interactive performance of the interactive system.
  11. 根据权利要求10所述的装置,其中,所述处理器执行确定所述交互系统的自 学习指数时,具体执行如下方法:The device according to claim 10, wherein, when the processor executes determining the self-learning index of the interactive system, specifically perform the following method:
    获得所述交互日志中在当前评价周期前设定时长内交互失败的第一失败日志;Obtaining the first failure log of the interaction failure within the set time period before the current evaluation period in the interaction log;
    根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率;Determine the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period;
    根据所述当前评价周期的交互成功率确定所述自学习指数。The self-learning index is determined according to the interaction success rate of the current evaluation period.
  12. 根据权利要求11所述的装置,其中,所述当前评价周期包括多个子数据采集周期;The apparatus of claim 11, wherein the current evaluation period comprises a plurality of sub-data collection periods;
    所述处理器执行根据所述第一失败日志对应的交互内容在所述当前评价周期内的交互成功率,确定所述当前评价周期的交互成功率时,具体执行如下方法:When the processor determines the interaction success rate of the current evaluation period according to the interaction success rate of the interaction content corresponding to the first failure log in the current evaluation period, the following method is specifically performed:
    根据所述第一失败日志,确定每个子数据采集周期的交互成功率;Determine the interaction success rate of each sub-data collection cycle according to the first failure log;
    获得多个子数据采集周期的交互成功率的平均值;Obtain the average value of the interaction success rate of multiple sub-data collection cycles;
    将所述平均值确定为所述当前评价周期的交互成功率。The average value is determined as the interaction success rate of the current evaluation period.
  13. 根据权利要求12所述的装置,其中,所述设定时长为一个子数据采集周期;所述处理器执行根据所述第一失败日志,确定每个子数据采集周期的交互成功率时,具体执行如下方法:在当前子数据采集周期为所述当前评价周期的第一个子数据采集周期的情况下,将所述第一失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率;The device according to claim 12, wherein the set duration is a sub-data collection period; when the processor determines the interaction success rate of each sub-data collection period according to the first failure log, specifically executes The following method: when the current sub-data collection period is the first sub-data collection period of the current evaluation period, the interaction content corresponding to the first failure log in the current sub-data collection period is successfully interacted rate, determined as the interaction success rate of the current sub-data collection period;
    或者,将所述第一失败日志对应的交互内容在每个子数据采集周期内的交互成功率,确定为每个子数据采集周期的交互成功率。Alternatively, the interaction success rate of the interaction content corresponding to the first failure log in each sub-data collection cycle is determined as the interaction success rate of each sub-data collection cycle.
  14. 根据权利要求13所述的装置,其中,所述处理器执行根据所述第一失败日志,确定每个子数据采集周期的交互成功率时,还具体执行如下方法:The device according to claim 13, wherein, when the processor determines the interaction success rate of each sub-data collection cycle according to the first failure log, it also specifically performs the following method:
    在当前子数据采集周期非所述当前评价周期的第一个子数据采集周期的情况下,在所述交互日志中,获得所述当前子数据采集周期的前一子数据采集周期内的交互失败的第二失败日志;If the current sub-data collection period is not the first sub-data collection period of the current evaluation period, in the interaction log, obtain the interaction failure in the previous sub-data collection period of the current sub-data collection period The second failure log of;
    将所述第二失败日志对应的交互内容在所述当前子数据采集周期内的交互成功率,确定为所述当前子数据采集周期的交互成功率。The interaction success rate of the interaction content corresponding to the second failure log in the current sub-data collection period is determined as the interaction success rate of the current sub-data collection period.
  15. 根据权利要求11至14任一项所述的装置,其中,所述处理器执行根据所述当前评价周期的交互成功率确定所述自学习指数时,具体执行如下方法:The device according to any one of claims 11 to 14, wherein, when the processor executes determining the self-learning index according to the interaction success rate of the current evaluation cycle, the following method is specifically executed:
    获得所述当前评价周期的交互失败率;obtaining the interaction failure rate of the current evaluation period;
    根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数,使所述自 学习指数为随所述交互成功率的单调递增函数。According to the interaction success rate and the interaction failure rate of the current evaluation cycle, determine the self-learning index, so that the self-learning index is a monotonically increasing function with the interaction success rate.
  16. 根据权利要求15所述的装置,其中,所述处理器执行根据所述当前评价周期的交互成功率和交互失败率,确定自学习指数时,具体执行如下公式:The device according to claim 15, wherein, when the processor determines the self-learning index according to the interaction success rate and interaction failure rate of the current evaluation cycle, the following formula is specifically executed:
    S=k′/(k +r) S=k'/(k ' +r)
    其中,S为自学习指数,k′为交互成功率,r为交互失败率。Among them, S is the self-learning index, k' is the interaction success rate, and r is the interaction failure rate.
  17. 根据权利要求15所述的装置,其中,所述处理器执行获得所述当前评价周期的交互失败率时,具体执行如下方法:The device according to claim 15, wherein when the processor executes obtaining the interaction failure rate of the current evaluation period, the following method is specifically executed:
    获得所述交互日志中在所述当前评价周期内的日志的第一数量,以及在所述当前评价周期内的交互失败的日志的第二数量;Obtaining a first number of logs in the interaction log within the current evaluation period, and a second number of interaction failure logs in the current evaluation period;
    根据所述第一数量和所述第二数量,确定所述当前评价周期的交互失败率。An interaction failure rate of the current evaluation cycle is determined according to the first quantity and the second quantity.
  18. 根据权利要求10至14任一项所述的装置,其中,所述处理器执行获得所述交互日志中每条日志的交互状态时,具体执行如下方法:The device according to any one of claims 10 to 14, wherein, when the processor executes obtaining the interaction state of each log in the interaction log, the following method is specifically executed:
    在通过预设交互策略确定所述交互日志中的日志为用户满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互成功;If the log in the interaction log is determined to be a satisfactory reply from the user through the preset interaction strategy, then determine the interaction status of the log in the interaction log as successful interaction;
    在通过预设交互策略确定所述交互日志中的日志为用户不满意的回复的情况下,则将所述交互日志中的日志的交互状态确定为交互失败。If it is determined through the preset interaction policy that the log in the interaction log is an unsatisfactory reply from the user, the interaction status of the log in the interaction log is determined as interaction failure.
  19. 一种计算机可读存储介质,存储有计算机可执行指令,所述指令程序被运行时执行如权利要求1至9任一项所述的用于评价交互系统交互性能的方法。A computer-readable storage medium storing computer-executable instructions, the instruction program executing the method for evaluating the interactive performance of an interactive system according to any one of claims 1 to 9 when executed.
PCT/CN2022/096513 2021-06-02 2022-06-01 Method and apparatus for evaluating interaction performance of interaction system WO2022253251A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110616138.4A CN113282475B (en) 2021-06-02 2021-06-02 Method and device for evaluating interactive performance of interactive system
CN202110616138.4 2021-06-02

Publications (1)

Publication Number Publication Date
WO2022253251A1 true WO2022253251A1 (en) 2022-12-08

Family

ID=77283290

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096513 WO2022253251A1 (en) 2021-06-02 2022-06-01 Method and apparatus for evaluating interaction performance of interaction system

Country Status (2)

Country Link
CN (1) CN113282475B (en)
WO (1) WO2022253251A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282475B (en) * 2021-06-02 2022-12-06 青岛海尔科技有限公司 Method and device for evaluating interactive performance of interactive system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009110220A (en) * 2007-10-30 2009-05-21 Hitachi Ltd Audit log collection/evaluation system, audit log collection/evaluation method, and collection/evaluation computer
CN105027197A (en) * 2013-03-15 2015-11-04 苹果公司 Training an at least partial voice command system
US9921574B1 (en) * 2016-03-03 2018-03-20 Sprint Communications Company L.P. Dynamic interactive robot dialogue creation incorporating disparate information sources and collective feedback analysis
CN109545185A (en) * 2018-11-12 2019-03-29 百度在线网络技术(北京)有限公司 Interactive system evaluation method, evaluation system, server and computer-readable medium
CN111666396A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 User intention understanding satisfaction evaluation method, device, equipment and storage medium
CN113282475A (en) * 2021-06-02 2021-08-20 青岛海尔科技有限公司 Method and device for evaluating interactive performance of interactive system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8838559B1 (en) * 2011-02-24 2014-09-16 Cadence Design Systems, Inc. Data mining through property checks based upon string pattern determinations
CN105488185B (en) * 2015-12-01 2018-07-24 上海智臻智能网络科技股份有限公司 A kind of optimization method and device of knowledge base
CN108388926B (en) * 2018-03-15 2019-07-30 百度在线网络技术(北京)有限公司 The determination method and apparatus of interactive voice satisfaction
CN109857716B (en) * 2019-01-28 2023-06-27 平安科技(深圳)有限公司 System interaction log recording method and device, storage medium and server
CN111985751B (en) * 2019-05-23 2023-09-26 百度在线网络技术(北京)有限公司 Human-computer chat experience assessment system
CN110738239A (en) * 2019-09-20 2020-01-31 浙江大学 search engine user satisfaction evaluation method based on mouse interaction sequence region behavior joint modeling
CN111460292A (en) * 2020-03-30 2020-07-28 百度在线网络技术(北京)有限公司 Model evaluation method, apparatus, device, and medium
CN112416887B (en) * 2020-11-18 2024-01-30 脸萌有限公司 Information interaction method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009110220A (en) * 2007-10-30 2009-05-21 Hitachi Ltd Audit log collection/evaluation system, audit log collection/evaluation method, and collection/evaluation computer
CN105027197A (en) * 2013-03-15 2015-11-04 苹果公司 Training an at least partial voice command system
US9921574B1 (en) * 2016-03-03 2018-03-20 Sprint Communications Company L.P. Dynamic interactive robot dialogue creation incorporating disparate information sources and collective feedback analysis
CN109545185A (en) * 2018-11-12 2019-03-29 百度在线网络技术(北京)有限公司 Interactive system evaluation method, evaluation system, server and computer-readable medium
CN111666396A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 User intention understanding satisfaction evaluation method, device, equipment and storage medium
CN113282475A (en) * 2021-06-02 2021-08-20 青岛海尔科技有限公司 Method and device for evaluating interactive performance of interactive system

Also Published As

Publication number Publication date
CN113282475B (en) 2022-12-06
CN113282475A (en) 2021-08-20

Similar Documents

Publication Publication Date Title
KR102323333B1 (en) Application data processing method and apparatus, and storage medium
EP2830044B1 (en) Instruction processing method, apparatus, and system
CN105477854B (en) Applied to the handle control method of intelligent terminal, apparatus and system
CN105320417B (en) Page switching method and client
CN104699591A (en) Reappearing method and device of test scenes
CN103809735B (en) A kind of method and device of gesture identification
JP2016530657A (en) Application switching and input information adding method and apparatus
CN110874324A (en) Test result data storage method and device, terminal equipment and storage medium
CN105740326A (en) Thread state monitoring method and device for browser
WO2022253251A1 (en) Method and apparatus for evaluating interaction performance of interaction system
WO2020199937A1 (en) Method and device for processing information in game, storage medium and electronic device
CN110908837B (en) Application program exception handling method and device, electronic equipment and storage medium
WO2014008789A1 (en) Method and device for processing browser window
CN110471585A (en) Function of application icon methods of exhibiting, device and computer equipment
CN103353858B (en) A kind of automated testing method based on action touch simulation and device
CN108769175A (en) Remote real machine access control method, device, storage medium and electronic equipment
CN108256811A (en) Job information processing method, device, computer equipment and storage medium
CN115129572A (en) Performance test method, device, equipment and medium
CN105446848B (en) The test method and device of the data processing performance of electronic equipment
CN105549894A (en) Touch information processing method and apparatus, touch information acquisition method and apparatus and touch information processing system
CN104717175B (en) The processing method and system of virtual desktop
CN107992372A (en) A kind of chassis information exchange method, system, equipment and computer-readable storage medium
CN110248023B (en) Intelligent terminal control method, device, equipment and medium
CN105554134B (en) Information synchronization method and device
CN107147719A (en) A kind of hardware update method, master node, slave node and server cluster

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22815295

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE