WO2020224448A1

WO2020224448A1 - Interaction method and device, loudspeaker, electronic apparatus, and storage medium

Info

Publication number: WO2020224448A1
Application number: PCT/CN2020/086627
Authority: WO
Inventors: 冯伟国; 牛也; 李双江; 丁盘苹; 金灿灿
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2019-05-07
Filing date: 2020-04-24
Publication date: 2020-11-12
Also published as: CN111914983A; CN111914983B

Abstract

The present disclosure relates to an interaction method and device, a loudspeaker, an electronic apparatus, and a storage medium. The method comprises: acquiring the t-th round of user input information, wherein t is an integer greater than or equal to 1; determining a decision outcome of multiple decision-making models in the t-th round at least according to the t-th round of user input information and the (t-1)-th round of output information, wherein the (t-1)-th round of output information is determined according to a decision outcome of the multiple decision-making models in the (t-1)-th round; and determining the t-th round of output information according to the decision outcome of the multiple decision-making models in the t-th round. In the present disclosure, the previous round of decision-making is used to enable all decision-making models to make better decisions in the present round, and the respective decision-making models are integrated so as to reach a more accurate decision for each round of decision-making.

Description

Interactive method and device, sound box, electronic equipment and storage medium

This application claims the priority of the Chinese patent application filed on May 07, 2019 with the application number 201910376459.4 and the invention title "Interaction method and device, speaker, electronic equipment and storage medium", the entire content of which is incorporated into this application by reference in.

Technical field

The present disclosure relates to the field of computer technology, and in particular to an interactive method and device, speakers, electronic equipment, and storage media.

Background technique

In the voice interaction scenario, the voice input by the user undergoes natural language understanding and dialogue decision-making, and then a response to the user or the service requested by the user is obtained. Engineers and researchers provide various types of dialogue decision models. Currently, in the dialogue decision-making system, different dialogue decision-making models exist separately. For example, the text entered by the user first passes through a rule-based dialogue decision model, and if there is a result, it is returned to the user, otherwise it continues to pass through the matching-based dialogue decision model, and if there is a result, it is returned to the user. Different dialogue decision models are completely separated and cannot help each other to make better dialogue decisions during the next user input.

How to integrate various dialogue decision-making models for more accurate dialogue decision-making is an urgent problem to be solved.

Summary of the invention

The present disclosure proposes an interactive technical solution.

According to an aspect of the present disclosure, there is provided an interaction method including:

Obtain the user input information for the t-th round, where t is an integer greater than or equal to 1;

At least according to the user input information of the t-th round and the output information of the t-1 round, the decision results of multiple decision models in the t-th round are determined, wherein the output information of the t-1 round is based on the The decision results of multiple decision models in the t-1 round are determined;

According to the decision results of the multiple decision models in the t round, the output information of the t round is determined.

In a possible implementation manner, the determining the decision results of multiple decision models in the t round at least according to the user input information of the t round and the output information of the t-1 round includes:

According to the user input information of the t-th round, the output information of the t-1 round, and the state information of the t-th round, the decision results of the multiple decision models in the t-th round are determined.

In a possible implementation manner, the multiple decision models include a recurrent neural network model;

The determining the decision results of multiple decision models in the t round at least according to the user input information of the t round and the output information of the t-1 round includes:

According to the user input information of the t-th round, the output information of the t-1 round, the state information of the t-th round, and the hidden layer feature of the recurrent neural network model in the t-1 round To determine the decision results of the multiple decision models in the t-th round.

In a possible implementation manner, the state information of the t-th round includes at least one of the following: entity information related to the t-th round, time information to which the t-th round belongs, and information about the t-th round Round information.

In a possible implementation manner, the determining the output information of the t round according to the decision results of the multiple decision models in the t round includes:

Voting according to the decision results of the multiple decision models in the t-th round to obtain a voting result;

According to the voting result, the output information of the t-th round is determined.

In a possible implementation manner, the decision result of the multiple decision models in the t round includes the decision category of each of the multiple decision models in the t round;

The voting according to the decision results of the multiple decision models in the t-th round to obtain the voting result includes:

Determine the decision categories of the decision models in the t-th round as the categories voted by the decision models;

The category with the highest number of votes is determined as the voting result.

In a possible implementation manner, the decision result of the multiple decision models in the t round includes the candidate decision category of each of the multiple decision models in the t round and the The weight of alternative decision categories;

Determine the sum of the weights corresponding to each candidate decision category according to the weights of the candidate decision categories of each decision model in the t round;

The candidate decision category with the largest sum of weights is determined as the voting result.

The decision result of each of the multiple decision models in the t-th round is respectively input into a preset deep learning model, and the output information of the t-th round is output via the preset deep learning model.

In a possible implementation manner, the user input information is voice input by the user.

According to another aspect of the present disclosure, there is provided an interactive device, including:

The obtaining module is used to obtain the user input information of the t-th round, where t is an integer greater than or equal to 1;

The first determining module is configured to determine the decision results of multiple decision models in the t round at least according to the user input information in the t-th round and the output information in the t-1 round, wherein the t-1 The output information of the round is determined according to the decision results of the multiple decision models in the t-1 round;

The second determining module is configured to determine the output information of the t-th round according to the decision results of the multiple decision models in the t-th round.

In a possible implementation manner, the first determining module is used to:

The first determining module is configured to: according to the user input information of the t round, the output information of the t-1 round, the state information of the t round, and the recurrent neural network model in the The hidden layer feature of the t-1 round determines the decision result of the multiple decision models in the t round.

In a possible implementation manner, the second determining module is used to:

The second determining module is used for:

In a possible implementation manner, the second determining module is used to:

The decision result of each of the plurality of decision models in the t-th round is input into a preset deep learning model, and output information of the t-th round is output through the preset deep learning model.

According to another aspect of the present disclosure, there is provided a sound box, including:

In a possible implementation manner, the first determining module is used to:

In a possible implementation manner, the second determining module is used to:

The second determining module is used for:

In a possible implementation manner, the second determining module is used to:

According to another aspect of the present disclosure, there is provided an electronic device including:

processor;

A memory for storing processor executable instructions;

Wherein, the processor is configured to call the instructions stored in the memory to execute the aforementioned interaction method.

According to another aspect of the present disclosure, there is provided a computer-readable storage medium having computer program instructions stored thereon, and when the computer program instructions are executed by a processor, the foregoing interaction method is implemented.

In the embodiment of the present disclosure, by obtaining the user input information of the t round, at least according to the user input information of the t round and the output information of the t-1 round, the decision results of multiple decision models in the t round are determined, And according to the decision results of multiple decision models in the t round, the output information of the t round is determined, so the decision in the previous round can help all the decision models make better decisions in this round, and do it in each round When making decisions, it can integrate various decision models to make more accurate decisions.

It should be understood that the above general description and the following detailed description are only exemplary and explanatory, rather than limiting the present disclosure.

According to the following detailed description of exemplary embodiments with reference to the accompanying drawings, other features and aspects of the present disclosure will become clear.

Description of the drawings

The drawings herein are incorporated into the specification and constitute a part of the specification. These drawings illustrate embodiments that conform to the disclosure and are used together with the specification to explain the technical solutions of the disclosure.

Fig. 1 shows a flowchart of an interaction method according to an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of an interaction method of an embodiment of the present disclosure;

Figure 3 shows a block diagram of an interactive device according to an embodiment of the present disclosure;

Fig. 4 shows a block diagram of a sound box according to an embodiment of the present disclosure;

FIG. 5 shows a block diagram of an electronic device 800 according to an embodiment of the present disclosure;

FIG. 6 shows a block diagram of an electronic device 1900 according to an embodiment of the present disclosure.

Detailed ways

Various exemplary embodiments, features, and aspects of the present disclosure will be described in detail below with reference to the drawings. The same reference numerals in the drawings indicate elements with the same or similar functions. Although various aspects of the embodiments are shown in the drawings, unless otherwise noted, the drawings are not necessarily drawn to scale.

The dedicated word "exemplary" here means "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" need not be construed as being superior or better than other embodiments.

The term "and/or" in this article is only an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations. In addition, the term "at least one" in this document means any one or any combination of at least two of the multiple, for example, including at least one of A, B, and C, may mean including A, Any one or more elements selected in the set formed by B and C.

In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific embodiments. Those skilled in the art should understand that the present disclosure can also be implemented without some specific details. In some instances, the methods, means, elements, and circuits well-known to those skilled in the art have not been described in detail in order to highlight the gist of the present disclosure.

Fig. 1 shows a flowchart of an interaction method according to an embodiment of the present disclosure. The execution subject of the interactive method may be an interactive device. For example, the interaction method can be executed by a terminal device or a server or other processing device. Among them, the terminal equipment can be user equipment (UE), mobile equipment, user terminal, terminal, speaker, cellular phone, cordless phone, personal digital assistant (PDA), handheld device, computing device, vehicle-mounted equipment Or wearable devices, etc. In some possible implementations, the interaction method can be implemented by a processor calling computer-readable instructions stored in the memory. As shown in Fig. 1, the interaction method includes steps S11 to S13.

In step S11, the user input information of the t-th round is obtained, where t is an integer greater than or equal to 1.

In the embodiments of the present disclosure, the user input information may be voice, text, or other information input by the user.

In a possible implementation manner, in a voice interaction application scenario, the user input information may be voice input by the user. Using ASR (Automatic Speech Recognition, automatic speech recognition) technology, the speech input by the user can be converted into text; using NLU (Natural Language Understanding) model, semantic vectors can be obtained based on the converted text.

In step S12, the decision results of multiple decision models in the t round are determined based on at least the user input information in the t round and the output information in the t-1 round. The decision result of each decision model in round t-1 is determined.

In the embodiment of the present disclosure, the number of decision models is greater than or equal to two. For example, multiple decision models include LSTM (Long Short Term Memory) model and matching (Matching) model.

In one possible implementation, the decision model is an end-to-end decision model. For example, the decision model may be an end-to-end dialogue decision model, an end-to-end decision model based on a gated recurrent unit (Gated Recurrent Unit, GRU), or an end-to-end decision model based on context matching.

In a possible implementation manner, the decision results of multiple decision models in the t round are determined based on at least the user input information in the t round and the output information in the t-1 round, including: according to the user in the t round The input information, the output information of the t-1 round, and the state information of the t round determine the decision results of multiple decision models in the t round.

In this implementation, the state information of the t-th round may refer to the context state information of the t-th interaction.

In a possible implementation, the state information of the t-th round includes at least one of the following: entity information related to the t-th round, time information to which the t-th round belongs, and round number information of the t-th round.

In an example, the entity information related to the t-th round may include location information and person information related to the t-th round. In another example, in a shopping scenario, the entity information related to the t-th round may include entity information such as brand, product, and price. In another example, in a mobile phone recharge scenario, the entity information related to the t-th round may include entity information such as phone numbers, nicknames, and prices.

In an example, the time information to which the t-th round belongs may include information such as whether the time to which the t-th round belongs is a working day or a holiday.

In an example, the round number information of the t-th round may be information on the number of interactions on the day. According to the round number information of the t-th round, it can be determined how many interactions are on the day.

In a possible implementation, multiple decision models include Recurrent Neural Network (RNN) models; determine multiple decisions based at least on user input information in round t and output information in round t-1 The decision results of the model in the t round include: according to the user input information of the t round, the output information of the t-1 round, the state information of the t round, and the hidden layer of the recurrent neural network model in the t-1 round Features, determine the decision results of multiple decision models in the t round.

For example, if the recurrent neural network model is an LSTM model, the LSTM model can be based on the user input information of the t round, the output information of the t-1 round, the state information of the t round, and the LSTM model in the t-1 round The hidden layer feature determines the decision result of the LSTM model in the t round.

In step S13, the output information of the t round is determined according to the decision results of the multiple decision models in the t round.

In the embodiments of the present disclosure, the output information of the t-th round is determined according to the decision results of multiple decision models in the t-th round, which can combine the information of multiple decision models and their respective advantages, so as to obtain more accurate output information .

In the embodiment of the present disclosure, after the output information of the t-th round is obtained, the output information of the t-th round is output. For example, the output information of the t-th round is to reply to the user or return the requested service to the user. For example, if the user input information is a voice asking about the weather, the reply to the user is the weather query result.

In a possible implementation, determining the output information of the t round according to the decision results of the multiple decision models in the t round includes: voting based on the decision results of the multiple decision models in the t round to obtain the voting result ; According to the voting results, determine the output information of the t round.

In one example, the decision results of multiple decision models in the t round include the decision category of each of the multiple decision models in the t round; vote according to the decision results of the multiple decision models in the t round, and get The voting results include: determining the decision category of each decision model in the t round as the category voted by each decision model; determining the category with the highest number of votes as the voting result.

For example, the number of decision models is three, which are decision model 1, decision model 2, and decision model 3. Among them, the decision category of decision model 1 in round t is category A, the decision category of decision model 2 in round t is category A, and the decision category of decision model 3 in round t is category C, then the number of votes for category A The number of votes for category C is 1, so category A is determined as the voting result.

In another example, the decision results of the multiple decision models in the t round include the candidate decision category of each of the multiple decision models in the t round and the weight of the candidate decision category; The decision results of the tth round are voted, and the voting results are obtained, including: according to the weights of the alternative decision categories of each decision model in the t round, determine the sum of the weights corresponding to each alternative decision category; and reserve the largest sum of weights The selection decision category is determined as the voting result.

For example, the number of decision models is three, which are decision model 1, decision model 2, and decision model 3. Among them, the decision result of decision model 1 in the t round is (A: 0.3, B: 0.4), that is, the alternative decision categories of decision model 1 in the t round are A and B. Among them, the alternative decision category A The weight is 0.3, and the weight of alternative decision category B is 0.4; the decision result of decision model 2 in round t is (A:0.3, B:0.4), that is, the alternative decision category of decision model 2 in round t is A and B, where the weight of alternative decision category A is 0.3, and the weight of alternative decision category B is 0.4; the decision result of decision model 3 in the t round is (A: 0.9, B: 0.1), that is, decision The candidate decision categories of Model 3 in round t are A and B, where the weight of candidate decision category A is 0.9, and the weight of candidate decision category B is 0.1. According to the decision model 1, decision model 2 and decision model 3 in the t round of the alternative decision category and the weight of the alternative decision category, it can be determined that the sum of the weights corresponding to the alternative decision category A is 1.5, and the alternative decision category B corresponds to The sum of the weights of is 0.9, so the alternative decision category A is determined as the voting result.

In a possible implementation manner, according to the decision results of multiple decision models in the t round, determining the output information of the t round includes: separate the decision results of each of the multiple decision models in the t round Input the preset deep learning model, and output the output information of the t-th round through the preset deep learning model.

In this implementation manner, the preset deep learning model may be a pre-trained deep learning model for determining output information according to the decision results of multiple decision models.

For example, the number of decision models is three, which are decision model 1, decision model 2, and decision model 3. Among them, the decision result of decision model 1 in the t round is (A: 0.3, B: 0.4), that is, the alternative decision categories of decision model 1 in the t round are A and B. Among them, the alternative decision category A The weight is 0.3, and the weight of alternative decision category B is 0.4; the decision result of decision model 2 in round t is (A:0.3, B:0.4), that is, the alternative decision category of decision model 2 in round t is A and B, where the weight of alternative decision category A is 0.3, and the weight of alternative decision category B is 0.4; the decision result of decision model 3 in the t round is (A: 0.9, B: 0.1), that is, decision The candidate decision categories of Model 3 in round t are A and B, where the weight of candidate decision category A is 0.9, and the weight of candidate decision category B is 0.1. The decision results of the decision model 1, the decision model 2, and the decision model 3 in the t round are respectively input into the preset deep learning model, and the output information of the t round is output through the preset deep learning model.

In a possible implementation manner, the output information of the t-th round may be used to determine the decision result of each decision model in the t+1-th round. For example, in the t+1 round, according to the user input information of the t+1 round and the output information of the t round, the decision result of each of the multiple decision models in the t+1 round is determined. For another example, in the t+1 round, according to the user input information in the t+1 round, the output information in the t round, and the state information in the t+1 round, it is determined that each of the multiple decision models is The decision result of round t+1.

Fig. 2 shows a schematic diagram of an interaction method of an embodiment of the present disclosure. In the example shown in Figure 2, the decision model includes an LSTM model and a Matching model. In the t round, the LSTM model is based on the user input information in the t round, the output information in the t-1 round (Action't-1), the state information in the t round (utterancet), and the LSTM model in the t-1 round The hidden layer feature of the round determines the decision result (Actiont) of the LSTM model in the t round; the Matching model is based on the user input information in the t round, the output information in the t-1 round (Action't-1), and the t-th round The round state information (utterancet) determines the decision result (Actiont) of the Matching model in round t. The decision result (Actiont) of the LSTM model in the t round and the decision result (Actiont) of the Matching model in the t round are obtained through the Decision module to obtain the output information (Action't) of the t round. Similarly, in round t+1, the LSTM model is based on user input information in round t+1, output information in round t (Action't), state information in round t+1 (utterancet+1), and The hidden layer feature of the LSTM model in the t round determines the decision result of the LSTM model in the t+1 round (Actiont+1); the Matching model is based on the user input information in the t+1 round and the output information in the t round (Action 't), and the state information of the t+1 round (utterancet+1), determine the decision result (Actiont+1) of the Matching model in the t+1 round. The decision result of the LSTM model in the t+1 round (Actiont+1) and the decision result of the Matching model in the t+1 round (Actiont+1) are obtained through the Decision module to obtain the output information of the t+1 round (Action't+ 1).

It should be noted that although the interaction method is described above with two decision models, those skilled in the art can understand that the present disclosure should not be limited to this. The interaction method provided by the present disclosure may involve more decision models.

The interaction method provided by the embodiments of the present disclosure provides an end-to-end user interaction solution, which can reduce maintenance difficulty.

It can be understood that, without violating the principle logic, the various method embodiments mentioned in the present disclosure can be combined with each other to form a combined embodiment, which is limited in length and will not be repeated in this disclosure.

Those skilled in the art can understand that in the above methods of the specific implementation, the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possibility. The inner logic is determined.

In addition, the present disclosure also provides interactive devices, speakers, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any interactive method provided in the present disclosure. For the corresponding technical solutions and descriptions, refer to the corresponding records in the method section. ,No longer.

Fig. 3 shows a block diagram of an interactive device according to an embodiment of the present disclosure. As shown in FIG. 3, the interaction device includes: an acquiring module 31, configured to acquire user input information of the t-th round, where t is an integer greater than or equal to 1; The user input information and the output information of the t-1 round determine the decision results of the multiple decision models in the t round, where the output information of the t-1 round is based on the multiple decision models in the t-1 round The decision result is determined; the second determining module 33 is used to determine the output information of the t round according to the decision results of the multiple decision models in the t round.

In a possible implementation manner, the first determining module 32 is used to determine whether multiple decision models are in the tth round according to the user input information of the tth round, the output information of the t-1th round, and the state information of the tth round. The decision result of round t.

In a possible implementation, the multiple decision models include a recurrent neural network model; the first determining module 32 is used to: according to the user input information of the t round, the output information of the t-1 round, and the state of the t round Information, as well as the hidden layer characteristics of the recurrent neural network model in the t-1 round, determine the decision results of multiple decision models in the t round.

In a possible implementation manner, the second determination module 33 is configured to: vote according to the decision results of the multiple decision models in the t round to obtain the voting result; and determine the output information of the t round according to the voting result.

In a possible implementation manner, the decision result of the multiple decision models in the t round includes the decision category of each of the multiple decision models in the t round; the second determination module 33 is used to: use each decision model The decision category in the t round is determined as the category voted by each decision model; the category with the highest number of votes is determined as the voting result.

In a possible implementation, the decision results of the multiple decision models in the t round include the candidate decision category of each of the multiple decision models in the t round and the weight of the candidate decision category; the second determination The module 33 is used to: determine the sum of the weights corresponding to each candidate decision category according to the weights of the candidate decision categories of each decision model in the t round; determine the candidate decision category with the largest sum of weights as the voting result.

In a possible implementation manner, the second determining module 33 is configured to: input the decision results of each of the multiple decision models in the t-th round into the preset deep learning model, and output through the preset deep learning model The output information of round t.

Fig. 4 shows a block diagram of a sound box according to an embodiment of the present disclosure. As shown in FIG. 4, the sound box includes: an acquiring module 41, configured to acquire user input information of the t-th round, where t is an integer greater than or equal to 1; The user input information and the output information of the t-1 round determine the decision results of the multiple decision models in the t round, where the output information of the t-1 round is based on the decision of the multiple decision models in the t-1 round The result is determined; the second determining module 43 is used to determine the output information of the t-th round according to the decision results of the multiple decision-making models in the t-th round.

In a possible implementation manner, the first determining module 42 is configured to determine whether multiple decision models are in the tth round according to the user input information of the t-th round, the output information of the t-1th round, and the state information of the t-th round. The decision result of round t.

In a possible implementation, the multiple decision models include a cyclic neural network model; the first determining module 42 is used to: according to the user input information of the t round, the output information of the t-1 round, and the state of the t round Information, as well as the hidden layer characteristics of the recurrent neural network model in the t-1 round, determine the decision results of multiple decision models in the t round.

In a possible implementation manner, the second determining module 43 is configured to: vote according to the decision results of multiple decision models in the t round to obtain the voting result; and determine the output information of the t round according to the voting result.

In a possible implementation, the decision result of the multiple decision models in the t round includes the decision category of each of the multiple decision models in the t round; the second determination module 43 is used to: The decision category in the t round is determined as the category voted by each decision model; the category with the highest number of votes is determined as the voting result.

In a possible implementation, the decision results of the multiple decision models in the t round include the candidate decision category of each of the multiple decision models in the t round and the weight of the candidate decision category; the second determination Module 43 is used to: determine the sum of weights corresponding to each candidate decision category according to the weights of the candidate decision categories of each decision model in the t round; determine the candidate decision category with the largest sum of weights as the voting result.

In a possible implementation manner, the second determining module 43 is configured to: input the decision result of each of the multiple decision models in the t-th round into the preset deep learning model, and output the result via the preset deep learning model The output information of round t.

In some embodiments, the functions or modules contained in the device or speaker provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments. For specific implementation, refer to the description of the above method embodiments, for the sake of brevity , I won’t repeat it here.

The embodiment of the present disclosure also provides a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the above method when executed by a processor. The computer-readable storage medium may be a non-volatile computer-readable storage medium.

An embodiment of the present disclosure also provides an electronic device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured as the above method.

The electronic device can be provided as a terminal, server or other form of device.

FIG. 5 shows a block diagram of an electronic device 800 according to an embodiment of the present disclosure. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.

5, the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, And the communication component 816.

The processing component 802 generally controls the overall operations of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method. In addition, the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations in the electronic device 800. Examples of these data include instructions for any application or method operating on the electronic device 800, contact data, phone book data, messages, pictures, videos, etc. The memory 804 can be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.

The power supply component 806 provides power for various components of the electronic device 800. The power supply component 806 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a microphone (MIC). When the electronic device 800 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signal may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, the audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module. The peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.

The sensor component 814 includes one or more sensors for providing the electronic device 800 with various aspects of state evaluation. For example, the sensor component 814 can detect the on/off status of the electronic device 800 and the relative positioning of the components. For example, the component is the display and the keypad of the electronic device 800. The sensor component 814 can also detect the electronic device 800 or the electronic device 800. The position of the component changes, the presence or absence of contact between the user and the electronic device 800, the orientation or acceleration/deceleration of the electronic device 800, and the temperature change of the electronic device 800. The sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact. The sensor component 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

In an exemplary embodiment, the electronic device 800 can be implemented by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing devices (DSPD), programmable logic devices (PLD), field A programmable gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.

In an exemplary embodiment, there is also provided a non-volatile computer-readable storage medium, such as a memory 804 including computer program instructions, which can be executed by the processor 820 of the electronic device 800 to complete the foregoing method.

FIG. 6 shows a block diagram of an electronic device 1900 according to an embodiment of the present disclosure. For example, the electronic device 1900 may be provided as a server. 6, the electronic device 1900 includes a processing component 1922, which further includes one or more processors, and a memory resource represented by the memory 1932, for storing instructions executable by the processing component 1922, such as application programs. The application program stored in the memory 1932 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 1922 is configured to execute instructions to perform the above-described methods.

The electronic device 1900 may also include a power supply component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to the network, and an input output (I/O) interface 1958. The electronic device 1900 can operate based on an operating system stored in the memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.

In an exemplary embodiment, there is also provided a non-volatile computer-readable storage medium, such as the memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to complete the foregoing method.

The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the present disclosure.

The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples of computer-readable storage media (non-exhaustive list) include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon The protruding structure in the hole card or the groove, and any suitable combination of the above. The computer-readable storage medium used here is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.

The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .

The computer program instructions used to perform the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or in one or more programming languages. Source code or object code written in any combination, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages. Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access the Internet connection). In some embodiments, an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer-readable program instructions. The computer-readable program instructions are executed to realize various aspects of the present disclosure.

Herein, various aspects of the present disclosure are described with reference to flowcharts and/or block diagrams of methods, apparatuses (systems) and computer program products according to embodiments of the present disclosure. It should be understood that each block of the flowcharts and/or block diagrams and combinations of blocks in the flowcharts and/or block diagrams can be implemented by computer-readable program instructions.

These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine such that when these instructions are executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner, so that the computer-readable medium storing instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowchart and/or block diagram.

It is also possible to load computer-readable program instructions onto a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , So that the instructions executed on the computer, other programmable data processing apparatus, or other equipment realize the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

The flowcharts and block diagrams in the drawings show the possible implementation architecture, functions, and operations of the system, method, and computer program product according to multiple embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more functions for implementing the specified logical function. Executable instructions. In some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.

The embodiments of the present disclosure have been described above, and the above description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Without departing from the scope and spirit of the described embodiments, many modifications and changes are obvious to those of ordinary skill in the art. The choice of terms used herein is intended to best explain the principles, practical applications, or technical improvements of the technologies in the market, or to enable other ordinary skilled in the art to understand the embodiments disclosed herein.

Claims

An interactive method, characterized in that it includes:

Obtain the user input information for the t-th round, where t is an integer greater than or equal to 1;

At least according to the user input information of the t-th round and the output information of the t-1 round, the decision results of multiple decision models in the t-th round are determined, wherein the output information of the t-1 round is based on the The decision results of multiple decision models in the t-1 round are determined;

According to the decision results of the multiple decision models in the t round, the output information of the t round is determined.
The method according to claim 1, wherein:

According to the user input information of the t-th round, the output information of the t-1 round, and the state information of the t-th round, the decision results of the multiple decision models in the t-th round are determined.
The method according to claim 1, wherein the multiple decision models include a recurrent neural network model;

The determining the decision results of multiple decision models in the t round at least according to the user input information of the t round and the output information of the t-1 round includes:

According to the user input information of the t-th round, the output information of the t-1 round, the state information of the t-th round, and the hidden layer feature of the recurrent neural network model in the t-1 round To determine the decision results of the multiple decision models in the t-th round.
The method according to claim 2 or 3, wherein the state information of the t-th round includes at least one of the following: entity information related to the t-th round, time information to which the t-th round belongs, and The round number information of the t-th round.
The method according to claim 1, wherein:

Voting according to the decision results of the multiple decision models in the t-th round to obtain a voting result;

According to the voting result, the output information of the t-th round is determined.
The method according to claim 5, wherein the decision result of the plurality of decision models in the t round includes the decision category of each of the plurality of decision models in the t round;

The voting according to the decision results of the multiple decision models in the t-th round to obtain the voting result includes:

Determine the decision categories of the decision models in the t-th round as the categories voted by the decision models;

The category with the highest number of votes is determined as the voting result.
The method according to claim 6, wherein the decision result of the plurality of decision models in the t-th round comprises a candidate decision of each of the plurality of decision models in the t-th round Category and the weight of the candidate decision category;

The voting according to the decision results of the multiple decision models in the t-th round to obtain the voting result includes:

Determine the sum of the weights corresponding to each candidate decision category according to the weights of the candidate decision categories of each decision model in the t round;

The candidate decision category with the largest sum of weights is determined as the voting result.
The method according to claim 1, wherein:

The decision result of each of the plurality of decision models in the t-th round is input into a preset deep learning model, and output information of the t-th round is output through the preset deep learning model.
The method according to claim 1, wherein the user input information is voice input by the user.
An interactive device, characterized in that it comprises:

The obtaining module is used to obtain the user input information of the t-th round, where t is an integer greater than or equal to 1;

The first determining module is configured to determine the decision results of multiple decision models in the t round according to the user input information of the t round and the output information of the t-1 round, wherein the t-1 round The output information of is determined according to the decision results of the multiple decision models in the t-1 round;

The second determining module is configured to determine the output information of the t-th round according to the decision results of the multiple decision models in the t-th round.
A speaker, characterized in that it comprises:

The obtaining module is used to obtain the user input information of the t-th round, where t is an integer greater than or equal to 1;

The first determining module is configured to determine the decision results of multiple decision models in the t round according to the user input information of the t round and the output information of the t-1 round, wherein the t-1 round The output information of is determined according to the decision results of the multiple decision models in the t-1 round;

The second determining module is configured to determine the output information of the t-th round according to the decision results of the multiple decision models in the t-th round.
An electronic device, characterized in that it comprises:

processor;

A memory for storing processor executable instructions;

Wherein, the processor is configured to call an instruction stored in the memory to execute the method according to any one of claims 1-9.
A computer-readable storage medium having computer program instructions stored thereon, wherein the computer program instructions implement the method according to any one of claims 1 to 9 when executed by a processor.