CN111105793B

CN111105793B - Voice interaction method and device based on interaction engine cluster

Info

Publication number: CN111105793B
Application number: CN201911221172.0A
Authority: CN
Inventors: 原利鹏; 张伟萌; 戴帅湘
Original assignee: Hangzhou Suddenly Cognitive Technology Co ltd
Current assignee: Hangzhou Suddenly Cognitive Technology Co ltd
Priority date: 2019-12-03
Filing date: 2019-12-03
Publication date: 2022-09-06
Anticipated expiration: 2039-12-03
Also published as: CN111105793A

Abstract

The invention discloses a method and a device for voice interaction based on an interaction engine cluster, wherein the method comprises the following steps: acquiring an instruction of a user; judging the type of the instruction, and if the instruction is a cluster task working mode change instruction, modifying the working mode of the voice assistant system into a cluster task working mode according to the cluster task working mode change instruction by the voice assistant system; the voice assistant system processes the task instruction of the user by using the interaction engine cluster in the cluster task working mode, and simultaneously provides at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user. By the method, the user can modify the working mode of the voice assistant system into the cluster task working mode according to the requirement of the user, and the voice assistant system can intelligently execute the task in a mode of better meeting the requirement of the user and not bringing information redundancy to the user.

Description

Voice interaction method and device based on interaction engine cluster

Technical Field

The embodiment of the invention relates to the technical field of information processing, in particular to a method and a device for voice interaction based on an interaction engine cluster.

Background

The voice assistant system allows a user to request the execution of a specific task through a voice instruction, for example, the user makes an air ticket booking request, and the voice assistant system can determine that the user intends to book an air ticket, and then complete slot filling of the air ticket booking task, such as the place of departure is Beijing, the place of destination is Shanghai, the date is 2019, 10 months and 1 days, and the like.

The voice assistant system in the prior art can provide a certain correlation between tasks or interaction engines for executing the tasks, for example, a ticket booking task and a weather query task are correlated in advance, and the ticket booking interaction engine or the weather query interaction engine are correlated.

One application scenario for the above-mentioned association is: when the user requests another task, the voice assistant system may populate the slot of the other task with slot fill content for the other task associated with the other task. For example, the user asks "how to weather there" after booking an airline ticket, the voice assistant system fills the city slot and the date slot in the weather query task with the filling contents of the destination slot and the date slot in the airline ticket booking task associated with the weather query task, and does not ask the user again "ask you which city to inquire about weather" and/or "ask you which day to inquire about weather", which brings convenience to the user in voice interaction to some extent. However, in the above scenario, if the user modifies the destination of the air ticket booking from shanghai to junior, the existing voice assistant system cannot presume that the user has a requirement for inquiring the weather of junior, and at this time, the user needs to actively ask "how the weather is there" or "how the weather is there", so that the user can know the junior weather, and the user experience is not good.

Another application scenario for the above association relationship is: after a certain task is executed, other tasks related to the task are automatically executed, for example, after the task of booking an air ticket is completed, the voice assistant system can automatically execute the weather inquiry task and provide the execution result of the weather inquiry task for the user, for example, the weather of the sea on the trip destination is a rain shower and the user is asked to bring rain gear. For the above scenario, no matter whether the user needs to provide the user with the execution result of other tasks, the increase of information provision may cause trouble to the user, for example, in some cases, the user does not need weather information, and at this time, the user needs to filter the information provided by the voice assistant system to be able to know the ordering result of the air ticket.

In addition, in the prior art, the association relationship between tasks or interaction engines is preset and fixed, which obviously cannot meet the personalized requirements of users.

In summary, how to provide a flexible association relationship between tasks or interaction engines so that the voice assistant system can intelligently execute tasks in a manner that better meets the needs of users and does not bring information complexity to users becomes a problem to be solved urgently.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a method and a device for voice interaction based on an interaction engine cluster.

The invention provides a method for carrying out voice interaction based on an interaction engine cluster, which comprises the following steps:

step 201, acquiring a user instruction;

step 201a, judging the type of the instruction, and if the instruction is a cluster task working mode change instruction, executing step 211;

step 211, the voice assistant system modifies the working mode of the voice assistant system into a cluster task working mode according to the cluster task working mode changing instruction;

the voice assistant system processes the task instruction of the user by using the interaction engine cluster in the cluster task working mode, and simultaneously provides at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user.

The invention provides a device for voice interaction based on an interaction engine cluster, which comprises:

a user instruction acquisition unit for acquiring an instruction of a user;

the instruction classification unit is used for judging the type of a user instruction, and triggering the working mode control unit if the instruction is a cluster task working mode change instruction;

the working mode control unit is used for modifying the working mode of the voice assistant system into a cluster task working mode according to the cluster task working mode change instruction;

the device uses an interaction engine cluster to process a task instruction of a user in the cluster task working mode, and provides at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user simultaneously.

The invention also provides a computer device characterized in that it comprises a processor and a memory, in which a computer program is stored that is executable on the processor, which computer program, when executed by the processor, implements the method as described above.

The invention also provides a computer-readable storage medium, in which a computer program that is executable on a processor is stored, which computer program, when executed, implements a method as described above.

The invention also provides a voice assistant system which is characterized by comprising the device.

The invention also provides a terminal, which is characterized by comprising the device or the voice assistant system.

By the method and the device, the user can modify the working mode of the voice assistant system into the cluster task working mode according to the requirement of the user, and the voice assistant system can intelligently execute tasks in a mode of meeting the requirement of the user and not bringing information redundancy to the user.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a voice assistant system in one embodiment of the invention.

FIG. 2 is a method for voice interaction based on a cluster of interaction engines, in one embodiment of the invention.

FIG. 3 is a method for performing speech interaction based on an interaction engine cluster in another embodiment of the present invention.

FIG. 4 is a method for performing speech interaction based on an interaction engine cluster in another embodiment of the present invention.

FIG. 5 is a method for aggregation to form an interaction engine cluster in one embodiment of the invention.

FIG. 6 is a method for aggregating to form an interaction engine cluster in another embodiment of the invention.

FIG. 7 is a method of building a cluster of interaction engines, in one embodiment of the invention.

FIG. 8 is a method of building an interaction engine cluster in another embodiment of the invention.

FIG. 9 is an apparatus for performing voice interaction based on a cluster of interaction engines in one embodiment of the present invention.

FIG. 10 is a block diagram of an apparatus for speech interaction based on an interaction engine cluster according to another embodiment of the present invention.

FIG. 11 is a block diagram of an apparatus for performing voice interaction based on an interaction engine cluster according to another embodiment of the present invention.

FIG. 12 is an apparatus for aggregating to form an interaction engine cluster in one embodiment of the invention.

FIG. 13 is an apparatus for building an interaction engine cluster in one embodiment of the invention.

FIG. 14 is an apparatus for building an interaction engine cluster in another embodiment of the invention.

FIG. 15 is a method for building or aggregating to form an interaction engine cluster and performing voice interaction based on the interaction engine cluster, in an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings. The embodiments and specific features of the embodiments of the present invention are detailed descriptions of technical solutions of the embodiments of the present invention, and are not limitations on technical solutions of the embodiments of the present invention, and the technical features of the embodiments and the embodiments of the present invention may be combined with each other without conflict.

1. Voice assistant system

FIG. 1 illustrates a block diagram of a voice assistant system that may be implemented on a stand-alone device or across multiple devices. In some embodiments, some of the modules, units or functions of the voice assistant system belong to a server, and the rest of the modules, units or functions belong to a terminal, and the terminal can communicate with the server through one or more networks. In some embodiments, some of the modules of the voice assistant system, such as the processing module 102 and the interaction engine 112, may belong to both the server and the terminal, and the remaining modules belong to one of the server or the terminal.

The voice assistant system mainly comprises: a human-computer interaction interface 101, a processing module 102, a database 103 and the like. The processing module includes n interaction engines 112, where n is a positive integer greater than or equal to 1, and each interaction engine 112 may include a semantic understanding module 201, a dialog management and control module 202, a dialog generation module 203, and a command execution module 204. The processing module 102 is connected to the human-machine interface 101, and can receive data input by a user through the human-machine interface and output interactive data to the user through the human-machine interface, for example, dialog data, task execution process and result fed back to the user.

In some embodiments, the interaction engine 112 may include an interaction main engine and/or at least one interaction sub-engine. Wherein the interaction main engine is a default engine of the voice assistant system. The interaction sub-engine on the server can be owned by the voice assistant system, can also be generated by local training of the terminal and uploaded through a proprietary interface, and the interaction sub-engine in the terminal can be owned by the voice assistant system, can also be generated by local training of the terminal and loaded in the voice assistant system, and can also be downloaded from the server and loaded in the voice assistant system by the terminal.

In some embodiments, each interaction engine (interaction main engine, interaction sub engine) is capable of performing at least one task, i.e. each interaction engine may be associated with at least one task, and the tasks capable of being performed by different interaction sub engines may be the same or different. The interaction engine defines at least one slot for each task associated therewith. The terminal can download one or more interaction sub-engines from the server according to the user requirements and load the interaction sub-engines into the voice assistant system of the terminal, for example, the task executable by the weather interaction sub-engine is weather query, the slot defined by the weather query task is a city and a date, the user can download the weather interaction sub-engine and interact with the weather interaction sub-engine to query the city weather forecast, the task executable by the music interaction sub-engine is music playing, the slot defined by the music playing task comprises a singer name and a song name, and the user can download the music interaction sub-engine and interact with the music interaction sub-engine to play music according to the user requirements.

The slot defined by the interaction engine for the associated tasks at least comprises a basic slot and can also comprise an expanded slot, for example, the B1 interaction engine and the B2 interaction engine are both associated with the weather query task, the slot defined by the B1 interaction engine for the weather query task comprises a city and a date, and the slot defined by the B2 interaction engine for the weather query task comprises a city, a date and a time, wherein the city and date slot is the basic slot and the time slot is the expanded slot.

The user can interact with the voice assistant system in a voice or text mode, the voice assistant system determines the user intention (namely determines the task) according to the instruction of the user, determines the key knowledge data corresponding to each slot position associated with the user intention, and fills the key knowledge data into the corresponding slot position. The voice assistant system then performs the task based on the populated slot or slots.

In some embodiments, the interaction main engine determines, based on a user instruction only, an interaction sub-engine capable of processing the user instruction, which itself does not perform a specific task. Therefore, the determination of the user intention in the above process may be performed by an interaction main engine in the voice assistant system, the interaction main engine is further configured to select one or more interaction sub-engines to process the user instruction based on the determined user intention, and the slot filling and the task execution in the above process are performed by the one or more interaction sub-engines determined by the interaction main engine.

2. Interactive engine cluster

At least two interaction engines are included in one interaction engine cluster. In some embodiments, the at least two interaction engines are at least two interaction sub-engines. Wherein the task associated with any one of the interaction engines in the interaction engine cluster is different from the tasks associated with the other interaction engines in the interaction engine cluster, the task associated with any one of the interaction engines in the interaction engine cluster has at least one same or corresponding slot position with the task associated with at least one other interaction engine in the interaction engine cluster, or, a task belonging to any one of the interaction engines in the interaction engine cluster is different from tasks belonging to the interaction engine cluster in tasks associated with other interaction engines in the interaction engine cluster, and the task belonging to the interaction engine cluster in the task associated with any one of the interaction engines in the interaction engine cluster and the tasks belonging to the interaction engine cluster in the tasks associated with other interaction engines in the interaction engine cluster have at least one same or corresponding slot.

The same slot refers to a slot with the same slot name and the same key knowledge data should be filled in the specific session context, and the corresponding slot refers to a slot with a different slot name and the same key knowledge data should be filled in the specific session context.

An example of an interaction engine cluster is a travel interaction engine cluster, which includes an a1 interaction engine and a B1 interaction engine, the a1 interaction engine can execute a task scheduled for a ticket, the slot positions of the task include a departure place, a destination and a date, wherein the destination is a necessary filling slot position, the departure place is a chosen filling slot position, for the chosen filling slot position, when corresponding key knowledge data is not included in the user instruction, default values are adopted for the departure place slot position, the current city of the user is a city such as beijing, according to different settings of the a1 interaction engine, the date slot position can be the necessary filling slot position or the chosen filling slot position, and when the necessary filling slot position is the user instruction and corresponding key knowledge data is not included in the user instruction, the a1 interaction engine can return a clear question "which day you want to schedule a ticket". The task that the B1 interaction engine can perform is a weather query, the slot of which includes a city and a date. Assuming that the user is more concerned about the weather condition of the destination on the day of travel when booking the air ticket, the date slot position of the a1 interaction engine and the date slot position of the B1 interaction engine are the same slot position, the destination slot position of the a1 interaction engine and the city slot position of the B1 interaction engine are corresponding slot positions, and in this example, the corresponding relationship of the same slot position or the corresponding slot positions of different interaction engines is one-to-one. However, the above assumptions should not be regarded as limitations of the present invention, and the user may also be concerned about weather of the departure place and the destination on the day of travel, at this time, the date slot position of the a1 interactive engine and the date slot position of the B1 interactive engine are the same slot position, the destination slot position of the a1 interactive engine and the city slot position of the B1 interactive engine are corresponding slot positions, and meanwhile, the departure place slot position of the a1 interactive engine and the city slot position of the B1 interactive engine are also corresponding slot positions.

Interaction engines in the interaction engine cluster may be in an inactive, active, suspended, etc. state.

The terminal can download the attribute information of the interaction engine cluster from the server, construct the interaction engine cluster locally according to the downloaded attribute information of the interaction engine cluster, and synchronize the constructed attribute information of the interaction engine cluster to other terminals of the user.

The terminal can also aggregate locally to generate an interaction engine cluster according to the historical conversation records of the user and the voice assistant system, synchronize the attribute information of the generated interaction engine cluster to other terminals of the user, and/or upload the attribute information to a server for downloading by other users.

The attribute information of the interaction engine cluster includes at least one of: the cluster name of the interaction engine cluster, the names of at least two interaction engines included in the interaction engine cluster, and the same or corresponding slots that the tasks that can be executed (or associated) by the at least two interaction engine clusters have. Table 1 shows attribute information of one interaction engine cluster.

TABLE 1 Attribute information for interaction Engine Cluster

The attribute information of the interaction engine cluster may further include tasks associated with the interaction engine, for example, the attribute information of the trip interaction engine cluster may also be as described in table 2.

TABLE 2 Attribute information for interaction Engine Cluster

When the interaction engine has a plurality of associated tasks and only part of the tasks belong to a certain interaction engine cluster, indicating which one or more of the plurality of tasks belong to the interaction engine cluster in the attribute information. For example, the tasks associated with the a1 interaction engine include a ticket booking and aviation news, wherein the ticket booking belongs to a travel interaction engine cluster, and the tasks associated with the B1 interaction engine include a weather query and an alarm setting, wherein the weather query belongs to the travel interaction engine cluster, and at this time, the attribute information of the travel interaction engine cluster is also shown in table 2.

When the interaction engine cluster is used, the interaction engine included in the interaction engine cluster only executes tasks belonging to the interaction engine cluster in the associated tasks, and the tasks not belonging to the interaction engine cluster are not executed. For example, in the above example, when the travel interaction engine cluster is used, the ticket booking task associated with the a1 interaction engine and the weather query task associated with the B1 interaction engine are performed, and the aviation news task associated with the a1 interaction engine and the alarm setting task associated with the B1 interaction engine are not performed.

When the interaction engine cluster is used, if any one interaction engine (hereinafter referred to as a first interaction engine) in the interaction engine cluster is activated, for example, the first interaction engine receives a user instruction, the first interaction engine is activated, or the interaction main engine determines a user intention based on the user instruction, selects an interaction sub-engine capable of processing the user intention based on the user intention, and sends the user instruction to the interaction sub-engine, at this time, the interaction sub-engine is activated, the voice assistant system activates other interaction engines (hereinafter referred to as at least one second interaction engine) in the interaction engine cluster, and instantly synchronizes a first slot position and/or a second slot position of at least one second task associated with the at least one second interaction engine according to key knowledge data filled in a first slot position of the first task associated with the first interaction engine in the interaction engine cluster, and at least one second task associated with the at least one second interaction engine has the first slot position and/or a second slot position corresponding to the first slot position, and all or part of the interaction engines in the interaction engine cluster acquire task execution results and simultaneously provide the task execution results to the user. Wherein a first task associated with a first interaction engine and at least one second task associated with at least one second interaction engine belong to the interaction engine cluster.

The instant synchronization is specifically that, upon a user request, for example, based on a new task instruction of the user, when a first slot of a first task associated with a first interaction engine in the cluster of interaction engines changes from unfilled to filled with first critical knowledge data, the first key knowledge data is instantly populated into the first slot and/or the second slot of the second task associated with the second interaction engine, and/or, upon a user request, e.g., based on a user's cluster task modification instructions, when the critical knowledge data populated by the first slot location of the first task associated with a first interaction engine in the interaction engine cluster changes from the first critical knowledge data to the second critical knowledge data, and modifying the key knowledge data filled in the first slot and/or the second slot of the second task associated with the second interaction engine from the first key knowledge data to the second key knowledge data.

The instant synchronization is specifically, when a first slot position of a first task executed by a first interaction engine in an interaction engine cluster is changed from unfilled to filled with first key knowledge data based on a new task instruction of a user, instantly filling the first key knowledge data into a first slot position and/or a second slot position of a second task associated with a second interaction engine, and when a corresponding relationship between the first interaction engine and the corresponding slot position of the second interaction engine is many-to-one, the filling is additional filling, and/or, when the key knowledge data filled into the first slot position of the first task associated with the first interaction engine in the interaction engine cluster is changed from the first key knowledge data to the second key knowledge data based on a cluster task modification instruction of the user, if the corresponding relationship between the first interaction engine and the corresponding slot position of the second interaction engine is one-to-one, and modifying the key knowledge data filled in the first slot position and/or the second slot position of the second task associated with the second interaction engine from the first key knowledge data to the second key knowledge data in real time, wherein the modification is complete, or modifying the first key knowledge data in the key knowledge data filled in the first slot position and/or the second slot position associated with the second interaction engine to the second key knowledge data in real time if the corresponding relationship between the first interaction engine and the corresponding slot position of the second interaction engine is many-to-one, and modifying the first key knowledge data to the second key knowledge data in part.

For example, when the corresponding relationship between the corresponding slot positions of the first interaction engine and the second interaction engine is one-to-one, taking the example that the date slot position of the a1 interaction engine and the date slot position of the B1 interaction engine are the same slot position, and the destination slot position of the a1 interaction engine and the city slot position of the B1 interaction engine are corresponding slot positions, assuming that the newly-built task instruction of the user is "ticket from tomorrow to shanghai", the voice assistant system determines that the first interaction engine is the a1 interaction engine, sends the instruction to the a1 interaction engine, the a1 interaction engine fills the slot position of the ticket reservation task based on the instruction, fills the starting slot position as beijing, the current location of the user, fills the destination slot position as shanghai, fills the date slot position as 24 months, and at this time, the instant synchronization process is to fill the first key knowledge data into the city slot position of the weather query task associated with the B1 interaction engine, the first key data of 8 months and 24 days are filled into a date slot of a weather query task associated with the B1 interaction engine, and then the task execution results of the A1 interaction engine and the B1 interaction engine are simultaneously provided for the user. The user sends out a cluster task modification instruction 'or goes to a Chengdou', based on the task execution result, at the moment, the A1 interaction engine modifies the key knowledge data filled in the destination slot position of the air ticket booking task from the first key knowledge data Shanghai to the second key knowledge data Chengdou, and the instant synchronization process is that the key knowledge data filled in the city slot position of the weather inquiry task associated with the B1 interaction engine is modified from the first key knowledge data Shanghai to the second key knowledge data Chengdou, and the modification is complete. Subsequently, the task execution results of the a1 interaction engine and the B1 interaction engine are simultaneously provided to the user again.

And when the corresponding relation of the corresponding slot positions of the first interactive engine and the second interactive engine is many-to-one, taking the date slot position of the A1 interactive engine and the date slot position of the B1 interactive engine as the same slot position, the destination slot position of the A1 interactive engine and the city slot position of the B1 interactive engine as the corresponding slot positions, and the departure place slot position of the A1 interactive engine and the city slot position of the B1 interactive engine as corresponding slot positions, supposing that a newly-built task instruction of the user is a ticket from scheduled tomorrow to Shanghai, the voice assistant system determines that the first interactive engine is the A1 interactive engine, sends the instruction to the A1 interactive engine, the A1 interactive engine fills the slot position of the ticket scheduled task based on the instruction, fills the departure place Beijing as the slot position of the current place of the user, fills the destination slot position as Shanghai, fills the month date slot position as 8 days, and the instant synchronization process is that the first key knowledge data is filled into the city slot position 1 days associated with the weather inquiry task of the B3524 interactive engine In the position, the first key knowledge data is additionally filled into a city slot of a weather query task associated with a B1 interaction engine in Shanghai, namely the key knowledge data filled in the city slot is Beijing and Shanghai, the first key data is filled into a date slot of the weather query task associated with a B1 interaction engine in 8 months and 24 days, and then task execution results of the A1 interaction engine and the B1 interaction engine are simultaneously provided for a user. The user sends out a cluster task modification instruction or goes to the Chengdu bar based on the task execution result, at the moment, the A1 interaction engine modifies the key knowledge data filled in the destination slot position of the air ticket booking task from the first key knowledge data Shanghai to the second key knowledge data Chengdu, the instant synchronization process is to modify the first key knowledge data filled in the city slot position of the weather inquiry task associated with the B1 interaction engine from the Shanghai to the second key knowledge data Chengdu, namely, a partial modification mode is adopted, and the modified key knowledge data filled in the city slot position is Beijing and Chengdu. Subsequently, the task execution results of the a1 interaction engine and the B1 interaction engine are simultaneously provided to the user again.

3. Cluster task mode of operation

The voice assistant system has two working modes, namely a single task working mode and a cluster task working mode. The voice assistant system determines a first task such as air ticket reservation based on a user instruction, further determines a first interaction engine such as an A1 interaction engine, executes the first task in a single task working mode, and in a cluster task working mode, interaction engine clusters included in the voice assistant system are all in an available state, and when the first interaction engine belongs to at least one of the interaction engine clusters, the instruction is processed based on the at least one interaction engine cluster, that is, the voice assistant system executes not only the first task but also other tasks belonging to the interaction engine cluster, for example, a weather query task in Table 2.

Fig. 2 shows a method for voice interaction based on an interaction engine cluster, which is used for a terminal with a voice assistant system, and comprises the following steps:

step 201, acquiring a user instruction;

step 202, determining a first task based on the instruction, determining a first interaction engine based on the first task, and sending the instruction to the first interaction engine, so that the first interaction engine fills a slot of the first task based on the instruction; wherein the first interaction engine is associated with the first task;

step 203, judging whether the voice assistant system is in a cluster task working mode, if so, executing

steps

205a, 205b, 205c and 206, and if not, executing step 204;

step 204, receiving a first task execution result from the first interaction engine, and providing the first task execution result to a user;

step 205a, determining an interaction engine cluster to which the first interaction engine belongs, wherein the interaction engine cluster comprises the first interaction engine and at least one second interaction engine;

step 205b, activating said at least one second interaction engine;

step 205c, immediately synchronizing a first slot position of at least one second task associated with at least one second interaction engine in the interaction engine cluster and/or a second slot position corresponding to the first slot position based on the key knowledge data filled by the first interaction engine into the first slot position of the first task;

step 206, receiving a first task execution result from the first interaction engine, receiving at least one second task execution result from the at least one second interaction engine, and providing the first task execution result and the at least one second task execution result to the user at the same time.

The above-described method may be performed by a voice assistant system or may be performed by an apparatus included in a voice assistant system.

The voice assistant system of the terminal comprises at least one interaction engine cluster, and specifically, the voice assistant system stores attribute information of at least one interaction engine cluster. The voice assistant system may aggregate locally to generate the interaction engine cluster, as in methods 300 and/or 400, and may also download the attribute information of the interaction engine cluster from the server and build locally to form the interaction engine cluster, as in methods 500 and/or 600, as described below.

In step 202 the instruction is sent to the first interaction engine, i.e. the first interaction engine is activated.

In step 205b, the voice assistant system may activate the at least one second interaction engine by sending an activation message to the at least one second interaction engine.

A first task associated with a first interaction engine and at least one second task associated with at least one second interaction engine belong to the interaction engine cluster.

Preferably, if the first interaction engine does not belong to any interaction engine cluster in step 205a, step 204 is executed.

Preferably, in step 205a, the interaction engine cluster to which the first interaction engine belongs is determined by a search method, for example, a name of the first interaction engine is searched in attribute information of the interaction engine cluster in the terminal or the voice assistant system, if the name is found, the corresponding interaction engine cluster is the interaction engine cluster described in the first interaction engine, step 205b is executed, if the name is not found, the first interaction engine does not belong to the task interaction engine cluster, and at this time, step 204 is executed.

Preferably, in step 205a, the interaction engine cluster to which the first interaction engine and the first task belong is determined. For example, if the user issues a command "schedule air tickets from tomorrow to shanghai", the first interaction engine is the a1 interaction engine, and the first task is air ticket reservation, it is determined in step 205a that the travel interaction engine cluster to which the a1 interaction engine and the air ticket reservation task belong is obtained, and the other interaction engine clusters to which the a1 interaction engine and the aviation news task belong are not determined.

Preferably, in step 205a, the interaction engine cluster to which the first task belongs is determined by determining the interaction engine cluster to which the first task belongs, for example, by searching the first task in the attribute information of the interaction engine cluster in the terminal or the voice assistant system.

In some embodiments, one or more of the at least two interaction engines included in the interaction engine cluster are master interaction engines, the remaining interaction engines are slave interaction engines, when the voice assistant system is in the cluster task working mode, if a first interaction engine is the master interaction engine of the interaction engine cluster to which the first interaction engine belongs, when the first interaction engine is activated, other interaction engines in the interaction engine cluster are also activated, and instant synchronization of slots between the interaction engines is performed, and if the first interaction engine is not the master interaction engine of the interaction engine cluster to which the first interaction engine belongs, only the first interaction engine is activated. For example, in an interaction engine cluster including an a1 interaction engine and a B1 interaction engine, the a1 interaction engine is a leading interaction engine, and when the instruction issued by the user is "ticket reservation tomorrow to shanghai", the first interaction engine is an a1 interaction engine, which is a leading interaction engine, and thus, both the a1 interaction engine and the B1 interaction engine are activated, and both the ticket reservation and weather inquiry tasks are performed, and when the instruction issued by the user is "how much weather is today", the first interaction engine is a B1 interaction engine, which is not a leading interaction engine, and thus, only the B1 interaction engine is activated, the a1 interaction engine is not activated, and the ticket reservation task is not performed. Notably, the first interaction engine may be a master interaction engine in one interaction engine cluster and a slave interaction engine in the other interaction engine cluster.

The step 205a further comprises: and judging whether the first interaction engine is a leading interaction engine of the interaction engine cluster, if so, executing the step 205b and the subsequent steps, and otherwise, executing the step 204.

In some embodiments, after the

steps

204 and 206 are completed, the step 201 is returned to continue receiving the user's instruction.

In some embodiments, the method shown in fig. 1 further comprises, before performing step 202, step 201 a: judging the type of the instruction, and if the instruction is a newly-built task instruction, executing the step 202 and the subsequent steps;

in step 201a, the type of the instruction is determined, and if the instruction is a single task modification instruction, step 207 is executed: sending the single task modification instruction to a first interaction engine, so that the first interaction engine modifies the key knowledge data filled in the slot of the first task based on the instruction, obtaining a first task modification execution result from the first interaction engine, providing the first task modification execution result to the user, and returning to step 201.

In step 201a, the type of the instruction is determined, and if the instruction is a cluster task modification instruction, step 208 and step 209 are executed.

Step 208: determining a third interaction engine corresponding to the cluster task modification instruction, and sending the instruction to the third interaction engine, so that the third interaction engine modifies a third slot of a third task associated with the third interaction engine based on the instruction; performing instant synchronization on a third slot position of at least one fourth task associated with a fourth interaction engine in the interaction engine cluster and/or a fourth slot position corresponding to the third slot position based on the modified key knowledge data filled in the third slot position of the third task;

step 209, receiving the third task execution result from the third interaction engine, receiving at least one fourth task execution result from the at least one fourth interaction engine, providing the third task execution result and the at least one fourth task execution result to the user at the same time, and returning to step 201.

The first task, the second task, the third task, and the fourth task are merely references to tasks, which does not mean that the tasks are necessarily the same or different. Specifically, the third interaction engine may be a first interaction engine, and correspondingly, the fourth interaction engine may be at least one second interaction engine, where the first task is the same as the third task, and the second task is the same as the fourth task; the third interaction engine may also be

The second interaction engine, correspondingly, the fourth interaction engine may be another second interaction engine or the first interaction engine, in this case, the third task is the same as the second task, and the fourth task is the same as the first task or another second task.

The third task associated with the third interaction engine and the at least one fourth task associated with the at least one fourth interaction engine belong to the interaction engine cluster.

In some embodiments, the task execution result given by the interaction engine may be a final state task execution result; in some embodiments, the task execution result given by the interaction engine may be a transient task execution result, at which time the execution state of the corresponding task is marked as incomplete. After receiving the task confirmation instruction of the user, the interaction engine may convert the transient task execution result into a final task execution result, for example, by communicating with a corresponding server to complete the conversion of the task execution result. In

steps

204 and 206, the task execution result given by at least one interaction engine included in the interaction engine cluster is a transient task execution result, in step 204, the at least one interaction engine is a first interaction engine, and in step 206, the at least one interaction engine may be the first interaction engine and/or a second interaction engine. As mentioned above, after

steps

204 and 206, returning to step 201, continuing to receive the instruction of the user, in step 201a, determining the type of the instruction, and if the instruction is a task confirmation instruction, executing step 210: sending the task confirmation instruction to the at least one interaction engine to trigger the at least one interaction engine to convert the transient task execution result into a final task execution result, receiving the final task execution result from the at least one interaction engine, providing the final task execution result to the user, and returning to step 201.

For example, if the instruction issued by the user for the first time is "ticket scheduled to go tomorrow to shanghai", in step 201a, it is determined that it is a new instruction, then step 202 is executed, in step 202, the voice assistant system determines that the first task engine is an a1 interaction engine, and sends the instruction to an a1 interaction engine, the a1 interaction engine populates a slot of the predetermined task of the ticket based on the instruction, populates a starting slot as beijing, which is the current location of the user, populates a destination slot as shanghai, and populates a date slot as 8 months and 24 days, if the operation mode of the voice assistant system is a cluster task operation mode, or the operation mode of the voice assistant system is a cluster task operation mode and the first interaction engine is the leading interaction engine of the travel interaction engine cluster to which the first interaction engine belongs to, then in step 205B, the B1 interaction engine in the travel interaction engine cluster is also activated, in this example, assuming that the corresponding relation of the corresponding slots of the first interaction engine and the second interaction engine is one-to-one in the above, the voice assistant system fills the first key knowledge data beijing into the city slot of the weather query task associated with the B1 interaction engine, and fills the first key knowledge data beijing into the date slot of the weather query task associated with the B1 interaction engine for 8 months and 24 days. Subsequently, the method is executed to step 206, and then the task execution result provided to the user is a first task execution result "that you search for a ticket that tomorrow CA232 starts from beijing to shanghai, the price is 500 yuan, and whether a reservation is needed" and a second task execution result "that the B1 interaction engine is a cloudy day and a temperature is 28-34 degrees, where the task execution result of the a1 interaction engine is a transient task execution result, at this time, the method returns to step 201, continues to receive the user instruction, and the following description will be given by taking the case that three different user instructions are continuously received after returning to step 201 as an example:

the first method comprises the following steps: if the received instruction is "confirm reservation" after returning to step 201, the a1 interaction engine communicates with a server, such as a national aviation server, to complete reservation of air tickets, and converts the transient task execution result into a final task execution result.

And the second method comprises the following steps: if the received instruction is "go to town bar" after returning to step 201, then in step 201a, it is determined that the instruction is a cluster task modification instruction, at this time, step 208 is executed, the third interactive engine corresponding to the instruction is an a1 interactive engine, that is, the third interactive engine is the first interactive engine, the third task is the same as the first task, the third slot (destination slot) of the third task (air ticket reservation) is modified based on the instruction, the key knowledge data is modified from shanghai to be unity, and the fourth slot (city) of at least one fourth task (weather inquiry) associated with at least one fourth interactive engine (B1 interactive engine) in the interactive engine cluster is instantly synchronized based on the modified key knowledge data (town) filled in the third slot of the third task, that is, the key knowledge data of the city is instantly synchronized to be unity, in step 209, the third interaction engine and the at least one fourth interaction engine respectively execute the associated tasks, and return task execution results to the voice assistant system, and the voice assistant system provides the third task execution result "for you to search for airtickets from beijing to tomorrow in tomorrow CA233 at a price of 400 yuan, and whether to book" and the fourth task execution result "tomorrow in tomorrow at a temperature of 22-26 degrees" of the third interaction engine and the at least one fourth interaction engine to the user at the same time.

And the third is that: if the received instruction is "what weather is good at the success", after returning to step 201, in step 201a, it is determined that the instruction is a cluster task modification instruction, at this time, step 208 is executed, the third interaction engine corresponding to the instruction is a B1 interaction engine, that is, the third interaction engine is a second interaction engine, the third task is the same as the second task, a third slot (a city slot) of the third task (weather query) is modified based on the instruction, key knowledge data of the third task is modified from shanghai to a success, and a fourth slot (a destination slot) of at least one fourth task (ticket reservation) associated with at least one fourth interaction engine (a1 interaction engine) in the interaction engine cluster is instantly synchronized based on the modified key knowledge data (success) filled in the third slot of the third task, that is, the key knowledge data of the destination slot is instantly synchronized to a success, in step 209, the third interaction engine and the at least one fourth interaction engine respectively execute the associated tasks, and return task execution results to the voice assistant system, and the voice assistant system provides the third task execution result "all tomorrow, temperature 22-26 degrees" of the third interaction engine and the fourth task execution result "of the at least one fourth interaction engine for your search for airtickets from beijing to tomorrow CA233, the price is 400 yuan, and whether to reserve" to provide the airtickets to the user at the same time.

Similarly, if the instruction sent by the user for the first time is "schedule air ticket tomorrow to shanghai", in step 201a, it is determined that the instruction is a new instruction, step 202 is executed, in step 202, it is determined that the first task engine is the a1 interaction engine, and the subsequent steps are continuously executed, if the working mode of the voice assistant system is not the cluster task working mode and/or the a1 interaction engine is not the leading interaction engine, the method proceeds to step 204, and the task execution result provided to the user is the first task execution result of the a1 interaction engine, "it is" search for air ticket whose price is 500 yuan and it is required to be scheduled that tomorrow CA232 starts from beijing to shanghai ", at which point, the method also returns to step 201, and continues to receive the user instruction. The method of the present invention is described below by taking the example of continuing to receive two different user commands after returning to step 201:

And the second method comprises the following steps: if the received command is "go to the metropolitan area" after returning to step 201, in step 201a, it is determined that the command is a single task modification command, and step 207 is executed: the first interaction engine (A1 interaction engine) modifies the slot (destination) of the first task type (air ticket reservation) based on the instruction, executes the first task type, provides the task execution result ' for you to search the air ticket from Beijing to Chengdu, the price is 400 yuan, whether to reserve ' for tomorrow CA233 ' to the user, and returns to step 201.

If the user instruction is incomplete, which results in that the first task and/or the second task associated with the first interaction engine and/or the second interaction engine in the interaction engine cluster cannot be executed directly, for example, the date of purchasing the ticket is not specified when the user gives the instruction, after step 205c, step 205d is further included: judging whether a cluster task clarification question is received from the first interaction engine, if so, executing a step 205e, otherwise, executing a step 206;

step 205 e: providing the cluster task clarification question to a user, and returning to the step 201;

in step 201a, if the instruction is an instruction for the cluster task clarification question response, executing step 205 f;

step 205f, sending the instruction to a first interaction engine, so that the first interaction engine fills the slot of the first task according to the instruction for the clarification question response, and continuing to execute step 205 c.

Similarly, when the user instruction is incomplete to cause the first task associated with the first interaction engine to be unable to be executed, and it is determined in step 203 that the voice assistant system is not in the cluster task operating mode, step 204a is executed: judging whether a single task clarification question is received from the first interaction engine, if so, executing a step 204b, otherwise, executing the step 204;

step 204 b: providing the single task clarification question to a user, and returning to the step 201;

in step 201a, if the instruction is an instruction for the single task clarification question response, executing step 204 c;

step 204c, sending the instruction to a first interaction engine, so that the first interaction engine fills the slot of the first task according to the instruction for the clarification question response, and continuing to execute step 204 a.

If, in step 205a, the first interaction engine does not belong to any interaction engine cluster, or it is determined that the first interaction engine is not a dominant interaction engine of the interaction engine cluster, then step 204a is executed: and judging whether a single task clarification question is received from the first interaction engine, if so, executing the step 204b, otherwise, executing the step 204.

In step 205b, when the first interaction engine or the first interaction engine and the interaction engine cluster to which the first task belongs are multiple, the instructions are processed by using the multiple interaction engine clusters, that is, sequentially according to the multiple interaction engine clusters.

In step 205b, when the first interaction engine or the first interaction engine and the interaction engine cluster to which the first task belongs are multiple, one of the interaction engine clusters is selected to be used according to a third preset rule, or the multiple interaction engine clusters are used.

In certain embodiments, the third predetermined rule is at least one of: the interaction engine cluster with the most frequent use, the most recently used interaction engine cluster, the interaction engine cluster with the highest user score, etc. The third preset rule may be the same as or different from the second preset rule mentioned below.

FIG. 3 shows a method for voice interaction based on an interaction engine cluster, which is used for a terminal with a voice assistant system, and comprises the following steps:

step 201, acquiring a user instruction;

The task instruction of the user is processed by using the interaction engine cluster, specifically, when a first interaction engine is activated based on the task instruction, at least one second interaction engine in the interaction engine cluster to which the first interaction engine belongs is activated, and at least one first slot position and/or a second slot position corresponding to the first slot position of at least one second task associated with at least one second interaction engine in the interaction engine cluster are/is processed based on key knowledge data filled in the at least one first slot position of the first task associated with the first interaction engine.

And simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to a user, specifically, receiving the first task execution result from the first interaction engine, receiving at least one second task execution result from the at least one second interaction engine, and simultaneously providing the first task execution result and the at least one second task execution result to the user.

By the method, the user can modify the working mode of the voice assistant system into the cluster task working mode according to the requirement of the user, so that the execution of the associated task is more in line with the requirement of the user, and the redundant information is prevented from being sent to the user.

Preferably, after step 211, return to step 201.

Preferably, after step 211, the method further comprises:

step 212, judging whether the previous instruction of the work mode change instruction is a new task instruction, if so, executing step 213 a;

step 213 a: processing the last instruction of the user by using an interaction engine cluster, and simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user.

Step 213a specifically includes the following steps:

step 215, determining a first task based on the new task instruction, and determining a first interaction engine based on the first task, wherein the first interaction engine is associated with the first task; sending the new task instruction to the first interaction engine, so that the first interaction engine fills a slot position of a first task based on the new task instruction;

step 205b, activating said at least one second interaction engine;

step 205c, immediately synchronizing a first slot position of at least one second task associated with at least one second interaction engine in the interaction engine cluster and/or a second slot position corresponding to the first slot position based on key knowledge data filled by the first interaction engine to the first slot position of the first task;

step 206, receiving the first task execution result from the first interaction engine, receiving at least one second task execution result from the at least one second interaction engine, and providing the first task execution result and the at least one second task execution result to the user at the same time.

The preferred embodiment described above in relation to the method shown in fig. 2 is equally applicable to the method shown in fig. 3. And will not be described in detail herein.

The above method is described below with reference to specific examples. For example, after the user issues the instruction "schedule air ticket from tomorrow to shanghai", the user then gives the instruction "switch the working mode to the cluster task working mode", at this time, in step 211, the voice assistant system modifies the working mode to the cluster task working mode, obtains the previous instruction "schedule air ticket from tomorrow to shanghai", determines that the previous instruction is a new task instruction, and then executes step 215. In step 215, the first interaction engine is determined to be the a1 interaction engine and the last instruction "air ticket scheduled tomorrow to shanghai" is sent to the a1 interaction engine, after which steps 205a-216 are performed so that the first task execution result and the at least one second task execution result can be simultaneously provided to the user.

By the method, the user can modify the working mode of the voice assistant system at any time, and can automatically execute the last new task instruction according to the cluster voice interaction mode after modifying the working mode of the voice assistant system, so that the new task instruction does not need to be sent again under the condition that the user sends the new task instruction and then modifies the working mode, and the voice conversation experience of the user is improved.

Preferably, when step 213a is executed, step 214 is executed: judging whether the voice assistant system provides a task execution result for the newly-built task instruction to the user, and if so, executing the step 205a and the subsequent steps; if not, step 215 and its subsequent steps are performed. That is, if the task execution result is provided to the user for the new task instruction, step 215 may be skipped, that is, the first interaction engine does not need to be determined again to determine the first task, and the new task instruction does not need to be sent to the first interaction engine, so that the first interaction engine does not need to fill the slot of the first task based on the new task instruction again. It should be noted that, even if the task execution result for the new task instruction is provided to the user, in step 216, the voice assistant system still obtains the first task execution result from the first interaction engine again and provides it to the user together with the second task execution result, which takes into account that, after the user gives the new task instruction and obtains the corresponding task execution result, the user may immediately modify the working mode of the voice assistant system if the task execution result does not meet the expectation, at this time, the user may not carefully listen to or view the first task execution result, therefore, after modifying the working mode of the voice assistant system, the first task execution result and the second task execution result that have been provided before are provided together to the user, which better meets the experience requirement of the user, and avoids that the user needs to relocate to the first task execution result given by the voice assistant system before modifying the working mode of the voice assistant system, and a smoother use experience is provided for the user.

Preferably, when it is determined in step 212 that the previous command to the operation mode change command is a new task command, step 213 is executed: and judging whether the receiving time of the new task instruction meets the preset condition, if so, executing the step 213 a.

Preferably, if the receiving time of the new task instruction is determined not to meet the preset condition, the process returns to step 201.

And the preset condition is that the difference value between the receiving time of the newly-built task instruction and the receiving time of the cluster task working mode changing instruction is smaller than a preset value.

By the above method, unnecessary execution of the last new task instruction in the cluster voice interaction mode is avoided, for example, the user issues a new task instruction ten times a morning, obtains a task execution result, and issues an operation mode modification instruction three times a afternoon, in which case the user's intention is only to modify the operation mode of the voice assistant system, and does not need to execute the new task instruction ten times a morning in the cluster voice interaction mode, and according to the above method, the above situation can be accurately recognized.

In step 201a, the voice assistant system may determine whether the instruction is a cluster task operation mode change instruction for the keyword included in the user instruction.

The step 211 further includes: and sending a synchronization message to other terminals of the user, wherein the synchronization message is used for indicating the voice assistant system on the other terminals of the user to modify the working mode of the voice assistant system into a cluster task working mode.

4. Cluster name

FIG. 4 shows a method for voice interaction based on an interaction engine cluster, comprising the following steps:

step 201, acquiring a user instruction;

step 201a, judging the type of the instruction, and if the instruction comprises an interaction engine cluster enabling instruction, executing step 216; the interaction engine cluster enabling instruction carries a cluster name;

step 216, enabling the interaction engine cluster corresponding to the cluster name by the voice assistant system;

step 220, the voice assistant system processes the new task instruction based on the interaction engine cluster corresponding to the cluster name carried in the interaction engine cluster enabling instruction according to the new task instruction carried in the instruction;

step 206, simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to a user.

After step 216 and before step 220, step 217 is further included, whether the instruction includes a new task instruction is determined, if yes, step 220 is executed, otherwise, step 201 is returned to.

In step 217, if the instruction includes a new task instruction, further performing step 218, determining a first task based on the new task instruction, determining a first interaction engine associated with the first task based on the first task, and determining whether the first interaction engine belongs to the interaction engine cluster, if so, performing step 220, otherwise, performing step 221;

step 221 is: prompting the user that the new task instruction is not matched with the cluster name, and returning to step 201.

In step 217, if the instruction includes a new task instruction, further performing step 218, determining a first task based on the new task instruction, determining a first interaction engine associated with the first task based on the first task, determining whether the first interaction engine belongs to the interaction engine cluster, if so, performing step 219, otherwise, performing step 221;

step 219: judging whether the first interaction engine is a dominant interaction engine of the interaction engine cluster, if so, executing step 220, otherwise, executing step 222;

step 222 is to prompt the user to establish a non-dominant interaction engine of the interaction engine associated with the new task instruction, and the step 201 is returned.

Preferably, step 220 includes the following steps.

Step 220a, sending a newly-built task instruction carried in the instruction to a first interaction engine in the interaction engine cluster, so that the first interaction engine fills a slot of a first task based on the instruction; the first interaction engine is associated with a first task, and the first task is determined based on the new task instruction;

step 205b, activating at least one second interaction engine of said interaction engine cluster; the at least one second interaction engine is another interaction engine in the interaction engine cluster than the first interaction engine;

step 205c, immediately synchronizing the first slot position of the at least one second task associated with the at least one second interaction engine in the interaction engine cluster and/or the second slot position corresponding to the first slot position based on the key knowledge data filled by the first interaction engine to the first slot position of the first task.

Step 206 specifically comprises: receiving the first task execution result from the first interaction engine, receiving at least one second task execution result from the at least one second interaction engine, and simultaneously providing the first task execution result and the at least one second task execution result to a user.

Preferably, step 220 includes the steps of:

step 220a (as mentioned above, it is not described herein again);

step 205b (as mentioned above, the details are not repeated herein);

step 205c (as mentioned above, the details are not repeated herein);

step 205 d: determining whether a clarification challenge is received from the first interaction engine, if so, performing step 205e, otherwise, performing step 206;

step 205 f: sending the instruction to a first interaction engine, so that the first interaction engine fills the slot of the first task according to the instruction for the clarifying challenge response, and after step 205f, executing step 205 c.

Step 206 (as mentioned above, it is not described herein again).

In some embodiments, in step 218, it is determined whether the first interaction engine and the first task belong to the interaction engine cluster.

In step 201a, if the user instruction is a new task instruction, execute step 202a, determine whether a previous instruction of the new task instruction is an interaction engine cluster enabling instruction, if so, execute step 220 and its subsequent steps, or execute step 218 and its subsequent steps, otherwise, execute step 202. It should be noted that, the user instruction is a new task instruction, which means that the user instruction only includes the new task instruction and does not include the interaction engine cluster enabling instruction, and conversely, if the user instruction includes the interaction engine cluster enabling instruction, even if the user instruction includes the new task instruction at the same time, the user instruction is not classified as the new task instruction to perform step 202a, but performs step 216 according to the interaction engine cluster enabling instruction, that is, the determination level of the interaction engine cluster enabling instruction is higher than that of the new task instruction.

The above describes using an interaction engine cluster by two modes, a cluster task working mode and a cluster name, where the former uses the interaction engine cluster, all interaction engine clusters in a voice assistant system of a terminal are in an available state, and a specific use of which interaction engine cluster or interaction engine clusters depends on a task instruction given by a user, and the latter uses the interaction engine cluster, and only the interaction engine cluster corresponding to the cluster name can be used. Through the two different modes, more convenient and richer interactive engine cluster use experience is provided for users.

5. Local aggregation to form an interaction engine cluster

Referring to FIG. 5, a method 300 for aggregating interaction engine clusters in one embodiment of the present invention is shown for a terminal with a voice assistant system, comprising the following steps:

step 301: selecting at least two historical tasks to form a historical task set;

step 302: aiming at each historical task in the historical task set, acquiring an interaction engine associated with each historical task in a voice assistant system of the terminal, and forming an interaction engine set by the interaction engine associated with each historical task in the historical task set; wherein each interaction engine is associated with at least one task, the interaction engine defining at least one slot for each task associated therewith;

step 303, determining whether at least two interaction engines in the interaction engine set have the same or corresponding slot positions in the slot positions defined by the respective associated historical tasks, and if so, aggregating the at least two interaction engines to form an interaction engine cluster so that the voice assistant system can use the interaction engine cluster.

Preferably, in step 301, at least two historical tasks are selected according to the first preset rule and the historical dialog record.

Preferably, in step 301, at least two historical tasks are selected according to a first preset rule and a historical dialogue record, specifically: selecting at least two historical tasks executed within preset time in a historical conversation record; and/or selecting at least two historical tasks executed in a preset dialogue turn number in the historical dialogue records; and/or selecting at least two historical tasks which are continuously executed in the historical dialogue records. The at least two history tasks which are continuously executed refer to that other history tasks which do not belong to the at least two history tasks are not executed between the at least two history tasks.

In step 303, after the at least two interaction engines are aggregated to form an interaction engine cluster, according to the sequence of the at least two historical tasks, determining one or more interaction engines in the interaction engine cluster as a dominant interaction engine.

Referring to FIG. 6, a method 400 for aggregating to form an interaction engine cluster for a terminal with a voice assistant system in one embodiment of the invention is shown, comprising the steps of:

step 401, acquiring a historical dialogue record of a user and a voice assistant system, and judging whether the following situations exist in the historical dialogue record: interrupting a conversation related to a first task, entering the conversation related to at least one second task, and recovering the conversation related to the first task after the execution of the at least one second task is finished; if yes, forming a historical task set, wherein the historical task set comprises the first task and at least one second task;

step 402: aiming at each historical task in the historical task set, acquiring an interaction engine associated with each historical task in a voice assistant system of the terminal, wherein the interaction engine associated with each historical task in the historical task set forms an interaction engine set; wherein each interaction engine is associated with at least one task, the interaction engine defining at least one slot for each task associated therewith;

step 403, judging whether the slots defined by at least two interaction engines in the interaction engine set for their respective associated historical tasks have the same or corresponding slots, and if so, aggregating the at least two interaction engines to form an interaction engine cluster so that the voice assistant system can use the interaction engine cluster.

Preferably, in step 401, it is determined whether the following condition exists in the history dialog record: interrupting a dialog related to a first task, entering a dialog related to at least one second task, and recovering the dialog related to the first task after the execution of the at least one second task is finished, specifically: analyzing the historical conversation records to obtain corresponding historical task execution time lines and historical task execution states; judging whether a historical task closed loop exists or not based on the historical task execution time line and the historical task execution state; the historical task closed loop is a historical task sequence with a first task as a head node and a tail node, at least one second task as an intermediate node, incomplete execution state of the head node and complete execution state of at least one intermediate node. In the historical task sequence, the task execution state of the tail node may be complete or incomplete, which is not limited in the present invention.

In step 403, after aggregating the at least two interaction engines to form an interaction engine cluster, the interaction engine associated with the first task is determined as the dominant interaction engine.

The following applies to the above-described methods 300 and 400.

Preferably, in steps 303 and 403, based on an instruction of the user in the historical dialog record, it is determined whether at least two interaction engines in the set of interaction engines have the same or corresponding slots in the slots defined for the corresponding historical tasks.

Preferably, in steps 303 and 403, based on an instruction of a user in the historical dialog record, determining key knowledge data of at least one slot defined by an interaction engine associated with a historical task corresponding to the instruction for the historical task, and based on the slot and the key knowledge data, determining whether at least two interaction engines in the interaction engine set have the same or corresponding slots in the slots defined by the interaction engines for the associated historical tasks.

Preferably, in steps 303 and 403, the at least two interaction engines are aggregated to form an interaction engine cluster, specifically: recording the names of the at least two interaction engines in the attribute information of the interaction engine cluster, and recording the same or corresponding slot position in the slot positions defined by the at least two interaction engines in the interaction engine set aiming at the associated historical tasks in the attribute information of the interaction engine cluster.

In steps 303 and 403, the speech assistant system uses the interaction engine cluster, specifically: activating at least one second interaction engine in the interaction engine cluster upon activation of a first interaction engine in the interaction engine cluster, instantly synchronizing at least one first slot location and/or second slot location of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on the key knowledge data populated for the at least one first slot location of the first task associated with the first interaction engine, the second task associated with the second interaction engine has the at least one first slot position and/or a second slot position corresponding to the first slot position, receives the first task execution result from the first interaction engine, receives at least one second task execution result from the at least one second interaction engine, and provides the first task execution result and the at least one second task execution result to the user at the same time.

In steps 303 and 403, the voice assistant system uses the interaction engine cluster, specifically: and when the voice assistant system is in a cluster task working mode and a first interaction engine determined based on a user instruction belongs to the interaction engine cluster, using the interaction engine cluster.

In steps 303 and 403, the voice assistant system uses the interaction engine cluster, specifically: and when the voice assistant system is in a cluster task working mode and a first interaction engine and a first task determined based on a user instruction belong to the interaction engine cluster, using the interaction engine cluster.

In steps 303 and 403, the voice assistant system uses the interaction engine cluster, specifically: and when the received instruction of the user comprises the cluster name of the interaction engine cluster, using the interaction engine cluster.

In steps 303 and 403, the voice assistant system uses the interaction engine cluster, which specifically includes some or all of the

aforementioned steps

201 and 222.

The above-described methods 300 and 400 are described below with reference to specific examples. The method 300 will first be described in detail. For example, the historical dialog of the user with the voice assistant system at 10 am to 11 am on 24 months 8 is recorded as follows:

the user: booking air tickets from tomorrow to Shanghai.

The voice assistant system: a ticket that a CA232 departs from Beijing to Shanghai at two points in the afternoon of tomorrow and is reserved for you;

the user: how is the day shanghai weather?

The voice assistant system: the yin of Shanghai turns fine in the Ming Tian.

The user: playing the Zhou Jieren sunny day.

The voice assistant system: and the user can find the Zhou Ji Lun on a sunny day and then the playing is about to be performed.

Taking the step 301 as an example of selecting at least two historical tasks executed within a preset time in the historical dialog records to further explain the method, assuming that the preset time is 1 hour, three historical tasks are selected and obtained based on the historical dialog records: booking air tickets, inquiring weather and playing songs; as previously described, the interaction engine associated with the ticket reservation task is assumed to be the a1 interaction engine, the interaction engine associated with the weather query task is assumed to be the B1 interaction engine, and the interaction engine associated with the song play is assumed to be the C1 interaction engine. In step 302, the formed interaction engine set includes an A1 interaction engine, a B1 interaction engine, and a C1 interaction engine. The slot position defined by the a1 interaction engine for the ticket reservation includes a starting place, a destination and a date, the slot position defined by the B1 interaction engine for the weather query task includes a city and a date, the slot position defined by the C1 interaction engine for the song play includes a singer name and a song name, then in step 303, it is determined that the key knowledge data of the starting place slot position defined by the a1 interaction engine for the ticket reservation, which is the history task corresponding to the instruction, is the current city of the user, i.e. beijing, the key knowledge data of the destination slot position is shanghai and the key knowledge data of the date slot position is 8 months and 25 days, based on the instruction of the user in the history dialogue record, is how much the weather is in shanghai? "determining that the key knowledge data of the city slot defined by the interactive engine B1 interactive engine for weather query corresponding to the instruction is shanghai, and the key knowledge data of the date slot is 8 months and 25 days, based on the instruction" play fine day of zhou jilun "of the user in the history dialogue record, determining that the key knowledge data of the singer name slot defined by the interactive engine C1 interactive engine for song play corresponding to the instruction is zhou jilun, and the key knowledge data of the song name slot is fine day, and based on that, determining whether the slots defined by at least two interactive engines in the interactive engine set for the associated history tasks have the same or corresponding slots, in this specific example, it can be determined that the destination defined by the interactive engine a1 for ticket slot reservation and the city defined by the interactive engine B1 interactive engine for weather query task are corresponding slots And the slot position, the date slot position defined by the A1 interaction engine for the ticket in advance and the date slot position defined by the B1 interaction engine for the weather query task are the same slot position, and the C1 interaction engine is not the same as or corresponding to the A1 interaction engine and the B1 interaction engine, so that the A1 interaction engine and the B1 interaction engine are aggregated to form an interaction engine cluster for the voice assistant system to use in subsequent voice interaction with the user.

The method 400 is described in detail below. For example, the user's historical dialog with the voice assistant system is as follows:

the user: help me to set up an alarm clock 7 am tomorrow.

The voice assistant system: you have helped to set up an alarm clock at 7 am in the morning.

The user: booking air tickets from tomorrow to Shanghai.

The voice assistant system: searching for tickets from Beijing to Shanghai for two points of afternoon CA232 in tomorrow;

the user: playing the Zhou Jieren sunny day.

The user: how is the weather in the open sky and the sea?

The voice assistant system: the yin of Shanghai turns fine in the Ming Tian.

The user: the reservation is confirmed.

The voice assistant system: a ticket that starts in beijing to shanghai for CA232 has been scheduled for you tomorrow at two points in the afternoon.

The user: and inquiring the nearby Chinese banks.

The voice assistant system: there are 1 Chinese bank in 1 kilometer nearby, the address is number 88 of construction road, and the telephone is 12345678.

The historical task execution timeline obtained based on the historical dialogue record analysis is as follows: alarm clock setting, ticket reservation, song playing, weather query, ticket reservation, map query, and historical task execution states are as follows: and the history conversation record is determined to have a history task closed loop based on the history task execution time line and the history task execution state, wherein the head node and the tail node of the history task closed loop are used for reserving a first task, namely the air ticket, the history task closed loop is provided with two intermediate nodes, the corresponding second task is song playing and weather inquiry, the execution state of the head node is incomplete, and the execution state of the intermediate nodes is complete. At this time, the generated historical task set includes: airline ticket reservations, song playback, and weather inquiries. In this example, the historical task set generated by the method 400 is the same as the historical task set generated by the method 300 in the previous example, and therefore, the execution process of the step 402 and the step 403 is similar to that of the step 302 and the step 303, and is not described again here.

In certain embodiments, prior to step 301 or 401, the following steps are performed:

step 201: receiving an instruction of a user;

step 201a, determining the type of the instruction, and if the instruction is an interactive engine cluster aggregation instruction, performing step 301 and/or step 401 and step 403.

In the embodiment in which steps 301-303 and steps 401-403 are both performed, the order of performing the steps is not limited, that is, the steps 301-303 may be performed first, and the steps 401-403 may be performed first.

After step 303 or 403, a step 304 or 404 is performed, said step 304 or 404: and providing a first interface for a user so that the user names the interaction engine cluster through the first interface, and recording the cluster name input by the user in attribute information of the interaction engine cluster by the voice assistant system.

And setting a corresponding use level for the interaction engine cluster. In step 303 or step 403, after aggregating the at least two interaction engines to form an interaction engine cluster, marking the usage level of the interaction engine cluster as the lowest, and recording the usage level of the interaction engine cluster in the attribute information thereof.

In some embodiments, in step 301 or 401, a plurality of historical task sets are generated according to the historical dialog records, statistics is performed on the plurality of generated historical task sets, and the usage level of the interaction engine cluster is marked according to the frequency of the historical task sets in the statistical result. For example, 3 historical task sets are generated, which are (air ticket reservation, song playing and weather query), (restaurant air table query and taxi taking) and (air ticket reservation, song playing and weather query), respectively, then the statistical result is (air ticket reservation, song playing and weather query) twice, (restaurant air table query and taxi taking) once, the voice assistant system marks a higher usage level for the interaction engine cluster generated based on the task set (air ticket reservation, song playing and weather query) and marks a lower usage level for the interaction engine cluster generated based on the task set (restaurant air table query and taxi taking).

In certain embodiments, after step 303 or 403, step 305 or 405 is performed: and providing a second interface for the user, so that the user modifies the use level of the interaction engine cluster through the second interface, and recording the modified use level of the interaction engine cluster in the attribute information of the interaction engine cluster. The user can modify the use level of the interaction engine cluster mark by the voice assistant system according to the requirement of the user.

In some embodiments, in step 302 or 402, for each historical task of the historical task set, an interaction engine associated with the historical task in the voice assistant system of the terminal is obtained, specifically, an interaction engine executing the historical task is obtained.

In some embodiments, in step 302 or 402, for each historical task of the historical task set, an interaction engine associated with the historical task in the voice assistant system of the terminal is obtained, specifically, at least one interaction engine capable of executing the historical task in the interaction engines included in the voice assistant system of the terminal is obtained, and according to a second preset rule, one interaction engine is selected from the at least one interaction engine to serve as the interaction engine associated with the historical task.

In some embodiments, the second predetermined rule is at least one of: the interaction engine that is used the most often, the most recently used interaction engine, the interaction engine that has the highest user score, the interaction engine that has the most often the user has adopted its task execution results, and so on.

In certain embodiments, the step 206 further comprises: updating the usage level of the interaction engine cluster based on the usage of the interaction engine cluster. The use of the interaction engine cluster refers to the use of the interaction engine cluster triggered by a user sending an instruction including a cluster name corresponding to the interaction engine cluster, or the use of the interaction engine cluster triggered by the user sending the instruction when the voice assistant system is in a cluster task working mode.

In some embodiments, after step 303 or 403, step 306 or 406 is executed to upload the interaction engine cluster to a server for downloading by other users, and specifically, to upload the attribute information of the interaction engine cluster.

In some embodiments, after step 303 or 403, step 307 or 407 is executed to provide a third interface to the user, so that the user sets the attribute of the interaction engine in the interaction engine cluster through the third interface, for example, a master interaction engine and a slave interaction engine of the interaction engine cluster are set, and the attribute of the interaction engine is recorded in the attribute information of the interaction engine cluster.

In some embodiments, after step 303 or 403, step 308 or 408 is executed to send a synchronization message to another terminal of the user, where the synchronization message carries the interaction engine cluster obtained by aggregation, and specifically, the synchronization message carries the attribute information of the interaction engine cluster.

The present invention does not limit the execution sequence of steps 304-308, and the five steps can be executed in any order, and likewise, the present invention does not limit the execution sequence of steps 404-408, and the five steps can be executed in any order.

6. Building an interaction Engine Cluster

As mentioned above, the voice assistant system of the terminal can also acquire or download the attribute information of the interaction engine cluster from the server, and construct the interaction engine cluster locally.

Referring to FIG. 7, a method 500 for constructing an interaction engine cluster for a terminal with a voice assistant system is shown in one embodiment of the present invention, the method comprising the steps of:

step 501, receiving attribute information of a first interaction engine cluster sent by a server, wherein the attribute information of the interaction engine cluster comprises names of at least two interaction engines included in the interaction engine cluster;

step 502, judging whether the voice assistant system of the terminal includes all the interaction engines included in the attribute information of the interaction engine cluster, if the voice assistant system does not include the fifth interaction engine of all the interaction engines, executing step 503;

step 503, acquiring a task associated with the fifth interaction engine;

step 504, judging whether the voice assistant system of the terminal comprises a sixth interaction engine associated with the task, if so, executing step 506;

step 506, replacing a fifth interaction engine in the attribute information of the first interaction engine cluster with a sixth interaction engine to form attribute information of a second interaction engine cluster;

and step 508, constructing an interaction engine cluster based on the second interaction engine cluster attribute information so that the voice assistant system can use the interaction engine cluster.

Preferably, the attribute information of the first interaction engine cluster in step 501 may be obtained by locally aggregating, by another user, the interaction engine cluster at the terminal thereof according to the foregoing method 300 and/or 400 and uploading the attribute information of the interaction engine cluster to the server, or may be the attribute information of the interaction engine cluster preset by the server.

When there are multiple tasks associated with the fifth interaction engine, in step 503, the task belonging to the first interaction engine cluster associated with the fifth interaction engine is acquired, and the acquiring process may be completed based on the attribute information (e.g., table 2) of the first interaction engine cluster.

Preferably, before step 501, the method comprises the following steps

Step 201: receiving an instruction of a user;

step 201a, judging the type of the instruction, and if the instruction is a first interaction engine cluster downloading instruction, executing step 501.

Preferably, the voice assistant system uses the interaction engine cluster, specifically: activating at least one second interaction engine in the interaction engine cluster upon activation of a first interaction engine in the interaction engine cluster, instantly synchronizing at least one first slot and/or second slot of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on the populated critical knowledge data for the at least one first slot of the first task associated with the first interaction engine, the second task associated with the second interaction engine has the at least one first slot position and/or a second slot position corresponding to the first slot position, receives the first task execution result from the first interaction engine, receives at least one second task execution result from the at least one second interaction engine, and provides the first task execution result and the at least one second task execution result to a user at the same time.

Preferably, the interaction engine cluster downloading instruction includes a cluster name of the interaction engine cluster.

Preferably, in step 502, if the voice assistant system of the terminal includes all the interaction engines included in the attribute information of the first interaction engine cluster, step 509 is executed.

Preferably, in step 504, if the voice assistant system of the terminal does not include the sixth interaction engine associated with the above task, step 507 and step 509 are performed.

Preferably, in step 504, if the voice assistant system of the terminal includes a sixth interaction engine associated with the task, step 505 is executed, step 505 is to determine whether a slot defined by the sixth interaction engine for the task includes all slots defined by the fifth interaction engine for the task, and if so, step 506 is executed.

Preferably, in step 505, if the slot defined by the sixth interaction engine for the task does not include all slots defined by the fifth interaction engine for the task, step 507 and step 509 are executed;

step 507, downloading the fifth interaction engine from the server;

step 509, constructing an interaction engine cluster based on the attribute information of the first interaction engine cluster, so that the voice assistant system can use the interaction engine cluster.

Preferably, step 508 further comprises: and sending a synchronization message to other terminals of the user, wherein the synchronization message carries the constructed interaction engine cluster, and specifically, the synchronization message carries the attribute information of the second interaction engine cluster.

Step 509 further includes sending a synchronization message to the other terminal of the user, where the synchronization message carries the constructed interaction engine cluster, and specifically, the synchronization message carries the attribute information of the first interaction engine cluster.

Preferably, the synchronization message carries download indication information in addition to the first interaction engine cluster attribute information, where the download indication information includes a name of a fifth interaction engine and indicates the other terminal to download the fifth interaction engine.

Based on this, after receiving the synchronization message, the other terminals judge whether the synchronization message carries the download indication information, if the synchronization message carries the download indication information, the fifth interaction engine is downloaded according to the download indication information, and an interaction engine cluster is constructed, and if the synchronization message does not carry the download indication information, the other terminals directly construct the interaction engine cluster according to the interaction engine cluster attribute information carried in the synchronization message, such as the first interaction engine cluster attribute information or the second interaction engine cluster attribute information. This is because the voice assistant systems of different terminals of the user include the same interaction engine before downloading the interaction engine cluster, and since the terminal executing the method 500 already performs the determination in steps 502, 504, and 505, the other terminals do not need to repeatedly perform the determination process, and the operation of the other terminals can be simplified according to whether the indication information is downloaded in the synchronization message.

Step 507 is specifically, downloading the fifth interaction engine from the server immediately; or, adding the fifth interaction engine in the downloading waiting list, periodically judging whether the terminal is accessed to the wifi network, and if the terminal is accessed to the wifi network, executing downloading according to the downloading waiting list.

After the step 507 and before the step 509, step 507a is executed to determine whether the fifth interaction engine is completely downloaded, and if so, step 509 is executed.

In step 507a, it is determined that the fifth interaction engine download is complete by determining that the waiting download list is empty.

The above method is described below with reference to specific examples. Assume that the first interaction engine cluster attribute information received in step 501 is shown in table 2. In step 502, assuming that the voice assistant system of the terminal only includes the a1 interaction engine and does not include the B1 interaction engine, the method proceeds to step 503, and obtains a task associated with a fifth interaction engine, i.e., the B1 interaction engine, in this example, assuming that only one task associated with the B1 interaction engine is included, i.e., a weather query, then in step 504, it is determined whether the voice assistant system of the terminal includes the sixth interaction engine associated with the task, in this embodiment, assuming that the voice assistant system of the terminal includes the B2 interaction engine associated with the weather query, step 505 is performed, the slot defined by the B1 interaction engine for the weather query task includes a city and a date, and the slot defined by the B2 interaction engine for the weather query task includes a city, a date and a time, then in step 505, it is determined that the slot defined by the sixth interaction engine for the task includes all slots defined by the task of the fifth interaction engine, at this time, the method proceeds to step 506, a sixth interaction engine, namely a B2 interaction engine, is used to replace a fifth interaction engine, namely a B1 interaction engine, in the attribute information of the first interaction engine cluster, to form second interaction engine cluster attribute information, and in step 508, an interaction engine cluster is constructed based on the second interaction engine cluster attribute information, and the constructed interaction engine cluster includes an a1 interaction engine and a B2 interaction engine.

By the method, after the attribute information of the interaction engine cluster is downloaded from the server, the attribute information of the interaction engine cluster can be localized according to the interaction engine locally included in the voice assistant system of the terminal, so that unnecessary downloading of the interaction engine capable of executing the same task is avoided.

Referring to FIG. 8, a method 600 for constructing an interaction engine cluster for a terminal with a voice assistant system is shown in one embodiment of the present invention, the method comprising the steps of:

601, receiving first interaction engine cluster configuration information sent by a server, wherein the first interaction engine cluster configuration information includes tasks associated with interaction engines included in a first interaction engine cluster;

step 602, for each task included in the configuration information of the first interaction engine cluster, selecting an interaction engine associated with the task from a voice assistant system of the terminal, and putting the interaction engine into an interaction engine set;

step 603, after the interaction engine set includes the interaction engine associated with each task, an interaction engine cluster is constructed based on the interaction engines in the interaction engine set, so that the voice assistant system can use the interaction engine cluster.

Preferably, before step 601, the method comprises the following steps

Step 201: receiving an instruction of a user;

step 201a, judging the type of the instruction, and if the instruction is a second interaction engine cluster downloading instruction, executing step 601.

Preferably, the voice assistant system uses the interaction engine cluster, specifically: activating at least one second interaction engine in the interaction engine cluster upon activation of a first interaction engine in the interaction engine cluster, instantly synchronizing at least one first slot location and/or second slot location of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on the key knowledge data populated for the at least one first slot location of the first task associated with the first interaction engine, the second task associated with the second interaction engine has the at least one first slot position and/or a second slot position corresponding to the first slot position, receives the first task execution result from the first interaction engine, receives at least one second task execution result from the at least one second interaction engine, and provides the first task execution result and the at least one second task execution result to the user at the same time.

Preferably, for each interaction engine in the interaction engine cluster, the interaction engine cluster configuration information includes a task associated therewith. The server generates the interaction engine cluster configuration information after receiving an interaction engine cluster (specifically, interaction engine cluster attribute information) uploaded by the terminal, and the generation process of the interaction engine cluster configuration information is to replace an interaction engine name in the interaction engine cluster attribute information with a task associated with the interaction engine, for example, the interaction engine cluster attribute information shown in table 1, or delete the interaction engine name in the interaction engine cluster attribute information, for example, the interaction engine cluster attribute information shown in table 2. Preferably, the generating process of the configuration information of the interaction engine cluster is suitable for a case that the interaction engine included in the interaction engine cluster is associated with only one task.

Preferably, for each interaction engine in the interaction engine cluster, the interaction engine cluster configuration information includes tasks belonging to the interaction engine cluster among the tasks associated with the interaction engine. And the generation process of the interaction engine cluster configuration information is deleting the name of the interaction engine in the attribute information of the interaction engine cluster. Preferably, the generating process of the interaction engine cluster configuration information is applicable to a case that at least one interaction engine included in the interaction engine cluster is associated with a plurality of tasks.

Preferably, when receiving the attribute information of the interaction engine cluster uploaded by the terminal, the server determines whether the attribute information contains a task associated with the interaction engine, if so, deletes the name of the interaction engine in the attribute information of the interaction engine cluster, generates configuration information of the interaction engine cluster, and if not, replaces the name of the interaction engine in the attribute information of the interaction engine cluster with the task associated with the interaction engine, and generates the configuration information of the interaction engine cluster.

Preferably, in step 602, an interaction engine associated with the task is selected from a voice assistant system of the terminal and placed into an interaction engine set, specifically:

step 602a, judging whether the voice assistant system of the terminal comprises an interaction engine associated with the task, if so, executing step 602 b; otherwise, step 602c and its subsequent steps are executed;

step 602b, putting the interaction engine into an interaction engine set;

step 602c, obtaining a first selectable interaction engine list of the task, providing the first selectable interaction engine list to the user, and returning to step 201;

in step 201a, if the instruction is a first download instruction, step 602d is executed; the first downloading instruction carries an interaction engine selected by a user for the task from the first selectable interaction engine list;

step 602d, downloading the interaction engines indicated in the first downloading instruction, and putting the interaction engines into an interaction engine set.

The obtaining of the first optional interaction engine list of the task in step 602c may specifically be sending a first optional interaction engine list obtaining request to the server, where the first optional interaction engine list obtains a request carrying the task, and receiving the first optional interaction engine list returned by the server.

The step 602d of downloading the interaction engine indicated in the first download instruction specifically includes: immediately downloading the interaction engine indicated in the first downloading instruction from the server; or, adding the interaction engine indicated in the first downloading instruction in the downloading waiting list, periodically judging whether the terminal is accessed to the wifi network, and if the terminal is accessed to the wifi network, executing downloading according to the downloading waiting list.

Preferably, in step 603, by determining whether the number of interaction engines included in the interaction engine set is equal to the number of interaction engines included in the first interaction engine configuration set, if so, it may be determined that the interaction engine associated with each task is included in the interaction engine set.

Preferably, the first selectable interaction engine list of the task obtained from the server in step 602c specifically includes: the name of the interaction engine may be selected.

Preferably, the obtaining the first selectable interaction engine list of the task from the server in step 602c further includes: the recommendation level of the interaction engine can be selected.

Preferably, if a plurality of interaction engines are associated with the task in the voice assistant system of the terminal, one of the interaction engines is selected according to a fourth preset rule and is placed into the interaction engine set.

In certain embodiments, the fourth predetermined rule is at least one of: the interaction engine that is used the most often, the most recently used interaction engine, the interaction engine that has the highest user score, the interaction engine that has the most often the user has adopted its task execution results, and so on.

If at least one of the interaction engines included in the interaction engine cluster uploaded by other users defines an extension slot for the task associated with the interaction engine, the interaction engine cluster configuration information further includes a second optional interaction engine list of the task for the task, the interaction engine included in the second optional interaction engine list includes the slot defined by the at least one interaction engine for the task. Specifically, after the server generates the interaction engine cluster configuration information according to the foregoing process, based on the interaction engine cluster attribute information, for each interaction engine in the interaction engine cluster, it is determined whether a slot defined by the interaction engine cluster for the associated task includes an extended slot of the task, and if yes, a second optional interaction engine list of the task is added to the interaction engine cluster configuration information. Preferably, the associated task belongs to the interaction engine cluster.

Preferably, in step 602, an interaction engine associated with the task is selected from a voice assistant system of the terminal and placed in an interaction engine set, which specifically includes:

step 602a, if the configuration information of the interaction engine cluster includes a second selectable interaction engine list of the task, determining whether the voice assistant system of the terminal includes an interaction engine in the second selectable interaction engine list, and if the voice assistant system includes an interaction engine in the second selectable interaction engine list, executing step 602 b; if not, executing step 602e and the subsequent steps; if the interaction engine cluster configuration information does not include the second optional interaction engine list of the task, judging whether a voice assistant system of the terminal includes an interaction engine associated with the task, if so, executing step 602 b; otherwise, step 602c and its subsequent steps are performed.

Step 602b, putting the interaction engines into an interaction engine set.

Step 602e and its subsequent steps are:

step 602e, providing the second selectable interaction engine list to the user, and returning to step 201;

in step 201a, if the instruction is a second download instruction, go to step 602 f; the second downloading instruction carries an interaction engine selected by the user for the task from the second selectable interaction engine list;

step 602f, downloading the interaction engine indicated in the second downloading instruction, and putting the interaction engine into an interaction engine set.

Step 602c and its subsequent steps are:

step 602c, acquiring a first optional interaction engine list of the task, providing the first optional interaction engine list to a user, and returning to step 201;

in step 201a, if the instruction is a first download instruction, step 602d is executed; the first downloading instruction carries an interaction engine selected by a user for the task from the first optional interaction engine list;

Preferably, the first selectable interaction engine list and the second selectable interaction engine list may include a preset number of interaction engines. Selecting the interaction engines included in the first and second selectable interaction engine lists based on user scores, download times, and the like.

Preferably, the step 603 further includes: and sending a synchronization message to other terminals of the user, wherein the synchronization message carries the constructed interaction engine cluster, and specifically, the synchronization message carries the attribute information of the constructed interaction engine cluster.

Preferably, if the interaction engine is downloaded in the process of constructing the interaction engine cluster, the synchronization message further carries download indication information, and the download indication information includes a name of the interaction engine downloaded in the process of constructing the interaction engine cluster.

Based on this, after receiving the synchronization message, if the synchronization message carries the download indication information, the other terminals download the interaction engine according to the download indication and construct an interaction engine cluster, and if the synchronization message does not carry the download indication, the other terminals directly construct the interaction engine cluster according to the attribute information of the interaction engine cluster. This is because the voice assistant systems of different terminals of the user include the same interaction engine before downloading the interaction engine cluster, and since the terminal performing the above method 600 has already performed the determination in step 602, other terminals do not need to perform the determination in step 602 any more.

The above method is described below with reference to specific examples. Assuming that the attribute information of the first interaction engine cluster uploaded by other terminals is shown in table 2, the server generates configuration information of the first interaction engine cluster based on the attribute information of the first interaction engine, wherein the configuration information includes an air ticket reservation task and a weather query task. In step 601, the terminal downloads the configuration information of the first interaction engine cluster. Assuming that only the a1 interaction engine associated with the air ticket reservation task is included in the voice assistant system of the terminal, and the interaction engine associated with the weather query task is not included, in step 602, the a1 interaction engine is added to the interaction engine set for the air ticket reservation task, and for the weather query task, the method proceeds to step 602c, the terminal obtains a first optional interaction engine list of the task, including the B1 interaction engine and the B2 interaction engine, and provides the first download instruction to the user, instructing to download the B1 interaction engine, then proceeds to step 602d, the terminal downloads the B1 interaction engine, and puts the interaction engine into the interaction engine set, and in step 603, an interaction engine cluster is built based on the interaction engines in the interaction engine set.

Assuming that an interaction engine associated with a weather query and included in first interaction engine cluster attribute information uploaded by other terminals is a B2 interaction engine, since a slot defined by the B2 interaction engine for a weather query task includes not only two basic slots, namely a city slot and a date slot, but also one extended slot, namely a time slot, a second selectable interaction engine list of the weather query task is included in first interaction engine configuration information generated by the server, where the slot includes a B2 interaction engine and a B3 interaction engine, and the slot defined by the B3 interaction engine for the weather query task includes two basic slots, namely a city slot and a date slot, and also includes two extended slots, namely a time slot and a region slot. Then in step 602a, for the ticket booking task, add a1 interaction engine to the interaction engine set, and for the weather query task, the method will execute to step 602e, and the terminal provides the above-mentioned second selectable interaction engine list to the user, and downloads the interaction engines according to the second downloading instruction given by the user, thereby constructing an interaction engine cluster in step 603.

Through the method, the terminal downloads the configuration information of the interaction engine cluster from the server, and based on the configuration information of the interaction engine cluster, the interaction engine locally included in the voice assistant system is preferentially selected to construct the interaction engine cluster, so that unnecessary downloading of a plurality of interaction engines capable of executing the same task is avoided.

In the foregoing embodiments, it is mentioned that the type of the instruction is determined in step 201a, and specifically, the voice assistant system may determine which type of the instruction is specifically determined by the keyword and the dialog context included in the instruction of the user, the task execution result provided to the user, the operating mode of the voice assistant system, and other factors.

The method for performing voice interaction based on the interaction engine cluster, the method for forming the interaction engine cluster by aggregation and the method for constructing the interaction engine cluster can be partially or completely combined. Referring to fig. 15, a method for forming an interaction engine cluster and performing voice interaction based on the interaction engine cluster of the present invention is shown, and in other embodiments, all or part of the steps in the foregoing method embodiments may also be included in the method for forming an interaction engine cluster and performing voice interaction based on the interaction engine cluster. The present invention is not limited thereto.

The present invention further provides a device for performing voice interaction based on an interaction engine cluster, referring to fig. 9, the device includes:

a user instruction acquisition unit for acquiring an instruction of a user;

the task execution control unit is used for determining a first task based on the instruction, determining a first interaction engine based on the first task, and sending the instruction to the first interaction engine so that the first interaction engine fills a slot of the first task based on the instruction; wherein the first interaction engine is associated with the first task;

the working mode control unit is used for judging whether the voice assistant system is in a cluster task working mode, if so, triggering the task execution control unit, and if not, triggering the task execution result receiving unit;

the task execution control unit is further configured to determine, in response to a trigger of the working mode control unit, an interaction engine cluster to which the first interaction engine belongs, where the interaction engine cluster includes the first interaction engine and at least one second interaction engine, activate the at least one second interaction engine, and perform instant synchronization on a first slot position of at least one second task associated with the at least one second interaction engine in the interaction engine cluster and/or a second slot position corresponding to the first slot position based on key knowledge data filled by the first interaction engine into the first slot position of the first task;

the task execution result receiving unit is configured to receive a first task execution result from the first interaction engine and provide the first task execution result to a user, or receive the first task execution result from the first interaction engine, receive at least one second task execution result from the at least one second interaction engine, and provide the first task execution result and the at least one second task execution result to the user at the same time.

Preferably, the apparatus further includes a storage unit, configured to store at least one interaction engine cluster, and in particular, store attribute information of the at least one interaction engine cluster.

Preferably, if the task execution control unit determines that the first interaction engine does not belong to any interaction engine cluster, the task execution result receiving unit is triggered to receive a first task execution result from the first interaction engine.

Preferably, the task execution control unit is further configured to determine the first interaction engine and an interaction engine cluster to which the first task belongs.

Preferably, the task execution control unit is further configured to determine, in response to the trigger of the working mode control unit, an interaction engine cluster to which the first interaction engine belongs, determine whether the first interaction engine is a dominant interaction engine of the interaction engine cluster, if so, activate the at least one second interaction engine, and perform instant synchronization on the first slot position and/or the second slot position of the at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on the key knowledge data filled by the first interaction engine to the first slot position of the first task; otherwise, triggering the task execution result receiving unit.

The task execution result receiving unit is further configured to trigger the user instruction obtaining unit after the first task execution result is provided to the user, or after the first task execution result and the at least one second task execution result are simultaneously provided to the user.

The user instruction acquisition unit is also used for responding to the trigger of the task execution result receiving unit and continuously receiving the instruction of the user.

Preferably, the device further comprises an instruction classification unit, configured to determine the type of the user instruction obtained by the user instruction obtaining unit, and trigger the task execution control unit if the instruction is a new task instruction. The task execution control unit is further configured to determine, in response to a trigger of the instruction classifying unit when the instruction is a new task instruction, a first task based on the instruction, determine a first interaction engine based on the first task, and send the instruction to the first interaction engine, so that the first interaction engine fills a slot of the first task based on the instruction.

The instruction classification unit is further configured to: and if the instruction is a single-task modification instruction, triggering the task execution control unit. The task execution control unit is further configured to send the instruction to the first interaction engine in response to a trigger of the instruction classification unit when the instruction is a single-task modification instruction, so that the first interaction engine modifies the key knowledge data filled in the slot of the first task based on the instruction.

The task execution result receiving unit is further configured to obtain a first task modification execution result from the first interaction engine, provide the first task modification execution result to a user, and trigger the user instruction obtaining unit.

The instruction classification unit is further configured to: and if the instruction is a cluster task modification instruction, triggering the task execution control unit. The task execution control unit is further configured to determine, in response to a trigger of the instruction classification unit when the instruction is a cluster task modification instruction, a third interaction engine corresponding to the cluster task modification instruction, and send the instruction to the third interaction engine, so that the third interaction engine modifies a third slot of a third task associated with the third interaction engine based on the instruction; and carrying out instant synchronization on the third slot position of at least one fourth task associated with a fourth interaction engine in the interaction engine cluster and/or a fourth slot position corresponding to the third slot position based on the modified key knowledge data filled in the third slot position of the third task.

The task execution result receiving unit is further configured to receive the third task execution result from the third interaction engine, receive at least one fourth task execution result from the at least one fourth interaction engine, provide the third task execution result and the at least one fourth task execution result to the user at the same time, and trigger the user instruction obtaining unit.

The task execution result receiving unit receives at least one task execution result from at least one interaction engine, which may be a first interaction engine and/or a second interaction engine, as a transient task execution result. And the task execution result receiving unit triggers the user instruction acquisition unit after providing the task execution result for the user. And after the user instruction acquisition unit acquires the instruction of the user again, triggering the instruction classification unit.

The instruction classification unit is further configured to: and if the command is a task confirmation command, triggering the task execution control unit. The task execution control unit is further configured to send the task confirmation instruction to the at least one interaction engine in response to triggering of the instruction classification unit when the instruction is the task confirmation instruction, so as to trigger the at least one interaction engine to convert the transient task execution result into a final task execution result.

The task execution result receiving unit is further configured to receive a final state task execution result from the at least one interaction engine, provide the final state task execution result to a user, and trigger the user instruction obtaining unit.

The task execution control unit is further configured to determine whether a cluster task clarity question is received from the first interaction engine after the instant synchronization, provide the cluster task clarity question to a user if the cluster task clarity question is received, and trigger the user instruction obtaining unit, and otherwise trigger the task execution result receiving unit to receive a first task execution result from the first interaction engine and at least one second task execution result from the at least one second interaction engine.

And the user instruction acquisition unit is also used for responding to the trigger of the task execution control unit and continuously receiving the user instruction.

The instruction classification unit is further configured to: if the instruction is an instruction for the clarification question response of the cluster task, triggering the task execution control unit; the task execution control unit is further configured to send, in response to a trigger of the instruction classification unit when the instruction is an instruction for the cluster task clarification question response, the instruction to a first interaction engine, so that the first interaction engine fills a slot of the first task according to the instruction for the cluster task clarification question response, and instantaneously synchronizes a first slot and/or a second slot of at least one second task associated with at least one second interaction engine in the interaction engine cluster based on key knowledge data filled by the first interaction engine into the first slot of the first task.

And the task execution control unit is further configured to determine whether a single task clarification question is received from the first interaction engine when the voice assistant system is determined not to be in the cluster task working mode, provide the single task clarification question to a user if the single task clarification question is received, and trigger the user instruction acquisition unit, and otherwise trigger the task execution result receiving unit to receive a first task execution result from the first interaction engine.

The instruction classification unit is further configured to: if the instruction is the instruction for the single-task clarification question response, triggering the task execution control unit; the task execution control unit is further configured to send, in response to a trigger of the instruction classification unit when the instruction is an instruction for the single-task clarification question response, the instruction to a first interaction engine, so that the first interaction engine fills a slot of the first task according to the instruction for the single-task clarification question response.

Preferably, the task execution control unit is further configured to determine whether a single task clarity question is received from the first interaction engine if it is determined that the first interaction engine does not belong to any interaction engine cluster, or it is determined that the first interaction engine is not a dominant interaction engine of the interaction engine cluster, provide the single task clarity question to a user if it is determined that the single task clarity question is received from the first interaction engine, and trigger the user instruction obtaining unit, and otherwise trigger the task execution result receiving unit to receive a first task execution result from the first interaction engine.

The instruction classification unit is further configured to: if the instruction is the instruction for the single-task clarification question response, triggering the task execution control unit; the task execution control unit is further configured to send, in response to a trigger of the instruction classification unit when an instruction is an instruction for the single-task clarification question response, the instruction to a first interaction engine, so that the first interaction engine fills a slot of the first task according to the instruction for the single-task clarification question response.

The task execution control unit is further configured to select to use one of the interaction engine clusters or use the interaction engine clusters according to a third preset rule when the first interaction engine or the interaction engine cluster to which the first interaction engine and the first task belong is multiple.

The present invention further provides a device for performing voice interaction based on an interaction engine cluster, referring to fig. 10, the device includes:

a user instruction acquisition unit for acquiring an instruction of a user;

The device also comprises a task execution control unit and a task execution result receiving unit;

the instruction classification unit is also used for judging whether the previous instruction of the working mode changing instruction is a newly-built task instruction or not, and if so, triggering a task execution control unit;

the task execution control unit is used for processing the previous instruction of the user by using the interaction engine cluster;

the task execution result receiving unit is used for simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to a user.

The task execution control unit processes the last instruction of the user by using an interaction engine cluster, and specifically, in response to the triggering of the instruction classification unit when the last instruction is a newly-built task instruction, the task execution control unit determines a first task based on the newly-built task instruction, and determines a first interaction engine based on the first task, wherein the first interaction engine is associated with the first task; sending the new task instruction to the first interaction engine, so that the first interaction engine fills a slot position of a first task based on the new task instruction; determining an interaction engine cluster to which the first interaction engine belongs, wherein the interaction engine cluster comprises the first interaction engine and at least one second interaction engine; activating the at least one second interaction engine; instantly synchronizing a first slot position of at least one second task associated with at least one second interaction engine in the interaction engine cluster and/or a second slot position corresponding to the first slot position based on key knowledge data filled by the first interaction engine to the first slot position of the first task.

The task execution result receiving unit is configured to receive the first task execution result from the first interaction engine, receive at least one second task execution result from the at least one second interaction engine, and provide the first task execution result and the at least one second task execution result to a user at the same time.

Preferably, the apparatus further includes a storage unit, configured to store at least one interaction engine cluster, and in particular, store attribute information of at least one interaction engine cluster.

The same name elements of the device in this embodiment as those of the device shown in fig. 9 also have the function of the elements of fig. 9, and the preferred embodiment for any one element of the device shown in fig. 9 also applies to the same elements of the device in this embodiment. For example, the task execution control unit in this embodiment has the same function as that of the task execution control unit described in fig. 9, and the preferred embodiment of the task execution control unit shown in fig. 9 is also applicable to the task execution control unit in this embodiment, and is also applicable to other units with the same name, which is not described herein again.

Preferably, the task execution control unit is configured to process the previous instruction of the user by using an interaction engine cluster, and specifically: responding to a trigger of an instruction classification unit when a previous instruction is a new task instruction, receiving a first message from a task execution result receiving unit, wherein the first message carries an indication for whether a task execution result is provided for a user for the new task instruction, if so, determining an interaction engine cluster to which a first interaction engine belongs, activating at least one second interaction engine, and performing instant synchronization on a first slot position and/or a second slot position of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on key knowledge data filled by the first interaction engine to the first slot position of the first task; otherwise, determining a first task based on the new task instruction, determining a first interaction engine based on the first task, sending the new task instruction to the first interaction engine, so that the first interaction engine fills a slot position of the first task based on the new task instruction, determining an interaction engine cluster to which the first interaction engine belongs, activating the at least one second interaction engine, and performing instant synchronization on a first slot position and/or a second slot position of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on key knowledge data filled by the first interaction engine to the first slot position of the first task;

the task execution result receiving unit is further configured to send the first message to the task execution control unit.

Preferably, the task execution control unit is configured to respond to a trigger of the instruction classification unit when the previous instruction is a new task instruction, determine whether a receiving time of the new task instruction meets a preset condition, and if so, use the interaction engine cluster to process the previous instruction of the user.

And the task execution control unit is also used for triggering the user instruction acquisition unit if the receiving time of the newly-built task instruction is judged not to accord with the preset condition.

The device also comprises a synchronization unit which is used for sending a synchronization message to other terminals of the user and indicating the voice assistant system on the other terminals of the user to change the working mode of the voice assistant system into a cluster task working mode.

The present invention further provides a device for performing voice interaction based on an interaction engine cluster, referring to fig. 11, the device includes:

the user instruction acquisition unit is used for acquiring an instruction of a user, wherein the instruction carries a new task instruction;

the instruction classification unit is used for judging the type of a user instruction, and triggering the task execution control unit if the instruction comprises an interaction engine cluster enabling instruction;

the task execution control unit is used for enabling the interaction engine cluster corresponding to the cluster name, and processing the new task instruction based on the interaction engine cluster corresponding to the cluster name carried in the interaction engine cluster enabling instruction according to the new task instruction carried in the instruction;

and the task execution result receiving unit is used for simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to a user.

The units of the apparatus in this embodiment, which are named the same as those of the apparatus in the two previous embodiments (the embodiments corresponding to fig. 9-10), also have the functions of the units in the two previous embodiments, and the preferred embodiment of any one unit of the apparatus in the two previous embodiments is also applicable to the same units of the apparatus in this embodiment. For example, the task execution control unit in this embodiment has the same function as the task execution control units in the two embodiments, and the preferred embodiments of the task execution control units in the two embodiments are also applicable to the task execution control unit in this embodiment, and other units with the same name are also applicable, and are not described herein again.

Preferably, the instruction classifying unit is further configured to determine whether the instruction includes a new task instruction, if so, trigger the task execution control unit, and otherwise, trigger the user instruction obtaining unit.

And the task execution control unit is also used for responding to the trigger of the instruction classification unit when the instruction comprises a new task instruction, and processing the new task instruction based on the interaction engine cluster corresponding to the cluster name according to the new task instruction carried in the instruction.

The task execution control unit is further configured to, in response to triggering of the instruction classification unit when the instruction includes a new task instruction, determine a first task based on the new task instruction, determine a first interaction engine associated with the first task based on the first task, and determine whether the first interaction engine belongs to the interaction engine cluster, if so, process the new task instruction based on the interaction engine cluster corresponding to the cluster name according to the new task instruction carried in the instruction, otherwise, prompt the user that the new task instruction is not matched with the cluster name, and trigger the user instruction acquisition unit.

The task execution control unit is further configured to, in response to a trigger when the instruction classification unit includes a new task instruction in the instruction, determine a first task based on the new task instruction, determine a first interaction engine associated with the first task based on the first task, determine whether the first interaction engine belongs to the interaction engine cluster, if the first interaction engine belongs to the interaction engine cluster, further judging whether the first interaction engine is a dominant interaction engine of the interaction engine cluster, if the interaction engine is the leading interaction engine, processing the new task instruction based on the interaction engine cluster corresponding to the cluster name according to the new task instruction carried in the instruction, if the task instruction is not the dominant interaction engine, prompting a user to create an interaction engine non-dominant interaction engine associated with the task instruction, and triggering the user instruction acquisition unit; and if the first interaction engine does not belong to the interaction engine cluster, prompting a user that a new task instruction is not matched with the cluster name, and triggering the user instruction acquisition unit.

The task execution control unit is configured to process, according to a new task instruction carried in the instruction, the new task instruction based on the interaction engine cluster corresponding to the cluster name, and specifically includes: sending the newly-built task instruction to the first interaction engine, so that the first interaction engine fills a slot position of a first task based on the instruction; the first interaction engine is associated with a first task, and the first task is determined based on the new task instruction; activating at least one second interaction engine of the cluster of interaction engines; the at least one second interaction engine is another interaction engine in the interaction engine cluster than the first interaction engine; instantly synchronizing a first slot position of at least one second task associated with at least one second interaction engine in the interaction engine cluster and/or a second slot position corresponding to the first slot position based on key knowledge data filled by the first interaction engine to the first slot position of the first task.

And the task execution result receiving unit is used for receiving the first task execution result from the first interaction engine, receiving at least one second task execution result from the at least one second interaction engine, and simultaneously providing the first task execution result and the at least one second task execution result to a user.

And the task execution control unit is also used for judging whether a clarification question is received from the first interaction engine after the instant synchronization, if so, providing the clarification question for a user, and triggering the user instruction acquisition unit.

The instruction classification unit is further configured to: if the instruction is the instruction aiming at the clarification question response, triggering the task execution control unit; the task execution control unit is further configured to send, in response to a trigger of the instruction classification unit when the instruction is an instruction for the clarification question response, the instruction to a first interaction engine, so that the first interaction engine fills a slot of the first task according to the instruction for the clarification question response, and performs instant synchronization on a first slot and/or a second slot of at least one second task associated with at least one second interaction engine in the interaction engine cluster based on key knowledge data filled by the first interaction engine into the first slot of the first task.

The instruction classification unit is further configured to, when the user instruction is a new task instruction, determine whether a previous instruction of the new task instruction is an interaction engine cluster enabling instruction, and if so, trigger the task execution control unit.

And the task execution control unit is used for responding to the trigger of the instruction classification unit when the previous instruction is an interaction engine cluster enabling instruction, and processing the newly-built task instruction according to the mode.

In another embodiment, the apparatus for performing voice interaction based on the interaction engine cluster of the present invention comprises all or part of the units of the apparatus in the above three embodiments.

The invention also provides a device for aggregating to form an interaction engine cluster, which comprises the following units with reference to fig. 12:

the historical task set generating unit is used for selecting at least two historical tasks to form a historical task set;

the interactive engine set generating unit is used for acquiring an interactive engine associated with each historical task in a voice assistant system of the terminal aiming at each historical task in the historical task set, and the interactive engine associated with each historical task in the historical task set forms an interactive engine set; wherein each interaction engine is associated with at least one task, the interaction engine defining at least one slot for each task associated therewith;

and the aggregation unit is used for judging whether the slots defined by at least two interaction engines in the interaction engine set aiming at the respective associated historical tasks have the same or corresponding slots, and if so, aggregating the at least two interaction engines to form an interaction engine cluster so as to facilitate the voice assistant system to use the interaction engine cluster.

Preferably, the historical task set generating unit is configured to select at least two historical tasks according to a first preset rule and a historical dialogue record.

Preferably, the historical task set generating unit is configured to select at least two historical tasks executed within a preset time in the historical conversation record; and/or selecting at least two historical tasks executed in a preset number of conversation turns in the historical conversation record; and/or selecting at least two historical tasks executed adjacently in the historical conversation record. The at least two history tasks executed adjacently mean that other history tasks are not executed between the at least two history tasks.

Preferably, the aggregating unit is further configured to determine one or more interaction engines in the interaction engine cluster as a dominant interaction engine according to the sequence of the at least two historical tasks after aggregating the at least two interaction engines to form one interaction engine cluster.

The invention also provides a device for forming the interaction engine cluster by aggregation, which comprises the following units:

the historical task set generating unit acquires a historical dialogue record of a user and the voice assistant system and judges whether the following situations exist in the historical dialogue record: interrupting a conversation related to a first task, entering the conversation related to at least one second task, and recovering the conversation related to the first task after the execution of the at least one second task is finished; if yes, forming a historical task set, wherein the historical task set comprises the first task and at least one second task;

Preferably, the history task set generating unit determines whether there is a case where the history session record includes: interrupting a dialog related to a first task, entering a dialog related to at least one second task, and recovering the dialog related to the first task after the execution of the at least one second task is finished, specifically: analyzing the historical conversation records to obtain corresponding historical task execution time lines and historical task execution states; judging whether a historical task closed loop exists or not based on the historical task execution time line and the historical task execution state; the historical task closed loop is a historical task sequence with a first task as a head node and a tail node, at least one second task as an intermediate node, incomplete execution state of the head node and complete execution state of at least one intermediate node.

Preferably, the aggregating unit is further configured to determine the interaction engine associated with the first task as a dominant interaction engine after the at least two interaction engines are aggregated to form one interaction engine cluster.

The following applies to the apparatus for aggregating to form an interaction engine cluster in the above two embodiments.

The aggregation unit is configured to determine, based on an instruction of the user in the historical dialog record, whether the slot positions defined by the at least two interaction engines in the interaction engine set for the corresponding historical tasks have the same or corresponding slot positions.

The aggregation unit is configured to determine, based on an instruction of a user in the historical dialog record, key knowledge data of at least one slot defined by an interaction engine associated with a historical task corresponding to the instruction for the historical task, and determine, based on the slot and the key knowledge data, whether the slots defined by at least two interaction engines in the interaction engine set for the associated historical task have the same or corresponding slot.

The aggregation unit aggregates the at least two interaction engines to form an interaction engine cluster, which specifically comprises: recording the names of the at least two interaction engines in the attribute information of the interaction engine cluster, and recording the same or corresponding slot positions in the slot positions defined by the at least two interaction engines in the interaction engine set aiming at the associated historical tasks in the attribute information of the interaction engine cluster.

The voice assistant system uses the interaction engine cluster, specifically: activating at least one second interaction engine in the interaction engine cluster upon activation of a first interaction engine in the interaction engine cluster, instantly synchronizing at least one first slot location and/or second slot location of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on the key knowledge data populated for the at least one first slot location of the first task associated with the first interaction engine, the second task associated with the second interaction engine has the at least one first slot position and/or a second slot position corresponding to the first slot position, receives the first task execution result from the first interaction engine, receives at least one second task execution result from the at least one second interaction engine, and provides the first task execution result and the at least one second task execution result to a user at the same time.

The voice assistant system uses the interaction engine cluster, specifically: and when the voice assistant system is in a cluster task working mode and a first interaction engine determined based on a user instruction belongs to the interaction engine cluster, using the interaction engine cluster.

The voice assistant system uses the interaction engine cluster, specifically: and when the voice assistant system is in a cluster task working mode and a first interaction engine and a first task determined based on a user instruction belong to the interaction engine cluster, using the interaction engine cluster.

The voice assistant system uses the interaction engine cluster, specifically: and when the received instruction of the user comprises the cluster name of the interaction engine cluster, using the interaction engine cluster.

The apparatus for aggregating and forming an interaction engine cluster in the two embodiments described above may include all or part of units of the apparatus for performing voice interaction based on the interaction engine cluster in the three embodiments described before in the two embodiments described above, and details are not described here again.

The units with the same name in the devices for aggregating to form the interaction engine cluster in the two embodiments can have the functions in the two embodiments at the same time.

The apparatus for aggregating to form an interaction engine cluster further comprises:

a user instruction acquisition unit for acquiring an instruction of a user;

and the instruction classification unit is used for judging the type of the user instruction, and triggering the historical task set generation unit if the instruction is an interactive engine cluster aggregation instruction.

The aggregation unit is further configured to provide a first interface for a user, so that the user names the interaction engine cluster through the first interface, and records a cluster name input by the user in attribute information of the interaction engine cluster.

The apparatus for forming an interaction engine cluster by aggregation further includes a storage unit, configured to store the interaction engine cluster formed by aggregation, and specifically, to store attribute information of the interaction engine cluster formed by aggregation.

The aggregation unit is further configured to mark the usage level of the interaction engine cluster as the lowest after aggregating the at least two interaction engines to form an interaction engine cluster, and record the usage level of the interaction engine cluster in the attribute information thereof.

The apparatus for aggregating to form an interaction engine cluster further comprises: the statistical unit is used for performing statistics on a plurality of historical task sets generated based on historical conversation records;

and the aggregation unit is further used for marking the use level of the interaction engine cluster according to the frequency of the historical task set in the statistical result.

The aggregation unit is further configured to provide a second interface for a user, so that the user modifies the usage level of the interaction engine cluster through the second interface, and records the modified usage level of the interaction engine cluster in the attribute information of the interaction engine cluster.

The interaction engine set generation unit acquires an interaction engine associated with each historical task in a voice assistant system of the terminal aiming at each historical task in the historical task set, specifically, acquires the interaction engine executing the historical task.

The interaction engine set generating unit acquires, for each historical task of the historical task set, an interaction engine associated with the historical task in a voice assistant system of the terminal, and specifically, acquires at least one interaction engine capable of executing the historical task from the interaction engines included in the voice assistant system of the terminal, and selects one interaction engine from the at least one interaction engine as the interaction engine associated with the historical task according to a second preset rule.

The aggregation unit is further configured to update the usage level of the interaction engine cluster based on the usage of the interaction engine cluster by the task execution control unit.

The apparatus for aggregating to form an interaction engine cluster further comprises: and the uploading unit is used for uploading the interaction engine cluster to a server for downloading and using by other users, and specifically, uploading the attribute information of the interaction engine cluster.

The aggregation unit is further configured to provide a third interface for a user, so that the user sets an attribute of an interaction engine in the interaction engine cluster through the third interface, for example, sets a leading interaction engine and a trailing interaction engine of the interaction engine cluster, and records the attribute of the interaction engine in attribute information of the interaction engine cluster.

The apparatus for aggregating to form an interaction engine cluster further comprises: and a synchronization unit, configured to send a synchronization message to other terminals of the user, where the synchronization message carries the interaction engine cluster obtained by aggregation, and specifically, the synchronization message carries attribute information of the interaction engine cluster.

The invention also provides a device for constructing an interaction engine cluster, which comprises the following units with reference to fig. 13:

the system comprises a first acquisition unit, a first processing unit and a second acquisition unit, wherein the first acquisition unit is used for receiving attribute information of a first interaction engine cluster sent by a server, and the attribute information of the interaction engine cluster comprises names of at least two interaction engines included in the interaction engine cluster;

the construction unit is used for judging whether a voice assistant system of the terminal comprises all interaction engines included in the interaction engine cluster or not, and if the voice assistant system does not comprise a fifth interaction engine in all the interaction engines, acquiring a task associated with the fifth interaction engine; judging whether the voice assistant system of the terminal comprises a sixth interaction engine associated with the task, if so, replacing a fifth interaction engine in the attribute information of the first interaction engine cluster by the sixth interaction engine to form attribute information of a second interaction engine cluster, and constructing the interaction engine cluster based on the attribute information of the second interaction engine cluster so as to facilitate the voice assistant system to use the interaction engine cluster.

The apparatus for constructing an interaction engine cluster in this embodiment may include all or part of the apparatuses for aggregating to form an interaction engine cluster and performing voice interaction based on the interaction engine cluster in the foregoing five embodiments, and details are not repeated here.

Preferably, the apparatus further comprises:

a user instruction acquisition unit for acquiring an instruction of a user;

and the instruction classification unit is used for judging the type of the user instruction, and triggering the first acquisition unit if the instruction is a first interaction engine cluster downloading instruction.

Preferably, the constructing unit is further configured to construct an interaction engine cluster based on the attribute information of the first interaction engine cluster if the voice assistant system includes all interaction engines included in the interaction engine cluster, so that the voice assistant system uses the interaction engine cluster.

Preferably, the constructing unit is further configured to download the fifth interaction engine from the server if the voice assistant system of the terminal does not include a sixth interaction engine associated with the task, and construct an interaction engine cluster based on the attribute information of the first interaction engine cluster, so that the voice assistant system uses the interaction engine cluster.

Preferably, the constructing unit is further configured to, if the voice assistant system of the terminal includes a sixth interaction engine associated with the task, determine whether a slot defined by the sixth interaction engine for the task includes all slots defined by the fifth interaction engine for the task, if so, replace the fifth interaction engine in the attribute information of the first interaction engine cluster with the sixth interaction engine to form attribute information of a second interaction engine cluster, and construct the interaction engine cluster based on the attribute information of the second interaction engine cluster, so that the voice assistant system uses the interaction engine cluster.

Preferably, the apparatus further includes a synchronization unit, configured to send a synchronization message to other terminals of the user, where the synchronization message carries the interaction engine cluster obtained by the building, and specifically, the synchronization message carries the attribute information of the second interaction engine cluster.

Preferably, the building unit is further configured to download the fifth interaction engine from the server if the slot defined by the sixth interaction engine for the task does not include all slots defined by the fifth interaction engine for the task, and build an interaction engine cluster based on the attribute information of the first interaction engine cluster after the fifth interaction engine is downloaded, so that the voice assistant system uses the interaction engine cluster.

Preferably, the synchronization unit is further configured to send a synchronization message to another terminal of the user, where the synchronization message carries the interaction engine cluster obtained by the constructing, and specifically, the synchronization message carries the attribute information of the first interaction engine cluster.

Preferably, the synchronization message carries download indication information in addition to the first interaction engine cluster attribute information, where the download indication information includes a name of a fifth interaction engine and indicates the other terminals to download the fifth interaction engine.

Preferably, the constructing unit downloads the fifth interaction engine from the server, specifically: immediately downloading the fifth interaction engine from the server; or, adding the fifth interaction engine in the download waiting list, periodically judging whether the terminal is accessed to the wifi network, and if the terminal is accessed to the wifi network, executing downloading according to the download waiting list.

Preferably, the building unit determines whether the fifth interaction engine is downloaded, and if so, builds an interaction engine cluster based on the attribute information of the first interaction engine cluster.

Preferably, the building unit determines that the fifth interaction engine download is completed by determining that the waiting download list is empty.

The invention also provides a device for constructing an interaction engine cluster, which comprises the following units with reference to fig. 14:

a second obtaining unit, configured to receive first interaction engine cluster configuration information sent by a server, where the first interaction engine cluster configuration information includes a task associated with an interaction engine included in a first interaction engine cluster;

a constructing unit, configured to, for each task included in the first interaction engine cluster configuration information, select an interaction engine associated with the task from a voice assistant system of the terminal, and place the interaction engine into an interaction engine set, where after the interaction engine set includes the interaction engine associated with each task, the interaction engine cluster is constructed based on the interaction engines in the interaction engine set, so that the voice assistant system uses the interaction engine cluster.

The apparatus for constructing an interaction engine cluster in this embodiment may include all or part of the apparatuses for aggregating to form an interaction engine cluster, the apparatus for performing voice interaction based on the interaction engine cluster, and the apparatus for constructing an interaction engine cluster in the foregoing six embodiments, and details are not repeated here.

Preferably, the apparatus further comprises:

a user instruction acquisition unit for acquiring an instruction of a user;

and the instruction classification unit is used for judging the type of the user instruction, and triggering the second acquisition unit if the instruction is a second interaction engine cluster downloading instruction.

Preferably, the construction unit is configured to: aiming at each task included in the first interaction engine cluster configuration information, judging whether a voice assistant system of the terminal includes an interaction engine associated with the task, and if so, putting the interaction engine into an interaction engine set; otherwise, acquiring a first optional interaction engine list of the task, providing the first optional interaction engine list for a user, and triggering the instruction classification unit;

the instruction classifying unit is used for triggering the constructing unit if the instruction is a first downloading instruction; the first downloading instruction carries an interaction engine selected by a user for the task from the first selectable interaction engine list;

the constructing unit is used for responding to the trigger of the instruction classifying unit when the instruction is a first downloading instruction, downloading the interaction engines indicated in the first downloading instruction and putting the interaction engines into the interaction engine set.

The constructing unit obtains the first optional interaction engine list of the task, and specifically may send a first optional interaction engine list obtaining request to the server, where the request carries the task, and receives the first optional interaction engine list returned by the server.

The downloading, by the building unit, the interaction engine indicated in the first downloading instruction specifically includes: immediately downloading the interaction engine indicated in the first downloading instruction from the server; or, adding the interaction engine indicated in the first downloading instruction in the downloading waiting list, periodically judging whether the terminal is accessed to the wifi network, and if the terminal is accessed to the wifi network, executing downloading according to the downloading waiting list.

The construction unit determines whether the number of the interaction engines included in the interaction engine set is equal to the number of the interaction engines included in the first interaction engine configuration set, and if so, may determine that the interaction engine set includes the interaction engine associated with each task.

The building unit is further configured to select one of the interaction engines according to a fourth preset rule and place the selected interaction engine into the interaction engine set if a plurality of interaction engines are associated with the task in the voice assistant system of the terminal.

Preferably, the construction unit is further configured to: for each task included in the first interaction engine cluster configuration information, if the interaction engine cluster configuration information includes a second selectable interaction engine list of the task, judging whether a voice assistant system of the terminal includes an interaction engine in the second selectable interaction engine list, and if the voice assistant system includes an interaction engine in the second selectable interaction engine list, putting the interaction engine into an interaction engine set; if the interaction engines in the second selectable interaction engine list are not included, the second selectable interaction engine list is provided for the user, and the instruction classification unit is triggered; if the interaction engine cluster configuration information does not comprise a second optional interaction engine list of the task, judging whether a voice assistant system of the terminal comprises an interaction engine associated with the task, if so, putting the interaction engine into an interaction engine set, otherwise, providing the first optional interaction engine list for a user, and triggering the instruction classification unit;

the instruction classification unit is used for triggering the construction unit if the instruction is a first downloading instruction or a second downloading instruction; the first downloading instruction carries an interaction engine selected by a user for the task from the first selectable interaction engine list; the second downloading instruction carries an interaction engine selected by the user for the task from the second selectable interaction engine list;

the constructing unit is used for responding to the trigger of the instruction classifying unit when the instruction is a first downloading instruction, downloading the interaction engines indicated in the first downloading instruction and putting the interaction engines into an interaction engine set; and in response to the trigger of the instruction classification unit when the instruction is a second downloading instruction, downloading the interaction engines indicated in the second downloading instruction, and putting the interaction engines into an interaction engine set.

The device further comprises: and a synchronization unit, configured to send a synchronization message to other terminals of the user, where the synchronization message carries the interaction engine cluster obtained by the construction, and specifically, the synchronization message carries attribute information of the interaction engine cluster obtained by the construction.

The seven devices in the seven embodiments can be freely combined, and any one of the seven devices can comprise all or part of the units in the remaining six devices.

The invention also provides a computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program is executable on a processor, and when executed implements the method as described above.

The present invention also provides a terminal characterized by comprising the apparatus as described above or the voice assistant system as described above.

The voice assistant system or the terminal may include all or part of the units of the apparatus for aggregating to form the interaction engine cluster, the apparatus for performing voice interaction based on the interaction engine cluster, and the apparatus for constructing the interaction engine cluster in the foregoing seven embodiments.

Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. The computer-readable storage medium may include: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), a flash memory, an erasable programmable read-only memory (EPROM), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

Computer program code for carrying out operations of the present invention may be written in one or more programming languages, or a combination thereof.

The above description is only an example for the convenience of understanding the present invention, and is not intended to limit the scope of the present invention. In the specific implementation, a person skilled in the art may change, add, or reduce the components of the apparatus according to the actual situation, and may change, add, reduce, or change the order of the steps of the method according to the actual situation without affecting the functions implemented by the method.

While embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents, and all changes that come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

1. A method for voice interaction based on an interaction engine cluster, which is used for a terminal with a voice assistant system, and is characterized in that the method comprises the following steps:

step 201, acquiring a user instruction;

the voice assistant system processes a task instruction of a user by using an interaction engine cluster in the cluster task working mode, and simultaneously provides at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user;

after step 211, the method further comprises:

step 213a, processing the previous instruction of the user by using the interaction engine cluster, and simultaneously providing at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user.

2. The method of claim 1 wherein in step 212, if the previous instruction is a new task instruction, then performing step 213: judging whether the receiving time of the new task instruction meets a preset condition, if so, executing the step 213 a;

and the preset condition is that the difference value between the receiving time of the newly-built task instruction and the receiving time of the working mode changing instruction is smaller than a preset value.

3. The method according to claim 1, wherein step 213a specifically comprises:

step 205b, activating said at least one second interaction engine;

4. The method of claim 1, wherein a synchronization message is sent to the other terminal of the user for instructing a voice assistant system on the other terminal of the user to modify its operating mode to a cluster task operating mode.

5. An apparatus for performing voice interaction based on an interaction engine cluster, the apparatus comprising:

a user instruction acquisition unit for acquiring an instruction of a user;

the device uses an interaction engine cluster to process a task instruction of a user in the cluster task working mode, and simultaneously provides at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to the user;

the task execution result receiving unit is configured to provide at least two task execution results provided by at least two interaction engines included in the interaction engine cluster to a user at the same time.

6. The apparatus according to claim 5, wherein the task execution control unit is configured to determine whether the receiving time of the new task instruction meets a preset condition in response to the trigger of the instruction classification unit when the previous instruction is the new task instruction, and if so, use the interaction engine cluster to process the previous instruction of the user;

7. The apparatus according to claim 5, wherein the task execution control unit is configured to use an interaction engine cluster to process the previous instruction of the user, specifically: determining a first task based on the new task instruction, determining a first interaction engine based on the first task, sending the new task instruction to the first interaction engine, so that the first interaction engine fills a slot of the first task based on the new task instruction, determining an interaction engine cluster to which the first interaction engine belongs, activating the at least one second interaction engine, and performing instant synchronization on a first slot and/or a second slot of at least one second task associated with the at least one second interaction engine in the interaction engine cluster based on key knowledge data filled by the first interaction engine to the first slot of the first task;

8. The apparatus of claim 5, further comprising a synchronization unit configured to send a synchronization message to the other terminal of the user for instructing a voice assistant system on the other terminal of the user to modify its operating mode to a cluster task operating mode.

9. A computer arrangement, characterized in that the computer arrangement comprises a processor and a memory, in which a computer program is stored which is executable on the processor, which computer program, when being executed by the processor, carries out the method according to any one of claims 1 to 4.

10. A computer-readable storage medium, in which a computer program that is executable on a processor is stored, which computer program, when being executed, carries out the method according to any one of claims 1 to 4.

11. A voice assistant system comprising an apparatus as claimed in any one of claims 5 to 8.

12. A terminal comprising an apparatus as claimed in any one of claims 5 to 8, or comprising a voice assistant system as claimed in claim 11.