CN115914466B

CN115914466B - Voice interaction method and device based on voice stream robot and storage medium

Info

Publication number: CN115914466B
Application number: CN202310017158.9A
Authority: CN
Inventors: 刘威; 余文虎; 黄明星; 周晓波; 沈鹏
Original assignee: Beijing Shuidi Technology Group Co ltd
Current assignee: Beijing Shuidi Technology Group Co ltd
Priority date: 2023-01-06
Filing date: 2023-01-06
Publication date: 2023-06-23
Anticipated expiration: 2043-01-06
Also published as: CN115914466A

Abstract

The application discloses a voice interaction method and device based on a voice stream robot, a storage medium and computer equipment, wherein the method comprises the following steps: receiving a target voice stream of a user through a first component of the voice stream robot, and forwarding the target voice stream to a target second component of the voice stream robot; sending interaction notification information to a corresponding third component through the target second component, and distributing a target virtual voice stream robot for the user in a line multiplexing connection pool based on the interaction notification information through the third component; and determining target response voice of the target voice stream by the target virtual voice stream robot so as to perform voice interaction with the user. According to the method and the device, when the target virtual voice stream robot is allocated for the user, the line multiplexing connection pool can be fully utilized, so that on one hand, the user connection efficiency can be greatly improved, and on the other hand, the user experience can be effectively improved.

Description

Voice interaction method and device based on voice stream robot and storage medium

Technical Field

The application relates to the technical field of internet, in particular to a voice interaction method and device based on a voice streaming robot, a storage medium and computer equipment.

Background

Along with the continuous development of intelligent voice customer service, a voice flow robot is also in the continuous development process. The voice flow robot has a plurality of intelligent interaction technologies such as voice recognition, semantic recognition and the like, so that the user intention can be accurately understood, and a response can be correspondingly given. It can be said that the appearance of the voice flow robot has important significance for the realization of intelligent voice customer service.

However, in the prior art, when a user is connected to the voice stream robot, several hundred milliseconds or more are usually required, the connection efficiency is low, and the user experience is poor.

Disclosure of Invention

In view of this, the present application provides a voice interaction method and apparatus, a storage medium, and a computer device based on a voice stream robot, when a target virtual voice stream robot is allocated to a user, a line multiplexing connection pool can be fully utilized, and the user on-time is changed from several hundred milliseconds to several milliseconds, so that on one hand, the user on-efficiency can be greatly improved, and on the other hand, the user experience can also be effectively improved.

According to one aspect of the present application, there is provided a voice interaction method based on a voice streaming robot, including:

receiving a target voice stream of a user through a first component of the voice stream robot, and forwarding the target voice stream to a target second component of the voice stream robot;

sending interaction notification information to a corresponding third component through the target second component, and distributing a target virtual voice stream robot for the user in a line multiplexing connection pool based on the interaction notification information through the third component;

and determining target response voice of the target voice stream by the target virtual voice stream robot so as to perform voice interaction with the user.

According to another aspect of the present application, there is provided a voice interaction device based on a voice streaming robot, including:

the voice stream receiving module is used for receiving a target voice stream of a user through a first component of the voice stream robot and forwarding the target voice stream to a target second component of the voice stream robot;

the robot allocation module is used for sending interaction notification information to a corresponding third component through the target second component, and allocating a target virtual voice stream robot to the user in the line multiplexing connection pool based on the interaction notification information through the third component;

and the response voice determining module is used for determining target response voice of the target voice stream through the target virtual voice stream robot so as to perform voice interaction with the user.

According to still another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described voice interaction method based on a voice streaming robot.

According to still another aspect of the present application, there is provided a computer device including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, the processor implementing the voice interaction method based on the voice stream robot when executing the program.

By means of the technical scheme, the voice interaction method and device based on the voice stream robot, the storage medium and the computer equipment can forward the target voice stream of the user to the first component of the voice stream robot through a preset protocol after the user is connected. After the first component receives the target voice stream, the first component may further forward the target voice stream into the target second component. And then, the target second component can generate interaction notification information, and can send the interaction notification information to the corresponding third component to notify the third component that the user needs to access. After receiving the interaction notification information, the third component may find a target virtual voice stream robot for serving the user in the line multiplexing connection pool. After the target virtual voice stream robot is determined from the line multiplexing connection pool, the target voice stream in the target second component can be sent to the target virtual voice stream robot, so that the target virtual voice stream robot can determine target response voice corresponding to the target voice stream based on the target voice stream, and voice interaction can be performed between the target response voice and a user. According to the method and the device for the virtual voice stream, when the target virtual voice stream robot is distributed for the user, the line multiplexing connection pool can be fully utilized, the user on time is changed from hundreds of milliseconds to milliseconds, on one hand, the user on efficiency can be greatly improved, and on the other hand, the user experience can be effectively improved.

The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

fig. 1 shows a flow diagram of a voice interaction method based on a voice streaming robot according to an embodiment of the present application;

fig. 2 shows a schematic structural diagram of a voice stream robot according to an embodiment of the present application;

fig. 3 is a schematic flow chart of another voice interaction method based on a voice streaming robot according to an embodiment of the present application;

fig. 4 shows a schematic structural diagram of a voice interaction device based on a voice streaming robot according to an embodiment of the present application.

Detailed Description

The present application will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.

In this embodiment, a voice interaction method based on a voice streaming robot is provided, as shown in fig. 1, and the method includes:

step 101, receiving a target voice stream of a user through a first component of the voice stream robot, and forwarding the target voice stream to a target second component of the voice stream robot;

the voice interaction method provided by the embodiment of the application may be specifically implemented based on a voice streaming robot, as shown in fig. 2, where the voice streaming robot may include a first component, a plurality of second components, and a plurality of third components. Wherein each second component may correspond to a third component. The third party platform may be responsible for placing the user's call or calling the user externally. When a user is connected to the third party platform, the third party platform can forward the target voice stream of the user to a first component of the voice stream robot through a preset protocol, the first component can be an openips component, the first component can receive and forward the target voice stream, and the corresponding preset protocol can be a SIP+RTP protocol. After the first component receives the target voice stream, the first component may further forward the target voice stream into the second component. Here, the second component may be a fresh component.

Step 102, sending interaction notification information to a corresponding third component through the target second component, and distributing a target virtual voice stream robot to the user in a line multiplexing connection pool through the third component based on the interaction notification information;

in this embodiment, after the target second component receives the target voice stream sent by the first component, the target second component may generate interaction notification information, and may send the interaction notification information to the corresponding third component, to notify the third component that there is a user to access. After receiving the interaction notification information, the third component may find a target virtual voice stream robot for serving the user in the line multiplexing connection pool. Here, the third component may be a esl component. The line multiplexing connection pool can be correspondingly provided with a plurality of lines and a plurality of virtual voice stream robots, one virtual voice stream robot can be selected from the line multiplexing connection pool as a target virtual voice stream robot whenever a new user is accessed, one line is allocated to the target virtual voice stream robot, and the line is removed from the line multiplexing connection pool; after the user hangs up, the target virtual voice stream robot is reset, and the line corresponding to the target virtual voice stream robot is accessed into the line multiplexing connection pool again, so that the efficient reuse of the line and the virtual voice stream robot is realized.

And step 103, determining target response voice of the target voice stream by the target virtual voice stream robot so as to perform voice interaction with the user.

In this embodiment, after the target virtual voice stream robot is determined from the line multiplexing connection pool, the target voice stream in the target second component may be sent to the target virtual voice stream robot, so that the target virtual voice stream robot may determine, based on the target voice stream, a target response voice corresponding to the target voice stream, and may subsequently perform voice interaction between the target response voice and the user.

By applying the technical scheme of the embodiment, after the user is connected, the target voice stream of the user can be forwarded to the first component of the voice stream robot through a preset protocol. After the first component receives the target voice stream, the first component may further forward the target voice stream into the target second component. And then, the target second component can generate interaction notification information, and can send the interaction notification information to the corresponding third component to notify the third component that the user needs to access. After receiving the interaction notification information, the third component may find a target virtual voice stream robot for serving the user in the line multiplexing connection pool. After the target virtual voice stream robot is determined from the line multiplexing connection pool, the target voice stream in the target second component can be sent to the target virtual voice stream robot, so that the target virtual voice stream robot can determine target response voice corresponding to the target voice stream based on the target voice stream, and voice interaction can be performed between the target response voice and a user. According to the method and the device for the virtual voice stream, when the target virtual voice stream robot is distributed for the user, the line multiplexing connection pool can be fully utilized, the user on time is changed from hundreds of milliseconds to milliseconds, on one hand, the user on efficiency can be greatly improved, and on the other hand, the user experience can be effectively improved.

Further, as a refinement and extension of the foregoing embodiment, in order to fully describe a specific implementation procedure of the embodiment, another voice interaction method based on a voice stream robot is provided, as shown in fig. 3, where the method includes:

step 201, counting the number of user processes corresponding to each second component, and taking the second component with the minimum number of user processes as the target second component;

in this embodiment, the voice stream robot includes a plurality of second components, each of the second components corresponding to one of the third components. In order to avoid too many users accessing simultaneously, a plurality of second assemblies are arranged in the voice stream robot, voice streams corresponding to different users are distributed to different second assemblies, and then a third assembly corresponding to the different second assemblies determines the target virtual voice stream robot, so that the pressure caused by too many users accessing simultaneously are greatly relieved. When a user accesses, it may first determine which second component the user accesses, specifically, the number of user processes corresponding to each second component may be counted first, and then, the second component with the smallest number of user processes may be determined as the target second component corresponding to the user.

Step 202, receiving a target voice stream of a user through a first component of the voice stream robot, and forwarding the target voice stream to a target second component of the voice stream robot;

in this embodiment, the third party platform may be responsible for placing the user's call or calling the user to the outside. When a user is connected to the third party platform, the third party platform can forward the target voice stream of the user to a first component of the voice stream robot through a preset protocol, the first component can be an openips component, the first component can receive and forward the target voice stream, and the corresponding preset protocol can be a SIP+RTP protocol. After the first component receives the target voice stream, the first component may further forward the target voice stream into the second component. Here, the second component may be a fresh component.

Step 203, sending interaction notification information to a corresponding third component through the target second component, and determining a free line and a free virtual voice stream robot from the line multiplexing connection pool through the third component based on the interaction notification information;

in this embodiment, after the target second component receives the target voice stream, the interaction notification information may be correspondingly generated, and the interaction notification information may be sent to the third component. After receiving the interaction notification information, the third component may determine a free line and a free virtual voice stream robot that exist in the plurality of lines and the plurality of virtual voice stream robots in the current line multiplexing connection pool. Wherein the free line, i.e. the line not currently being utilized, is the free virtual voice stream robot, i.e. the virtual voice stream robot not currently being assigned to the user.

Step 204, determining a target virtual voice stream robot from the idle virtual voice stream robots, allocating any idle line for the target virtual voice stream robot, and removing any idle line from the line multiplexing connection pool;

in this embodiment, any one of the idle virtual voice stream robots may be regarded as the target virtual voice stream robot, and one idle line may be allocated to the target virtual voice stream robot from among a plurality of idle lines. And then, removing the selected idle line from the line multiplexing connection pool, so as to avoid the line from being selected again when a new user is accessed subsequently.

Step 205, sending, by the third component, a bridging instruction to the target second component, so that the target voice stream is forwarded from the target second component to the target virtual voice stream robot;

in this embodiment, after the target virtual voice stream robot is determined, the third component may send a bridging instruction to the target second component, so that the target voice stream of the user may be sent to the target virtual voice stream robot.

Step 206, sending the target voice stream to a voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target voice stream into a target text;

in this embodiment, after the target virtual voice stream robot receives the target voice stream, the target voice stream may be sent to a voice recognition unit, which converts the target voice stream into target text through an ASR component in the voice recognition unit. The input of the ASR component is speech input and the output is text output. Wherein the speech recognition unit may be a unit independent of the virtual voice stream robot.

Step 207, the target text returned by the voice recognition unit is sent to a robot engine, so that the robot engine determines a target conversation according to the target text, and returns the target conversation to the target virtual voice stream robot;

in this embodiment, the target text may also be transmitted to the robot engine after the voice recognition unit converts the target voice stream into the target text. The robotic engine may determine that the target should be dialogs based on the received target text. For example, the target text is "i want to know about the specific content of the a insurance", and then the target should be the specific content of the a insurance queried by the robot engine. After the robotic engine determines the target should session, the target should session may be returned to the target virtual voice stream robot. Wherein the target should be dialogized into text form.

Step 208, sending the target applied dialogue to the voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target applied dialogue into target response voice to interact with the voice of the user.

In this embodiment, after the target virtual voice stream robot receives the target applied dialogue sent by the robot engine, the target applied dialogue may then be forwarded to the voice recognition unit, and the target applied dialogue may be converted into target response voice by the TTS component in the voice recognition unit. The input of the TTS component is text input and the output is speech output. In this way, the target response voice can be subsequently utilized for voice interaction with the user.

In an embodiment of the present application, optionally, after the "send the target text returned by the speech recognition unit to the robot engine" in step 207, the method further includes: identifying a target intention corresponding to the target text; when the target intention is matched with a first preset intention, switching the user to a manual customer service; and returning to target end call operation when the target intention is matched with the second preset intention.

In this embodiment, after the target text is sent to the bot engine, the bot engine may determine the target intent of the user from the target text. Then, it may be determined whether the target intention belongs to the first preset intention or the second preset intention. The first preset intention may be a purchase intention of the user, and the second preset intention may be a non-purchase intention of the user. If the target intention of the user belongs to the first preset intention, in order to better serve the user at this time, the user experience can be further improved, the user can be transferred to the artificial customer service, and the next service is provided for the user through the artificial customer service, so that the user is directly converted into a stable client. If the target intention of the user does not belong to the first preset intention but belongs to the second preset intention, the target end call can be directly returned to the user, namely, the user is provided with a polite end dialogue service.

In an embodiment of the present application, optionally, after the "return to target end call", the method further includes: feeding back target ending voice corresponding to the target ending voice operation to the user through the target virtual voice stream robot, and calling a preset hang-up interface after playing is finished to hang-up the user; and re-accessing any idle line into the line multiplexing connection pool.

In this embodiment, after the robot engine determines the target end call, the target end call may be sent to the speech recognition unit by the target virtual speech streaming robot, the target end call may be converted into the target end speech by the TTS component of the speech recognition unit, and then the target end speech may be fed back to the target virtual speech streaming robot, and finally forwarded by the target virtual speech streaming robot to the third component, forwarded by the third component to the second component, forwarded by the second component to the first component, forwarded by the first component to the third party platform, and finally forwarded by the third party platform to the client of the user. When the client-side is detected to play the target ending voice, a preset hang-up interface can be called at the moment to directly hang up the connection with the user. After the user is hung up, the line and the target virtual voice stream robot corresponding to the user are restored to the idle state again, at the moment, the line can be re-connected into the line multiplexing connection pool, and the target virtual voice stream robot can be re-added into the queue of the idle virtual voice stream robot, so that the utilization rate of the line and the virtual voice stream robot is effectively improved.

In an embodiment of the present application, optionally, before the "receiving, by the first component of the voice stream robot, the target voice stream of the user", the method further includes: acquiring historical user access data in preset time, determining the number of user accesses at the same moment based on the historical user access data, and sequencing the number of user accesses at the same moment; and determining the number of the virtual voice stream robots in the line multiplexing connection pool according to the number of the user accesses at the same time after sequencing.

In this embodiment, to ensure that each user can be provided with virtual voice stream robots directly as much as possible during peak periods of user access, the number of virtual voice stream robots in the line multiplex connection pool can be determined based on historical user access data. Firstly, historical user access data in a preset time can be obtained, and then, the number of simultaneous access users corresponding to different moments in the preset time can be counted. For example, there are 13 twelve twenty-five access users, and then the number of user accesses at this time instant may be 13. Then, the number of the virtual voice stream robots in the line multiplexing connection pool can be determined according to the maximum number after the sequencing, for example, the maximum number of the users accessed in thirty-eight minutes at ten points in historical user access data in preset time is 25, so that the number of the virtual voice stream robots in the line multiplexing connection pool can be directly set to 25. In addition, as multiple people access is not performed at all times, in order to avoid resource waste, an average value can be obtained according to the number of the user access at the preset number of the same time before ranking, and the number of the virtual voice stream robots in the line multiplexing connection pool can be determined according to the average value.

Further, as a specific implementation of the method of fig. 1, an embodiment of the present application provides a voice interaction device based on a voice streaming robot, as shown in fig. 4, where the device includes:

Optionally, the voice stream robot includes a plurality of second components, each of the second components corresponds to a third component; the apparatus further comprises:

the statistics module is used for counting the number of user processing corresponding to each second component before the target voice stream is forwarded to the target second component of the voice stream robot, and taking the second component with the minimum number of user processing as the target second component;

correspondingly, the device further comprises:

and the instruction sending module is used for sending a bridging instruction to the target second component through the third component before the target response voice of the target voice flow is determined through the target virtual voice flow robot, so that the target voice flow is forwarded from the target second component to the target virtual voice flow robot.

Optionally, the line multiplexing connection pool includes a plurality of lines and a plurality of virtual voice stream robots; the robot distribution module is used for: determining a free line and a free virtual voice stream robot from the line multiplexing connection pool based on the interaction notification information through the third component; and determining a target virtual voice stream robot from the idle virtual voice stream robots, distributing any idle line for the target virtual voice stream robot, and removing any idle line from the line multiplexing connection pool.

Optionally, the answer speech determining module is configured to:

transmitting the target voice stream to a voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target voice stream into a target text; the target text returned by the voice recognition unit is sent to a robot engine, so that the robot engine determines a target conversation according to the target text, and returns the target conversation to the target virtual voice stream robot; the target applied dialogue is sent to the voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target applied dialogue into target response voice.

Optionally, the apparatus further comprises:

the intention recognition module is used for recognizing the target intention corresponding to the target text after the target text returned by the voice recognition unit is sent to the robot engine;

the first matching module is used for switching the user to the manual customer service when the target intention is matched with a first preset intention;

and the second matching module is used for returning to the target end call when the target intention is matched with a second preset intention.

Optionally, the apparatus further comprises:

the interface calling module is used for feeding back target ending voice corresponding to the target ending voice to the user through the target virtual voice stream robot after the target ending voice is returned, and calling a preset on-hook interface after playing is finished to hang up the user;

and the access module is used for re-accessing any free line into the line multiplexing connection pool.

Optionally, the apparatus further comprises:

the sequencing module is used for acquiring historical user access data in preset time before the target voice stream of the user is received through the first component of the voice stream robot, determining the user access quantity at the same time based on the historical user access data, and sequencing the user access quantity at the same time;

and the number determining module is used for determining the number of the virtual voice stream robots in the line multiplexing connection pool according to the number of the user access at the same time after sequencing.

It should be noted that, other corresponding descriptions of each functional unit related to the voice interaction device based on the voice stream robot provided in the embodiment of the present application may refer to corresponding descriptions in the methods of fig. 1 to 3, which are not repeated herein.

Based on the above-mentioned method shown in fig. 1 to 3, correspondingly, the embodiment of the application further provides a storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned voice interaction method based on a voice stream robot shown in fig. 1 to 3.

Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.

Based on the method shown in fig. 1 to 3 and the virtual device embodiment shown in fig. 4, in order to achieve the above object, the embodiment of the present application further provides a computer device, which may specifically be a personal computer, a server, a network device, etc., where the computer device includes a storage medium and a processor; a storage medium storing a computer program; and a processor for executing a computer program to implement the voice interaction method based on the voice stream robot as shown in fig. 1 to 3.

Optionally, the computer device may also include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., bluetooth interface, WI-FI interface), etc.

It will be appreciated by those skilled in the art that the architecture of a computer device provided in the present embodiment is not limited to the computer device, and may include more or fewer components, or may combine certain components, or may be arranged in different components.

The storage medium may also include an operating system, a network communication module. An operating system is a program that manages and saves computer device hardware and software resources, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the entity equipment.

From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. After the user is connected, the target voice stream of the user can be forwarded to the first component of the voice stream robot through a preset protocol. After the first component receives the target voice stream, the first component may further forward the target voice stream into the target second component. And then, the target second component can generate interaction notification information, and can send the interaction notification information to the corresponding third component to notify the third component that the user needs to access. After receiving the interaction notification information, the third component may find a target virtual voice stream robot for serving the user in the line multiplexing connection pool. After the target virtual voice stream robot is determined from the line multiplexing connection pool, the target voice stream in the target second component can be sent to the target virtual voice stream robot, so that the target virtual voice stream robot can determine target response voice corresponding to the target voice stream based on the target voice stream, and voice interaction can be performed between the target response voice and a user. According to the method and the device for the virtual voice stream, when the target virtual voice stream robot is distributed for the user, the line multiplexing connection pool can be fully utilized, the user on time is changed from hundreds of milliseconds to milliseconds, on one hand, the user on efficiency can be greatly improved, and on the other hand, the user experience can be effectively improved.

Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims

1. A voice interaction method based on a voice stream robot is characterized by comprising the following steps:

sending interactive notification information to a corresponding third component through the target second component, and distributing a target virtual voice stream robot for the user in a line multiplexing connection pool based on the interactive notification information through the third component, wherein the line multiplexing connection pool comprises a plurality of lines and a plurality of virtual voice stream robots;

2. The method of claim 1, wherein the voice streaming robot comprises a plurality of second components, each of the second components corresponding to a third component; before forwarding the target voice stream into the target second component of the voice stream robot, the method further comprises:

counting the number of user processing corresponding to each second component, and taking the second component with the minimum number of user processing as the target second component;

accordingly, before the target response voice of the target voice stream is determined by the target virtual voice stream robot, the method further includes:

and sending a bridging instruction to the target second component through the third component so as to enable the target voice stream to be forwarded from the target second component to the target virtual voice stream robot.

3. The method of claim 1, wherein the assigning, by the third component, the target virtual voice stream robot to the user in a line multiplexed connection pool based on the interaction notification information comprises:

determining a free line and a free virtual voice stream robot from the line multiplexing connection pool based on the interaction notification information through the third component;

and determining a target virtual voice stream robot from the idle virtual voice stream robots, distributing any idle line for the target virtual voice stream robot, and removing any idle line from the line multiplexing connection pool.

4. The method of claim 3, wherein the determining, by the target virtual voice stream robot, a target reply voice for the target voice stream comprises:

transmitting the target voice stream to a voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target voice stream into a target text;

the target text returned by the voice recognition unit is sent to a robot engine, so that the robot engine determines a target conversation according to the target text, and returns the target conversation to the target virtual voice stream robot;

the target applied dialogue is sent to the voice recognition unit through the target virtual voice stream robot so that the voice recognition unit converts the target applied dialogue into target response voice.

5. The method of claim 4, wherein after the sending the target text returned by the speech recognition unit to a robotic engine, the method further comprises:

identifying a target intention corresponding to the target text;

when the target intention is matched with a first preset intention, switching the user to a manual customer service;

and returning to target end call operation when the target intention is matched with the second preset intention.

6. The method of claim 5, wherein after the returning the target end call, the method further comprises:

feeding back target ending voice corresponding to the target ending voice operation to the user through the target virtual voice stream robot, and calling a preset hang-up interface after playing is finished to hang-up the user;

and re-accessing any idle line into the line multiplexing connection pool.

7. The method of claim 1, wherein prior to receiving the user's target voice stream by the first component of the voice stream robot, the method further comprises:

acquiring historical user access data in preset time, determining the number of user accesses at the same moment based on the historical user access data, and sequencing the number of user accesses at the same moment;

and determining the number of the virtual voice stream robots in the line multiplexing connection pool according to the number of the user accesses at the same time after sequencing.

8. A voice interaction device based on a voice streaming robot, comprising:

the robot distribution module is used for sending interaction notification information to a corresponding third component through the target second component, distributing target virtual voice stream robots to the users in a line multiplexing connection pool based on the interaction notification information through the third component, wherein the line multiplexing connection pool comprises a plurality of lines and a plurality of virtual voice stream robots;

9. A storage medium having stored thereon a computer program, which when executed by a processor, implements the method of any of claims 1 to 7.

10. A computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, characterized in that the processor implements the method of any one of claims 1 to 7 when executing the computer program.