CN111862966A - Intelligent voice interaction method and related device - Google Patents

Intelligent voice interaction method and related device Download PDF

Info

Publication number
CN111862966A
CN111862966A CN201910779585.4A CN201910779585A CN111862966A CN 111862966 A CN111862966 A CN 111862966A CN 201910779585 A CN201910779585 A CN 201910779585A CN 111862966 A CN111862966 A CN 111862966A
Authority
CN
China
Prior art keywords
process engine
engine interface
calling
voice
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910779585.4A
Other languages
Chinese (zh)
Inventor
李宽
熊彬
权圣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Xiaofei Finance Co Ltd
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Xiaofei Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Xiaofei Finance Co Ltd filed Critical Mashang Xiaofei Finance Co Ltd
Priority to CN201910779585.4A priority Critical patent/CN111862966A/en
Publication of CN111862966A publication Critical patent/CN111862966A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The application discloses an intelligent voice interaction method and a related device. The intelligent voice interaction method comprises the following steps: calling a first process engine interface to acquire voice information input by a user through the first process engine interface; calling a first process engine interface to identify the voice information through a first process engine to obtain identification information of the voice information; calling a second process engine interface to acquire a corresponding response event of the identification information through the second process engine interface; a corresponding response event of the identification information is executed. By the scheme, maintainability and reusability of the script file can be improved.

Description

Intelligent voice interaction method and related device
Technical Field
The present application relates to the field of intelligent voice technologies, and in particular, to an intelligent voice interaction method and a related device.
Background
The intelligent chat robot is widely applied in various industries, particularly service industries, so that various commercial and civil products including intelligent customer service, intelligent sound boxes, entertainment products and the like are derived. As an advanced form of the intelligent voice robot, the intelligent voice robot is more and more favored by the industry in a more natural and convenient voice interaction mode.
In view of the fact that intelligent voice interaction is mostly multi-turn conversation, programming languages such as Lua and Python need to be adopted to write script files so as to achieve management of business processes in conversation scenes. However, for complex conversation scenes such as operator customer service and e-commerce customer service, the service flow is relatively complex, so that the script file is large and bloated, and is not easy to maintain. In addition, since it is difficult for a single script file to manage a plurality of different dialog scenarios, a plurality of script files need to be set in order to cope with a plurality of different dialog scenarios, so that reusability is greatly reduced, and maintenance and management are not facilitated. In view of the above, how to improve maintainability and reusability of script files becomes an urgent problem to be solved.
Disclosure of Invention
The technical problem mainly solved by the application is to provide an intelligent voice interaction method and a related device, which can improve maintainability and reusability of script files.
In order to solve the above problem, a first aspect of the present application provides an intelligent voice interaction method, including calling a first process engine interface, and acquiring voice information input by a user through the first process engine interface; calling a first process engine interface, and identifying the voice information through a first process engine to obtain identification information of the voice information; calling a second process engine interface, and acquiring a corresponding response event of the identification information through the second process engine interface; and executing the response event corresponding to the identification information.
In order to solve the above problems, a second aspect of the present application provides an intelligent voice interaction apparatus, including an obtaining module, a recognition module, a matching module, and an execution module, where the obtaining module is configured to call a first process engine interface, and obtain voice information input by a user through the first process engine interface; the recognition module is used for calling a first process engine interface, recognizing the voice information through the first process engine and obtaining recognition information of the voice information; the matching module is used for calling a second process engine interface and acquiring a corresponding response event of the identification information through the second process engine interface; the execution module is used for executing the corresponding response event of the identification information.
In order to solve the above problem, a third aspect of the present application provides an intelligent voice interaction device, comprising a memory and a processor coupled to each other; the processor is configured to execute the program instructions stored in the memory to implement the intelligent voice interaction method of the first aspect.
In order to solve the above problem, a fourth aspect of the present application provides a storage device, which stores program instructions capable of being executed by a processor, where the program instructions are used to implement the intelligent voice interaction method of the first aspect.
According to the scheme, a first process engine interface is called, voice information input by a user is obtained through the first process engine interface, the first process engine interface is called, the voice information is identified through the first process engine to obtain identification information of the voice information, then a second process engine interface is called, a corresponding response event of the identification information is obtained through the second process engine interface, and the response event corresponding to the identification information is executed, so that the voice information input by the user is identified, the corresponding response event of the identification information is handed to the corresponding process engine to be responsible for, the business process in a conversation scene is not required to be managed, only input/output of the information is required, and calling of various interfaces is required, so that the volume of script files is greatly reduced, and maintainability is improved; in addition, the script file is only responsible for inputting/outputting information and calling various interfaces, so that the reusability of various different conversation scenes is greatly improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of an intelligent voice interaction method of the present application;
FIG. 2 is a schematic flow chart diagram illustrating another embodiment of an intelligent voice interaction method of the present application;
FIG. 3 is a schematic flow chart diagram illustrating a method for intelligent voice interaction in accordance with another embodiment of the present application;
FIG. 4 is a block diagram of an embodiment of an intelligent voice interaction system based on the intelligent voice interaction method of the present application;
FIG. 5 is a block diagram of an embodiment of an intelligent voice interaction apparatus;
FIG. 6 is a block diagram of an embodiment of the intelligent voice interaction device;
FIG. 7 is a block diagram of an embodiment of a memory device according to the present application.
Detailed Description
The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of an intelligent voice interaction method according to the present application. Specifically, the method may include the steps of:
step S11: and calling a first process engine interface, and acquiring the voice information input by the user through the first process engine interface.
In this embodiment, the script file calls the first process engine interface, so that the first process engine acquires the voice information input by the user through the first process engine interface. In an implementation scenario, the first process engine includes an output interface and an input interface, where the input interface is used to obtain voice information input by a user, and the output interface, that is, the first process engine interface in this embodiment, is used to output the voice information to a script file communicatively connected to the first process engine, so that the script file obtains the voice information input by the user through the first process engine interface.
In an implementation scenario, according to a specific dialog scenario, if the number of the service flows is small, the first flow engine may also be not set, and the key information input by the user may be directly obtained through a preset interface, which is not specifically limited in this embodiment.
In an implementation scenario, in order to guide the user to perform voice interaction so as to gradually understand the intention of the user, the fourth process engine interface may be invoked to obtain the welcome dialog and play the welcome dialog before step S11. Dialogs are conversational modalities in the various flows of speech interaction. Taking the customer service of the operator as an example, when the customer service telephone is connected, welcome words are generally broadcasted, such as "respected customer, your good, mobile service request … …, broadband service request … …, manual service request … …, end hang-up request" to guide the user to feed back the user's needs.
Step S12: and calling a first process engine interface, and identifying the voice information through the first process engine to obtain the identification information of the voice information.
In this embodiment, the script file may further call a first process engine interface, so that the first process engine transfers the identification information obtained by identifying the voice information to the script file through the first process engine interface, and in an implementation scenario, the script file calls the first process engine interface, so that the first process engine directly identifies the voice information after acquiring the voice information input by the user, and obtains the identification information of the voice information; in another implementation scenario, after the script file acquires the voice information input by the user through the first process engine interface, the first process engine interface is called to enable the first process engine to identify the voice information, so as to obtain the identification information of the voice information, thereby improving the robustness of the system.
Identification information includes, but is not limited to: the voice information input by the user is converted into text information after the text information is converted into text, the user intention represented by the voice information input by the user, and the like, which is not specifically limited in this embodiment.
Step S13: and calling a second process engine interface, and acquiring a corresponding response event of the identification information through the second process engine interface.
In this embodiment, the script file further calls a second process engine interface, so that the second process engine determines, according to the identification information transmitted by the script file, a corresponding response event associated with the identification information, and transmits the corresponding response event to the script file through the second process engine interface. For example, in a customer service voice interaction scenario of an operator, after acquiring voice information "i want to check the call charge" input by a user, a first process engine recognizes that the intention of the user is "call charge", that is, the recognition information is "call charge", a script file transmits the recognition information to a second process engine, the second process engine can further request specific call charge information such as the call charge balance, the consumption details in the month and the like corresponding to the telephone number of the user from a server according to the recognition information, and then a second process engine interface transmits the call charge information to the script file. Other voice interaction scenarios are analogized, and the embodiment is not exemplified here.
Step S14: and executing the response event corresponding to the identification information.
Depending on the specific business process, the corresponding response event may be a word related to the identification information, or may be other actions such as hanging up. Still taking the above-mentioned service voice interaction example of the operator, the script file calls the second process engine interface to acquire the corresponding response event of the identification information as "telephone charge information", so that the telephone charge information can be played for the user to listen to. Or, the script file calls the second process engine interface to acquire the identification information, that is, the corresponding response event of "no other problem, i want to hang up" is "hang up", and then the voice interaction may be directly hung up, which is not limited in this embodiment.
According to the scheme, a first process engine interface is called, voice information input by a user is obtained through the first process engine interface, the first process engine interface is called to recognize the voice information through the first process engine to obtain recognition information of the voice information, then a second process engine interface is called, a response event corresponding to the recognition information is obtained through the second process engine interface, and the response event corresponding to the recognition information is executed, so that the voice information input by the user is recognized, the response event corresponding to the recognition information is handed to the corresponding process engine to be responsible for, the business process in a conversation scene is not required to be managed, only information input/output is required to be responsible, and various interfaces are called, so that the volume of script files is greatly reduced, and maintainability is improved; in addition, the script file is only responsible for inputting/outputting information and calling various interfaces, so that the reusability of various different conversation scenes is greatly improved.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating an intelligent voice interaction method according to another embodiment of the present application. Fig. 2 is a schematic flowchart of an embodiment of the intelligent voice interaction method shown in fig. 1. The method comprises the following steps:
Step S21: and calling a first process engine interface, and acquiring the voice information input by the user through the first process engine interface.
Please refer to step S11.
Step S22: and calling a first process engine interface, and identifying the voice information through the first process engine to obtain text information converted from the voice information.
In this embodiment, the script file calls a first process engine interface to enable the first process engine to identify the acquired voice information, so that the voice information is converted into text information. In one implementation scenario, the script file may also invoke the first process engine interface to cause the first process engine to recognize the voice information, convert the voice information to text information, and further recognize the user's intent.
Please refer to step S12, which is not described herein again.
Step S23: and calling a second process engine interface, and acquiring a corresponding response dialog of the text message through the second process engine interface.
In this embodiment, the script file may further call the second process engine interface, so that the second process engine determines that the corresponding response session of the text message is returned to the script file through the second process engine interface. Still taking the customer service of the operator as an example, when the customer service telephone is connected, generally report "respected customer, your, mobile service request … …, broadband service request … …, manual service request … …, end call on hook", when the current seat cannot answer, "the seat is busy, please wait", when the conversation ends, generally report "thank you for incoming call, please call on hook", etc., broadcast the telephone art, guide the user to solve the problem step by step, or provide information for the user.
Please refer to step S13 above.
Step S24: and calling a third process engine interface, and playing the corresponding response dialogs of the text information through the third process engine.
And calling a third flow engine interface by the script file so as to enable the third flow engine to play a corresponding response dialog of the text information. In this embodiment, after the script file calls the second flow engine interface to obtain the corresponding response dialog of the text message, the corresponding response dialog is transmitted through the third flow engine interface, so that the third flow engine plays the corresponding response dialog.
In one implementation scenario, the script file may also invoke a third flow engine interface to cause the third flow engine to directly play audio corresponding to the response dialog. For example, according to the specific voice interaction scenario, the number of the service flows involved is small, the dialogues involved in the service flows can be recorded as audio in advance and stored in the intelligent voice interaction apparatus, in the step S23, the second flow engine interface may be invoked to obtain an ID (identity document) of the text information corresponding to the response dialogues, and in the step S24, the audio is retrieved from the memory of the intelligent voice interaction apparatus and played according to the ID; or, according to a plurality of service flows related to a specific voice interaction scenario, the dialogs related to the service flows may be recorded as audio in advance and stored in a server communicatively connected to the intelligent voice interaction apparatus, in the step S23, the second process engine interface may be invoked to obtain an ID (Identity Document) of the text information corresponding to the response dialogs through the second process engine interface, in this step S24, the server may be requested to obtain the corresponding audio according to the ID, and when the corresponding audio is downloaded from the server, the audio is played. The embodiment is not particularly limited herein.
Different from any of the above embodiments, in this embodiment, the script file calls the first process engine interface, the voice information input by the user is identified by the first process engine and converted into text information, so as to call the second process engine interface, the corresponding response dialogues of the text information are acquired by the second process engine interface, and the corresponding response dialogues are played by the third process engine interface, so that the corresponding response dialogues are acquired by calling various process engine interfaces according to the voice information input by the user, and further, the voice interaction with the user is realized.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an intelligent voice interaction method according to another embodiment of the present application. Fig. 3 is a schematic flowchart of another embodiment of the intelligent voice interaction method shown in fig. 1. Specifically, the method may include the steps of:
step S31: and calling a first process engine interface, and acquiring the voice information input by the user through the first process engine interface.
Please refer to step S21.
Step S32: and calling a first process engine interface, identifying the voice information through the first process engine, and converting the voice information into text information.
Please refer to step S22.
Step S33: and calling a second process engine interface, and acquiring a corresponding response dialog of the text message through the second process engine interface.
Please refer to step S23.
Step S34: and calling a third process engine interface, and playing the corresponding response dialogs of the text information through the third process engine.
Please refer to step S24.
Step S35: step S31 and subsequent steps are re-executed.
After the script file calls the third process engine interface to play the corresponding response dialog of the text message through the third process engine, the step of calling the first process engine interface to acquire the voice message input by the user through the first process engine interface and the subsequent steps can be executed again, so that after the corresponding response dialog is played, the feedback of the user on the dialog is acquired, and a new round of voice interaction is started. In an implementation scenario, when the intelligent voice interaction supports user interruption, the first process engine interface may be further invoked to obtain the voice information input by the user through the first process engine interface in the process executed in step S34, which is not limited in this embodiment.
Different from any of the above embodiments, in this embodiment, after the current round of voice interaction is finished, the voice information input by the user is continuously acquired, so that the next round of voice interaction is started, and the user can acquire the desired information to be understood in the process of multiple rounds of voice interaction.
Referring to fig. 4, fig. 4 is a schematic diagram of a framework of an embodiment of an intelligent voice interaction system based on the intelligent voice interaction method of the present application. As shown in fig. 4, the intelligent voice interaction system in this embodiment may be designed based on FreeSwitch, or may be designed based on other soft switch software, such as Asterisk. The relevant technical standards for Freeswitch and Asterisk are prior art in the field, and are not described herein in detail. The intelligent voice interaction system in this embodiment may include the script file, the first process engine, the second process engine, and the third process engine described in the above embodiments. The script file may be written based on programming languages such as Lua and Python, and the embodiment is not limited in this embodiment. The first process engine is an Automatic Speech Recognition (ASR) system, and is mainly used for receiving and recognizing Speech information input by a user and converting the Speech information into text information; the second process engine is mainly used for acquiring a corresponding response word operation according to the text information; the third flow engine is a Speech synthesis (Text To Speech, TTS) system, and is mainly used for playing corresponding response dialogues. In an implementation scenario, the intelligent voice interaction system may further include a fourth process engine, which is mainly configured to generate a welcome dialog corresponding to the interaction scenario when the voice interaction starts. The embodiment is not particularly limited herein.
Referring to fig. 4, the script file includes a first flow engine interface calling module, a second flow engine interface calling module, and a third flow engine interface calling module, where the first flow engine interface calling module is connected to the first flow engine, the second flow engine interface calling module is connected to the second flow engine, and the third flow engine interface calling module is connected to the third flow engine. The connections referred to in this embodiment are communication connections so that the modules may communicate related information between each other. In addition, the script file also comprises a voice input module and a voice output module. Specifically, when the intelligent voice interaction system works, the interaction between the modules of the script file and the interaction between each module and the process engine may include:
when voice interaction is triggered, for example, a customer service telephone is connected, etc., a fourth process engine (not shown) generates a welcome call, a fourth process engine interface calling module (not shown) of the script file transmits the welcome call to a third process engine interface calling module through a voice output module, so that the third engine is called to play the welcome call through the third process engine interface calling module;
The first process engine acquires voice information input by a user, the script file acquires the voice information through the first process engine interface calling module, and continues to acquire the recognition information of the first process engine on the voice information through the first process engine interface calling module, so that the recognition information is transmitted to the second process engine interface calling module through the voice input module until the recognition information is transmitted to the second process engine;
the second process engine obtains a corresponding response event of the identification information, such as: when the corresponding response event is on-hook, requires key confirmation of a user and the like, transmitting the corresponding response event to other operation modules so as to execute the corresponding response event; or, when the corresponding response event is the corresponding response session, the corresponding response session is transmitted to the voice output module through the second process engine interface calling module and is transmitted to the third process engine interface calling module through the voice output module until the corresponding response session is transmitted to the third process engine, so that the third process engine plays the corresponding response session.
After the third flow engine finishes playing the corresponding response session or when the intelligent voice interaction system supports user interruption, the script file re-executes the step of calling the first flow engine interface to acquire the voice information input by the user and the subsequent steps in the process of playing the corresponding response session by the third flow engine.
Therefore, different from the prior art, the script file is only responsible for input/output of information and calling of various interfaces, reusability of various different conversation scenes is greatly improved, the size of the script file is greatly reduced, maintainability is improved, and then the service can be promoted to quickly fall to the ground, maintenance threshold is reduced, and system stability is enhanced.
Referring to fig. 5, fig. 5 is a schematic diagram of an embodiment of an intelligent voice interaction device 50 according to the present application. Specifically, the intelligent voice interaction device 50 includes an obtaining module 51, a recognition module 52, a matching module 53 and an executing module 54, where the obtaining module 51 is configured to call a first process engine interface to obtain voice information input by a user through the first process engine interface, the recognition module 52 is configured to call the first process engine interface to recognize the voice information through the first process engine to obtain recognition information of the voice information, the matching module 53 is configured to call a second process engine interface to obtain a corresponding response event of the recognition information through the second process engine interface, and the executing module 54 is configured to execute the corresponding response event of the recognition information.
According to the scheme, the script file calls the first process engine interface to acquire the voice information input by a user through the first process engine interface, calls the first process engine interface to recognize the voice information through the first process engine to acquire the recognition information of the voice information, then calls the second process engine interface to acquire the corresponding response event of the recognition information through the second process engine interface, and executes the corresponding response event of the recognition information, so that the voice information input by the user is recognized, the corresponding response event of the recognition information is handed to the corresponding process engine, the business process in a conversation scene is not required to be managed, only the input/output of the information is required to be responsible, and the calling of various interfaces is required, so that the volume of the script file is greatly reduced, and the maintainability is improved; in addition, the script file is only responsible for inputting/outputting information and calling various interfaces, so that the reusability of various different conversation scenes is greatly improved.
In some embodiments, the recognition module 52 is configured to call a first process engine interface to recognize the voice message through the first process engine and convert the voice message into text message, the matching module 53 is configured to call a second process engine interface to obtain a corresponding response dialog of the text message through the second process engine interface, and the execution module 54 is configured to call a third process engine interface to play a corresponding response dialog of the text message through the third process engine.
In some embodiments, the intelligent voice interaction device 50 further includes a loop control module, configured to control the obtaining module 51, the recognition module 52, the matching module 53 and the execution module 54 to re-execute the steps of obtaining the voice information input by the user through the first process engine interface by calling the first process engine interface and the subsequent steps after the execution module 54 calls the third process engine interface to play the corresponding response dialog of the text information through the third process engine.
In some embodiments, the obtaining module 51 is configured to invoke an automatic speech recognition process engine interface to obtain speech information through the automatic speech recognition process engine interface, and the recognition module 52 is configured to invoke the automatic speech recognition process engine interface to recognize the speech information through the automatic speech recognition process engine to obtain recognition information of the speech information.
In some embodiments, execution module 54 is configured to invoke a speech synthesis system to play a corresponding responsive utterance of the recognition information through the speech synthesis system.
In some embodiments, execution module 54 is further configured to suspend the voice interaction.
In some embodiments, the obtaining module 51 is further configured to invoke a fourth process engine interface to obtain the welcome word through the fourth process engine interface, and play the welcome word.
Different from the prior art, the script file calls the first process engine interface to recognize the voice information input by the user through the first process engine interface and converts the voice information into text information, so that the second process engine interface is called to acquire the corresponding response dialogues of the text information through the second process engine interface, and the third process engine interface is called to play the corresponding response dialogues through the third process engine, so that the corresponding response dialogues can be acquired by calling various process engine interfaces according to the voice information input by the user, and further the voice interaction with the user is realized.
In addition, after the current round of voice interaction is finished, voice information input by the user is continuously acquired, so that the next round of voice interaction is started, and the user can acquire expected information to be known in the process of multiple rounds of voice interaction.
Referring to fig. 6, fig. 6 is a schematic diagram of a framework of an embodiment of an intelligent voice interaction device 60 according to the present application. The intelligent voice interaction device 60 comprises a memory 61 and a processor 62 coupled to each other, and the processor 62 is configured to execute program instructions stored in the memory 61 to implement the steps in any of the above-described embodiments of the intelligent voice interaction method.
In particular, the processor 62 is configured to control itself and the memory 61 to implement the steps in any of the above-described embodiments of the intelligent voice interaction method. The processor 62 may also be referred to as a CPU (Central Processing Unit). The processor 62 may be an integrated circuit chip having signal processing capabilities. The Processor 62 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 62 may be commonly implemented by a plurality of integrated circuit chips.
According to the scheme, the script file is only responsible for inputting/outputting information and calling various interfaces, reusability of various different conversation scenes is greatly improved, the size of the script file is greatly reduced, maintainability is improved, and then the quick landing of services is facilitated, the maintenance threshold is reduced, and the system stability is enhanced.
Referring to fig. 7, fig. 7 is a schematic diagram of a memory device 70 according to an embodiment of the present application. The memory device 70 stores program instructions 71 capable of being executed by the processor, the program instructions 71 being used for implementing the intelligent voice interaction method in any of the above embodiments.
According to the scheme, the script file is only responsible for inputting/outputting information and calling various interfaces, reusability of various different conversation scenes is greatly improved, the size of the script file is greatly reduced, maintainability is improved, and then the quick landing of services is facilitated, the maintenance threshold is reduced, and the system stability is enhanced.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (10)

1. An intelligent voice interaction method, comprising:
calling a first process engine interface, and acquiring voice information input by a user through the first process engine interface; and the number of the first and second groups,
calling the first process engine interface, and identifying the voice information through a first process engine to obtain identification information of the voice information;
calling a second process engine interface, and acquiring a response event corresponding to the identification information through the second process engine interface;
and executing the response event corresponding to the identification information.
2. The intelligent voice interaction method according to claim 1, wherein the step of calling the first process engine interface to recognize the voice message through the first process engine to obtain the recognition information of the voice message comprises:
calling the first flow engine interface, identifying the voice information through the first flow engine, and converting the voice information into text information;
the step of calling a second process engine interface and acquiring the corresponding response event of the identification information through the second process engine interface comprises the following steps:
calling the second process engine interface, and acquiring a response dialect corresponding to the text information through the second process engine interface;
The step of executing the response event corresponding to the identification information comprises:
and calling a third flow engine interface, and playing a response dialog corresponding to the text message through the third flow engine.
3. The intelligent voice interaction method according to claim 2, wherein the step of calling a third flow engine interface and playing a response dialog corresponding to the text message through the third flow engine includes:
and re-executing the calling first process engine interface, and acquiring voice information input by a user through the first process engine interface and subsequent steps.
4. The intelligent voice interaction method according to claim 1 or 2, wherein the step of calling a first process engine interface, and acquiring the voice information input by the user through the first process engine interface comprises:
calling an automatic voice recognition process engine interface, and acquiring the voice information through the automatic voice recognition process engine interface;
the step of calling the first process engine interface, recognizing the voice information through the first process engine, and obtaining the recognition information of the voice information comprises:
and calling the automatic voice recognition process engine interface, and recognizing the voice information through the automatic voice recognition process engine to obtain the recognition information of the voice information.
5. The intelligent voice interaction method according to claim 1 or 2, wherein the step of executing the response event corresponding to the identification information comprises:
and calling a voice synthesis system, and playing the response dialogs corresponding to the identification information through the voice synthesis system.
6. The intelligent voice interaction method according to claim 1, wherein the step of executing the response event corresponding to the identification information comprises:
and hanging up the voice interaction.
7. The intelligent voice interaction method according to claim 1, wherein the step of calling the first process engine interface and acquiring the voice information input by the user through the first process engine interface further comprises:
and calling a fourth process engine interface, acquiring the welcome call through the fourth process engine interface and playing the welcome call.
8. An intelligent voice interaction device, comprising:
the acquisition module is used for calling a first flow engine interface and acquiring voice information input by a user through the first flow engine interface;
the recognition module is used for calling the first flow engine interface, recognizing the voice information through the first flow engine and obtaining the recognition information of the voice information;
The matching module is used for calling a second flow engine interface and acquiring a corresponding response event of the identification information through the second flow engine interface;
and the execution module is used for executing the response event corresponding to the identification information.
9. An intelligent voice interaction device, comprising a memory and a processor coupled to each other;
the processor is configured to execute the program instructions stored by the memory to implement the method of any of claims 1 to 7.
10. A storage device storing program instructions executable by a processor to perform the method of any one of claims 1 to 7.
CN201910779585.4A 2019-08-22 2019-08-22 Intelligent voice interaction method and related device Pending CN111862966A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910779585.4A CN111862966A (en) 2019-08-22 2019-08-22 Intelligent voice interaction method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910779585.4A CN111862966A (en) 2019-08-22 2019-08-22 Intelligent voice interaction method and related device

Publications (1)

Publication Number Publication Date
CN111862966A true CN111862966A (en) 2020-10-30

Family

ID=72970592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910779585.4A Pending CN111862966A (en) 2019-08-22 2019-08-22 Intelligent voice interaction method and related device

Country Status (1)

Country Link
CN (1) CN111862966A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562643A (en) * 2020-11-09 2021-03-26 深圳桔子智能科技发展有限公司 Voice interaction method, control device and storage medium
CN112767943A (en) * 2021-02-26 2021-05-07 湖北亿咖通科技有限公司 Voice interaction system

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1362703A (en) * 2001-01-05 2002-08-07 甦活全球网路股份有限公司 On-line interactive voice system and its implementation method
US20050059453A1 (en) * 2003-09-11 2005-03-17 Jamal Benbrahim Gaming apparatus software employing a script file
CN101546309A (en) * 2008-03-26 2009-09-30 国际商业机器公司 Method and equipment for constructing indexes to resource content in computer network
CN103299362A (en) * 2010-11-16 2013-09-11 沃科莱特有限公司 Cooperative voice dialog and business logic interpreters for a voice-enabled software application
CN103678354A (en) * 2012-09-11 2014-03-26 中国移动通信集团公司 Local relation type database node scheduling method and device based on cloud computing platform
CN107004410A (en) * 2014-10-01 2017-08-01 西布雷恩公司 Voice and connecting platform
CN108320729A (en) * 2018-01-29 2018-07-24 珠海金山网络游戏科技有限公司 A kind of method and apparatus of efficient debugging game music audio
CN108595731A (en) * 2018-01-23 2018-09-28 盛科网络(苏州)有限公司 The design method and device of dynamic entry in a kind of Ethernet chip
CN108959937A (en) * 2018-06-29 2018-12-07 北京奇虎科技有限公司 Plug-in unit processing method, device and equipment
CN109716430A (en) * 2016-09-29 2019-05-03 微软技术许可有限责任公司 It is conversated interaction using super robot
CN109767753A (en) * 2019-03-29 2019-05-17 北京赢和博雅文化发展有限公司 Star robot interactive approach and system
CN110032361A (en) * 2018-01-11 2019-07-19 腾讯科技(深圳)有限公司 Test analogy method, device, electronic equipment and computer readable storage medium
CN111508477A (en) * 2019-08-02 2020-08-07 马上消费金融股份有限公司 Voice broadcasting method, device, equipment and storage device

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1362703A (en) * 2001-01-05 2002-08-07 甦活全球网路股份有限公司 On-line interactive voice system and its implementation method
US20050059453A1 (en) * 2003-09-11 2005-03-17 Jamal Benbrahim Gaming apparatus software employing a script file
CN101546309A (en) * 2008-03-26 2009-09-30 国际商业机器公司 Method and equipment for constructing indexes to resource content in computer network
CN103299362A (en) * 2010-11-16 2013-09-11 沃科莱特有限公司 Cooperative voice dialog and business logic interpreters for a voice-enabled software application
CN103678354A (en) * 2012-09-11 2014-03-26 中国移动通信集团公司 Local relation type database node scheduling method and device based on cloud computing platform
CN107004410A (en) * 2014-10-01 2017-08-01 西布雷恩公司 Voice and connecting platform
CN109716430A (en) * 2016-09-29 2019-05-03 微软技术许可有限责任公司 It is conversated interaction using super robot
CN110032361A (en) * 2018-01-11 2019-07-19 腾讯科技(深圳)有限公司 Test analogy method, device, electronic equipment and computer readable storage medium
CN108595731A (en) * 2018-01-23 2018-09-28 盛科网络(苏州)有限公司 The design method and device of dynamic entry in a kind of Ethernet chip
CN108320729A (en) * 2018-01-29 2018-07-24 珠海金山网络游戏科技有限公司 A kind of method and apparatus of efficient debugging game music audio
CN108959937A (en) * 2018-06-29 2018-12-07 北京奇虎科技有限公司 Plug-in unit processing method, device and equipment
CN109767753A (en) * 2019-03-29 2019-05-17 北京赢和博雅文化发展有限公司 Star robot interactive approach and system
CN111508477A (en) * 2019-08-02 2020-08-07 马上消费金融股份有限公司 Voice broadcasting method, device, equipment and storage device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562643A (en) * 2020-11-09 2021-03-26 深圳桔子智能科技发展有限公司 Voice interaction method, control device and storage medium
CN112767943A (en) * 2021-02-26 2021-05-07 湖北亿咖通科技有限公司 Voice interaction system

Similar Documents

Publication Publication Date Title
US10255918B2 (en) Command and control of devices and applications by voice using a communication base system
US10055190B2 (en) Attribute-based audio channel arbitration
US9548066B2 (en) Voice application architecture
US9715873B2 (en) Method for adding realism to synthetic speech
US11762629B2 (en) System and method for providing a response to a user query using a visual assistant
US11310362B2 (en) Voice call diversion to alternate communication method
US9077802B2 (en) Automated response system
CN111862966A (en) Intelligent voice interaction method and related device
CN109348048B (en) Call message leaving method, terminal and device with storage function
US11404057B2 (en) Adaptive interactive voice response system
US20060077967A1 (en) Method to manage media resources providing services to be used by an application requesting a particular set of services
US20210074296A1 (en) Transcription generation technique selection
US11431767B2 (en) Changing a communication session
CN110534084A (en) Intelligent voice control method and system based on FreeWITCH
JP2013257428A (en) Speech recognition device
CN114584656B (en) Streaming voice response method and device and voice call robot thereof
CN114598773B (en) Intelligent response system and method
KR102426290B1 (en) Mobile device with automatic call response function, method for automatic call response by mobile device and computer program for the same
US20200227051A1 (en) Service control method, service control apparatus and device
CN114390144A (en) Intelligent processing method, device and control system for voice incoming call
CN117834778A (en) IVVR-based telephone call replacing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201030

RJ01 Rejection of invention patent application after publication