WO2022012413A1 - 一种用于移动式人机协作呼叫机器人的三方通话终端 - Google Patents

一种用于移动式人机协作呼叫机器人的三方通话终端 Download PDF

Info

Publication number
WO2022012413A1
WO2022012413A1 PCT/CN2021/105295 CN2021105295W WO2022012413A1 WO 2022012413 A1 WO2022012413 A1 WO 2022012413A1 CN 2021105295 W CN2021105295 W CN 2021105295W WO 2022012413 A1 WO2022012413 A1 WO 2022012413A1
Authority
WO
WIPO (PCT)
Prior art keywords
call
processing module
end processing
voice
module
Prior art date
Application number
PCT/CN2021/105295
Other languages
English (en)
French (fr)
Inventor
司马华鹏
Original Assignee
南京硅基智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京硅基智能科技有限公司 filed Critical 南京硅基智能科技有限公司
Priority to US17/612,673 priority Critical patent/US11516346B2/en
Priority to EP21794449.5A priority patent/EP3968619B1/en
Publication of WO2022012413A1 publication Critical patent/WO2022012413A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/527Centralised call answering arrangements not requiring operator intervention
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5166Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing in combination with interactive voice response systems or voice portals, e.g. as front-ends
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/38Displays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/39Electronic components, circuits, software, systems or apparatus used in telephone systems using speech synthesis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/50Aspects of automatic or semi-automatic exchanges related to audio conference
    • H04M2203/5018Initiating a conference during a two-party conversation, i.e. three-party service or three-way call
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5183Call or contact centers with computer-telephony arrangements

Definitions

  • the present disclosure relates to the field of artificial intelligence, and more particularly, to a three-party call terminal for a mobile man-machine collaborative calling robot.
  • smart household appliances are constantly entering people's lives.
  • various brands of smart devices based on voice interaction have been launched in large numbers. Users can interact with smart devices by issuing voice commands to achieve functions such as listening to songs, telling the time, chatting, gaming, companionship, information query, and device control.
  • smart devices are mainly used in fields such as family life, leisure entertainment or children's education, and are rarely and difficult to be applied in enterprise-level applications.
  • the current voice interactive robot especially the telephone man-machine collaborative call robot system, covers the man-machine collaborative call robot system based on artificial intelligence technology and dialogue system, and the voice communication system based on communication network and VOIP technology.
  • the two are closely bound, the system is extremely complex, development, deployment and maintenance are very difficult and costly. Replacing any of these components is very difficult and inflexible.
  • the phone robot developed based on this is too complex and bulky to move once deployed. This phone robot is based on a large-scale cloud server. There is no entity that ordinary people can easily recognize, and it cannot give people an intuitive and friendly impression.
  • the purpose of the present disclosure is to provide a three-way communication terminal for a mobile man-machine collaborative calling robot, which can be decoupled from a communication system, is easy to deploy, is easy to switch, provides mobility, and can be conveniently placed in various occasions ; Convenient access to personal mobile phones or call terminals.
  • the present disclosure provides a three-party call terminal for a mobile man-machine collaborative calling robot, including:
  • a first voice interface configured to be connected to a back-end processing module, and to transmit call audio between a call object and the back-end processing module; wherein the back-end processing module is configured to communicate with the call object through preset rules interact;
  • CODEC1 configured to encode and/or decode the call audio between the call object and the back-end processing module
  • the second voice interface is configured to connect to the artificial agent and transmit the call audio between the call object and the artificial agent; the second voice interface is further configured to connect the call object with the back-end processing The audio of the call between the modules is transmitted to the artificial agent;
  • a CODEC2 module configured to encode and/or decode the audio of the call between the call object and the artificial agent
  • a call control module configured to process control signals, and automatically dial, answer and hang up calls
  • a data processing submodule configured to process voice data and perform data transfer with the back-end processing module
  • the networking sub-module is configured for network connection with the back-end processing module.
  • the three-party call terminal further includes a display module, and the display module is configured to display the call record between the call object and the background processing module to the artificial agent and the call object. or call-related information.
  • the three-way communication terminal further includes a button submodule, and the button submodule is used for inputting control instructions.
  • the three-party call terminal is set inside an audio device, the audio device includes a speaker and a microphone, and the second voice interface of the three-party call terminal is connected to all parts of the audio device. the speaker and the microphone.
  • the present disclosure further provides a communication system, the communication system includes: the three-party call terminal according to the first aspect, a back-end processing module, a human agent, and at least one communication terminal; wherein the manual The agent is connected with the three-party communication terminal through the communication terminal.
  • the back-end processing module is configured to process the voice data sent by the three-party communication terminal, and generate response voice and text to send back to the three-party communication terminal.
  • the back-end processing module includes a dialogue management submodule, a speech recognition submodule, an intent recognition submodule, and a speech synthesis submodule;
  • the dialogue management submodule is used to control the flow and logic of the dialogue, and generate a response text
  • the voice recognition submodule is used to identify the received voice of the call object and convert it into text
  • the intention recognition sub-module is used for recognizing the call object intention according to the recognized speech text
  • the speech synthesis submodule is used for synthesizing the response text into speech and sending it to the three-way communication terminal.
  • the present disclosure also provides a calling method, which is applied to the calling system according to the second aspect, and the method includes:
  • Transmit the answering voice to the communication terminal through the three-way communication terminal transmit the answering voice to the call object through the communication terminal; transmit the answering voice and/or the answering text to the artificial agent through the three-way communication terminal;
  • the response voice and the response text are generated by the back-end processing module according to a preset rule and the voice of the call object.
  • the method before acquiring the voice of the call object through the communication terminal, the method further includes:
  • the three-party call terminal will synchronize the data of the speech and call objects written according to the business logic to the back-end processing module;
  • the back-end processing module opens a session between the communication terminal and the back-end processing module after receiving the data of the conversation and the call object;
  • the back-end processing module sends the speech and text of the opening speech/welcoming speech to the call terminal through the three-party call terminal, so as to start the call between the back-end processing module and the call object.
  • the three-party call terminal further includes:
  • the three-way call terminal loads the task list according to the administrator's operation or pre-planned tasks
  • the three-party call terminal retrieves the corresponding vocabulary, and the vocabulary represents a complete set of business processes
  • the three-way call terminal queries the call object data from the call object database.
  • the generation of robot response voice and text after processing the voice of the call object by the back-end processing module includes:
  • the back-end processing module analyzes the intention of the call object according to the voice of the call object
  • the back-end processing module generates the text of the response sentence according to the intention of the call object and the built-in policies and rules of the language;
  • the back-end processing module determines the robot response voice according to the text of the response sentence.
  • the back-end processing module analyzes the intention of the call object according to the voice of the call object, including:
  • the back-end processing module converts the voice of the call object into text
  • the back-end processing module performs word segmentation on the text and obtains the word segmentation result
  • the back-end processing module analyzes the intention of the call object according to the word segmentation result.
  • the method further includes:
  • the method further includes:
  • the back-end processing module transmits the call record to the three-way call terminal, and saves it in the database of the three-way call terminal.
  • the method further includes:
  • the manual agent sends a manual intervention instruction to the three-party call terminal
  • the three-way call terminal responds to the manual intervention instruction, cuts off the connection with the back-end processing module, and switches to the manual intervention mode.
  • the present disclosure decouples the communication system (manual agent and communication terminal) from the human-machine collaborative calling robot system (back-end processing module) through the three-way communication terminal, which reduces the complexity of the system and makes it easy to deploy. It is convenient for flexible switching, which can greatly reduce the development, deployment and maintenance costs of the telephone man-machine collaborative call robot system; it provides mobility for the robot, and the three-party call terminal can be conveniently placed in various occasions, and it can also provide an intuitive and intuitive solution for the robot. , touchable entities, making it more friendly; convenient access to personal mobile phones or call terminals; providing access to Bluetooth terminals and audio ports, with a wide range of applications.
  • Fig. 1 is the module block diagram of the three-way communication terminal of the present invention.
  • FIG. 2 is a schematic structural diagram of a three-way communication system of the present invention.
  • the present disclosure provides a three-way communication system, including: a three-way communication terminal, a back-end processing module, an artificial seat, and at least one communication terminal.
  • the three-way communication system includes a communication terminal, and then When the terminal processing module, the artificial seat and the communication terminal are in use, they can be respectively connected with the three-way communication system to transmit voice data between the back-end processing module, the artificial seat and the communication terminal through the three-way communication system.
  • the communication terminal refers to the terminal equipment used by the call object
  • the artificial agent refers to the terminal equipment that monitors the dialogue between the back-end processing module and the communication terminal, which is used to monitor the voice data between the back-end processing module and the communication terminal. , and can be manually intervened to replace the back-end processing module and communicate directly with the communication terminal.
  • the present disclosure can adopt a three-party call terminal as shown in FIG. 1 .
  • the three-party call terminal includes: a first voice interface, a CODEC1 module, a second voice interface, a CODEC2 module, a call control module, a data processing submodule, and a networking submodule.
  • the three-way call terminal is used to transmit the voice data between the back-end processing module, the artificial agent and the communication terminal.
  • the operation process of the communication terminal is as follows:
  • M4 Receive the robot response voice or the artificial agent voice transmitted by the three-party call terminal through the second voice interface.
  • each module in the three-way call terminal is as follows:
  • the first voice interface is used to transmit the call audio of the call object and the back-end processing module;
  • CODEC1 module used for voice and audio encoding and decoding between the call object and the back-end processing module
  • the second voice interface is used to transmit the call audio of the artificial agent and the call object
  • CODEC2 module used for voice and audio encoding and decoding of artificial agents and call objects
  • the first voice interface and the second voice interface can be a Bluetooth terminal or an audio port.
  • the artificial agent can receive the call audio through the second voice interface, so as to determine whether to perform manual intervention.
  • the second voice interface sends the voice of the artificial agent to the three-party call terminal;
  • One or more ordinary microphones or MEMS microphones can be set on the audio port, and a far-field microphone array can also be used to receive the voice of the call object sent by the communication terminal through the voice connection module. It is used to play the robot's response voice; if necessary, the voice of the caller can also be played through the speaker without causing reverberation, whistling and reverberation.
  • Networking sub-module connect the back-end processing module through wired link, WIFI or 4G/5G network, send/receive voice and other data;
  • the data processing sub-module is used to schedule and control other modules in the three-party call terminal, to process the voice data and send it to the back-end processing module, process the voice data from the back-end processing module, and send it to the communication terminal, and control the display screen to display the call , and receive user instructions from the touch screen;
  • the call control module is used to control batch calls between the communication system and the three-party call terminal.
  • the call control module includes: a call object database, a system database, a task management module, and a communication controller sub-module;
  • the call object database is used to store the data related to the call object
  • System database used to store call records and other data related to the call process
  • Task management module used to manage call tasks
  • the communication controller sub-module is used to schedule other modules, obtain or store data; control the communication system to realize batch calls;
  • the call control module also includes a speech editor and a speech database.
  • the speech editor is used by the speech producer to create and modify speech
  • the speech database is used to store the speech produced by the speech producer through the speech editor.
  • the workflow of the call control module is as follows:
  • one speech represents a complete set of business processes, including its dialogue rules, all possible response sentence texts, and the Conversation and business-related data such as the rules of intention evaluation, including audio if recorded by a sound engineer;
  • the required data from the call object database such as phone number, name, gender, etc., as well as other business-related data such as the amount owed;
  • the above processes can be executed in batches or concurrently, provided that there are multiple communication terminals and three-party call terminals, and the back-end processing module supports concurrent tasks.
  • the three-way call terminal also includes a display screen and a button sub-module.
  • the display screen can display the call records or other call-related information of the human-machine collaborative calling robot system and the call object, and can also use a touch screen to realize the button function, allowing users to touch way to input control instructions;
  • the three-way call terminal can also be set up with a wireless communication system such as Bluetooth to communicate with the voice connection module;
  • the three-party call terminal can also be provided with an audio circuit, so that the audio input and output can be directly performed in digital form;
  • the three-way call terminal can also be provided with a power amplification module for amplifying the sound signal from the voice connection module;
  • the three-party call terminal can also set a voice noise reduction module to perform noise reduction processing on the received audio signal;
  • the three-party call terminal can also be equipped with an AD/DA conversion chip, which is used to convert the received voice of the call object into digital signals for transmission, and convert the received robot voice into analog signals for playback through speakers;
  • AD/DA conversion chip which is used to convert the received voice of the call object into digital signals for transmission, and convert the received robot voice into analog signals for playback through speakers;
  • the three-way call terminal can also be set with a control interface, including buttons, knobs, etc., for external control.
  • the operation process of the three-way call terminal is as follows:
  • the voice of the call object is transmitted to the back-end processing module through the communication terminal and the three-party call terminal;
  • the back-end processing module generates robot response voice and text after processing the voice of the call object
  • the robot response voice is transmitted to the call object through the three-party call terminal and communication terminal;
  • the robot response text is transmitted to the three-party call terminal and displayed;
  • the manual agent can track the call process at any time through the three-party call terminal, and can be transferred to manual answering if necessary to realize the function of man-machine collaborative calling.
  • the back-end processing module is used to perform intent recognition on the voice data sent by the three-party call terminal, and generate a reply voice according to the voice intent and send it back to the three-party call terminal.
  • the back-end processing module includes a dialogue management sub-module, a speech recognition sub-module, an intent recognition sub-module, a speech synthesis sub-module, a word segmentation sub-module, a voice separation sub-module, a voiceprint recognition, and a session management sub-module; the back-end processing module is deployed in the cloud On the server, communicate with the three-way communication terminal through wired or wireless network.
  • the dialog management sub-module is used to control the flow and logic of the dialog, and generate the response text
  • the speech recognition sub-module is used to recognize the received voice of the call object and convert it into text
  • the intent recognition sub-module is used to identify the call object intent according to the recognized voice text
  • the speech synthesis sub-module is used to synthesize the response text into speech and send it to the three-way communication terminal.
  • the running process of the dialogue system of the back-end processing module is as follows:
  • the call control module After the call control module connects (actively dials or passively answers) the call of the call object through the communication terminal, it synchronizes the speech written according to the business logic and the data of the call object to the session management sub-module and the dialogue management sub-module of the back-end processing module module;
  • the session management submodule opens a new session
  • the session management sub-module sends an instruction to the three-party call terminal to make it enter the answering mode
  • the session management sub-module sends the voice and text of the opening speech/welcoming speech to the three-party call terminal;
  • the three-party call terminal sends the voice to the call object through the voice connection module and the communication system, and starts the call between the robot and the call object;
  • the three-party call terminal receives the voice of the call object, and sends it to the voice recognition sub-module of the back-end processing module through the network;
  • the speech recognition sub-module converts the voice of the call object into text, and sends it to the intent recognition sub-module;
  • the intent recognition sub-module calls the word segmentation sub-module to first segment the word, and then according to the word segmentation result, combined with the vocabulary, identifies the intention of the call object, and sends it to the dialogue management sub-module;
  • the intention recognition can also be obtained directly from the voice of the call object through the intention recognition sub-module;
  • the dialogue management sub-module generates the text of the response sentence according to the built-in strategy and rules of the language, and sends it to the speech synthesis sub-module;
  • the speech synthesis submodule converts the text into the robot response voice; optionally, the robot response voice can also be recorded by a sound engineer in advance, and retrieved according to the response sentence;
  • the session management sub-module closes the session, transmits the call record to the call control module, and saves it to the system database for later query and analysis.
  • the present disclosure decouples the communication system from the man-machine collaborative call robot system, reduces the complexity of the system, makes it easy to deploy, facilitates flexible switching, and can greatly reduce the development, deployment and maintenance costs of the telephone man-machine collaborative call robot system ;
  • Provide mobility for the robot, the three-way call terminal can be conveniently placed in various occasions, and can also provide an intuitive and touchable entity for the robot, making it more friendly; convenient access to personal mobile phones or call terminals;
  • the three-party call terminal in the present disclosure continues the functions of the traditional telephone customer service man-machine collaborative calling robot system: it provides screen display, can be easily set, call records and switch, and is more convenient to use; supports external devices such as head-mounted Headphones, call tracking at any time for human agent intervention.
  • the deployment method is still similar to that of a call center.
  • the corresponding telephone robot system is deployed in advance, and the human agent or other users need to work in the place where the system is deployed, in order to realize the function of man-machine coordination.
  • the deployment method of the three-party call terminal in the present disclosure is more convenient.
  • the three-way communication terminal transfers the communication system in the prior art to the user's own communication terminal, such as a mobile phone, a fixed telephone, etc., and then realizes the communication between the communication system, the man-machine co-calling system, and the background talking robot system. unbundling between.
  • the following uses an example to illustrate the deployment of the three-party call terminal in this implementation:
  • an independent salesperson as a user of a three-way call terminal, directly connects the three-way call terminal to his own call terminal (such as a mobile phone) through wired or wireless means, thus completing the deployment without any need for any other operations. Therefore, the deployment efficiency and convenience of the three-party call terminal in this implementation manner are significantly improved compared with the prior art.
  • the salesperson in the above example can connect the three-way call terminal with his own mobile phone through an audio cable connection or Bluetooth connection.
  • the headset further provides audio data to the salesperson (for example, the three-way call terminal is equipped with a speaker/microphone, which can also be played directly).
  • the salesperson makes a call to an intended customer through a mobile phone (or directly, the three-way call terminal can automatically make a call to a certain intended user)
  • the three-way call terminal uploads the voice input by the intended customer to the back-end processing module set in the cloud.
  • the three-party communication terminal After the telephone robot in the processing module generates a corresponding response according to the rules, the three-party communication terminal further returns the response voice to the intended customer, so as to realize the interaction between the back-end processing module and the intended customer.
  • the salesperson can monitor the interaction process between the back-end processing module and the intended customer at any time through the headset, and directly interact with the intended customer through the three-way communication terminal when manual intervention is required.
  • the three-party call terminal in this implementation does not need to arrange artificial seats in advance, and the user can access the three-way call terminal at any time through the communication terminal carried by himself, and the artificial seat can be selected according to actual needs.
  • type of communication terminal and is not limited to a fixed communication terminal.
  • the communication function is initiated by the user's own communication terminal, so it is unnecessary to upload sensitive information such as the customer's phone number to the phone robot, thereby avoiding the need for telephone calls during use.
  • sensitive information such as the customer's phone number
  • the three-way call terminal can be directly deployed in the user's own communication terminal (audio device), such as a mobile phone, a Bluetooth headset, etc.
  • audio device such as a mobile phone, a Bluetooth headset, etc.
  • These audio devices include speakers and a microphone, and the three-way call terminal is placed in the terminal.
  • the second voice interface of the audio device is connected to the speaker and microphone in the audio device, so that the audio device itself will have the function of supporting three-way calls.
  • the user of the audio device can access the artificial agent through the audio device, and pass The microphone transmits voice data to the three-party communication terminal, and receives the voice data sent by the three-party communication terminal through the speaker, so as to realize the overall mobility of the three-party communication terminal.
  • the following uses an example to illustrate the deployment of the three-party call terminal in this implementation:
  • a salesperson can directly use an audio device (such as a mobile phone) deployed with a three-way call terminal, thereby omitting the process of connecting the three-way call terminal to its own call terminal. Effectively improve the deployment efficiency and convenience of the three-way call terminal.
  • the use location of the three-party call terminal is not limited. Compared with the prior art, the three-party call terminal in this implementation is more flexible in terms of the use location. Significant improvement.
  • the salesperson in the above example can enable the three-way calling function on the mobile phone.
  • the salesperson can click the application software of the three-way calling on the mobile phone to start the three-way calling terminal and enable the three-way calling function, for example,
  • the mobile phone connects the three-way call terminal to the back-end processing module of the cloud through the networking function, and enables the function of transmitting the voice data of the interaction between the mobile phone and other communication terminals to the three-way call terminal.
  • the salesperson makes a call to an intended customer through a mobile phone, and can receive the voice data sent by the intended customer through the mobile phone, transmit the voice data to the three-party call terminal, and upload it to the back-end processing module via the three-party call terminal.
  • the telephone robot in the back-end processing module After the telephone robot in the back-end processing module generates the corresponding response according to the rules, it sends the response data back to the three-way call terminal, and the three-way call terminal sends the response data to the intended customer through the mobile phone of the salesperson, so as to realize the back-end processing module.
  • the salesperson can monitor the interaction process between the back-end processing module and the intended customer at any time through the speaker of the audio device, and interact with the intended customer through the three-way communication terminal when manual intervention is required.
  • a salesperson is a user of a three-way call terminal, and the three-way call terminal is integrated into a Bluetooth headset. After the salesperson wears the headset, the three-way call function can be realized. In actual use, the salesperson wears a Bluetooth headset integrated with a three-way call terminal, and can activate the three-way call function through a trigger button set on the Bluetooth headset or an application installed on the mobile phone. After the three-party calling function is enabled, the salesperson makes a call to an intended customer through the mobile phone, and the mobile phone receives the voice data sent by the intended customer and transmits it to the Bluetooth headset through Bluetooth, and transmits the voice data to the three-party calling terminal, and uploads it through the three-party calling terminal. to the backend processing module.
  • the phone robot in the back-end processing module After the phone robot in the back-end processing module generates the corresponding response according to the rules, it sends the response data back to the three-way call terminal, and the three-way call terminal sends the response data to the intended customer through the microphone in the Bluetooth headset, so as to realize the back-end processing.
  • the interaction of the module with the intended customer In the above process, the salesperson can monitor the interaction process between the back-end processing module and the intended customer at any time through the speaker of the Bluetooth headset, and interact with the intended customer through the three-way communication terminal when manual intervention is required.
  • the integration between the three-way call terminal and the communication terminal in this implementation is higher, and the user can carry the three-way call terminal and use it in any scene, and is not limited to a fixed use scene.
  • the present disclosure provides a screen display function through a display module.
  • the three-party call terminal displays the conversation content of the call object and the response content of the robot through the display module.
  • the display module is pushed to the artificial agent, and the conversation content of the call object and the response content of the robot are displayed on the display screen of the artificial agent.
  • the artificial agent and the call object watch the display screen of the three-party call terminal at the same time.
  • the display module (equivalent to the display screen) of the three-way call terminal can directly display the conversation content of the call object and the display screen.
  • the response content of the robot so that the call object and the artificial agent can grasp the interaction process between the communication object and the telephone robot at the same time, and it is convenient for the synchronization of the interactive data between the call object and the artificial agent, so that the artificial agent can intervene in the call in time and solve the phone call. Problems that robots cannot solve.
  • the artificial agent can connect to the three-party communication terminal through its own communication terminal (refer to the first implementation method of the deployment mode of the three-party communication terminal above).
  • the communication terminal used by the artificial agent is If there is a built-in display screen, the artificial agent can browse the conversation content of the call object and the response content of the robot on the built-in display screen. In this way, even if the artificial agent uses the built-in communication terminal, the interaction process between the call object and the robot can be grasped at any time.
  • the back-end processing module can further process the conversation content of the call object and the response content of the robot, so that the first content displayed on the three-party call terminal and the second content displayed on the display screen of the artificial agent are different.
  • the back-end processing module processes the conversation content of the call object and the response content of the robot as the second content, and sends it to the human agent through the display module , to be displayed on the built-in display of the human agent.
  • the back-end processing module processes the response content of the robot as the first content, and displays it on the three-party call terminal through the display module. This display method is more pertinent to meet the different needs of call objects and artificial agents.
  • the present disclosure can be used to implement man-machine cooperation to call a telephone robot, and the operation process of the call system is as follows:
  • Connect the three-party call terminal device to the communication terminal device, and the connection method can be Bluetooth or 3.5mm audio interface;
  • the artificial agent is connected to the three-way call terminal, and the access method can be Bluetooth or 3.5mm audio interface;
  • the three-way call terminal is connected to the back-end processing module, and the connection method can be Ethernet, WIFI, 4G or 5G;
  • the session management sub-module opens the session, sends an instruction to the three-party call terminal to make it enter the waiting call mode, and at the same time sends the necessary data such as the opening speech/text, and the relevant data of the call object to it;
  • the three-party call terminal detects the call connection signal, sends an opening speech to the call object through the communication terminal, and displays relevant text information on the display screen;
  • the three-party call terminal receives the voice of the call object and sends it to the back-end processing module;
  • the back-end processing module calls the speech recognition sub-module to convert the speech into text, and then recognizes the intention of the call object through the intention recognition sub-module. After the dialogue management module judges and makes a decision, the robot responds with text and speech.
  • the voice of the call object can also be directly recognized as the intention of the call object through the intention recognition sub-module;
  • the robot reply voice is processed by the data processing sub-module in the three-way communication terminal and played, and is sent to the call object through the communication terminal at the same time;
  • the artificial agent can listen to the voice of the robot and the voice of the call object through the second voice interface, or understand the call process by watching the text on the screen;
  • the data processing sub-module in the three-party call terminal turns the session into manual intervention mode (without any impact on the communication system itself), and cuts off the connection with the back-end processing module;
  • the artificial agent speaks directly to the second voice interface in the three-way communication terminal, sends it to the call object through the communication terminal, and directly talks with the call object to realize seamless switching.
  • the robot voice and the agent voice have been matched by the speech synthesis sub-module, so it can better simulate the voice of the agent;
  • the conversation between the artificial agent and the call object can also be recognized as text by the voice recognition sub-module of the back-end processing module, and displayed on the display screen;

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明涉及人工智能领域,公开了一种用于移动式人机协作呼叫机器人的三方通话终端,其技术方案要点是第一语音接口,用于传输通话对象和后端处理模块的通话音频;CODEC1模块,用于通话对象和后端处理模块之间的通话语音音频编码、解码;第二语音接口,用于传输人工坐席和通话对象的通话音频;CODEC2模块,用于人工坐席和通话对象的通话语音音频编码、解码;通话控制模块,用于处理控制信号,用于自动拨打、接听电话、挂断电话;数据处理子模块,用于处理语音数据,与后端处理模块进行数据传递;联网子模块,用于和后端处理模块连接,能够与通信系统进行解耦,易于部署,便于切换,提供了可移动性,可以方便地放置在各种场合。

Description

一种用于移动式人机协作呼叫机器人的三方通话终端
本公开要求在2020年7月13日提交中国专利局、申请号为202010669451.X、发明名称为“一种用于移动式人机协作呼叫机器人的三方通话终端”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及人工智能领域,更具体地说,它涉及一种用于移动式人机协作呼叫机器人的三方通话终端。
背景技术
随着计算机技术、通信技术、互联网和人工智能技术的快速进步,各种智能家用电器不断走进人们的生活。例如,智能电视、智能冰箱、智能空调、智能音箱、智能手表、智能手环、智能眼镜等等。目前各种品牌的基于语音交互的智能设备已经大量上市,用户可以通过发出语音指令的方式与智能设备进行交互,实现听歌、报时、闲聊、游戏、陪伴、信息查询、设备控制等功能。但目前智能设备主要应用于家庭生活、休闲娱乐或儿童教育等领域,在企业级应用方面很少也很难得到应用。
随着人工智能技术和通信科技的蓬勃发展,电话机器人已广泛应用各行各业,大幅度降低了呼叫中心的人工成本,提高了效率。但目前的语音交互机器人特别是电话人机协作呼叫机器人系统涵盖了以人工智能技术和对话系统为主的人机协作呼叫机器人系统,和以通信网络和VOIP技术为主的语音通信系统。两者紧密绑定,系统异常复杂,开发、部署和维护难度都很大,成本很高。而替换其中任何一个组件都非常困难,很不灵活。基于此开发的电话机器人过于复杂和庞大,一旦部署就很难移动。这种电话机器人基于大规模云端服务器,没有一个普通人可以方便认知的实体,无法给人以直观亲切的印象。
要实现移动式的人机协作呼叫机器人,就需要一种支持三方通话并且方便易用的电话三方通话终端。传统的人机协作呼叫机器人一般使用台式机作为通话终端,操作复杂,不方便移动;而新式的移动式电话三方通话终端并不支持三方通话,无法实现人机协作呼叫机器人。
发明内容
本公开的目的是提供一种用于移动式人机协作呼叫机器人的三方通话终端,能够与通信系统进行解耦,易于部署,便于切换,提供了可移动性,可以方便地放置在各种场合;便捷的接入个人手机或通话终端。
本公开的上述技术目的是通过以下技术方案得以实现的:
第一方面,本公开提供了一种用于移动式人机协作呼叫机器人的三方通话终端,包括:
第一语音接口,配置为连接至后端处理模块,并在通话对象与所述后端处理模块之间传输通话音频;其中,所述后端处理模块配置为通过预设规则与所述通话对象进行交互;
CODEC1模块,配置为对所述通话对象与所述后端处理模块之间的通话音频进行编码和/或解码;
第二语音接口,配置为连接至人工坐席,并在所述通话对象与所述人工坐席之间传输通话音频;所述第二语音接口还配置为,将所述通话对象与所述后端处理模块之间的通话音频传输至所述人工坐席;
CODEC2模块,配置为对所述通话对象与所述人工坐席之间的通话音频进行编码和/或解码;
通话控制模块,配置为处理控制信号,以及自动拨打、接听电话、挂断电话;
数据处理子模块,配置为处理语音数据以及与所述后端处理模块之间进行数据传递;
联网子模块,配置为与后端处理模块进行网络连接。
作为本公开的一种优选技术方案,所述三方通话终端还包括显示模块,所述显示模块配置为向所述人工坐席和所述通话对象显示所述通话对象与所述后台处理模块的通话记录或通话相关信息。
作为本公开的一种优选技术方案,所述三方通话终端还包括按键子模块,所述按键子模块用于输入控制指令。
作为本公开的一种优选技术方案,所述三方通话终端设置于音频设备内部,所述音频设备包括扬声器与麦克风,所述三方通话终端的所述第二语音接口连接至所述音频设备的所述扬声器与所述麦克风。
第二方面,本公开还提供了一种通信系统,所述通信系统包括:如第一方面所述的三方通话终端、后端处理模块、人工坐席、以及至少一个通信终端;其中,所述人工坐席通过所述通信终端与所述三方通话终端连接。
作为本公开的一种优选技术方案,所述后端处理模块用于对所述三方通话终端发来的语音数据进行处理并生成应答语音和文字发回给所述三方通话终端。
作为本公开的一种优选技术方案,所述后端处理模包括对话管理子模块、语音识别子模块、意图识别子模块、语音合成子模块;
所述对话管理子模块用于控制对话的流程和逻辑,生成应答文本;
所述语音识别子模块用于识别接收到的通话对象语音并转化为文字;
所述意图识别子模块用于根据识别的语音文本识别出通话对象意图;
所述语音合成子模块用于将应答文本合成为语音并发送到所述三方通话终端。
第三方面,本公开还提供了一种通话方法,应用于如第二方面所述的通话系统,所述方法包括:
通过通信终端获取通话对象语音,并通过三方通话终端将所述通话对象语音传输至后端处理模块与所述人工坐席;
通过三方通话终端将应答语音传输至所述通信终端,通过所述通信终端将所述应答语音传输至所述通话对象;通过三方通话终端将应答语音和/或应答文字传输至所述人工坐席;
其中,所述应答语音与所述应答文字由所述后端处理模块根据预设规则以及所述通话对象语音生成。
作为本公开的一种优选技术方案,所述通过通信终端获取通话对象语音之前,所述方法还包括:
三方通话终端将根据业务逻辑编写的话术、通话对象的数据同步至后端处理模块;
后端处理模块在接收到话术和通话对象的数据后,开启通信终端与后端处理模块之间的会话;
通过后端处理模块向三方通话终端发送指令,以使三方通话终端进入接听模式;
后端处理模块通过三方通话终端向通话终端发送开场白/欢迎词的语音和文本,以开启后端处理模块与通话对象之间的通话。
作为本公开的一种优选技术方案,所述三方通话终端将根据业务逻辑编写的话术、通话对象的数据同步至后端处理模块之前,还包括:
三方通话终端根据管理员的操作或者事先计划的任务,加载任务清单;
三方通话终端根据任务清单,检索出对应的话术,所述话术代表一套完整的业务流程;
三方通话终端从通话对象数据库中查询通话对象数据。
作为本公开的一种优选技术方案,所述通过后端处理模块处理通话对象语音后生成机器人应答语音和文字包括:
后端处理模块根据通话对象语音分析通话对象的意图;
后端处理模块根据通话对象的意图、以及话术内置的策略和规则,生成应答句子文本;
后端处理模块根据应答句子文本确定机器人应答语音。
作为本公开的一种优选技术方案,所述后端处理模块根据通话对象语音分析通话对象的意图包括:
后端处理模块将通话对象语音转换成文本;
后端处理模块将文本进行分词,得到分词结果;
后端处理模块根据分词结果分析通话对象的意图。
作为本公开的一种优选技术方案,所述方法还包括:
通过三方通话终端显示后端处理模块和通话对象的通话记录或通话相关信息。
作为本公开的一种优选技术方案,所述方法还包括:
识别到通信终端与后端处理模块之间的对话结束,通过后端处理模块关闭通信终端与后端处理模块之间的会话;
后端处理模块将通话记录传输至三方通话终端,并保存于三方通话终端的数据库。
作为本公开的一种优选技术方案,所述方法还包括:
人工坐席向三方通话终端发送人工介入指令;
三方通话终端响应人工介入指令,切断与后端处理模块的连接,并转换为人工介入模式。
综上所述,本公开通过三方通话终端把通信系统(人工坐席和通信终端)与人机协作呼叫机器人系统(后端处理模块)进行解耦,降低了系统的复杂性,使之易于部署,便于灵活切换,可以大幅度降低电话人机协作呼叫机器人系统的开发、部署和维护成本;为机器人提供了可移动性,三方通话终端可以方便地放置在各种场合,也可以为机器人提供一 个直观、可触摸的实体,使之更具有亲和力;便捷的接入个人手机或者通话终端;提供蓝牙端、音频端口接入方式,应用范围广泛。
附图说明
图1是本发明的三方通话终端的模块框图;
图2是本发明的三方通话系统的结构示意图。
具体实施方式
以下结合附图对本发明作进一步详细说明。
如图2所示,本公开提供一种三方通话系统,包括:三方通话终端、后端处理模块、人工坐席、以及至少一个通信终端,如图2所示,三方通话系统包括一个通信终端,后端处理模块、人工坐席和通信终端在使用时,可以分别与三方通话系统连接,以通过三方通话系统在后端处理模块、人工坐席和通信终端之间传输语音数据。其中,通信终端是指通话对象所使用的终端设备,人工坐席是指对后端处理模块与通信终端的对话进行监控的终端设备,其用于监控后端处理模块与通信终端之间的语音数据,并可以人工介入,以取代后端处理模块与通信终端直接对话。
本公开可以采用如图1所示的三方通话终端,三方通话终端包括:第一语音接口、CODEC1模块、第二语音接口、CODEC2模块、通话控制模块、数据处理子模块和联网子模块。三方通话终端用于传输后端处理模块、人工坐席和通信终端之间的语音数据。
以下具体介绍三方通话终端、后端处理模块和通信终端的运行过程。
通信终端的运行过程为:
M1、连接三方通话终端;
M2、在开启会话后,接收通话对象的通话音频;
M3、将该通话音频通过第二语音接口输入三方通话终端,以通过三方通话终端将通话音频传输至后端处理模块和人工坐席;
M4、通过第二语音接口接收三方通话终端传输的机器人应答语音或者人工坐席语音。
三方通话终端中各模块的介绍如下:
第一语音接口,用于传输通话对象和后端处理模块的通话音频;
CODEC1模块,用于通话对象和后端处理模块之间的通话语音音频编码、解码;
第二语音接口,用于传输人工坐席和通话对象的通话音频;
CODEC2模块,用于人工坐席和通话对象的通话语音音频编码、解码;
第一语音接口和第二语音接口可以为蓝牙端或者音频端口,在通话过程中,人工坐席可以通过第二语音接口接收到通话音频,从而判断是否要进行人工介入,需要介入时,也从第二语音接口将人工坐席语音发给三方通话终端;
其中在音频端口可以设置一个或多个普通麦克风或者MEMS麦克风,也可以采用远场麦克风阵列,用于接收通信终端通过语音连接模块发来的通话对象语音,还可以设置扬声器等放音设备,用于播放机器人应答语音;如有必要,在不会导致回响、啸叫和混响的前提下,也可以通过扬声器播放通话对象的语音。
联网子模块,通过有线链路、WIFI或4G/5G网络连接后端处理模块,发送/接收语音及其它数据;
数据处理子模块,用于调度控制三方通话终端中的其他模块,用于处理语音数据并发到后端处理模块,处理来自后端处理模块的语音数据,并发送到通信终端,控制显示屏显示通话的文字记录,接收来自触摸屏的用户指令;
通话控制模块,用于控制通信系统和三方通话终端之间进行批量通话。
通话控制模块包括:通话对象数据库、系统数据库、任务管理模块、通信控制器子模块;
通话对象数据库,用于存储通话对象相关的数据;
系统数据库,用于存储通话记录及通话过程相关的其它数据;
任务管理模块,用于管理通话任务;
通信控制器子模块,用于调度其它模块,获取或存入数据;控制通信系统实现批量通话;
此外通话控制模块还包括话术编辑器和话术数据库,话术编辑器用于话术制作人员制作和修改话术,话术数据库用于存储话术制作人员通过话术编辑器制作的话术。
通话控制模块的工作流程如下:
根据管理员的操作或者事先计划好的任务,通过任务管理模块加载任务清单;
从话术数据库检索出任务需要的、由话术制作人员事先制作好的话术;其中一个话术就代表一套完整的业务流程,包括其对话规则、所有可能的应答句子文本、对通话对象的意向评价的规则等对话和业务相关数据,如果使用录音师录音,则还包括录音音频;
从通话对象数据库中查询出所需数据,例如电话号码、姓名、性别等,以及其它和业务相关的数据比如欠款金额之类;
通过网络把话术和通话对象数据同步到后端处理模块;
控制通信终端接通(主动拨打或被动接听)通话对象;
等待通话结束,从后端处理模块接收通话记录存储到数据库;
根据需要,以上流程可以批量执行;也可以并发执行,前提是有多个通信终端和三方通话终端,同时后端处理模块支持并发任务。
此外三方通话终端还包括显示屏和按键子模块,显示屏可以显示人机协作呼叫机器人系统和通话对象的通话记录或其他通话相关信息,也可以使用触摸屏,同时实现按键功能,让用户通过触摸的方式输入控制指令;
三方通话终端还可以设置蓝牙等无线通信系统,用于和语音连接模块进行通信;
三方通话终端还可以设置音频电路,使得音频的输入输出能够直接以数字形式进行;
三方通话终端还可以设置功率放大模块,用于放大来自语音连接模块的声音信号;
三方通话终端还可以设置语音降噪模块,对收到的音频信号作降噪处理;
三方通话终端还可以设置AD/DA转换芯片,用于把接收到的通话对象的语音转成数字信号传输,把接收到的机器人语音转成模拟信号通过扬声器播放;
三方通话终端还可以设置控制接口,包括按钮、旋钮等,用于外部控制。
三方通话终端的运行过程为:
A1、连接三方通话终端和通信终端;
A2、连接三方通话终端和后端处理模块;
A3、开启会话并接通通话对象;
A4、通话对象语音通过通信终端、三方通话终端传输到后端处理模块;
A5、后端处理模块处理通话对象语音后生成机器人应答语音和文字;
A6、机器人应答语音通过三方通话终端、通信终端传输给通话对象;
A7、机器人应答文字传输到三方通话终端并显示;
A8、人工坐席通过三方通话终端随时跟踪通话过程,必要时可以转入人工接听,实现人机协作呼叫功能。
后端处理模块用于对三方通话终端发来的语音数据进行意图识别,并根据语音意图生成回复语音发回给三方通话终端。后端处理模块包括对话管理子模块、语音识别子模块、意图识别子模块、语音合成子模块、分词子模块、声音分离子模块、声纹识别、会话管理子模块;后端处理模块部署在云端服务器上,通过有线或无线网络与三方通话终端通信。
对话管理子模块用于控制对话的流程和逻辑,生成应答文本;
语音识别子模块用于识别接收到的通话对象语音并转化为文字;
意图识别子模块用于根据识别的语音文本识别出通话对象意图;
语音合成子模块用于将应答文本合成为语音并发送到三方通话终端。
后端处理模块的对话系统运行过程如下:
S1、通话控制模块通过通信终端接通(主动拨打或被动接听)通话对象的电话后,把根据业务逻辑编写的话术、通话对象的数据同步到后端处理模块的会话管理子模块和对话管理子模块;
S2、会话管理子模块开启一个新的会话;
S3、会话管理子模块向三方通话终端发送指令,使之进入接听模式;
S4、会话管理子模块向三方通话终端发送开场白/欢迎词的语音和文本;
S5、三方通话终端通过语音连接模块和通信系统把语音发送给通话对象,开启机器人和通话对象之间的通话;
S6、三方通话终端收到通话对象语音,通过网络发送到后端处理模块的语音识别子模块;
S7、语音识别子模块把通话对象语音转成文本,发送到意图识别子模块;
S8、意图识别子模块调用分词子模块先分词,再根据分词结果,结合话术识别出通话对象的意图,发送到对话管理子模块;
S9、可选地,意图识别也可以通过意图识别子模块直接由通话对象的语音得到;
S10、对话管理子模块根据话术内置的策略和规则,生成应答句子文本,发送到语音合成子模块;
S11、语音合成子模块把文本转换成机器人应答语音;可选地,机器人应答语音也可以事先由录音师录好,根据应答句子检索出来;
S12、把应答句子文本和语音一起发送到三方通话终端,由三方通话终端播放并通过语音连接模块和通信系统发送给通话对象;
S13、如此循环,直到对话结束;
S14、会话管理子模块关闭会话,把通话记录传输到通话控制模块,保存到系统数据库,供以后查询分析。
本公开把通信系统与人机协作呼叫机器人系统进行解耦,降低了系统的复杂性,使之 易于部署,便于灵活切换,可以大幅度降低电话人机协作呼叫机器人系统的开发、部署和维护成本;为机器人提供了可移动性,三方通话终端可以方便地放置在各种场合,也可以为机器人提供一个直观、可触摸的实体,使之更具有亲和力;便捷的接入个人手机或者通话终端;提供蓝牙端、音频端口接入方式,应用范围广泛。
同时,本公开中的三方通话终端延续了传统电话客服人机协作呼叫机器人系统功能:提供屏幕显示,可以方便的设置,调取通话记录和切换,使用更便捷;支持外放设备如头戴式耳机,随时进行通话跟踪以便人工坐席介入。
具体而言,现有技术中的电话机器人,由于其通信系统、人机协呼系统以及后台通话机器人系统均彼此绑定,故其部署方式仍采用类似于呼叫中心的部署方式,即在使用场合提前部署相应的电话机器人系统,人工坐席或其它使用者需要在部署有该系统的场所工作,才能实现人机协呼的功能。较于上述现有技术,本公开中的三方通话终端的部署方式更为便捷。
在一种实现方式中,三方通话终端将现有技术中的通信系统交由使用者自身的通信终端、如手机、固定电话等,进而实现通信系统与人机协呼系统、后台通话机器人系统之间的解绑。以下通过一示例说明本实现方式中三方通话终端的部署方式:
在一示例中,某独立销售人员作为三方通话终端的使用者,其直接将三方通话终端通过有线或无线的方式接入至自身的通话终端(如手机)中,以此即完成部署,无需任何其它操作。因此,本实现方式中的三方通话终端在部署效率与便捷性上,较于现有技术得以显著改善。
实际使用过程中,上述示例中的销售人员一方面可通过音频线连接或蓝牙连接的方式将三方通话终端与自身手机之间进行连接,另一方面,三方通话终端通过连接在三方通话终端之上的耳麦进一步向销售人员提供音频数据(如三方通话终端搭载有扬声器/麦克风也可直接播放)。销售人员通过手机向某意向客户拨打电话(也可直接由三方通话终端自动向某意向用户拨打电话)后,三方通话终端将意向客户输入的语音上传至设置于云端的后端处理模块,后端处理模块中的电话机器人根据规则产生对应的回应后,进一步由三方通话终端将回应语音返回至意向客户,以此实现后端处理模块与意向客户的交互。上述过程中,销售人员可通过耳麦随时监听后端处理模块与意向客户的交互过程,并在需要人工介入时,通过三方通话终端直接与意向客户进行交互。
由上述使用过程获知,本实现方式中的三方通话终端在使用过程中,无需提前布置人工坐席,使用者可以通过自身携带的通信终端随时接入三方通话终端,人工坐席可以根据实际需要,选用合适类型的通信终端,而并不限于固定的通信终端。
此外,本实现方式的三方通话终端在使用过程中,通信功能是由使用者自身的通信终端发起的,故其不必要将客户电话等敏感信息上传至电话机器人,进而避免了使用过程中由于电话机器人而导致信息泄露的可能。
在另一种实现方式中,三方通话终端可以直接部署在使用者自身的通信终端(音频设备)中,例如部署在手机、蓝牙耳机等,这些音频设备包括扬声器和麦克,并将三方通话终端中的第二语音接口与音频设备中的扬声器和麦克相连接,这样,该音频设备本身将具有支持三方通话的功能,此时,音频设备的使用者可以通过该音频设备接入人工坐席,并通过麦克向三方通话终端传输语音数据,通过扬声器接收三方通话终端发送的语音数据, 实现三方通话终端的整体可移动性。以下通过一示例说明本实现方式中三方通话终端的部署方式:
在一示例中,某销售人员作为三方通话终端的使用者,其可以直接使用部署有三方通话终端的音频设备(如手机),从而省去将三方通话终端接入自身的通话终端的过程,可以有效提高三方通话终端的部署效率与便捷性。在此基础上,由于音频设备具有较强的可移动性,因此,三方通话终端的使用地点不受限制,本实现方式中的三方通话终端在使用地点的灵活性上,较于现有技术得以显著改善。
实际使用过程中,上述示例中的销售人员可以在手机上开启三方通话功能,具体的,销售人员可以点击手机上的三方通话的应用软件等以启动三方通话终端,并开启三方通话功能,例如,启动三方通话终端后,手机通过联网功能,令三方通话终端连接云端的后端处理模块,并开启将手机与其它通信终端交互的语音数据传输至三方通话终端的功能。该销售人员通过手机向某意向客户拨打电话,可以通过手机接收意向客户发送的语音数据,并将该语音数据传输至三方通话终端,经由三方通话终端上传至后端处理模块。后端处理模块中的电话机器人根据规则产生对应的回应后,将应答数据回传至三方通话终端,三方通话终端通过销售人员的手机将该应答数据发送至意向客户,以此实现后端处理模块与意向客户的交互。上述过程中,销售人员可以通过音频设备的扬声器随时监听后端处理模块与意向客户的交互过程,并在需要人工介入时,通过三方通话终端与意向客户进行交互。
在另一示例中,某销售人员作为三方通话终端的使用者,三方通话终端集成于一蓝牙耳机中,销售人员佩戴该耳机后,即可实现三方通话功能。实际使用过程中,销售人员佩戴集成有三方通话终端的蓝牙耳机,可通过设置于蓝牙耳机之上的触发式按钮或通过手机上安装的应用程序以开启三方通话功能。开启三方通话功能后,该销售人员通过手机向某意向客户拨打电话,手机接收意向客户发送的语音数据后通过蓝牙传输至蓝牙耳机,并将该语音数据传输至三方通话终端,经由三方通话终端上传至后端处理模块。后端处理模块中的电话机器人根据规则产生对应的回应后,将应答数据回传至三方通话终端,三方通话终端通过蓝牙耳机中的麦克将该应答数据发送至意向客户,以此实现后端处理模块与意向客户的交互。上述过程中,销售人员可以通过蓝牙耳机的扬声器随时监听后端处理模块与意向客户的交互过程,并在需要人工介入时,通过三方通话终端与意向客户进行交互。
由上述使用过程可知,本实现方式中的三方通话终端与通信终端之间的集成度更高,使用者可以将三方通话终端随身携带并在任何场景使用,并不限定于固定的使用场景。
本公开通过显示模块提供屏幕显示功能,具体的,三方通话终端通过显示模块显示通话对象的会话内容和机器人的应答内容,其中,在人工坐席的显示屏与三方通话终端的显示屏不同时,通过显示模块推送至人工坐席,以人工坐席的显示屏显示通话对象的会话内容和机器人的应答内容。
具体而言,在一种实现方式中,人工坐席与通话对象同时观看三方通话终端的显示屏,此时,可以通过三方通话终端的显示模块(相当于显示屏)直接显示通话对象的会话内容和机器人的应答内容,以便于通话对象与人工坐席同时掌握通信对象与电话机器人之间的互动过程,便于通话对象与人工坐席之间的交互数据的同步,以便于人工坐席可以及时介入通话,解决电话机器人所无法解决的问题。
在另一种实现方式中,人工坐席可以通过自带的通信终端连接三方通话终端(可以参 考上文三方通话终端部署方式的第一种实现方式),此时,如果人工坐席所使用的通信终端自带显示屏,则人工坐席可以该自带的显示屏浏览通话对象的会话内容和机器人的应答内容。这样,即使人工坐席使用自带的通信终端,也可以随时掌握通话对象与机器人之间的交互过程。进一步地,后端处理模块可以将通话对象的会话内容和机器人的应答内容进行进一步处理,以使得在三方通话终端上所显示的第一内容和在人工坐席的显示屏上所显示的第二内容不同。例如,人工坐席想要掌握详细的通话对象的会话内容和机器人的应答内容,则后端处理模块将通话对象的会话内容和机器人的应答内容处理为第二内容,并通过显示模块发送给人工坐席,以在人工坐席的自带的显示屏上显示。而通话对象仅想要浏览机器人的应答内容,以便于快速获取所需要的应答信息,则后端处理模块将机器人的应答内容处理为第一内容,并通过显示模块显示在三方通话终端上。这种显示方式具有更强的针对性,以满足通话对象与人工坐席不同的需求。
本公开可以用于实现人机协作呼叫电话机器人,通话系统的运行过程如下:
D1、把三方通话终端装置连接到通信终端设备,连接方式可为蓝牙或3.5mm音频接口;
D2、人工坐席接入三方通话终端,接入方式可为蓝牙或3.5mm音频接口;
D3、三方通话终端连接后端处理模块,连接方式可以为以太网、WIFI、4G或者5G;
D4、控制通信终端拨通或者接听通话对象的电话;
D5、会话管理子模块开启会话,向三方通话终端发送指令使它进入等待通话模式,同时把开场白语音/文字、通话对象相关数据等必要的数据发送给它;
D6、三方通话终端检测到电话接通信号,通过通信终端向通话对象发送开场白语音,在显示屏显示相关文字信息;
D7、三方通话终端接收到通话对象的语音,发送到后端处理模块;
D8、后端处理模块调用语音识别子模块把语音转换成文字,再通过意图识别子模块识别出通话对象的意图,经过对话管理模块判断决策后,生成机器人答复文字和语音。可选地,也可以通过意图识别子模块直接把通话对象的语音识别为通话对象的意图;
D9、机器人答复文字和语音经由联网子模块发送到三方通话终端;
D10、机器人答复语音经三方通话终端中的数据处理子模块处理后播放出来,同时通过通信终端发送给通话对象;
D12、答复文字经三方通话终端中的数据处理子模块处理后,以通话记录的形式显示在显示屏上;
D13、人工坐席可以通过第二语音接口收听机器人语音和通话对象的语音,或者通过观看屏幕文字,了解通话进程;
D14、在人工坐席认为必要的时候,按下三方通话终端上设置的介入按钮开始人工介入;
D15、三方通话终端内数据处理子模块器把会话转为人工介入模式(对通信系统本身没有任何影响),切断与后端处理模块的连接;
D16、人工坐席直接对着三方通话终端内的第二语音接口讲话,通过通信终端发送到通话对象,直接和通话对象对话,实现无缝切换。机器人语音和坐席语音已经通过语音合成子模块进行匹配,所以能够较好地模拟坐席的声音;
D17、人工坐席和通话对象的对话也可以通过后端处理模块的语音识别子模块识别成 文字,显示在显示屏;
D18、关闭会话时,保存通话记录等相关数据到数据库。
以上所述仅是本发明的优选实施方式,本发明的保护范围并不仅局限于上述实施例,凡属于本发明思路下的技术方案均属于本发明的保护范围。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理前提下的若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (15)

  1. 一种用于移动式人机协作呼叫机器人的三方通话终端,其中,包括:
    第一语音接口,配置为连接至后端处理模块,并在通话对象与所述后端处理模块之间传输通话音频;其中,所述后端处理模块配置为通过预设规则与所述通话对象进行交互;
    CODEC1模块,配置为对所述通话对象与所述后端处理模块之间的通话音频进行编码和/或解码;
    第二语音接口,配置为连接至人工坐席,并在所述通话对象与所述人工坐席之间传输通话音频;所述第二语音接口还配置为,将所述通话对象与所述后端处理模块之间的通话音频传输至所述人工坐席;
    CODEC2模块,配置为对所述通话对象与所述人工坐席之间的通话音频进行编码和/或解码;
    通话控制模块,配置为处理控制信号,以及自动拨打、接听电话、挂断电话;
    数据处理子模块,配置为处理语音数据以及与所述后端处理模块之间进行数据传递;
    联网子模块,配置为与后端处理模块进行网络连接。
  2. 根据权利要求1所述的一种用于移动式人机协作呼叫机器人的三方通话终端,其中,所述三方通话终端还包括显示模块,所述显示模块配置为向所述人工坐席和所述通话对象显示所述通话对象与所述后台处理模块的通话记录或通话相关信息。
  3. 根据权利要求1所述的一种用于移动式人机协作呼叫机器人的三方通话终端,其中,所述三方通话终端还包括按键子模块,所述按键子模块用于输入控制指令。
  4. 根据权利要求1-3中任一所述的用于移动式人机协作呼叫机器人的三方通话终端,其中,所述三方通话终端设置于音频设备内部,所述音频设备包括扬声器与麦克风,所述三方通话终端的所述第二语音接口连接至所述音频设备的所述扬声器与所述麦克风。
  5. 一种通话系统,其中,所述通话系统包括:如权利要求1-4中任一所述的三方通话终端、后端处理模块、人工坐席、以及至少一个通信终端;其中,所述人工坐席通过所述通信终端与所述三方通话终端连接。
  6. 根据权利要求5所述的通话系统,其中,所述后端处理模块用于对所述三方通话终端发来的语音数据进行处理并生成应答语音和文字发回给所述三方通话终端。
  7. 根据权利要求5或6所述的通话系统,其中,所述后端处理模包括对话管理子模块、语音识别子模块、意图识别子模块、语音合成子模块;
    所述对话管理子模块用于控制对话的流程和逻辑,生成应答文本;
    所述语音识别子模块用于识别接收到的通话对象语音并转化为文字;
    所述意图识别子模块用于根据识别的语音文本识别出通话对象意图;
    所述语音合成子模块用于将应答文本合成为语音并发送到所述三方通话终端。
  8. 一种通话方法,应用于如权利要求5-7中任一所述的通话系统,其中,所述方法包括:
    通过通信终端获取通话对象语音,并通过三方通话终端将所述通话对象语音传输 至后端处理模块与所述人工坐席;
    通过三方通话终端将应答语音传输至所述通信终端,以通过所述通信终端将所述应答语音传输至所述通话对象;
    通过三方通话终端将应答语音和/或应答文字传输至所述人工坐席;
    其中,所述应答语音与所述应答文字由所述后端处理模块根据预设规则以及所述通话对象语音生成。
  9. 根据权利要求8所述的通话方法,其中,所述通过通信终端获取通话对象语音之前,所述方法还包括:
    三方通话终端将根据业务逻辑编写的话术、通话对象的数据同步至后端处理模块;
    后端处理模块在接收到话术和通话对象的数据后,开启通信终端与后端处理模块之间的会话;
    通过后端处理模块向三方通话终端发送指令,以使三方通话终端进入接听模式;
    后端处理模块通过三方通话终端向通话终端发送开场白/欢迎词的语音和文本,以开启后端处理模块与通话对象之间的通话。
  10. 根据权利要求9所述的通话方法,其中,所述三方通话终端将根据业务逻辑编写的话术、通话对象的数据同步至后端处理模块之前,还包括:
    三方通话终端根据管理员的操作或者事先计划的任务,加载任务清单;
    三方通话终端根据任务清单,检索出对应的话术,所述话术代表一套完整的业务流程;
    三方通话终端从通话对象数据库中查询通话对象数据。
  11. 根据权利要求9所述的通话方法,其中,所述通过后端处理模块处理通话对象语音后生成机器人应答语音和文字包括:
    后端处理模块根据通话对象语音分析通话对象的意图;
    后端处理模块根据通话对象的意图、以及话术内置的策略和规则,生成应答句子文本;
    后端处理模块根据应答句子文本确定机器人应答语音。
  12. 根据权利要求11所述的通话方法,其中,所述后端处理模块根据通话对象语音分析通话对象的意图包括:
    后端处理模块将通话对象语音转换成文本;
    后端处理模块将文本进行分词,得到分词结果;
    后端处理模块根据分词结果分析通话对象的意图。
  13. 根据权利要求8所述的通话方法,其中,所述方法还包括:
    通过三方通话终端显示后端处理模块和通话对象的通话记录或通话相关信息。
  14. 根据权利要求8所述的通话方法,其中,所述方法还包括:
    识别到通信终端与后端处理模块之间的对话结束,通过后端处理模块关闭通信终端与后端处理模块之间的会话;
    后端处理模块将通话记录传输至三方通话终端,并保存于三方通话终端的数据库。
  15. 根据权利要求8所述的通话方法,其中,所述方法还包括:
    人工坐席向三方通话终端发送人工介入指令;
    三方通话终端响应人工介入指令,切断与后端处理模块的连接,并转换为人工介入模式。
PCT/CN2021/105295 2020-07-13 2021-07-08 一种用于移动式人机协作呼叫机器人的三方通话终端 WO2022012413A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/612,673 US11516346B2 (en) 2020-07-13 2021-07-08 Three-way calling terminal for mobile human-machine coordination calling robot
EP21794449.5A EP3968619B1 (en) 2020-07-13 2021-07-08 Three-party call terminal for use in mobile man-machine collaborative calling robot

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010669451.X 2020-07-13
CN202010669451.XA CN111787169B (zh) 2020-07-13 2020-07-13 一种用于移动式人机协作呼叫机器人的三方通话终端

Publications (1)

Publication Number Publication Date
WO2022012413A1 true WO2022012413A1 (zh) 2022-01-20

Family

ID=72768082

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105295 WO2022012413A1 (zh) 2020-07-13 2021-07-08 一种用于移动式人机协作呼叫机器人的三方通话终端

Country Status (4)

Country Link
US (1) US11516346B2 (zh)
EP (1) EP3968619B1 (zh)
CN (1) CN111787169B (zh)
WO (1) WO2022012413A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111787169B (zh) * 2020-07-13 2021-06-15 南京硅基智能科技有限公司 一种用于移动式人机协作呼叫机器人的三方通话终端
CN117544719A (zh) * 2023-11-09 2024-02-09 深圳市恩泰世科技有限公司 一种自动拨号系统及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180020093A1 (en) * 2016-07-15 2018-01-18 Circle River, Inc. Automated call answering based on artificial intelligence
CN108965620A (zh) * 2018-08-24 2018-12-07 杭州数心网络科技有限公司 一种人工智能呼叫中心系统
CN110035187A (zh) * 2019-04-16 2019-07-19 浙江百应科技有限公司 一种在电话中实现ai和人工坐席无缝切换的方法
CN110166643A (zh) * 2019-06-18 2019-08-23 深圳市一号互联科技有限公司 人机耦合的坐席控制方法、系统及语音机器人
CN111787169A (zh) * 2020-07-13 2020-10-16 南京硅基智能科技有限公司 一种用于移动式人机协作呼叫机器人的三方通话终端

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7185054B1 (en) * 1993-10-01 2007-02-27 Collaboration Properties, Inc. Participant display and selection in video conference calls
US5855003A (en) * 1996-10-11 1998-12-29 Motorola, Inc. Method and apparatus for establishing a link in a wireless communication system
US6690776B1 (en) * 1999-04-12 2004-02-10 Conexant Systems, Inc. Communication on hold notifier
US6704567B1 (en) * 2000-09-18 2004-03-09 International Business Machines Corporation Wireless communications device and method
US6690933B1 (en) * 2000-09-18 2004-02-10 International Business Machines Corporation Sharing of wirelines using a network node device
US7283519B2 (en) * 2001-04-13 2007-10-16 Esn, Llc Distributed edge switching system for voice-over-packet multiservice network
US7333798B2 (en) * 2002-08-08 2008-02-19 Value Added Communications, Inc. Telecommunication call management and monitoring system
US9432237B2 (en) * 2011-02-16 2016-08-30 Clearone, Inc. VOIP device, VOIP conferencing system, and related method
CN103971686B (zh) * 2013-01-30 2015-06-10 腾讯科技(深圳)有限公司 自动语音识别方法和系统
US9307084B1 (en) * 2013-04-11 2016-04-05 Noble Systems Corporation Protecting sensitive information provided by a party to a contact center
US9602571B2 (en) * 2013-10-29 2017-03-21 International Business Machines Corporation Codec selection and usage for improved VoIP call quality
WO2015145219A1 (en) * 2014-03-28 2015-10-01 Navaratnam Ratnakumar Systems for remote service of customers using virtual and physical mannequins
US10040201B2 (en) * 2015-08-31 2018-08-07 Avaya Inc. Service robot communication systems and system self-configuration
WO2017173141A1 (en) * 2016-03-31 2017-10-05 JIBO, Inc. Persistent companion device configuration and deployment platform
US9876909B1 (en) * 2016-07-01 2018-01-23 At&T Intellectual Property I, L.P. System and method for analytics with automated whisper mode
CN106550156A (zh) * 2017-01-23 2017-03-29 苏州咖啦魔哆信息技术有限公司 一种基于语音识别的人工智能客服系统及其实现方法
US20180240162A1 (en) * 2017-02-22 2018-08-23 Koopid, Inc. Conversational commerce platform
US10850395B2 (en) * 2017-05-19 2020-12-01 Stc.Unm System and methods for multiple-place swarm foraging with dynamic depots
US9930088B1 (en) * 2017-06-22 2018-03-27 Global Tel*Link Corporation Utilizing VoIP codec negotiation during a controlled environment call
US10694038B2 (en) * 2017-06-23 2020-06-23 Replicant Solutions, Inc. System and method for managing calls of an automated call management system
US10645228B2 (en) * 2017-06-26 2020-05-05 Apple Inc. Adaptability in EVS codec to improve power efficiency
KR102338618B1 (ko) * 2017-07-25 2021-12-10 삼성에스디에스 주식회사 휴먼 에이전트에 의하여 보조 되는 무인 대화 서비스 제공 방법
DK179822B1 (da) * 2018-06-01 2019-07-12 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10791222B2 (en) * 2018-06-21 2020-09-29 Wells Fargo Bank, N.A. Voice captcha and real-time monitoring for contact centers
US11196863B2 (en) * 2018-10-24 2021-12-07 Verint Americas Inc. Method and system for virtual assistant conversations
CN111326141A (zh) * 2018-12-13 2020-06-23 南京硅基智能科技有限公司 一种处理获取人声数据的方法
CN109819124B (zh) * 2019-01-23 2021-03-23 广州市聚星源科技有限公司 一种ivr智能服务及其实现方法
US11012559B2 (en) * 2019-02-14 2021-05-18 Rochester Institute Of Technology Method and system to enhance communication between multiple parties
CN110191242A (zh) * 2019-05-21 2019-08-30 辽宁聆智科技有限公司 人工智能与人工客服相结合的基于电话网络的交互系统
CN110505354A (zh) * 2019-07-08 2019-11-26 中国平安人寿保险股份有限公司 基于人工智能的外呼方法、外呼装置、计算机设备及存储介质
US11587561B2 (en) * 2019-10-25 2023-02-21 Mary Lee Weir Communication system and method of extracting emotion data during translations
CN111294471B (zh) * 2020-02-06 2022-03-22 广州市讯飞樽鸿信息技术有限公司 一种智能电话应答方法和系统
CN111246031B (zh) * 2020-02-27 2021-04-06 大连即时智能科技有限公司 人机协同的电话客服方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180020093A1 (en) * 2016-07-15 2018-01-18 Circle River, Inc. Automated call answering based on artificial intelligence
CN108965620A (zh) * 2018-08-24 2018-12-07 杭州数心网络科技有限公司 一种人工智能呼叫中心系统
CN110035187A (zh) * 2019-04-16 2019-07-19 浙江百应科技有限公司 一种在电话中实现ai和人工坐席无缝切换的方法
CN110166643A (zh) * 2019-06-18 2019-08-23 深圳市一号互联科技有限公司 人机耦合的坐席控制方法、系统及语音机器人
CN111787169A (zh) * 2020-07-13 2020-10-16 南京硅基智能科技有限公司 一种用于移动式人机协作呼叫机器人的三方通话终端

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3968619A4

Also Published As

Publication number Publication date
US11516346B2 (en) 2022-11-29
US20220210275A1 (en) 2022-06-30
EP3968619B1 (en) 2024-09-04
CN111787169A (zh) 2020-10-16
EP3968619A1 (en) 2022-03-16
CN111787169B (zh) 2021-06-15
EP3968619A4 (en) 2022-12-21

Similar Documents

Publication Publication Date Title
JP3651508B2 (ja) 情報処理装置および情報処理方法
CN107134286A (zh) 基于语音交互的无线音频播放方法、音乐播放器及存储介质
WO2022012413A1 (zh) 一种用于移动式人机协作呼叫机器人的三方通话终端
JP2008099330A (ja) 情報処理装置、携帯電話機
CN107613132A (zh) 语音接听方法与移动终端装置
US12095951B2 (en) Systems and methods for providing headset voice control to employees in quick-service restaurants
CN206819732U (zh) 智能音乐播放器
CN103312912A (zh) 一种混音系统及方法
TW202022560A (zh) 用於聊天機器人與人類通話的可編程智能代理機
CN109862178A (zh) 一种可穿戴设备及其语音控制通信方法
US20220286538A1 (en) Earphone device and communication method
CN110473550A (zh) 语音通信方法、装置及存储介质
CN113194203A (zh) 一种用于听障人士的沟通系统、接听拨打方法及通讯系统
CN111835923B (zh) 一种基于人工智能的移动式语音交互对话系统
CN111428515A (zh) 一种同声传译的设备及方法
CN101136954B (zh) 一种全球呼电话及其控制装置和方法
CN111775165A (zh) 一种实现移动式智能客服机器人的系统、机器人终端以及后端处理模块
CN110351690A (zh) 一种智能语音系统及其语音处理方法
CN216930267U (zh) 一种具有交互系统的耳机充电仓及音频套件
KR102546532B1 (ko) 발화 영상 제공 방법 및 이를 수행하기 위한 컴퓨팅 장치
JP2008252511A (ja) コールセンタシステム、その受付方法およびコールセンタシステム用プログラム
US20200098363A1 (en) Electronic device
CN111046678A (zh) 一种基于网络的可扩展连接实现同声翻译的装置和方法
JP2023105607A (ja) プログラム、情報処理装置及び情報処理方法
CN109391477A (zh) 一种多种音频信号同时通话系统

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021794449

Country of ref document: EP

Effective date: 20211104

NENP Non-entry into the national phase

Ref country code: DE