CN111462726A - Outbound response method, device, equipment and medium - Google Patents

Outbound response method, device, equipment and medium Download PDF

Info

Publication number
CN111462726A
CN111462726A CN202010235873.6A CN202010235873A CN111462726A CN 111462726 A CN111462726 A CN 111462726A CN 202010235873 A CN202010235873 A CN 202010235873A CN 111462726 A CN111462726 A CN 111462726A
Authority
CN
China
Prior art keywords
response
content
sub
responded
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010235873.6A
Other languages
Chinese (zh)
Other versions
CN111462726B (en
Inventor
张晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202010235873.6A priority Critical patent/CN111462726B/en
Publication of CN111462726A publication Critical patent/CN111462726A/en
Application granted granted Critical
Publication of CN111462726B publication Critical patent/CN111462726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the invention discloses an outbound response method, a device, equipment and a medium, wherein the method comprises the following steps: when an outbound response instruction is triggered, acquiring voice data to be responded corresponding to the outbound response instruction; performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded; and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy. The outbound response method provided by the embodiment of the invention realizes the automatic completion of the outbound flow and improves the outbound efficiency by performing intention recognition on the voice data to be responded and responding according to the recognition result.

Description

Outbound response method, device, equipment and medium
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to an outbound response method, device, equipment and medium.
Background
With the rapid development of communication technology, outbound services are widely used in various fields: in the education and training industry, relevant course information can be quickly and effectively spread to clients by using outbound calls; in the financial industry, outbound can be used in the scenes of call receiving, payment reminding, banking business outbound and the like. The traditional outbound system needs to outbound through a manual seat, often needs a large amount of labor cost, and the outbound efficiency is unstable.
Disclosure of Invention
The embodiment of the invention provides an outbound response method, device, equipment and medium, which are used for automatically completing an outbound flow and improving outbound efficiency.
In a first aspect, an embodiment of the present invention provides an outbound response method, including:
when an outbound response instruction is triggered, acquiring voice data to be responded corresponding to the outbound response instruction when the outbound response instruction is triggered;
performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded;
and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy.
In a second aspect, an embodiment of the present invention further provides an outbound response apparatus, including:
the system comprises a to-be-responded voice acquisition module, a voice sending module and a voice receiving module, wherein the to-be-responded voice acquisition module is used for acquiring to-be-responded voice data corresponding to an outbound response instruction when the outbound response instruction is triggered and when the outbound response instruction is triggered;
the target intention determining module is used for carrying out semantic understanding on the voice data to be responded to and obtaining a target intention corresponding to the voice data to be responded;
and the outbound response module is used for determining a target response strategy corresponding to the voice data to be responded according to the target intention and responding according to the target response strategy.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
storage means for storing one or more programs;
when executed by one or more processors, cause the one or more processors to implement an outbound response method as provided by any of the embodiments of the present invention.
In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the outbound response method provided in any embodiment of the present invention.
The embodiment of the invention acquires the voice data to be responded corresponding to the outbound response instruction when the outbound response instruction is triggered and the outbound response instruction is triggered; performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded; and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy, so that the outbound flow is automatically completed, and the outbound efficiency is improved.
Drawings
Fig. 1 is a flowchart of an outbound method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an outbound flow provided by a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an outbound device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of an outbound method according to an embodiment of the present invention. The present embodiment is applicable to the case when an outbound call is made. The method may be performed by a calling-out device, which may be implemented in software and/or hardware, for example, which may be configured in a computer device. As shown in fig. 1, the method includes:
and S110, when the outbound response instruction is triggered, acquiring the voice data to be responded corresponding to the outbound response instruction.
In this embodiment, the outbound response command may be triggered by voice information input by the user. Optionally, in the outbound call process, the outbound response instruction may be triggered by the user inputting the voice information, and the user inputting the voice information is the voice data to be responded corresponding to the outbound response instruction. Illustratively, when an outbound call is "system: is the challenge user a? The user: if yes, the voice information input by the user is yes, namely, the outbound response instruction is triggered and is taken as the voice data to be responded.
S120, performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded.
In this embodiment, after the voice data to be responded is acquired, the user intention corresponding to the voice data to be responded is identified as the target intention. Optionally, the user intention corresponding to the voice data to be responded is identified, the voice data to be responded is subjected to text conversion to obtain text information corresponding to the voice data to be responded, the text information is subjected to semantic understanding, and the target intention corresponding to the voice data to be responded is determined according to a semantic understanding result.
In an embodiment of the present invention, the semantic understanding of the voice data to be responded to obtain a target intention corresponding to the voice data to be responded includes: performing text conversion on the voice data to be responded to obtain text information corresponding to the voice data to be responded; and inputting the text information into a pre-trained intention recognition model to obtain the target intention output by the intention recognition model. Optionally, the method for performing text conversion on the voice data to be responded is not limited again, as long as the voice data to be responded can be converted into text information, and after the text information corresponding to the voice data to be responded is obtained, intention recognition is performed through a pre-trained intention recognition model. In this embodiment, different outbound procedures may be constructed according to the outbound purpose, and different training samples are used to train corresponding intention recognition models for different outbound procedures. That is, the intention recognition model may be obtained according to the flow identifier corresponding to the current outbound call. And aiming at different outbound flows, training corresponding intention recognition models, so that the recognition results of the intention recognition models are more fit with the outbound flows, and the intention recognition results are more accurate.
S130, determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy.
In this embodiment, after the target intention of the user is determined, a target response policy of the voice data to be responded is determined according to the target intention and the current outbound flow. Optionally, the outbound flow may include multiple steps, and the target response policy corresponding to the target intention may be determined according to a preset flow response logic and the step corresponding to the voice data to be responded.
In an embodiment of the present invention, the responding according to the target response policy includes: acquiring at least one response sub-content contained in the target response strategy and a content type corresponding to the response sub-content; and generating target response voice information according to the response sub-content and the content type corresponding to the response sub-content, and playing the target response voice information. Optionally, a plurality of response sub-contents may be predefined, and the response contents are formed by splicing a plurality of response word contents, so as to improve reusability of the response sub-contents. For each response sub-content, a corresponding content type can be set for identifying the storage mode of the response word content. For example, if the response sub-content is stored in an audio manner, the content type of the response sub-content may be set to a voice type, and if the response sub-content is stored in a text manner, the content type of the response sub-content may be set to a text type. In this embodiment, after the target response policy is determined, the response sub-content identifier included in the target response policy is obtained, and the response sub-content and the content type corresponding to the response sub-content are obtained according to the response sub-content identifier.
On the basis of the above scheme, the generating target response voice information according to the response sub-content and the content type corresponding to the response sub-content includes: and aiming at each response sub-content, generating sub-response voice information corresponding to the response sub-content according to the content type corresponding to the response sub-content, and combining the sub-response voice information to generate target response voice information for playing. In this embodiment, when the content types corresponding to the response sub-content are different, the manner of generating the sub-response voice message corresponding to the response sub-content is also different. After acquiring the response sub-content and the content type corresponding to the response sub-content, which are contained in the target response strategy, for each response sub-content, sub-response voice information corresponding to the response sub-content is generated according to the content type corresponding to the response sub-content, the sub-response voice information is spliced to obtain target response voice information, and the target response voice information is played to complete the response of the voice data to be responded. Illustratively, it is assumed that the target response policy includes a response sub-content 1 "hello", and a response sub-content 2 "if you are temporarily inconvenient to answer, we call again later, please keep the communication clear", then generate a corresponding sub-response voice message 1 "hello" for the response sub-content 1, and generate a corresponding sub-response voice message 2 "if you are temporarily inconvenient to answer, we call again later, please keep the communication clear", then concatenate the sub-response voice message 1 and the sub-response voice message 2 to obtain a target response voice message "hello", and if you are temporarily inconvenient to answer, we call again later, please keep the communication clear ", and play it.
In an embodiment of the present invention, the generating sub-response speech information corresponding to the response sub-content according to the content type corresponding to the response sub-content includes: and calling a set path to acquire the voice information corresponding to the response sub-content, and taking the voice information as the sub-response voice information corresponding to the response sub-content. Optionally, the content type corresponding to the response sub-content includes a voice type, which indicates that the response sub-content is stored in an audio form. It can be understood that, when the content type corresponding to the response sub-content is the voice type, the path corresponding to the response sub-content is directly called to obtain the pre-stored voice information, and the obtained voice information is used as the sub-response voice information corresponding to the response sub-content.
In an embodiment of the present invention, the generating sub-response speech information corresponding to the response sub-content according to the content type corresponding to the response sub-content includes: and acquiring text information corresponding to the response sub-content, performing voice synthesis on the text information to obtain voice information corresponding to the text information, and taking the voice information as sub-response voice information corresponding to the response sub-content. Optionally, the content type corresponding to the response sub-content may further include a text type indicating that the response sub-content is stored in a text form. When the content type corresponding to the response sub-content is a text type, voice synthesis needs to be performed on the response sub-content in the text form, and voice information obtained through voice synthesis is used as sub-response voice information corresponding to the response sub-content.
The embodiment of the invention acquires the voice data to be responded corresponding to the outbound response instruction when the outbound response instruction is triggered and the outbound response instruction is triggered; performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded; and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy, so that the outbound flow is automatically completed, and the outbound efficiency is improved.
On the basis of the scheme, the method further comprises the following steps: acquiring the unanswered time of a user, generating overtime response information when the unanswered time is greater than a set overtime threshold, and outputting the overtime response information. Optionally, when the outbound call is performed, the unanswered time of the user may be detected in real time when the response node is on the user side, and when the unanswered time of the user exceeds a preset timeout threshold, timeout response information is generated according to a set timeout policy and is played to prompt the user to respond. The overtime response information can be information such as a response question repeatedly to be responded by the user, or information such as prompting the user to respond, or entering other response links. The overtime response information is generated and output when the user does not answer for a long time through the overtime strategy, so that the timeliness of the outbound call is ensured, and the outbound efficiency is improved.
Example two
The present embodiment provides a preferred embodiment based on the above-described embodiments. The outbound response method provided by the embodiment can be executed by the outbound system. Optionally, the outbound system includes five modules of speech recognition, flow engine, semantic understanding, dialect engine, and speech synthesis. The voice recognition may be any general voice recognition technique.
The flow engine may completely define the entire outbound flow. In this embodiment, the process engine includes concepts of processes, links, and steps. It is understood that multiple flows may be created in the flow engine. One process represents an intelligent outbound policy, and outbound content is performed according to the content of the policy. One process comprises a plurality of links, and one link (Section) comprises a plurality of steps, wherein one Step is the interaction between one or more rounds of clients and the outbound system. And a timeout mechanism is set for the flow in consideration of the timeliness of the outgoing call. Illustratively, the flow may be defined using an xml file. The Process can be defined as a flow, the Process comprises a type attribute, the value of the type attribute is the service type of the flow, a flow id is used for defining a unique flow identifier, and a Name is used for defining the Name of the flow; and defines the Timeout property of the flow, i.e., the global Timeout information of the entire flow. For example, when the client does not speak for a certain time, the client counts a timeout, and uses the Count attribute to define the timeout times; the step-ref attribute is used to define the step to jump to when the timeout occurs. In the definition of the links, a Section attribute is used to define the Section, and the Start attribute of the Section is the Start link of each communication session. The id attribute is the unique identifier of a link in a certain flow. The Name attribute is the Name of the link. The Timeout attribute is unique Timeout information for this link. The attribute is optional, if not defined, the global timeout information is used, and if defined, the global timeout information is overwritten. In the step definition, a step definition step is used, wherein a start attribute of the step definition step is a starting step in a starting link, an id attribute of the step definition step is a unique identifier in the link, a name of the step definition step is a name of the step definition step, a Driver attribute of the step definition step is a driving type of the step, a directDriver is a direct drive, and a value of the step to which the step directly jumps is a value of the direct drive; engineDriver is driven by the engine. At this point, the engineStack attribute needs to be defined. engineStack is a description of the semantic understanding engine. The inside of the table contains one to a plurality of engine attributes, and the value of each engine attribute is the id of the classifier.
The semantic understanding engine is formed by combining a plurality of machine learning models and deep learning models. In this embodiment, the intention expressed in the client statement is understood through a semantic understanding policy designed autonomously. Wherein, the semantic understanding engine is composed of a plurality of model groups. Different model groups can be set for different links in the process engine, and each model group comprises a plurality of machine learning or deep learning models. It should be noted that the machine learning model or the deep learning model for semantic understanding must be a text classification model.
The tactical engine carries out tactical storage through a pre-designed set of tactical storage structure and tactical assembly strategy. Considering that the current text-to-speech technology is not mature enough and there is a gap between the occurrence of real people, in this embodiment, a speech segment can be defined in the speech engine, and a complete speech is composed of multiple speech segments. And the dialog segment defines multiple types such as "text", "sound recording", etc. The speech synthesis engine will select the way the speech is synthesized based on the different types. It will be appreciated that defining multiple session segments may improve the reusability of the session segments. For example, the dialogies engine may be defined as: callspine is complete speech information in a certain scene, a type value is a scene name, and a speech information value defined in a process engine is the type. Wherein, the Call-scripts comprise a plurality of Call-scripts. Id is the only mark in a certain scene dialog operation in the call-script; name is a name. Type is the default Type for the dialog. The conversational content is defined in the conversational segment as segments. A segment contains a plurality of segments, each segment is a dialog, and when the attribute of the dialog is a Text attribute, the segment is the Text content.
Fig. 2 is a schematic diagram of an outbound flow according to a second embodiment of the present invention, as shown in fig. 2, when an outbound call is performed, voice data is accessed to the system, and first, voice information is converted into text information through a voice recognition module; then according to the current process engine, acquiring the processing logic of the current process, and calling the corresponding semantic understanding engine; returning to the process engine, determining the next process, outputting to the dialect engine, and assembling the corresponding dialect; finally, the speech responding to the client is synthesized through a speech synthesis engine.
The embodiment of the invention adopts a flexible configuration method to configure the outbound flow, the speech technology storage and the like, determines the outbound logic through the outbound flow designed in the flow engine, carries out semantic understanding on the voice information of the client through the semantic understanding engine, and synthesizes the voice information through the speech technology storage structure and method prestored in the speech technology engine and the speech technology assembly strategy to respond after determining the intention of the user, thereby constructing an intelligent outbound system with strong recognition capability and improving the work efficiency of the outbound.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an outbound response device according to a third embodiment of the present invention. The outbound device may be implemented in software and/or hardware, for example, the outbound device may be configured in a computer device. As shown in fig. 3, the apparatus includes a to-be-answered speech acquisition module 310, a target intention determination module 320, and an outbound response module 330, wherein:
a to-be-responded voice acquiring module 310, configured to acquire to-be-responded voice data corresponding to an outbound response instruction when the outbound response instruction is triggered;
a target intention determining module 320, configured to perform semantic understanding on the voice data to be responded, and obtain a target intention corresponding to the voice data to be responded;
and the outbound response module 330 is configured to determine a target response policy corresponding to the voice data to be responded according to the target intention, and respond according to the target response policy.
In the embodiment of the invention, when an outbound response instruction is triggered, the to-be-responded voice data corresponding to the outbound response instruction is acquired by the to-be-responded voice acquisition module; the target intention determining module carries out semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded; and the outbound response module determines a target response strategy corresponding to the voice data to be responded according to the target intention and responds according to the target response strategy, so that the outbound flow is automatically completed, and the outbound efficiency is improved.
Optionally, on the basis of the above scheme, the outbound response module 330 is specifically configured to:
acquiring at least one response sub-content contained in the target response strategy and a content type corresponding to the response sub-content;
and generating target response voice information according to the response sub-content and the content type corresponding to the response sub-content, and playing the target response voice information.
Optionally, on the basis of the above scheme, the outbound response module 330 is specifically configured to:
and aiming at each response sub-content, generating sub-response voice information corresponding to the response sub-content according to the content type corresponding to the response sub-content, and combining the sub-response voice information to generate target response voice information for playing.
Optionally, on the basis of the above scheme, the content type includes a voice type, and the outbound response module 330 is specifically configured to:
and calling a set path to acquire the voice information corresponding to the response sub-content, and taking the voice information as the sub-response voice information corresponding to the response sub-content.
Optionally, on the basis of the above scheme, the content type includes a text type, and the outbound response module 330 is specifically configured to:
and acquiring text information corresponding to the response sub-content, performing voice synthesis on the text information to obtain voice information corresponding to the text information, and taking the voice information as sub-response voice information corresponding to the response sub-content.
Optionally, on the basis of the foregoing scheme, the target intention determining module 320 is specifically configured to:
performing text conversion on the voice data to be responded to obtain text information corresponding to the voice data to be responded;
and inputting the text information into a pre-trained intention recognition model to obtain the target intention output by the intention recognition model.
Optionally, on the basis of the foregoing scheme, the apparatus further includes a timeout response module, configured to:
acquiring the unanswered time of a user, generating overtime response information when the unanswered time is greater than a set overtime threshold, and outputting the overtime response information.
The outbound response device provided by the embodiment of the invention can execute the outbound response method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary computer device 412 suitable for use in implementing embodiments of the present invention. The computer device 412 shown in FIG. 4 is only one example and should not impose any limitations on the functionality or scope of use of embodiments of the present invention.
As shown in FIG. 4, computer device 412 is in the form of a general purpose computing device. Components of computer device 412 may include, but are not limited to: one or more processors 416, a system memory 428, and a bus 418 that couples the various system components (including the system memory 428 and the processors 416).
Bus 418 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and processor 416, or a local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 412 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 412 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 428 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)430 and/or cache memory 432. The computer device 412 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage 434 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 418 by one or more data media interfaces. Memory 428 can include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 440 having a set (at least one) of program modules 442 may be stored, for instance, in memory 428, such program modules 442 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 442 generally perform the functions and/or methodologies of the described embodiments of the invention.
The computer device 412 may also communicate with one or more external devices 414 (e.g., keyboard, pointing device, display 424, etc.), and may also communicate with one or more devices that enable a user to interact with the computer device 412, and/or with any devices (e.g., network card, modem, etc.) that enable the computer device 412 to communicate with one or more other computing devices.
The processor 416 executes various functional applications and data processing by executing programs stored in the system memory 428, for example, implementing an outbound answering method provided by an embodiment of the present invention, the method comprising:
when an outbound response instruction is triggered, acquiring voice data to be responded corresponding to the outbound response instruction;
performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded;
and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy.
Of course, those skilled in the art can understand that the processor can also implement the technical solution of the outbound response method provided by any embodiment of the present invention.
EXAMPLE five
The fifth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the outbound response method provided by the fifth embodiment of the present invention, where the method includes:
when an outbound response instruction is triggered, acquiring voice data to be responded corresponding to the outbound response instruction;
performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded;
and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy.
Of course, the computer program stored on the computer-readable storage medium provided by the embodiment of the present invention is not limited to the method operations described above, and may also perform the relevant operations of the outbound response method provided by any embodiment of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including AN object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. An outbound response method, comprising:
when an outbound response instruction is triggered, acquiring voice data to be responded corresponding to the outbound response instruction;
performing semantic understanding on the voice data to be responded to obtain a target intention corresponding to the voice data to be responded;
and determining a target response strategy corresponding to the voice data to be responded according to the target intention, and responding according to the target response strategy.
2. The method of claim 1, wherein said replying according to the target reply policy comprises:
acquiring at least one response sub-content contained in the target response strategy and a content type corresponding to the response sub-content;
and generating target response voice information according to the response sub-content and the content type corresponding to the response sub-content, and playing the target response voice information.
3. The method according to claim 2, wherein the generating the target response voice message according to the response sub-content and the content type corresponding to the response sub-content comprises:
and aiming at each response sub-content, generating sub-response voice information corresponding to the response sub-content according to the content type corresponding to the response sub-content, and combining the sub-response voice information to generate target response voice information for playing.
4. The method according to claim 3, wherein the content type includes a voice type, and the generating sub-response voice information corresponding to the response sub-content according to the content type corresponding to the response sub-content includes:
and calling a set path to acquire the voice information corresponding to the response sub-content, and taking the voice information as the sub-response voice information corresponding to the response sub-content.
5. The method according to claim 3, wherein the content type includes a text type, and the generating of the sub-response speech information corresponding to the response sub-content according to the content type corresponding to the response sub-content includes:
and acquiring text information corresponding to the response sub-content, performing voice synthesis on the text information to obtain voice information corresponding to the text information, and taking the voice information as sub-response voice information corresponding to the response sub-content.
6. The method according to claim 1, wherein the semantically understanding the voice data to be responded to and obtaining the target intention corresponding to the voice data to be responded to comprises:
performing text conversion on the voice data to be responded to obtain text information corresponding to the voice data to be responded;
and inputting the text information into a pre-trained intention recognition model to obtain the target intention output by the intention recognition model.
7. The method of claim 1, further comprising:
acquiring the unanswered time of a user, generating overtime response information when the unanswered time is greater than a set overtime threshold, and outputting the overtime response information.
8. An outbound response means comprising:
the system comprises a to-be-responded voice acquisition module, a to-be-responded voice processing module and a to-be-responded voice processing module, wherein the to-be-responded voice acquisition module is used for acquiring to-be-responded voice data corresponding to an outbound response instruction when the outbound response instruction is triggered;
the target intention determining module is used for carrying out semantic understanding on the voice data to be responded to and obtaining a target intention corresponding to the voice data to be responded;
and the outbound response module is used for determining a target response strategy corresponding to the voice data to be responded according to the target intention and responding according to the target response strategy.
9. A computer device, the device comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the outbound response method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the outbound response method according to any one of claims 1 to 7.
CN202010235873.6A 2020-03-30 2020-03-30 Method, device, equipment and medium for answering out call Active CN111462726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010235873.6A CN111462726B (en) 2020-03-30 2020-03-30 Method, device, equipment and medium for answering out call

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010235873.6A CN111462726B (en) 2020-03-30 2020-03-30 Method, device, equipment and medium for answering out call

Publications (2)

Publication Number Publication Date
CN111462726A true CN111462726A (en) 2020-07-28
CN111462726B CN111462726B (en) 2023-08-22

Family

ID=71683363

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010235873.6A Active CN111462726B (en) 2020-03-30 2020-03-30 Method, device, equipment and medium for answering out call

Country Status (1)

Country Link
CN (1) CN111462726B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113556430A (en) * 2021-07-22 2021-10-26 深圳追一科技有限公司 Outbound system and outbound method
CN114500757A (en) * 2022-01-07 2022-05-13 马上消费金融股份有限公司 Voice interaction method and device, computer equipment and storage medium
CN115051873A (en) * 2022-07-27 2022-09-13 深信服科技股份有限公司 Network attack result detection method and device and computer readable storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050096912A1 (en) * 2003-10-30 2005-05-05 Sherif Yacoub System and method for interactive voice response enhanced out-calling
WO2009030556A1 (en) * 2007-09-07 2009-03-12 Sony Ericsson Mobile Communications Ab Dynamically assembling voice messages in a wireless communication device
CN103458056A (en) * 2013-09-24 2013-12-18 贵阳世纪恒通科技有限公司 Speech intention judging method based on automatic classification technology for automatic outbound system
WO2018019116A1 (en) * 2016-07-28 2018-02-01 上海未来伙伴机器人有限公司 Natural language-based man-machine interaction method and system
CN108777751A (en) * 2018-06-07 2018-11-09 上海航动科技有限公司 A kind of call center system and its voice interactive method, device and equipment
CN109256133A (en) * 2018-11-21 2019-01-22 上海玮舟微电子科技有限公司 A kind of voice interactive method, device, equipment and storage medium
CN109672794A (en) * 2018-12-04 2019-04-23 天津深思维科技有限公司 A kind of outer paging system of intelligent sound
CN110809095A (en) * 2019-10-25 2020-02-18 大唐网络有限公司 Method and device for voice call-out
CN110931012A (en) * 2019-10-12 2020-03-27 深圳壹账通智能科技有限公司 Reply message generation method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050096912A1 (en) * 2003-10-30 2005-05-05 Sherif Yacoub System and method for interactive voice response enhanced out-calling
WO2009030556A1 (en) * 2007-09-07 2009-03-12 Sony Ericsson Mobile Communications Ab Dynamically assembling voice messages in a wireless communication device
CN103458056A (en) * 2013-09-24 2013-12-18 贵阳世纪恒通科技有限公司 Speech intention judging method based on automatic classification technology for automatic outbound system
WO2018019116A1 (en) * 2016-07-28 2018-02-01 上海未来伙伴机器人有限公司 Natural language-based man-machine interaction method and system
CN108777751A (en) * 2018-06-07 2018-11-09 上海航动科技有限公司 A kind of call center system and its voice interactive method, device and equipment
CN109256133A (en) * 2018-11-21 2019-01-22 上海玮舟微电子科技有限公司 A kind of voice interactive method, device, equipment and storage medium
CN109672794A (en) * 2018-12-04 2019-04-23 天津深思维科技有限公司 A kind of outer paging system of intelligent sound
CN110931012A (en) * 2019-10-12 2020-03-27 深圳壹账通智能科技有限公司 Reply message generation method and device, computer equipment and storage medium
CN110809095A (en) * 2019-10-25 2020-02-18 大唐网络有限公司 Method and device for voice call-out

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113556430A (en) * 2021-07-22 2021-10-26 深圳追一科技有限公司 Outbound system and outbound method
CN113556430B (en) * 2021-07-22 2024-02-20 深圳追一科技有限公司 Outbound system and outbound method
CN114500757A (en) * 2022-01-07 2022-05-13 马上消费金融股份有限公司 Voice interaction method and device, computer equipment and storage medium
CN115051873A (en) * 2022-07-27 2022-09-13 深信服科技股份有限公司 Network attack result detection method and device and computer readable storage medium
CN115051873B (en) * 2022-07-27 2024-02-23 深信服科技股份有限公司 Network attack result detection method, device and computer readable storage medium

Also Published As

Publication number Publication date
CN111462726B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
KR102297394B1 (en) Automated assistant invocation of appropriate agent
US20190066679A1 (en) Music recommending method and apparatus, device and storage medium
CN111462726B (en) Method, device, equipment and medium for answering out call
CN108922564B (en) Emotion recognition method and device, computer equipment and storage medium
KR20190097267A (en) Create and send call requests to use third party agents
JP2020503620A (en) Context-Aware Human-Computer Dialogue
US9361589B2 (en) System and a method for providing a dialog with a user
US20190294638A1 (en) Dialog method, dialog system, dialog apparatus and program
CN110704594A (en) Task type dialogue interaction processing method and device based on artificial intelligence
CN110381221B (en) Call processing method, device, system, equipment and computer storage medium
CN112735374B (en) Automatic voice interaction method and device
EP2879062A2 (en) A system and a method for providing a dialog with a user
KR20200011483A (en) Customizable interactive interactive application
US20070156406A1 (en) Voice user interface authoring tool
CN109065019B (en) Intelligent robot-oriented story data processing method and system
CN112069830A (en) Intelligent conversation method and device
CN116016779A (en) Voice call translation assisting method, system, computer equipment and storage medium
CN114466106A (en) Test data generation method, device, equipment and medium of outbound system
CN106209583A (en) A kind of message input method, device and user terminal thereof
US20230169273A1 (en) Systems and methods for natural language processing using a plurality of natural language models
CN115906808A (en) Method, device, medium and computing device for confirming speech response degree
CN112163078B (en) Intelligent response method, device, server and storage medium
US20220319516A1 (en) Conversation method, conversation system, conversation apparatus, and program
CN113744712A (en) Intelligent outbound voice splicing method, device, equipment, medium and program product
US10559298B2 (en) Discussion model generation system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221013

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Applicant after: CHINA CONSTRUCTION BANK Corp.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant