CN115249481A

CN115249481A - Emotion recognition-based collection method and system, computer equipment and storage medium

Info

Publication number: CN115249481A
Application number: CN202210860323.2A
Authority: CN
Inventors: 陈思妮
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2022-07-21
Filing date: 2022-07-21
Publication date: 2022-10-28

Abstract

The embodiment of the application provides a method and a system for urging collection based on emotion recognition, computer equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: establishing a first call connection between a target receiving object and a first service object according to a preset first expiration time interval so as to acquire first call voice data; acquiring first voice data in the first call voice data; performing risk analysis according to the historical collection prompting data, a preset behavior keyword library and a preset behavior voiceprint characteristic database, and performing emotion recognition on the first voice data when a first risk judgment result shows that the target collection prompting object has no preset behavior; and acquiring first voice text data, performing collection urging strategy matching on the first voice text data, and performing collection urging processing on the target collection urging object by the second service object according to the obtained collection urging strategy information. According to the embodiment of the application, the discontented emotion of the user can be recognized in time and processed, so that the collection urging task can be completed more effectively.

Description

Emotion recognition-based collection method and system, computer equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a collection urging method and system based on emotion recognition, computer equipment and a storage medium.

Background

Currently, in the process of hastening the user, an AI outbound device or hastening personnel is usually adopted to hasten the user. However, the hasty personnel is susceptible to subjective emotional factors during hasty communication with the user, and the spoken words tend to cause the user to feel dissatisfied. Therefore, how to identify the discontented emotion of the user in time when communicating with the user, so as to more effectively complete the collection urging task becomes a technical problem which needs to be solved urgently at present.

Disclosure of Invention

The embodiment of the application mainly aims to provide a method and a system for urging collection based on emotion recognition, computer equipment and a storage medium, which can timely recognize discontent emotion of a user and process the discontented emotion so as to more effectively complete an urging collection task.

In order to achieve the above object, a first aspect of the embodiments of the present application provides a method for hastening harvesting based on emotion recognition, the method including:

establishing a first call connection between a target collection object and a first service object according to a preset first expiration time interval so as to acquire first call voice data of the target collection object and the first service object;

acquiring first voice data of the target receiving object in the first call voice data;

acquiring historical collection urging data of the target collection urging object;

performing risk analysis on the first voice data according to the historical collection prompting data, a preset behavior keyword library and a preset behavior voiceprint feature database to obtain a first risk judgment result;

when the first risk judgment result shows that the target collection object has no preset behavior, performing emotion recognition on the first voice data according to a preset emotion recognition model to obtain a first emotion recognition result;

acquiring first voice text data corresponding to the first voice data, and inputting the first emotion recognition result and the first voice text data into a preset collection policy model for matching to obtain collection policy information;

and sending the collection urging strategy information to a second service object, and carrying out collection urging treatment on the target collection urging object by the second service object according to the collection urging strategy information.

In some embodiments, after performing risk analysis on the first voice data according to the historical data of the target collection object, the preset-behavior keyword library, and the preset-behavior voiceprint feature database to obtain a first risk judgment result, the method further includes:

when the first risk judgment result shows that the target collection object has a preset behavior, acquiring first voice text data corresponding to the first voice data, and inputting the first voice text data into a preset conversational recommendation model for conversational recommendation to obtain a target recommended conversational skill;

and sending the target recommended word operation to a third service object.

In some embodiments, the sending the collection-urging policy information to a second service object, where the second service object performs collection urging processing on the target collection-urging object according to the collection-urging policy information, includes:

when the collection prompting completion information of the target collection prompting object is not acquired at the first expiration time interval, sending the collection prompting strategy information to a second service object;

establishing a second communication connection between the target collection-urging object and the second service object at a preset second overdue time interval according to the collection-urging strategy information so as to acquire second communication voice data of the target collection-urging object and the second service object, wherein the second overdue time interval is used for representing the preset time interval after the first overdue time interval is ended;

acquiring second voice data of the target reception prompting object in the second communication voice data;

performing risk analysis on the second voice data according to the historical collection prompting data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a second risk judgment result;

when the second risk judgment result shows that the target collection object has no preset behavior, performing emotion recognition on the second voice data according to the emotion recognition model to obtain a second emotion recognition result;

acquiring second voice text data corresponding to the second voice data, and inputting the second emotion recognition result and the second voice text data into a preset call-receiving-urging model for carrying out call-urging matching to obtain a target call-receiving-urging operation;

and prompting the target prompting object according to the target prompting conversation technology.

In some embodiments, after the performing risk analysis on the second voice data according to the historical collection prompting data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a second risk judgment result, the method further includes:

acquiring third voice data of the second service object in the second communication voice data;

performing emotion state analysis on the third voice data according to a preset emotion state keyword library and an emotion state voiceprint feature database to obtain a collection-urging emotion judgment result;

acquiring second voice text data corresponding to the second voice data, and inputting the second voice text data into a preset dialect recommendation model for performing dialect recommendation to obtain a target recommended dialect;

and when the collection prompting emotion judgment result shows that the second service object is in a preset emotional state, sending the target recommendation word to a third service object, and the third service object collects the target collection prompting object according to the target recommendation word and processes the second service object according to a preset management specification.

In some embodiments, after the performing emotion state analysis on the third speech data according to a preset emotion state keyword library and an emotion state voiceprint feature database to obtain a result of emotion collection promotion determination, the method further includes:

and when the collection prompting emotion judgment result shows that the second service object is not in the preset emotional state, the second service object carries out collection prompting on the target collection prompting object according to the target recommendation operation.

In some embodiments, the performing risk analysis on the first voice data according to the historical collection prompting data, a preset behavior keyword library and a preset behavior voiceprint feature database to obtain a first risk determination result includes:

acquiring first voice text data corresponding to the first voice data;

acquiring a preset behavior keyword library and a preset behavior voiceprint feature database, wherein the preset behavior keyword library comprises a plurality of preset behavior tendency keywords;

performing keyword matching processing on the first voice text data and the preset behavior tendency keywords to obtain a preset behavior keyword matching result;

acquiring a target voiceprint characteristic corresponding to the first voice data;

comparing the target voiceprint characteristics with the preset behavior voiceprint characteristic database to obtain a voiceprint comparison result;

and performing risk analysis on the first voice data according to the historical collection prompting data, the preset behavior keyword matching result and the voiceprint comparison result to obtain a first risk judgment result.

In some embodiments, after obtaining first voice text data corresponding to the first voice data, and inputting the first emotion recognition result and the first voice text data to a preset collection policy model for matching, to obtain collection policy information, the method further includes:

acquiring text characteristics of the first voice text data and voice characteristics of the first voice data;

performing intention recognition on the target receiving object according to the text characteristic and the voice characteristic to obtain an intention recognition result;

marking the target collection object according to the intention recognition result to obtain a marking result, wherein the marking result is used for representing key intention information of the target collection object in the first voice data.

In order to achieve the above object, a second aspect of the embodiments of the present application provides a system for urging collection based on emotion recognition, the system including:

the call connection module is used for establishing a first call connection between a target collection object and a first service object according to a preset first expiration time interval so as to acquire first call voice data of the target collection object and the first service object;

the first voice data acquisition module is used for acquiring first voice data of the target receiving object in the first call voice data;

the second voice data acquisition module is used for acquiring historical collection urging data of the target collection urging object;

the risk analysis module is used for carrying out risk analysis on the first voice data according to the historical collection prompting data, a preset behavior keyword library and a preset behavior voiceprint feature database to obtain a first risk judgment result;

the emotion recognition module is used for carrying out emotion recognition on the first voice data according to a preset emotion recognition model to obtain a first emotion recognition result when the first risk judgment result shows that the target collection object has no preset behavior;

the matching module is used for acquiring first voice text data corresponding to the first voice data, inputting the first emotion recognition result and the first voice text data into a preset collection policy model for matching to obtain collection policy information;

and the collection urging module is used for sending the collection urging strategy information to a second service object, and the second service object carries out collection urging treatment on the target collection urging object according to the collection urging strategy information.

To achieve the above object, a third aspect of embodiments of the present application provides a computer device, including:

at least one memory;

at least one processor;

at least one computer program;

the computer program is stored in the memory and the at least one computer program is executed by the processor to implement the method of any of the embodiments of the first aspect described above.

To achieve the above object, a fourth aspect of the embodiments of the present application further provides a storage medium, where the storage medium

The storage medium is a computer-readable storage medium having stored thereon a computer program for causing a computer to perform the method according to any of the embodiments of the first aspect described above.

According to the emotion recognition-based call collection method and system, the computer device and the storage medium, first, a first call connection between a target call collection object and a first service object is established according to a preset first expiration time interval, so that first call voice data of the target call collection object and the first service object are obtained. In order to timely and accurately identify the discontented emotion of the user, first voice data of a target collection object and historical collection data of the target collection object in the first call voice data are obtained, and then risk analysis is carried out on the first voice data according to the historical collection data, a preset behavior keyword library and a preset behavior voiceprint feature database to obtain a first risk judgment result. And when the first risk judgment result shows that the target collection object has no preset behavior, performing emotion recognition on the first voice data according to a preset emotion recognition model to obtain a first emotion recognition result. In order to complete the collection urging task more efficiently, first voice text data corresponding to the first voice data are obtained, and a first emotion recognition result and the first voice text data are input into a preset collection urging strategy model to be matched, so that collection urging strategy information is obtained. And sending the collection urging strategy information to the second service object so that the second service object carries out collection urging processing on the target collection urging object according to the collection urging strategy information. According to the embodiment of the application, the discontent emotion of the user can be identified in time and flexibly processed, so that the collection urging task can be completed more effectively.

Drawings

Fig. 1 is a first flowchart of a method for hastening harvest based on emotion recognition according to an embodiment of the present application;

fig. 2 is a second flowchart of a method for hastening harvest based on emotion recognition provided by an embodiment of the present application;

FIG. 3 is a flowchart of a specific method of step S170 in FIG. 1;

FIG. 4 is a third flowchart of a method for hastening harvest based on emotion recognition provided by an embodiment of the present application;

FIG. 5 is a flowchart of a specific method of step S140 in FIG. 1;

FIG. 6 is a fourth flowchart of a method for sentiment-based collection provided by an embodiment of the present application;

FIG. 7 is a block diagram of a sentiment recognition based reminder system provided by an embodiment of the present application;

fig. 8 is a hardware structure diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

First, several terms referred to in the present application are resolved:

artificial Intelligence (AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produces a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others. The artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.

Deep Neural Networks-Hidden Markov models (Deep Neural Networks-Hidden Markov models, DNN-HMM): the method is characterized in that a Gaussian Mixture Model (GMM) is replaced by DNN to Model the observation probability of an input voice signal, the GMM is a generation Model and adopts unsupervised learning, and the DNN is a discrimination Model and adopts supervised learning.

Currently, in the process of hastening the user, an AI outbound device or hastening personnel is usually adopted to hasten the user. However, since the AI outbound device usually replies by using a uniform speech template, the user is likely to have an unsatisfactory emotion due to an unexpected response, and the person who urges to receive the call is likely to be affected by subjective emotion factors during the communication process with the user, so that the spoken speech is likely to cause the user to have an unsatisfactory emotion. Therefore, when the user communicates with the user, how to timely identify the discontent emotion of the user and accurately navigate the user to the corresponding customer service personnel or self-service flow, so that the collection prompting task is completed more effectively, and the technical problem which needs to be solved at present is solved.

Based on this, the embodiment of the application provides a method and a system for collection urging based on emotion recognition, a computer device and a storage medium, which can timely recognize discontented emotions of a user and process the discontented emotions, so that collection urging tasks can be completed more effectively.

The method and system for urging collection based on emotion recognition, the computer device, and the storage medium provided in the embodiments of the present application are specifically described with reference to the following embodiments, first, a method for urging collection based on emotion recognition in the embodiments of the present application is described.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The embodiment of the application provides a collection urging method based on emotion recognition, and relates to the technical field of artificial intelligence, in particular to the technical field of data mining. The emotion recognition-based collection urging method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, smart watch, or the like; the server can be an independent server, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), big data and artificial intelligence platform and the like; the software may be an application or the like that implements a hastening method based on emotion recognition, but is not limited to the above form.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Referring to fig. 1, fig. 1 is a flowchart illustrating an alternative embodiment of a method for urging based on emotion recognition. In some embodiments of the present application, the method for urging based on emotion recognition specifically includes, but is not limited to, steps S110 to S170, and these seven steps are described in detail below with reference to fig. 1.

Step S110, establishing a first call connection between the target collection object and the first service object according to a preset first expiration time interval to acquire first call voice data of the target collection object and the first service object;

step S120, acquiring first voice data of a target collection object in the first call voice data;

step S130, acquiring historical collection urging data of a target collection urging object;

step S140, performing risk analysis on the first voice data according to the historical collection prompting data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a first risk judgment result;

step S150, when the first risk judgment result shows that the target collection object has no preset behavior, performing emotion recognition on the first voice data according to a preset emotion recognition model to obtain a first emotion recognition result;

step S160, acquiring first voice text data corresponding to the first voice data, inputting the first emotion recognition result and the first voice text data into a preset collection strategy model for matching, and obtaining collection strategy information;

step S170, sending the collection strategy information to a second service object, and carrying out collection urging processing on the target collection urging object by the second service object according to the collection urging strategy information.

In step S110 of some embodiments, since the user does not pay after the due payment deadline T, the user needs to be charged, and a first call connection between the target charge object and the first service object is established according to a preset first timeout interval, so as to obtain first call voice data of the target charge object and the first service object. Where n1 may characterize the first timeout interval, the time interval of the first timeout interval may be set to (T, T + n 1). Specifically, in a time interval of the first expiration time interval, a communication connection between the first service object and the target collection object is established to obtain first call voice data of the target collection object and the first service object, where the first call voice data is used to represent interactive audio stream data between the target collection object and the first service object.

It should be noted that, in order to reduce the labor consumption and improve the collection efficiency, the first service object may use an AI outbound to perform collection notification.

In step S120 of some embodiments, in order to timely determine the emotion of the user, first voice data of the target collection object in the first call voice data is obtained, where the first voice data is used to represent voice data input by the target collection object in the first call voice data.

In step S130 of some embodiments, in order to accurately identify whether the target hasty object has the preset behavior, historical hasty data of the target hasty object is obtained. The historical collection prompting data comprises historical call voice data, historical preset behavior keyword statistical data, historical preset behavior occurrence statistical data, collection prompting recorded data and the like. The historical collection urging process can comprise at least one collection urging process, and the historical preset behavior keyword statistical data is used for representing the statistical result of the keywords of the target collection urging object showing the preset behavior in the historical collection urging process; the historical call voice data is used for representing interactive audio stream data of historical communication between the target receiving object and the first service object; the historical preset behavior occurrence statistical data is used for representing the times of the actual occurrence of the preset behaviors of the target collection object in the historical collection process; the collection record data is used for representing the time, frequency and other records of the collection of the target collection object. It should be noted that, for example, during the credit collection process of the bank, the historical collection data may be obtained from the bank's cloud server.

The preset behavior includes at least one behavior intention of the target collection object, such as clear dissatisfaction expression, unhealthy word expression, and violent emotion expression in the voice. And performing data statistics according to preset behavior keywords corresponding to different behavior intents to form historical collection prompting data.

In step S140 of some embodiments, in order to accurately obtain a risk judgment on the target reception prompting object, a risk analysis is performed on the first voice data according to the historical reception prompting data, the preset behavior keyword library and the preset behavior voiceprint feature database, so as to obtain a first risk judgment result. The preset behavior keyword library is used for storing a plurality of preset behavior tendency keywords under different service types, and whether the text data corresponding to the first voice data comprises the keywords expressing the preset behavior intention is judged by combining the preset behavior keyword library so as to obtain a keyword judgment result. The preset behavior voiceprint feature database is used for storing voiceprint features capable of expressing preset behaviors, and whether the voiceprint features in the first voice data meet the voiceprint threshold value of the preset behaviors or not is judged by combining the preset behavior voiceprint feature database so as to obtain a voiceprint judgment result. Specifically, risk analysis is performed on the state of the current target revenue urging object by combining the keyword judgment result, the voiceprint judgment result and the historical revenue urging data, and the corresponding emotion level is determined, namely the first risk judgment result.

It should be noted that the emotion levels corresponding to the risk judgment result can be divided into a first level, a second level and a third level, and when the emotion level of the target revenue urging object is in the second level, it indicates that the current possible emotion of the target revenue urging object is stable and normal, that is, there may be no risk of a preset behavior; when the emotion level of the target collection object is in the first level or the third level, the target collection object is indicated to be possibly low in emotion or fierce in emotion currently, and the risk of preset behaviors may occur.

In step S150 of some embodiments, when the first risk determination result indicates that the target collection object has no preset behavior, in order to accurately determine and process the emotion of the user in time, emotion recognition is performed on the first voice data according to a preset emotion recognition model, so as to obtain a first emotion recognition result. Specifically, when the first risk judgment result indicates that the target collection object has no preset behavior, audio feature extraction is performed on the first voice data to obtain a first audio feature vector. And then, inputting the first audio characteristic vector into a preset emotion recognition model and performing emotion recognition to obtain the current emotion category of the target collection object, namely a first emotion recognition result. Wherein the mood categories include: calm, violent, depressed, etc., and violent emotions include anger and boredom.

It should be noted that the preset emotion recognition model may adopt any one of a DNN-HMM acoustic model, an N-Gram language model, or an emotion analysis model based on a finite weighted state machine, and the emotion recognition model is not limited in the present application.

In step S160 of some embodiments, in order to effectively improve the completion of the collection prompting task and ensure smooth communication between the target collection prompting object and the service object so as to avoid the occurrence of discontent emotion of the target object, the first voice text data corresponding to the first voice data is obtained, and the first emotion recognition result and the first voice text data are input to the preset collection prompting policy model for matching, so as to obtain the collection prompting policy information through the collection prompting policy model. The collection urging strategy information comprises collection urging frequency suggestion information, collection urging recommended dialect information and the like.

In step S170 of some embodiments, in order to improve the completion of the collection task, the collection policy information is sent to the second service object, and the second service object performs collection processing on the target collection object according to the collection policy information. Specifically, the second service object may prompt the target collection object to collect with collection frequency suggestion information according to collection recommendation session information, so as to improve the success rate of completion of collection tasks. It should be noted that the second service object may be an AI caller or an acquirer.

In each embodiment of the present application, when data related to the user identity or characteristic, such as user information, user behavior data, user history data, and user location information, is processed, permission or consent of the user is obtained, and the data collection, use, and processing comply with relevant laws and regulations and standards of relevant countries and regions. In addition, when the embodiment of the present application needs to acquire sensitive personal information of a user, individual permission or individual consent of the user is obtained through a pop-up window or a jump to a confirmation page, and after the individual permission or individual consent of the user is definitely obtained, necessary user-related data for enabling the embodiment of the present application to operate normally is acquired.

Referring to fig. 2, fig. 2 is a flowchart illustrating another alternative embodiment of a method for urging based on emotion recognition according to the present application. In some embodiments of the present application, after step S140, the emotion recognition-based motivation collection method specifically includes, but is not limited to, step S210 and step S220, which are described in detail below with reference to fig. 2.

Step S210, when the first risk judgment result shows that the target collection object has a preset behavior, acquiring first voice text data corresponding to the first voice data, and inputting the first voice text data into a preset conversational recommendation model for conversational recommendation to obtain a target recommended conversational skill;

step S220, the target recommended speech is sent to a third service object.

In steps S210 and S220 of some embodiments, when the first risk determination result indicates that the target revenue prompting object has a preset behavior, in order to effectively ensure completion of the revenue prompting task, first voice text data corresponding to the first voice data is acquired, the first voice text data is input to a preset conversational recommendation model for conversational recommendation, a target conversational recommendation is obtained, and the target conversational recommendation is sent to a third service object. Specifically, when the first risk judgment result indicates that the target revenue-urging object has a preset behavior, that is, the voice of the target revenue-urging object has at least one behavior intention of explicit dissatisfaction emotion expression, unhealthy word expression, violent emotion expression and the like, the first voice text data is input to the preset speech recommendation model for performing speech recommendation, so as to obtain the target recommended speech. According to the embodiment of the application, whether the target collection object has the risk of the preset behavior or not is judged in real time, the discontent emotion of the user can be identified in time and flexibly processed, and therefore the collection urging task is more effectively guaranteed to be completed.

Referring to fig. 3, fig. 3 is a flowchart of a specific method of step S170 according to an embodiment of the present disclosure. In some embodiments of the present application, step S170 specifically includes, but is not limited to, step S310 to step S370. These seven are described in detail below in conjunction with fig. 3.

Step S310, when the collection prompting completion information of the target collection prompting object is not acquired at the first expiration time interval, sending collection prompting strategy information to a second service object;

step S320, establishing a second communication connection between the target collection object and a second service object at a preset second overdue time interval according to the collection strategy information to acquire second communication voice data of the target collection object and the second service object, wherein the second overdue time interval is used for representing the preset time interval after the first overdue time interval is finished;

step S330, acquiring second voice data of a target receiving object in the second communication voice data;

step S340, performing risk analysis on the second voice data according to the historical collection prompting data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a second risk judgment result;

step S350, when the second risk judgment result shows that the target collection object has no preset behavior, performing emotion recognition on second voice data according to the emotion recognition model to obtain a second emotion recognition result;

step S360, second voice text data corresponding to the second voice data are obtained, and a second emotion recognition result and the second voice text data are input into a preset call gathering catalysis model for conversation matching to obtain a target call gathering catalysis;

step S370, the target receiving object is received according to the target receiving call.

In steps S310 and S320 of some embodiments, when the collection completion information of the target collection object is not obtained in the first timeout period, that is, the collection task of the target collection object is not completed in the first timeout period, the collection policy information is sent to the second service object, so that the second service object performs collection processing on the target collection object according to the collection policy information. And then, the second service object establishes second communication connection between the target collection object and the second service object at a preset second expiration time interval according to the collection strategy information so as to acquire second communication voice data of the target collection object and the second service object. Wherein the second timeout interval is used to represent a preset time interval after the first timeout interval ends, n2 may represent the second timeout interval, and then the time interval of the second timeout interval may be set to (T + n1, T + n 2). Specifically, in the time interval of the second expiration time interval, a communication connection between the second service object and the target collection object is established to obtain second communication voice data of the target collection object and the second service object, wherein the second communication voice data is used for representing interactive audio stream data between the target collection object and the second service object.

In step S330 and step S340 of some embodiments, in order to be able to timely determine the emotion of the user, second voice data of the target reception-forcing subject in the second communication voice data, which is used to characterize the voice data input by the target reception-forcing subject in the second communication voice data, is acquired. And performing risk analysis on the second voice data according to the historical collection data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a second risk judgment result. The specific descriptions of the historical collection data, the preset behavior keyword library and the preset behavior voiceprint feature database are the same as those in the above embodiments, and are not described herein again.

In some embodiments, in steps S350 to S370, when the second risk determination result indicates that the target urged receiving object has no preset behavior, in order to accurately determine and process the emotion of the user in time, emotion recognition is performed on the second voice data according to the emotion recognition model to obtain a second emotion recognition result. Specifically, when the second risk judgment result indicates that the target collection object has no preset behavior, audio feature extraction is performed on the second voice data to obtain a second audio feature vector. And then, inputting the second audio feature vector into a preset emotion recognition model and performing emotion recognition to obtain the current emotion category of the target collection object, namely a second emotion recognition result. In order to effectively improve the completion of the hasten-to-receive task and ensure the smooth conversation between the target hasten-to-receive object and the service object so as to avoid the occurrence of dissatisfaction of the target object, second voice text data corresponding to the second voice data is obtained, a second emotion recognition result and the second voice text data are input into a preset hasten-to-receive word model for word matching to obtain a target hasten-to-receive word, and the target hasten-to-receive object is hasten-to-receive according to the target hasten-to-receive word model so as to improve the success rate of the completion of the hasten-to-receive task.

Referring to fig. 4, fig. 4 is a flowchart illustrating another alternative embodiment of a method for urging based on emotion recognition according to the present application. In some embodiments, after step S340, the method for enticing based on emotion recognition specifically includes, but is not limited to, step S410 to step S440, which are described in detail below with reference to fig. 4.

Step S410, acquiring third voice data of a second service object in the second communication voice data;

step S420, performing emotion state analysis on the third voice data according to a preset emotion state keyword library and an emotion state voiceprint feature database to obtain a collection-promoting emotion judgment result;

step S430, second voice text data corresponding to the second voice data are obtained, and the second voice text data are input into a preset dialect recommendation model for dialect recommendation to obtain a target recommended dialect;

step S440, when the receiving-promoting emotion judgment result shows that the second service object is in the preset emotion state, the target recommended dialogues are sent to the third service object, the third service object carries out receiving promotion on the target receiving-promoting object according to the target recommended dialogues, and the second service object is processed according to the preset management specifications.

In steps S410 and S420 of some embodiments, in order to ensure smooth call between the target collection object and the service object, the emotion state analysis is performed on the third voice data according to the preset emotion state keyword library and the emotion state voiceprint feature database by acquiring the third voice data of the second service object in the second communication voice data in real time, so as to obtain a collection-promoting emotion judgment result. The emotional state keyword library is used for storing keywords in different emotional states, wherein the emotional states comprise calmness, fierce emotion, depression and the like, and the fierce emotion comprises anger, boredom and the like. And judging whether the text data corresponding to the third voice data contains keywords in the emotional state by combining the emotional state keyword library so as to obtain a keyword emotion judgment result. The emotion state voiceprint feature database is used for storing voiceprint features capable of expressing emotion states, and the emotion state met by the voiceprint features in the third voice data is judged by combining the emotion state voiceprint feature database so as to obtain a voiceprint emotion judgment result. And then obtaining a collection emotion judgment result according to the keyword emotion judgment result and the voiceprint emotion judgment result.

In step S430 and step S440 of some embodiments, second voice text data corresponding to the second voice data is obtained, and the second voice text data is input to a preset conversational recommendation model for conversational recommendation, so as to obtain a target recommended conversational technology. And when the receiving emotion judgment result shows that the second service object is in a preset emotion state, sending the target recommended word to a third service object, wherein the preset emotion state can be set to be a violent emotion state, namely the current violent emotion state of the second service object can cause the user to generate discontent emotion, and sending the target recommended word to the third service object. And the third service object performs collection urging on the target collection urging object according to the target recommendation conversation. In order to ensure that the second service object can avoid a violent emotional state in the later collection process, the second service object may be processed according to a preset management specification, wherein the preset management specification may be adjusted according to actual regulations and requirements, for example, the working attitude of the second service object is deducted and taught, and other measures are taken.

It should be noted that, in order to ensure smooth call between the target revenue-prompting object and the service object, after the first call connection is established, fourth voice data of the first service object in the first call voice data may also be obtained in real time, emotional state analysis is performed on the fourth voice data according to a preset emotional state keyword library and an emotional state voiceprint feature database to obtain a revenue-prompting emotion judgment result, and then when the revenue-prompting emotion judgment result indicates that the first service object is in a preset emotional state, the obtained target revenue-prompting operation is sent to a third service object, the third service object prompts the target revenue-prompting object according to the target revenue-prompting operation, and the second service object is processed according to a preset management specification.

In some embodiments, after step S430, the method for enticement based on emotion recognition further comprises:

and when the receiving emotion judgment result shows that the second service object is not in the preset emotion state, the second service object carries out receiving acceleration on the target receiving object according to the target recommendation operation.

Specifically, when the receiving-promoting emotion judgment result indicates that the second service object is not in the preset emotional state, that is, the current second service object can still serve a better target receiving-promoting object, the second service object performs receiving promotion on the target receiving-promoting object according to the target recommendation. According to the embodiment of the application, when the second service object is judged to have the preset emotional state, the third service object can timely prompt the target collection object to collect through adjustment, the situation that the user is not full of emotions is avoided, and therefore the collection task is more effectively guaranteed to be completed.

Referring to fig. 5, fig. 5 is a flowchart of step S140 according to an embodiment of the present disclosure. In some embodiments, step S140 specifically includes, but is not limited to, step S510 to step S560, and these six steps are described in detail below with reference to fig. 5.

Step S510, acquiring first voice text data corresponding to the first voice data;

step S520, acquiring a preset behavior keyword library and a preset behavior voiceprint feature database, wherein the preset behavior keyword library comprises a plurality of preset behavior tendency keywords;

step S530, performing keyword matching processing on the first voice text data and the preset behavior tendency keywords to obtain a preset behavior keyword matching result;

step S540, acquiring a target voiceprint characteristic corresponding to the first voice data;

step S550, comparing the voiceprint of the target voiceprint characteristic with a preset behavior voiceprint characteristic database to obtain a voiceprint comparison result;

step S560, performing risk analysis on the first voice data according to the historical collection urging data, the preset behavior keyword matching result, and the voice print comparison result, to obtain a first risk judgment result.

In some embodiments, in step S510 to step S560, in order to accurately obtain the risk judgment on the target object, the first voice text data corresponding to the first voice data is obtained, and the preset behavior keyword library and the preset behavior voiceprint feature database are obtained, where the preset behavior keyword library includes a plurality of preset behavior tendency keywords. And performing keyword matching processing on the first voice text data and the preset behavior tendency keywords to obtain a preset behavior keyword matching result. And then, acquiring a target voiceprint characteristic corresponding to the first voice data, and carrying out voiceprint comparison on the target voiceprint characteristic and a preset behavior voiceprint characteristic database to obtain a voiceprint comparison result. And finally, performing risk analysis on the first voice data according to the historical collection urging data, the preset behavior keyword matching result and the voiceprint comparison result, and determining the corresponding emotion level, namely the first risk judgment result.

Referring to fig. 6, fig. 6 is a flowchart illustrating another alternative method for hastening based on emotion recognition according to an embodiment of the present application. In some embodiments, after step S160, the method for urging based on emotion recognition specifically includes, but is not limited to, step S610 to step S630, which are described in detail below with reference to fig. 6.

Step S610, acquiring text characteristics of the first voice text data and voice characteristics of the first voice data;

step S620, performing intention recognition on the target receiving object according to the text characteristic and the voice characteristic to obtain an intention recognition result;

and S630, marking the target collection object according to the intention recognition result to obtain a marking result, wherein the marking result is used for representing key intention information of the target collection object in the first voice data.

In steps S610 to S630 of some embodiments, in order to accurately record the intention information of the target object in the hastening process, the text feature of the first speech text data and the speech feature of the first speech data are acquired, and the intention recognition result is obtained by performing intention recognition on the target object according to the text feature and the speech feature. And finally, marking the target collection object according to the intention recognition result to obtain a marking result, wherein the marking result is used for representing key intention information of the target collection object in the first voice data.

It should be noted that, in the second communication connection process, it is also necessary to obtain the text feature corresponding to the second voice text data and the voice feature of the first voice data, and perform intent recognition on the target reception object according to the text feature and the voice feature to obtain an intent recognition result. And finally, marking the target receiving object according to the intention recognition result to obtain a marking result, wherein the marking result is used for representing key intention information of the target receiving object in the second voice data. The method for urging collection based on emotion recognition can be applied to an emotion early warning process facing an urging collection link, and discontent emotions of users can be recognized in time and flexibly processed, so that an urging collection task can be completed more effectively.

Referring to fig. 7, fig. 7 is a block diagram of a system for sentiment recognition based snooze according to some embodiments of the present application. In some embodiments, the emotion recognition based urge receiving system includes a call connection module 710, a first voice data acquisition module 720, a second voice data acquisition module 730, a risk analysis module 740, an emotion recognition module 750, a matching module 760, and an urge receiving module 770.

The call connection module 710 is configured to establish a first call connection between the target collection object and the first service object according to a preset first timeout interval, so as to obtain first call voice data of the target collection object and the first service object;

the first voice data obtaining module 720 is configured to obtain first voice data of a target collection object in the first call voice data;

the second voice data acquisition module 730 is used for acquiring historical collection urging data of the target collection urging object;

the risk analysis module 740 is configured to perform risk analysis on the first voice data according to the historical collection urging data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a first risk judgment result;

the emotion recognition module 750 is configured to, when the first risk judgment result indicates that the target collection object has no preset behavior, perform emotion recognition on the first voice data according to a preset emotion recognition model to obtain a first emotion recognition result;

the matching module 760 is used for acquiring first voice text data corresponding to the first voice data, and inputting the first emotion recognition result and the first voice text data into a preset collection strategy model for matching to obtain collection strategy information;

and the collection urging module 770 is configured to send the collection urging policy information to the second service object, and the second service object performs collection urging processing on the target collection urging object according to the collection urging policy information.

It should be noted that the system for hastening receipts based on emotion recognition in the embodiment of the present application is used to implement the method for hastening receipts based on emotion recognition, and the system for hastening receipts based on emotion recognition in the embodiment of the present application corresponds to the method for hastening receipts based on emotion recognition, and for a specific processing procedure, reference is made to the method for hastening receipts based on emotion recognition, which is not repeated herein.

An embodiment of the present application further provides a computer device, including:

at least one memory;

at least one processor;

at least one computer program;

the program is stored in the memory and the processor executes at least one computer program to implement the method for urging based on emotion recognition of the embodiments of the present application described above. The computer device can be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a vehicle-mounted computer and the like.

The computer device according to the embodiment of the present application is described in detail below with reference to fig. 8.

Referring to fig. 8, fig. 8 illustrates a hardware configuration of a computer apparatus according to another embodiment, the computer apparatus including:

the processor 810 may be implemented by a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute a related program to implement the technical solution provided in the embodiment of the present Application;

the Memory 820 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 820 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present disclosure is implemented by software or firmware, the relevant program codes are stored in the memory 820 and called by the processor 810 to execute the incentive receiving method based on emotion recognition according to the embodiments of the present disclosure;

an input/output interface 830 for implementing information input and output;

the communication interface 840 is configured to implement communication interaction between the device and another device, and may implement communication in a wired manner (e.g., USB, network cable, etc.) or in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);

a bus 850 that transfers information between the various components of the device (e.g., the processor 810, the memory 820, the input/output interface 830, and the communication interface 840);

wherein processor 810, memory 820, input/output interface 830, and communication interface 840 are communicatively coupled to each other within the device via bus 850.

The embodiment of the application also provides a storage medium which is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions for causing a computer to execute the emotion recognition-based hastening method of the embodiment.

The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiments described in the embodiments of the present application are for more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute limitations on the technical solutions provided in the embodiments of the present application, and it is obvious to those skilled in the art that the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems with the evolution of technologies and the emergence of new application scenarios.

It will be appreciated by those skilled in the art that the embodiments shown in the figures are not intended to limit the embodiments of the present application and may include more or fewer steps than those shown, or some of the steps may be combined, or different steps may be included.

The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, and functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes multiple instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and the scope of the claims of the embodiments of the present application is not limited thereto. Any modifications, equivalents and improvements that may occur to those skilled in the art without departing from the scope and spirit of the embodiments of the present application are intended to be within the scope of the claims of the embodiments of the present application.

Claims

1. A method for incentivizing based on emotion recognition, the method comprising:

establishing a first call connection between a target receiving object and a first service object according to a preset first expiration time interval so as to obtain first call voice data of the target receiving object and the first service object;

2. The method according to claim 1, wherein after performing risk analysis on the first voice data according to the historical collection data, the preset behavior keyword library and the preset behavior voiceprint feature database of the target collection object to obtain a first risk judgment result, the method further comprises:

when the first risk judgment result shows that the target collection object has a preset behavior, acquiring first voice text data corresponding to the first voice data, and inputting the first voice text data into a preset dialect recommendation model for performing dialect recommendation to obtain a target recommended dialect;

and sending the target recommended word to a third service object.

3. The method according to claim 1, wherein the sending the collection policy information to a second service object, and the second service object performing collection processing on the target collection object according to the collection policy information, comprises:

acquiring second voice data of the target collection prompting object in the second communication voice data;

4. The method according to claim 3, wherein after performing risk analysis on the second speech data according to the historical collection data, the preset behavior keyword library and the preset behavior voiceprint feature database to obtain a second risk judgment result, the method further comprises:

5. The method of claim 4, wherein after the obtaining of the second speech text data corresponding to the second speech data and inputting the second speech text data into a preset conversational recommendation model for conversational recommendation, and obtaining a target recommended conversational language, the method further comprises:

6. The method according to any one of claims 1 to 5, wherein performing risk analysis on the first voice data according to the historical collection data, a preset behavior keyword library and a preset behavior voiceprint feature database to obtain a first risk judgment result comprises:

acquiring first voice text data corresponding to the first voice data;

7. The method according to any one of claims 1 to 5, wherein after acquiring first voice text data corresponding to the first voice data, and inputting the first emotion recognition result and the first voice text data to a preset revenue inducing policy model for matching, and obtaining revenue inducing policy information, the method further comprises:

8. A system for incentivizing based on emotion recognition, the system comprising:

9. A computer device, comprising:

at least one memory;

at least one processor;

at least one computer program;

the computer program is stored in the memory, and the at least one computer program is executed by the processor to implement:

the method of any one of claims 1 to 7.

10. A storage medium that is a computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for causing a computer to execute:

the method of any one of claims 1 to 7.