CN116886819A - Multi-dimensional telephone traffic data monitoring method, device and storage medium - Google Patents

Multi-dimensional telephone traffic data monitoring method, device and storage medium Download PDF

Info

Publication number
CN116886819A
CN116886819A CN202310990355.9A CN202310990355A CN116886819A CN 116886819 A CN116886819 A CN 116886819A CN 202310990355 A CN202310990355 A CN 202310990355A CN 116886819 A CN116886819 A CN 116886819A
Authority
CN
China
Prior art keywords
seat
voice
client
emotion
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310990355.9A
Other languages
Chinese (zh)
Other versions
CN116886819B (en
Inventor
漆振飞
江梅
侯本辉
刘畅
左云杰
李祥
颜文达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Power Grid Co Ltd
Original Assignee
Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Power Grid Co Ltd filed Critical Yunnan Power Grid Co Ltd
Priority to CN202310990355.9A priority Critical patent/CN116886819B/en
Publication of CN116886819A publication Critical patent/CN116886819A/en
Application granted granted Critical
Publication of CN116886819B publication Critical patent/CN116886819B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2281Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls

Abstract

The application relates to the technical field of telephone traffic monitoring, which solves the technical problems of complex hardware configuration and high monitoring cost of the existing monitoring method, in particular to a multi-dimensional telephone traffic data monitoring method, comprising the following steps: acquiring multidimensional monitoring data in the process of communicating between a call center seat person and a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data; dividing the customer voice data into a plurality of customer voice fragments according to a preset time length in time sequence; calculating a positive difference value between the client prosody value in the first segment of client speech segment and the client prosody value in the subsequent segment of client speech; and judging whether the positive difference value is smaller than a preset prosody threshold value. According to the application, the suspected abnormal call is screened out according to the prosody change of the customer voice, the seat emotion is judged according to the seat voice emotion characteristics and the facial expression characteristics, and the suspected abnormal call is screened out for the second time, so that the accuracy and reliability of telephone traffic monitoring are improved, and the monitoring cost is low.

Description

Multi-dimensional telephone traffic data monitoring method, device and storage medium
Technical Field
The present application relates to the field of traffic monitoring technologies, and in particular, to a method, an apparatus, and a storage medium for monitoring multidimensional traffic data.
Background
With the increasing competition of the market, in order to ensure and master the service quality of operators and maintain the enterprise image, enterprises generally need to monitor and control the service process of operators. The traditional monitoring method is to record the whole process of the conversation between the service personnel and the clients through the recording quality inspection system, and then periodically and randomly extract the recording files by the quality inspector for trial listening, or adopt a manual monitoring method, and monitor the whole process of the conversation by the quality inspector.
However, the quality inspection method for periodically and randomly extracting the audio records by the quality inspector to perform trial listening belongs to post quality inspection, and customer requirements cannot be perceived in real time, so that the customer cannot be confused by a doubt and a service quality of telephone operators cannot be ensured at the first time, a large amount of manpower is required to be input by adopting a manual monitoring method, and a monitoring system is complex in configuration, so that the monitoring cost is high, and the use requirements of modern enterprises cannot be met.
Disclosure of Invention
Aiming at the defects of the prior art, the application provides a multi-dimensional telephone traffic data monitoring method, a multi-dimensional telephone traffic data monitoring device and a storage medium, which solve the technical problems that the existing monitoring method needs to input a large amount of manpower and the monitoring system is complex in configuration, so that the monitoring cost is high, and achieve the purposes of improving the monitoring accuracy and reducing the monitoring cost.
In order to solve the technical problems, the application provides the following technical scheme: a multi-dimensional traffic data monitoring method comprising the steps of:
s1, acquiring multidimensional monitoring data in the process of communicating a call center seat person with a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data;
s2, dividing the client voice data into a plurality of continuous non-overlapping client voice fragments according to a preset time length in a time sequence;
s3, calculating a positive difference Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent client voice segment;
s4, judging whether the positive difference value Z is smaller than a preset prosody threshold Wt, if the positive difference value Z is smaller than the prosody threshold Wt, regarding the conversation as normal, and returning to the step S2 to monitor the next customer voice segment;
if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice fragment as suspected abnormality and executing the next step;
s5, extracting seat voice emotion characteristics and seat facial expression characteristics of a time period corresponding to the customer voice fragment marked as suspected abnormality from the seat voice data and the seat facial data respectively, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q;
s6, judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, if the seat emotion value Q is smaller than the seat emotion threshold value Xt, judging that the call is normal, and returning to the step S2 to monitor the next customer voice segment;
if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal;
s7, carrying out exception handling on the call marked as exception according to the exception handling rule.
Further, in step S3, a positive difference value Z between the prosody value of the client in the first segment of the client speech segment and the prosody value of the client in the subsequent client speech segment is calculated, which specifically includes:
s31, filtering and denoising each section of client voice segment to obtain a processed client voice segment;
s32, extracting the prosody features from each processed client voice segment and quantifying the prosody features into a client prosody value Y 1 ,Y 2 ,…,Y i ,…;
S33, calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice.
Further, in step S5, the seat voice emotion feature and the seat facial expression feature in a time period corresponding to the customer voice segment marked as suspected abnormal are extracted from the seat voice data and the seat facial data, and the seat voice emotion feature and the seat facial expression feature are weighted and fused to obtain a seat emotion value Q, which specifically includes:
s51, acquiring a time period T corresponding to a customer voice fragment marked as suspected abnormality;
s52, extracting the seat voice emotion characteristics and the seat facial expression characteristics in a time period T and a front-back adjacent time period T+/-1 from the seat voice data and the seat facial data respectively;
and S53, carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q.
Further, before step S7, the method further includes:
s8, extracting customer semantic information from customer voice data marked as abnormal call and identifying customer appeal;
s9, extracting a answering operation from the problem library according to customer requirements and sending the answering operation to terminal equipment used by seat personnel.
Further, in step S6, the seat emotion threshold Xt is a critical value obtained by taking the voice data and the facial emotion data of the seat personnel recorded in two channels under various emotion states as a sample set and inputting a multidimensional emotion judgment model for training and learning.
Further, in step S7, the exception handling rule refers to marking exception levels according to the seat emotion value Q, and different exception levels have different handling mechanisms.
The application also provides a technical scheme that: an apparatus for implementing the multi-dimensional traffic data monitoring method, comprising:
the monitoring data acquisition module is used for acquiring multidimensional monitoring data in the process of communicating the call center seat personnel with the clients, wherein the multidimensional monitoring data comprises client voice data, seat voice data and seat face data;
the voice segmentation module is used for sequentially segmenting the client voice data into a plurality of continuous non-overlapping client voice fragments according to preset time length;
the first calculation module is used for calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent client voice segment;
the first judging module is used for judging whether the positive difference value Z is smaller than a preset prosody threshold Wt or not;
the second computing module is used for respectively extracting seat voice emotion characteristics and seat facial expression characteristics of a time period corresponding to the customer voice fragment marked as suspected abnormality from the seat voice data and the seat facial data, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q;
the second judging module is used for judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt or not;
and the exception handling module is used for carrying out exception handling on the call marked as exception according to the exception handling rule.
Further, the method further comprises the following steps:
the voice information recognition module is used for extracting client semantic information from client voice data marked as abnormal calls and recognizing client appeal;
and the conversation generation module is used for extracting a conversation solution from the problem library according to customer requirements and sending the conversation solution to terminal equipment used by seat personnel.
The application also provides a technical scheme, a computer readable storage medium storing a computer program which when executed by a processor realizes the steps of the multi-dimensional telephone traffic data monitoring method.
By means of the technical scheme, the application provides a multi-dimensional telephone traffic data monitoring method, a multi-dimensional telephone traffic data monitoring device and a storage medium, which at least have the following beneficial effects:
1. according to the application, firstly, the seat ID of the suspected abnormal call and the time period of the suspected abnormal call are screened out preliminarily according to the mutation of the voice rhythm of the client, then the seat voice emotion characteristics and the facial expression characteristics are extracted from the voice data and the facial data of seat personnel in two time periods adjacent to each other before and after the time period of the suspected abnormal call, the seat voice emotion characteristics and the facial expression characteristics are weighted and fused, the seat emotion is judged by combining with the preset seat emotion threshold value, and the suspected abnormal call is screened for the second time according to the seat emotion, so that the accuracy and the reliability of screening the abnormal call are improved, the purpose of accurately monitoring the service quality of the seat personnel is achieved, and the monitoring cost is greatly reduced.
2. According to the application, the voice emotion and semantic information analysis is carried out from two angles of the client and the seat according to the causal relationship between the emotion change of the client and the call context, on one hand, when abnormal call is found, the corresponding processing mechanism can be triggered at the first time, the satisfaction degree of the client is improved, on the other hand, the client appeal can be perceived in real time, the first time of an enterprise is convenient for solving the problem for the client, and the help is provided for the development of the enterprise, so that the voice emotion and semantic information analysis method has higher social value and application prospect.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
fig. 1 is a flowchart of a multi-dimensional traffic data monitoring method according to a first embodiment of the present application;
FIG. 2 is a sub-flowchart of a multi-dimensional traffic data monitoring method according to a first embodiment of the present application;
FIG. 3 is a sub-flowchart of a multi-dimensional traffic data monitoring method according to a first embodiment of the present application;
fig. 4 is a schematic block diagram of a multi-dimensional traffic data monitoring device according to a first embodiment of the present application;
fig. 5 is a flowchart of a multi-dimensional traffic data monitoring method according to a second embodiment of the present application;
fig. 6 is a schematic block diagram of a multi-dimensional traffic data monitoring device according to a second embodiment of the present application.
In the figure: 10. a monitoring data acquisition module; 20. a voice segmentation module; 30. a first computing module; 40. a first judgment module; 50. a second computing module; 60. a second judging module; 70. a semantic information recognition module; 80. and a third judging module.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will become more apparent and fully apparent from the following detailed description of the preferred embodiments, it should be understood that the accompanying drawings, in which specific embodiments are shown, illustrate only some but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Scene overview
In the process of communicating between the seat of the power grid call center and the customer, in order to ensure the service quality of the telephone operators and improve the customer satisfaction, supervision and management are usually required to be performed on the service process of the telephone operators, but the existing monitoring method needs to rely on quality inspectors to extract records afterwards to perform trial listening or whole-course monitoring, a great deal of manpower is required to be input, the monitoring system is complex in configuration and high in monitoring cost, and therefore, the voice emotion recognition technology provides a new solution idea and scheme for monitoring telephone traffic data.
Specifically, through analyzing the speech emotion characteristics of the seat and comparing with emotion threshold values, whether the abnormal conversation occurs is judged, however, in the practical application process, because the seat personnel are professional trained professional personnel, the seat personnel control own speech and emotion according to own professional literacy, therefore, the abnormal conversation is difficult to find by analyzing the speech emotion characteristics of the seat personnel, the monitoring accuracy is low, customer appeal cannot be perceived in real time, in addition, the customer is calm in the initial stage of conversation with the seat, and the emotion of anger or gas can occur only when satisfactory answer cannot be obtained.
Based on the method, firstly, whether the call is suspected to be abnormal or not is judged from prosodic features in the acquired customer voice data, when the call is judged to be suspected to be abnormal, the suspected abnormal call is screened for the second time by combining semantic information in the customer voice data, seat voice data and seat face data, so that the accuracy and reliability of screening abnormal calls are improved, the purpose of accurately monitoring the service quality of seat personnel is achieved, and when the call is found to be abnormal, abnormal processing can be timely carried out, customer appeal is perceived in real time, and customer appeal is solved, so that the customer satisfaction is improved, enterprise images are maintained, and the method has the advantages of low monitoring cost and high monitoring accuracy.
Example 1
Referring to fig. 1, a multi-dimensional traffic data monitoring method includes the following steps:
s1, acquiring multidimensional monitoring data in the process of communicating between a call center seat person and a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data.
Specifically, in order to ensure the service quality of the telephone operators, voice data of the operators and the clients and face data of the operators in the process of communicating between the operators and the clients in the call center are obtained in real time through the double-channel recording equipment; the two-way data mirror port of voice media stream can also be set on the exit switch of the call center core machine room, the seat voice data sent by the seat terminal and the client voice data sent by the client terminal use different port numbers as the source port number and the target port number, so that the client voice data and the seat voice data can be identified, and various modes for acquiring and identifying the client voice data, the seat voice data and the seat face data are available, and the method is not particularly limited herein.
S2, dividing the client voice data into a plurality of continuous non-overlapping client voice fragments according to the preset time length in time sequence.
When the length of the real-time recorded customer voice data reaches the preset time length, the real-time recorded customer voice data is segmented to obtain a plurality of continuous non-overlapping customer voice segments, so that a monitoring system can analyze the customer voice segments conveniently, and the monitoring accuracy is improved.
In this embodiment, the preset duration is set to 100ms, and the client voice data is divided into several consecutive non-overlapping client voice segments according to the time sequence of the call, and since the client voice segment of each small segment is generated in a very short time, it is generally considered that the emotion of each small segment of the several shorter client voice segments is smooth.
S3, calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice segment.
As shown in fig. 2, the specific step of calculating a positive difference Z between the prosody value of the client in the first segment of client speech and the prosody value of the client in the subsequent segment of client speech includes:
s31, filtering and denoising each section of client voice segment to obtain a processed client voice segment;
s32, extracting the client prosody features from each processed client voice segment and quantifying the client prosody features into client prosody values;
the method comprises the steps of extracting client rhythm characteristics from each denoised client voice fragment, and quantifying the client rhythm characteristics into client rhythm values Y, wherein each small client voice fragment comprises semantic information and rhythm information, the rhythm information is voice characteristic information of pitch, duration, speed and spirit weight change caused by vibration of a part of a voice signal, and when emotion of a person changes, the pitch, speed and spirit of speaking are all mutated 1 ,Y 2 ,…,Y i …, and Y i Representing the client prosody value of the ith client speech segment.
S33, calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice.
The positive difference Z between the prosody value of each small customer voice segment after the first segment and the prosody value of the customer in the first segment is calculated according to the following formula:
Z i =|Y i -Y 1 |
in the above, Z i Representing a positive difference in the client prosody value of the ith client speech segment and the client prosody value within the first segment of client speech segment.
It should be noted that, statistical analysis indicates that people are usually in a level of a language and calm emotion in the initial stage of a voice call, and therefore, the present embodiment uses the prosody value of the client in the first segment of the client voice segment as a basic reference value.
S4, judging whether the positive difference Z is smaller than a preset prosody threshold Wt;
if the positive difference Z is less than the prosody threshold Wt, the conversation is considered normal, and the step S2 is returned to monitor the voice fragment of the next customer, so that the real-time monitoring of the conversation content is realized;
if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice segment as suspected abnormality and executing the next step, judging whether the mutation of the client call prosody is caused by the change of the call environment or the emotion change, and improving the monitoring accuracy, reducing the misjudgment rate and enhancing the credibility of seat personnel on the premise of improving the client satisfaction by identifying the real mutation reason.
It should be noted that, the prosody threshold related to the client herein is a critical value of average client prosody obtained by extracting a certain amount of client voice data from the call center client voice database as a sample set, extracting prosody information of the sample set, and performing statistical analysis.
S5, extracting the seat voice emotion characteristics and the seat facial expression characteristics of the time period corresponding to the customer voice fragments marked as suspected abnormalities from the seat voice data and the seat facial data respectively, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q.
As shown in fig. 3, the specific steps of determining the seat emotion value Q according to the seat voice data and the seat face data include:
s51, acquiring a time period T corresponding to a customer voice fragment marked as suspected abnormality;
s52, extracting the seat voice emotion characteristics and the seat facial expression characteristics in a time period T and a front-back adjacent time period T+/-1 from the seat voice data and the seat facial data respectively;
the change of human emotion generally has a close causal relationship with factors such as talking context, opposite emotion and the like, so that seat voice data of two time periods adjacent to the front and back of a suspected abnormal customer voice fragment need to be analyzed, but because seat personnel are professional trained and professional personnel with professional literacy, the own language can be controlled very well, the expression of the seat personnel can be changed when the emotion of a customer changes, and the expression of the person is difficult to control, so that the emotion characteristics and facial expression characteristics of the seat need to be fused, and the accuracy of judging the emotion of the customer and the seat is improved.
Based on this, in this embodiment, according to the time period T in which the customer voice segment marked as suspected abnormality occurs, the seat voice segments in the two adjacent time periods t±1 before and after the time period T are extracted from the seat voice data, and the seat facial expression features in the time period T and the two adjacent time periods t±1 before and after are extracted from the seat facial data.
And S53, carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q.
Facial expression characteristics, voice emotion characteristics and semantic characteristics are main emotion factors for judging human emotion, wherein the facial expression characteristics comprise changes of parts such as eyebrows, mouth, eyes and the like, and the facial expression characteristics are main parts for judging human emotion states, namely weights of all emotion factors in predicting human emotion are different.
In this embodiment, the seat emotion value Q is calculated by adopting a pre-trained convolutional neural network model fused with a principal component analysis algorithm, the basic idea is that the weight of the seat voice emotion feature and the seat facial expression feature is determined by adopting the principal component analysis algorithm, then the seat voice emotion feature and the seat facial expression feature are fused into feature vectors in a layer-by-layer convolution mode, and finally the seat emotion value Q is calculated according to the fused feature vectors, and the calculation formula of the seat emotion value Q is as follows:
Q ID =ω e E x (T±1)+ω m M x (T,T±1)
in the above, Q ID Emotion value, ω, representing seat with ID e Weights representing seat speech emotion characteristics, E x (T+ -1) represents the speech emotion characteristic parameter, ω, of the seat in the T+ -1 period m Weights representing seat facial expression features, M x (T, T+ -1) represents facial expression characteristic parameters of the seat during the time period T, T+ -1.
It should be noted that, by taking the collected various facial emotion data and voice data of the seat personnel with one-to-one mapping relationship as training samples, training and optimizing the convolutional neural network structure fused with the principal component analysis algorithm, a trained model capable of judging the human emotion according to the multidimensional traffic data is obtained, and the seat emotion threshold in the following step S6 is also a critical value obtained by analyzing according to sentences and facial expression features in various emotion states of the seat personnel collected in advance.
S6, judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, if the seat emotion value Q is smaller than the seat emotion threshold value Xt, judging that the call is normal, and returning to the step S2 to monitor the next customer voice segment;
and if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal.
S7, carrying out exception handling on the call marked as exception according to the exception handling rule.
When abnormal call is monitored, the monitoring system marks the abnormal grade according to the abnormal processing rule, namely according to the seat emotion value Q, and triggers a corresponding processing mechanism according to the abnormal grade.
In the embodiment, when the seat emotion value Q exceeds one time of the seat emotion threshold value Xt, marking as first-level abnormality and triggering a first-level processing mechanism, wherein the first-level processing mechanism is used for intercepting a call by a lead or quality inspector, and continuing to communicate with a customer by the lead or other seats, so that problems are solved in time, and the image of a company is maintained; when the seat emotion value Q is larger than the seat emotion threshold value Xt but not larger than one time, marking as a secondary abnormality and triggering a secondary processing mechanism, wherein the secondary processing mechanism is used for sending early warning information to terminal equipment used by seat personnel to prompt the seat personnel to pay attention to controlling emotion so as to ease the emotion of a customer at the first time and solve the problem for the customer.
Referring to fig. 4, the present embodiment further provides an apparatus for implementing the multi-dimensional traffic data monitoring method, including:
the monitoring data acquisition module 10 is used for acquiring multidimensional monitoring data in the process of communicating between a call center seat personnel and a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data;
a voice segmentation module 20 for dividing the client voice data into a plurality of consecutive non-overlapping client voice segments according to a preset time length of 100ms in time sequence, wherein the emotion of each of the plurality of shorter client voice segments is generally considered to be stable because the client voice segment of each small segment is generated in a very short time;
the first calculation module 30 is configured to calculate a positive difference value Z between the prosody value of the client in the first segment of client voice and the prosody value of the client in the subsequent segment of client voice, and the statistical analysis indicates that people are usually level and calm in emotion in the initial stage of voice call, so that the prosody value of the client in the first segment of client voice is used as a basic reference value;
the first judging module 40 is configured to judge whether the positive difference value Z is smaller than a preset prosody threshold Wt, and if the positive difference value Z is smaller than the prosody threshold Wt, consider that the call is normal and return to step S2 to monitor the next client voice segment, so as to realize real-time monitoring of call content; if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice segment as suspected abnormality and executing the next step, judging whether the mutation of the client call prosody is caused by the change of the call environment or the emotion change, and improving the monitoring accuracy, reducing the misjudgment rate and enhancing the credibility of seat personnel on the premise of improving the client satisfaction by identifying the real mutation reason;
the second calculation module 50 is configured to extract, from the seat voice data and the seat face data, a seat voice emotion feature and a seat facial expression feature of a time period corresponding to a customer voice segment marked as suspected abnormality, and perform weighted fusion on the seat voice emotion feature and the seat facial expression feature to obtain a seat emotion value Q;
the second judging module 60 is configured to judge whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, and if the seat emotion value Q is smaller than the seat emotion threshold value Xt, consider that the call is normal and return to step S2 to monitor the next customer voice segment; if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal;
the exception handling module 70 is configured to perform exception handling on a call marked as exception according to an exception handling rule, where the exception handling rule refers to marking an exception level according to a seat emotion value Q, and different exception levels have different handling mechanisms.
According to the embodiment, firstly, the seat ID of a suspected abnormal call and a time period T of the suspected abnormal call are screened out preliminarily according to mutation of the voice rhythm of a customer, then the seat voice emotion characteristics and the facial expression characteristics are extracted from voice data and facial data of seat personnel in two time periods T+/-1 adjacent to each other before and after the time period T of the suspected abnormal call, the seat voice emotion characteristics and the facial expression characteristics are weighted and fused, the seat emotion is judged by combining with a preset seat emotion threshold value, and the call with the suspected abnormal call is screened secondarily according to the seat emotion, so that the accuracy and the reliability of screening the abnormal call are improved, the aim of accurately monitoring the service quality of the seat personnel is fulfilled, and the monitoring cost is greatly reduced.
In addition, in combination with the multi-dimensional traffic data monitoring method in the above embodiment, the embodiment of the application can be implemented by providing a computer readable storage medium. The computer readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the steps of any of the multi-dimensional traffic data monitoring methods of the above embodiments.
Example two
The implementation manner provided in the present embodiment is made on the basis of the first embodiment, and the same technical problems are solved through the same method steps, the same device and the same technical scheme of the storage medium, and the same beneficial effects are provided, and the same parts are referred to each other, so that the detailed description of the present embodiment is omitted.
Referring to fig. 5-6, a specific implementation manner of a second embodiment of the present application is shown, in this embodiment, by extracting semantic information from customer voice data and identifying customer real-time requirements according to the semantic information, extracting answers from a question bank according to the customer requirements to generate answering techniques and sending the answering techniques to a terminal device used by a seat person, guiding the seat person to answer the customer, and triggering a corresponding processing mechanism according to an emotion value of the seat person, thereby ensuring traffic service quality, improving customer satisfaction, and maintaining enterprise images.
Referring to fig. 5, a multi-dimensional traffic data monitoring method includes the following steps:
s1, acquiring multidimensional monitoring data in the process of communicating between a call center seat person and a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data.
S2, dividing the client voice data into a plurality of continuous non-overlapping client voice fragments according to the preset time length in time sequence.
S3, calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice segment.
S4, judging whether the positive difference Z is smaller than a preset prosody threshold Wt;
if the positive difference Z is less than the prosody threshold Wt, the conversation is considered normal, and the step S2 is returned to monitor the voice fragment of the next customer, so that the real-time monitoring of the conversation content is realized;
if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice segment as suspected abnormality and executing the next step, judging whether the mutation of the client call prosody is caused by the change of the call environment or the emotion change, and improving the monitoring accuracy, reducing the misjudgment rate and enhancing the credibility of seat personnel on the premise of improving the client satisfaction by identifying the real mutation reason.
It should be noted that, the prosody threshold related to the client herein is a critical value of average client prosody obtained by extracting a certain amount of client voice data from the call center client voice database as a sample set, extracting prosody information of the sample set, and performing statistical analysis.
S5, extracting seat voice emotion characteristics and seat facial expression characteristics of a time period corresponding to the customer voice fragment marked as suspected abnormality from the seat voice data and the seat facial data respectively, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q;
s6, judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, if the seat emotion value Q is smaller than the seat emotion threshold value Xt, judging that the call is normal, and returning to the step S2 to monitor the next customer voice segment;
and if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal.
And S8, extracting customer semantic information from the customer voice data marked as abnormal call and identifying customer appeal.
According to the semantic information, emotion and appeal of a customer can be rapidly and accurately judged, so that in order to provide reliability of traffic data monitoring, customer semantic information needs to be extracted from customer voice data marked as abnormal calls, and customer appeal needs to be identified according to the semantic information.
S9, extracting a answering operation from the problem library according to customer requirements and sending the answering operation to terminal equipment used by seat personnel.
Searching answers from the enterprise question library according to customer appeal to generate corresponding answering operation, sending the answering operation to terminal equipment used by seat personnel, prompting the seat personnel to answer according to the answering operation, and carrying out statistical analysis according to satisfaction degree of customers on the answering operation according to a preset period, so as to help enterprises to optimize and update the answering operation in the question library, effectively improve customer retention rate and maintain enterprise images.
S7, carrying out exception handling on the call marked as exception according to the exception handling rule.
When abnormal call is monitored, the monitoring system marks the abnormal grade according to the abnormal processing rule, namely according to the seat emotion value Q, and triggers a corresponding processing mechanism according to the abnormal grade.
In the embodiment, when the seat emotion value Q exceeds one time of the seat emotion threshold value Xt, marking as first-level abnormality and triggering a first-level processing mechanism, wherein the first-level processing mechanism is used for intercepting a call by a lead or quality inspector, and continuing to communicate with a customer by the lead or other high-level seats, so that problems are solved in time, and the image of a company is maintained; when the seat emotion value Q is larger than the seat emotion threshold value Xt but not larger than one time, marking as a secondary abnormality and triggering a secondary processing mechanism, wherein the secondary processing mechanism is used for sending early warning information to terminal equipment used by seat personnel to prompt the seat personnel to pay attention to control emotion, and simultaneously sending a answering operation corresponding to customer appeal to the terminal equipment used by the seat personnel to help the seat to better serve customers so as to ease the customer emotion for the first time and solve the problem for the customers.
Referring to fig. 6, the present embodiment further provides an apparatus for implementing the multi-dimensional traffic data monitoring method, including:
the monitoring data acquisition module 10 is used for acquiring multidimensional monitoring data in the process of communicating between a call center seat personnel and a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data;
a voice segmentation module 20 for dividing the client voice data into a plurality of consecutive non-overlapping client voice segments according to a preset time length of 100ms in time sequence, wherein the emotion of each of the plurality of shorter client voice segments is generally considered to be stable because the client voice segment of each small segment is generated in a very short time;
the first calculation module 30 is configured to calculate a positive difference value Z between the prosody value of the client in the first segment of client voice and the prosody value of the client in the subsequent segment of client voice, and the statistical analysis indicates that people are usually level and calm in emotion in the initial stage of voice call, so that the prosody value of the client in the first segment of client voice is used as a basic reference value;
the first judging module 40 is configured to judge whether the positive difference value Z is smaller than a preset prosody threshold Wt, and if the positive difference value Z is smaller than the prosody threshold Wt, consider that the call is normal and return to step S2 to monitor the next client voice segment, so as to realize real-time monitoring of call content; if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice segment as suspected abnormality and executing the next step, judging whether the mutation of the client call prosody is caused by the change of the call environment or the emotion change, and improving the monitoring accuracy, reducing the misjudgment rate and enhancing the credibility of seat personnel on the premise of improving the client satisfaction by identifying the real mutation reason;
the second calculation module 50 is configured to extract, from the seat voice data and the seat face data, a seat voice emotion feature and a seat facial expression feature of a time period corresponding to a customer voice segment marked as suspected abnormality, and perform weighted fusion on the seat voice emotion feature and the seat facial expression feature to obtain a seat emotion value Q;
the second judging module 60 is configured to judge whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, and if the seat emotion value Q is smaller than the seat emotion threshold value Xt, consider that the call is normal and return to step S2 to monitor the next customer voice segment; if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal;
the semantic information recognition module 80 is configured to extract customer semantic information from customer voice data marked as abnormal call, and recognize a customer real-time appeal from the customer voice information;
the speaking operation generating module 90 is configured to extract an answer from a question library of an enterprise according to a real-time client requirement to generate a answering operation, and send the answering operation to a terminal device used by a seat person;
the exception handling module 70 is configured to perform exception handling on a call marked as exception according to an exception handling rule, where the exception handling rule refers to marking an exception level according to a seat emotion value Q, and different exception levels have different handling mechanisms.
According to the embodiment, firstly, the seat ID of a suspected abnormal call and a time period when the suspected abnormal call occurs are initially screened according to mutation of the voice rhythm of a customer, then voice information is extracted from the suspected abnormal customer voice fragment, whether sensitive words are contained or not is identified, the customer emotion is judged according to whether the sensitive words are contained or not, then according to causal relation between the change of the customer emotion and call context, seat voice emotion characteristics and facial expression characteristics are extracted from voice data and facial data of seat personnel in two time periods adjacent to each other before and after the suspected abnormal time period, the seat voice emotion characteristics and the facial expression characteristics are subjected to weighted fusion, the seat emotion characteristics and the facial expression characteristics are judged by combining with a preset seat emotion threshold value, emotion and semantic analysis are carried out from two angles of the customer and the seat, so that not only can the accuracy of monitoring of telephone traffic data be improved, but also customer appeal can be perceived in real time, corresponding processing mechanisms are triggered at the first time, the customer problem is solved, and the customer satisfaction is improved, so that the seat has higher social value and application prospect.
Corresponding to the multi-dimensional traffic data monitoring method provided in the foregoing embodiment, the present embodiment further provides a computer readable storage medium, and since the storage medium provided in the present embodiment corresponds to the multi-dimensional traffic data monitoring method provided in the foregoing embodiment, implementation of the multi-dimensional traffic data monitoring method described in the foregoing embodiment is also applicable to the storage medium provided in the present embodiment, and will not be described in detail in the present embodiment.
According to the application, the suspected abnormal call is screened out according to the change of the voice rhythm of the customer, the seat emotion is judged according to the seat voice emotion characteristics and the facial expression characteristics, the suspected abnormal call is screened out according to the seat emotion, and the monitoring is carried out from two angles of the seat and the customer, so that the accuracy and the reliability of telephone traffic monitoring are improved, and the monitoring cost is low.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For each of the above embodiments, since it is substantially similar to the method embodiment, the description is relatively simple, and reference should be made to the description of the method embodiment for relevant points.
The foregoing embodiments have been presented in a detail description of the application, and are presented herein with a particular application to the understanding of the principles and embodiments of the application, the foregoing embodiments being merely intended to facilitate an understanding of the method of the application and its core concepts; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (9)

1. A method for monitoring multi-dimensional traffic data, comprising the steps of:
s1, acquiring multidimensional monitoring data in the process of communicating a call center seat person with a customer, wherein the multidimensional monitoring data comprises customer voice data, seat voice data and seat face data;
s2, dividing the client voice data into a plurality of continuous non-overlapping client voice fragments according to a preset time length in a time sequence;
s3, calculating a positive difference Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent client voice segment;
s4, judging whether the positive difference Z is smaller than a preset prosody threshold Wt;
if the positive difference Z is less than the prosody threshold Wt, the conversation is considered normal, and the step S2 is returned to monitor the next customer voice segment;
if the positive difference Z is more than or equal to the prosody threshold Wt, marking the client voice fragment as suspected abnormality and executing the next step;
s5, extracting seat voice emotion characteristics and seat facial expression characteristics of a time period corresponding to the customer voice fragment marked as suspected abnormality from the seat voice data and the seat facial data respectively, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q;
s6, judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt, if the seat emotion value Q is smaller than the seat emotion threshold value Xt, judging that the call is normal, and returning to the step S2 to monitor the next customer voice segment;
if the seat emotion value Q is more than or equal to the seat emotion threshold value Xt, the call is considered as abnormal and marked as abnormal;
s7, carrying out exception handling on the call marked as exception according to the exception handling rule.
2. The method for monitoring multi-dimensional traffic data according to claim 1, wherein in step S3, a positive difference Z between the prosody value of the client in the first segment of the client speech and the prosody value of the client in the subsequent segment of the client speech is calculated, specifically comprising:
s31, filtering and denoising each section of client voice segment to obtain a processed client voice segment;
s32, extracting the prosody features from each processed client voice segment and quantifying the prosody features into a client prosody value Y 1 ,Y 2 ,...,Y i ,...;
S33, calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice.
3. The method for monitoring multi-dimensional traffic data according to claim 1, wherein in step S5, a seat voice emotion feature and a seat facial expression feature of a time period corresponding to a customer voice segment marked as suspected abnormality are extracted from seat voice data and seat facial data, respectively, and the seat voice emotion feature and the seat facial expression feature are weighted and fused to obtain a seat emotion value Q, and specifically comprising:
s51, acquiring a time period T corresponding to a customer voice fragment marked as suspected abnormality;
s52, extracting the seat voice emotion characteristics and the seat facial expression characteristics in a time period T and a front-back adjacent time period T+/-1 from the seat voice data and the seat facial data respectively;
and S53, carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q.
4. The multi-dimensional traffic data monitoring method according to claim 1, further comprising, prior to step S7:
s8, extracting customer semantic information from customer voice data marked as abnormal call and identifying customer appeal;
s9, extracting a answering operation from the problem library according to customer requirements and sending the answering operation to terminal equipment used by seat personnel.
5. The method for monitoring multi-dimensional traffic data according to claim 1, wherein in step S6, the seat emotion threshold Xt is a critical value obtained by taking voice data and facial emotion data of seat personnel recorded in two channels in various emotion states as a sample set and inputting a multi-dimensional emotion judgment model for training and learning.
6. The multi-dimensional traffic data monitoring method according to claim 1, wherein in step S7, the anomaly handling rule refers to marking anomaly levels according to a seat emotion value Q, and different anomaly levels have different handling mechanisms.
7. An apparatus for implementing the multi-dimensional traffic data monitoring method of any of claims 1-6, comprising:
the monitoring data acquisition module (10) is used for acquiring multidimensional monitoring data in the process of communicating between a call center seat person and a client, wherein the multidimensional monitoring data comprises client voice data, seat voice data and seat face data;
the voice segmentation module (20) is used for sequentially segmenting the client voice data into a plurality of continuous non-overlapping client voice fragments according to preset duration;
a first calculation module (30), wherein the first calculation module (30) is used for calculating a positive difference value Z between the prosody value of the client in the first section of client voice segment and the prosody value of the client in the subsequent section of client voice;
the first judging module (40), the said first judging module (40) is used for judging whether the positive difference Z is smaller than the prosody threshold Wt that presets;
the second computing module (50) is used for respectively extracting seat voice emotion characteristics and seat facial expression characteristics of a time period corresponding to the customer voice fragments marked as suspected abnormalities from the seat voice data and the seat facial data, and carrying out weighted fusion on the seat voice emotion characteristics and the seat facial expression characteristics to obtain a seat emotion value Q;
the second judging module (60), the second judging module (60) is used for judging whether the seat emotion value Q is smaller than a preset seat emotion threshold value Xt;
and the exception handling module (70) is used for carrying out exception handling on calls marked as exception according to the exception handling rules.
8. The multi-dimensional traffic data monitoring device of claim 7, further comprising:
the voice information recognition module (80) is used for extracting client semantic information from client voice data marked as abnormal calls and recognizing client appeal;
and the speaking generation module (90) is used for extracting a answering operation from the problem base according to customer requirements and sending the answering operation to terminal equipment used by seat personnel.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the multi-dimensional traffic data monitoring method according to any of claims 1 to 6.
CN202310990355.9A 2023-08-07 2023-08-07 Multi-dimensional telephone traffic data monitoring method, device and storage medium Active CN116886819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310990355.9A CN116886819B (en) 2023-08-07 2023-08-07 Multi-dimensional telephone traffic data monitoring method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310990355.9A CN116886819B (en) 2023-08-07 2023-08-07 Multi-dimensional telephone traffic data monitoring method, device and storage medium

Publications (2)

Publication Number Publication Date
CN116886819A true CN116886819A (en) 2023-10-13
CN116886819B CN116886819B (en) 2024-02-02

Family

ID=88260482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310990355.9A Active CN116886819B (en) 2023-08-07 2023-08-07 Multi-dimensional telephone traffic data monitoring method, device and storage medium

Country Status (1)

Country Link
CN (1) CN116886819B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107204195A (en) * 2017-05-19 2017-09-26 四川新网银行股份有限公司 A kind of intelligent quality detecting method analyzed based on mood
CN107293309A (en) * 2017-05-19 2017-10-24 四川新网银行股份有限公司 A kind of method that lifting public sentiment monitoring efficiency is analyzed based on customer anger
CN109587360A (en) * 2018-11-12 2019-04-05 平安科技(深圳)有限公司 Electronic device should talk with art recommended method and computer readable storage medium
WO2021051504A1 (en) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 Method for identifying abnormal call party, device, computer apparatus, and storage medium
CN115101053A (en) * 2022-06-23 2022-09-23 平安银行股份有限公司 Emotion recognition-based conversation processing method and device, terminal and storage medium
CN116189713A (en) * 2021-11-29 2023-05-30 上海畅跃信息技术有限公司 Outbound management method and device based on voice recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107204195A (en) * 2017-05-19 2017-09-26 四川新网银行股份有限公司 A kind of intelligent quality detecting method analyzed based on mood
CN107293309A (en) * 2017-05-19 2017-10-24 四川新网银行股份有限公司 A kind of method that lifting public sentiment monitoring efficiency is analyzed based on customer anger
CN109587360A (en) * 2018-11-12 2019-04-05 平安科技(深圳)有限公司 Electronic device should talk with art recommended method and computer readable storage medium
WO2021051504A1 (en) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 Method for identifying abnormal call party, device, computer apparatus, and storage medium
CN116189713A (en) * 2021-11-29 2023-05-30 上海畅跃信息技术有限公司 Outbound management method and device based on voice recognition
CN115101053A (en) * 2022-06-23 2022-09-23 平安银行股份有限公司 Emotion recognition-based conversation processing method and device, terminal and storage medium

Also Published As

Publication number Publication date
CN116886819B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
CN110910901B (en) Emotion recognition method and device, electronic equipment and readable storage medium
US7596498B2 (en) Monitoring, mining, and classifying electronically recordable conversations
US8078463B2 (en) Method and apparatus for speaker spotting
CN109615116A (en) A kind of telecommunication fraud event detecting method and detection system
CN107093431A (en) A kind of method and device that quality inspection is carried out to service quality
CN112580367B (en) Telephone traffic quality inspection method and device
CN110136696B (en) Audio data monitoring processing method and system
CN110929918A (en) 10kV feeder line fault prediction method based on CNN and LightGBM
CN113468296B (en) Model self-iteration type intelligent customer service quality inspection system and method capable of configuring business logic
CN102623009A (en) Abnormal emotion automatic detection and extraction method and system on basis of short-time analysis
CN110309967A (en) Prediction technique, system, equipment and the storage medium of customer service session grading system
CN109119095A (en) Level of fatigue recognition methods, device, computer equipment and storage medium
CN110163118A (en) One kind being based on various dimensions Psychological Evaluation overall analysis system
CN115512688A (en) Abnormal sound detection method and device
CN110166642A (en) A kind of predictive outbound method and apparatus
CN116886819B (en) Multi-dimensional telephone traffic data monitoring method, device and storage medium
CN114513791A (en) Telecom anti-fraud method based on machine learning
CN111489736B (en) Automatic scoring device and method for seat speaking operation
CN116756688A (en) Public opinion risk discovery method based on multi-mode fusion algorithm
CN111090585A (en) Crowd-sourcing task closing time automatic prediction method based on crowd-sourcing process
CN111010484A (en) Automatic quality inspection method for call recording
CN112434808B (en) Full-stack type forward neural network deep learning system security analysis and detection method
CN116189713A (en) Outbound management method and device based on voice recognition
WO2021073258A1 (en) Task follow-up method, apparatus and device based on emotion analysis, and storage medium
CN114630110A (en) Video image online rate detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant