Disclosure of Invention
The embodiment of the invention aims to provide a voice intelligent customer service character quality inspection method, which aims to solve the problem that the service evaluation of the existing intelligent system on customer service is not completely reliable.
The embodiment of the invention is realized in such a way that the voice intelligent customer service character quality inspection method comprises the following steps:
acquiring voice interaction records and customer service information;
calling corresponding customer service voiceprint information from a preset voiceprint database according to the customer service information;
extracting corresponding customer service voice information from the voice interaction record according to the customer service voiceprint information;
carrying out voice recognition on the customer service voice information to obtain a voice recognition result, wherein the voice recognition result at least comprises a meaning recognition result and a tone recognition result;
and evaluating the customer service according to the voice recognition result to generate an evaluation result.
Preferably, the step of performing speech recognition on the customer service speech information to obtain a speech recognition result specifically includes:
carrying out voice recognition on the customer service voice information to obtain a meaning recognition result, wherein the meaning recognition result at least comprises a character recognition time axis;
analyzing the customer service voice information, and performing stress identification on the meaning recognition result according to the analysis result and the character recognition time axis;
and performing tone recognition according to the character recognition result after the accent identification to obtain a tone recognition result.
Preferably, the step of analyzing the customer service voice information and performing accent marking on the meaning recognition result according to the analysis result and the character recognition time axis specifically includes:
sentence division is carried out on characters contained in the meaning identification result to obtain a sentence division identification result;
performing voice interception on the customer service voice information corresponding to each character in the sentence recognition result according to a character recognition time axis to obtain single character voice information;
counting the time length and loudness of the single character voice information corresponding to each character to obtain a statistical result;
and performing stress identification on the meaning identification result according to the statistical result.
Preferably, the method further comprises the step of performing interrupt recognition on the customer service voice information, and the step specifically comprises the following steps:
extracting client voice information from the voice interaction record, and performing voice recognition to obtain a client voice recognition result;
and (4) counting the coincidence rate of the voice recognition result and the client voice recognition result, and generating an interrupt analysis report.
Preferably, after the step of obtaining the voice interaction record, the method further includes performing noise reduction processing on the voice interaction record.
Preferably, after the step of obtaining the voice interaction record, the method further includes identifying a language type used in the voice interaction record.
Preferably, when the corresponding customer service voice information cannot be extracted from the voice interaction record according to the customer service voiceprint information, the quality inspection is carried out on the voice interaction record manually.
Another objective of an embodiment of the present invention is to provide a voice intelligent customer service text quality inspection system, which includes:
the data acquisition module is used for acquiring voice interaction records and customer service information;
the voiceprint calling module is used for calling corresponding customer service voiceprint information from a preset voiceprint database according to the customer service information;
the voice extraction module is used for extracting corresponding customer service voice information from the voice interaction record according to the customer service voiceprint information;
the voice recognition module is used for carrying out voice recognition on the customer service voice information to obtain a voice recognition result, and the voice recognition result at least comprises a meaning recognition result and a tone recognition result;
and the quality inspection evaluation module is used for evaluating the customer service according to the voice recognition result to generate an evaluation result.
Preferably, the speech recognition module includes:
the semantic recognition unit is used for carrying out voice recognition on the customer service voice information to obtain a meaning recognition result, and the meaning recognition result at least comprises a character recognition time axis;
the stress identification unit is used for analyzing the customer service voice information and performing stress identification on the meaning identification result according to the analysis result and the character identification time axis;
and the tone recognition unit is used for carrying out tone recognition according to the character recognition result after the accent identification to obtain a tone recognition result.
Preferably, the accent identification unit includes:
a sentence dividing unit, configured to divide the words included in the meaning identification result to obtain a sentence identification result;
the voice intercepting subunit is used for carrying out voice intercepting on the customer service voice information corresponding to each character in the sentence identification result according to the character identification time axis to obtain single character voice information;
the voice statistics subunit is used for counting the time length and loudness of the single word voice information corresponding to each character to obtain a statistical result;
and the identification subunit is used for performing stress identification on the meaning identification result according to the statistical result.
The invention has the beneficial effects that:
1. according to the voice intelligent customer service character quality inspection method, the voice of the customer service in the voice interaction record is independently extracted according to the voiceprint, semantic analysis and tone analysis are carried out on the voice, and the tone of the speech of the customer service is judged by using the distribution of loudness and stress, so that the voice service provided by the customer service is comprehensively analyzed, the behavior that the customer service avoids quality inspection by changing the tone is avoided, and the accuracy of service quality evaluation is improved;
2. extracting corresponding customer service voice information from the voice interaction record according to the customer service voiceprint information, comparing the customer service voiceprint information with the customer service voice information to judge whether the corresponding customer service voiceprint information exists in the current voice interaction record, and if the corresponding customer service voiceprint information does not exist in the voice interaction record, determining that the interaction between the customer service and the customer is abnormal, directly performing quality inspection through manual work to further judge whether the interaction is replaced by other customer services, and further extracting the voice sent by the customer service belonging to the voice interaction record according to the customer service voiceprint information to obtain the customer service voice information;
3. carrying out voice recognition on the customer service voice information to obtain a meaning recognition result, and displaying information contained in the customer service voice information in a text manner after semantic recognition, but losing the tone contained in the customer service voice information after the semantic recognition, and recording the time of the text recognition during the semantic recognition to obtain a text recognition time axis in which the time of each text corresponding to the customer service voice information is recorded;
4. the customer service voice information is read, then the time in the customer service voice information corresponding to the head end and the tail end of each character can be determined according to the character recognition time axis, so that the customer service voice between the time is intercepted, the section of voice corresponds to the character, and single character voice information is finally obtained.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another. For example, a first xx script may be referred to as a second xx script, and similarly, a second xx script may be referred to as a first xx script, without departing from the scope of the present application.
The manual customer service checking method is complicated, needs a lot of time, and cannot check all services provided by the manual customer service one by one, so that some enterprises check the services one by using an intelligent system, and mainly directly judge according to the speaking tone of the customer service and the speaking semantics of customers, so that the intelligent system cannot identify the condition that the customer service adopts calm tone to shadow the customers, namely the service evaluation of the existing intelligent system on the customer service is not completely reliable.
In the invention, the voice of the customer service in the voice interaction record is independently extracted according to the voiceprint, semantic analysis and tone analysis are carried out on the voice, and the tone of the speech of the customer service is judged by utilizing the distribution of loudness and stress, so that the voice service provided by the customer service is comprehensively analyzed, the behavior that the customer service avoids quality inspection by changing the tone is avoided, and the accuracy of service quality evaluation is improved.
As shown in fig. 1, which is a flowchart of a voice intelligent customer service text quality inspection method provided by the embodiment of the present invention, the method includes:
and S100, acquiring voice interaction records and customer service information.
In the step, voice interaction records and customer service information are obtained, in the system, each customer service has own information, such as name, gender and age, most importantly, the customer service information needs to contain the voice information of the customer service, the voice print information of the customer service is obtained by extracting the characteristics of the voice information, which voices in the voice interaction records are sent by the customer service can be judged according to the voice print information, the voice interaction records are recorded in the process of interaction between the customer service and the customer, for example, the customer service records whole conversation in the process of communication between the customer service and the customer, so that the voice interaction records are obtained, and therefore when quality inspection is needed, the information is called from a background; after the step, the method further comprises the steps of carrying out noise reduction processing on the voice interaction record and identifying the language type used in the voice interaction record.
And S200, calling corresponding customer service voiceprint information from a preset voiceprint database according to the customer service information.
In this step, a preset voiceprint database is queried according to the customer service information, the customer service information includes information such as the name of the customer service, and then the voiceprint information of the customer service is called from the voiceprint database according to the information, and after the step, the voiceprint information can be used for processing the voice interaction record.
And S300, extracting corresponding customer service voice information from the voice interaction record according to the customer service voiceprint information.
In the step, corresponding customer service voice information is extracted from the voice interaction record according to the customer service voiceprint information, the customer service voiceprint information is compared with the customer service voice information, whether the corresponding customer service voiceprint information exists in the current voice interaction record is judged, when the corresponding customer service voiceprint information does not exist in the voice interaction record, the fact that the interaction between the customer service and the customer is abnormal is determined, quality inspection is directly carried out manually, further judgment is carried out, whether the interaction is replaced by other customer services or not is further carried out, further, the voice sent by the customer service belonging to the voice interaction record is extracted independently according to the customer service voiceprint information, and then the customer service voice information is obtained.
S400, carrying out voice recognition on the customer service voice information to obtain a voice recognition result, wherein the voice recognition result at least comprises a meaning recognition result and a tone recognition result.
In the step, the customer service voice information is subjected to voice recognition, in the process, the customer service voice information is subjected to semantic recognition, namely the customer service voice information is converted into character information by the semantic recognition, and then the tone recognition is further carried out according to the character information and the customer service voice information, so that the tone of the customer service speaking is determined.
And S500, evaluating the customer service according to the voice recognition result to generate an evaluation result.
In the step, the customer service is evaluated according to the voice recognition result, and the specific meaning of the speaking of the customer service is further determined by comprehensively judging the semantics and the tone so as to evaluate the customer service and finally generate an evaluation result.
As shown in fig. 2, as a preferred embodiment of the present invention, the step of performing speech recognition on the customer service speech information to obtain a speech recognition result specifically includes:
s401, carrying out voice recognition on the customer service voice information to obtain a meaning recognition result, wherein the meaning recognition result at least comprises a character recognition time axis.
In this step, speech recognition is performed on the customer service speech information to obtain a meaning recognition result, after the semantic recognition, information contained in the customer service speech information can be displayed in the form of characters, but after the semantic recognition, the tone contained in the customer service speech information is lost, and when the semantic recognition is performed, recording of the time of character recognition is ensured, so that a character recognition time axis is obtained, and the time of each character corresponding to the customer service speech information is recorded in the character recognition time axis.
S402, analyzing the customer service voice information, and performing accent marking on the meaning recognition result according to the analysis result and the character recognition time axis.
In this step, the customer service voice information is analyzed, and each character in the meaning recognition result is marked at the time position corresponding to the customer service voice information based on the character recognition time axis, for example, in the semantic recognition, the character "hello" is included, wherein "hello" appears in the 31 th second, and "good" appears in the 32 th second, at this time, the customer service voice information of the 31 th and 32 th seconds is analyzed, so as to determine whether the accent exists in the voice "hello", and the character in which the accent appears is marked.
And S403, performing tone recognition according to the character recognition result after the accent mark to obtain a tone recognition result.
In this step, the voice corresponding to the marked characters is subjected to tone recognition to determine the tone used by the customer service, so as to obtain a tone recognition result.
As shown in fig. 3, as a preferred embodiment of the present invention, the step of analyzing the customer service voice information and performing accent marking on the meaning recognition result according to the analysis result and the character recognition time axis specifically includes:
s4021, segmenting the characters contained in the meaning recognition result to obtain a segmentation recognition result;
in this step, the words included in the meaning recognition result are first to be claused, and since the words included in the meaning recognition result may be continuous after semantic recognition is performed, the words need to be claused according to the content of the words to obtain a clause recognition result, and in the subsequent processing process, the words are processed sentence by sentence.
S4022, voice interception is carried out on the customer service voice information corresponding to each character in the sentence recognition result according to the character recognition time axis, and single character voice information is obtained.
In this step, the customer service voice information is read, and then the time in the customer service voice information corresponding to the head end and the tail end of each character can be determined according to the character recognition time axis, so that the customer service voice between the time is intercepted, and the section of voice corresponds to the character, and finally the single character voice information is obtained.
S4023, counting the time length and loudness of the single character voice information corresponding to each character to obtain a statistical result.
In the step, the time length and the loudness of the single character voice information corresponding to each character are counted, a person has logic stress in the speaking process, and sometimes the pronunciation time is prolonged or shortened according to the tone, and the time length and the loudness are counted, so that the tone of the customer service is comprehensively judged.
S4024, performing stress identification on the meaning identification result according to the statistical result.
In the step, the statistical results are combined, so that the pronunciation time and loudness of each character are obtained, the characters exceeding the preset time are determined as accent characters, the characters exceeding the preset loudness are determined as accent characters, all the characters are analyzed, and the tone of the customer service is judged according to the positions of the accent and the accent, so that the service quality of the customer service is evaluated.
As shown in fig. 4, as a preferred embodiment of the present invention, the step of performing interrupt recognition on the customer service voice information specifically includes:
s601, extracting the voice information of the client from the voice interaction record, and performing voice recognition to obtain a voice recognition result of the client.
In the step, the voice information of the client is extracted, so that whether the client evaluates the service quality of the customer service in the interaction process is judged through voice recognition, and a voice recognition result is finally generated.
S602, the coincidence rate of the voice recognition result and the client voice recognition result is counted, and an interrupt analysis report is generated.
In this step, the coincidence rate of the speech recognition result and the client speech recognition result is counted, because, in the customer service process, the client may be interrupted to speak, which greatly affects the service quality, and to avoid this problem, the time of the coincident speaking in the process of interaction between the customer service and the client in the speech interaction record is counted, so as to determine whether the condition that the client is interrupted to speak exists, and finally, the analysis report is interrupted, and the quality evaluation is taken into consideration.
As shown in fig. 5, the voice intelligent customer service text quality inspection system provided in the embodiment of the present invention includes:
and the data acquisition module 100 is configured to acquire the voice interaction record and the customer service information.
In the system, the data acquisition module 100 acquires voice interaction records and customer service information, each customer service in the system has own information, which can judge which voices in the voice interaction records are sent by the customer service according to voiceprint information, and the voice interaction records are recorded in the process of interaction between the customer service and the customer.
And the voiceprint calling module 200 is configured to call corresponding customer service voiceprint information from a preset voiceprint database according to the customer service information.
In the system, the voiceprint calling module 200 queries a preset voiceprint database according to the customer service information, wherein the customer service information includes information such as the name of the customer service, and then calls the voiceprint information of the customer service from the voiceprint database according to the information, and after the step, the voiceprint information can be used for processing the voice interaction record.
And the voice extraction module 300 is configured to extract corresponding customer service voice information from the voice interaction record according to the customer service voiceprint information.
In the system, a voice extraction module 300 extracts corresponding customer service voice information from a voice interaction record according to customer service voiceprint information, compares the customer service voiceprint information with the customer service voice information to judge whether the corresponding customer service voiceprint information exists in the current voice interaction record, and if the corresponding customer service voiceprint information does not exist in the voice interaction record, the interaction between the customer service and the customer is determined to be abnormal, and quality inspection is directly performed manually to further judge whether the interaction is replaced by other customer services.
The speech recognition module 400 is configured to perform speech recognition on the customer service speech information to obtain a speech recognition result, where the speech recognition result at least includes a meaning recognition result and a mood recognition result.
In the system, the speech recognition module 400 performs speech recognition on the customer service speech information, and in the process, firstly performs semantic recognition, namely, the customer service speech information is subjected to the semantic recognition, the customer service speech information is converted into character information, and then the tone recognition is further performed according to the character information and the customer service speech information, so that the tone of the customer service speaking is determined.
And the quality inspection evaluation module 500 is used for evaluating the customer service according to the voice recognition result to generate an evaluation result.
In the system, the quality inspection evaluation module 500 evaluates the customer service according to the speech recognition result, and further determines the specific meaning of the speaking of the customer service by comprehensively judging the semantics and the tone so as to evaluate the customer service and finally generate an evaluation result.
As shown in fig. 6, as a preferred embodiment of the present invention, the voice recognition module includes:
the semantic recognition unit 401 is configured to perform speech recognition on the customer service speech information to obtain a meaning recognition result, where the meaning recognition result at least includes a character recognition time axis.
In this module, the semantic recognition unit 401 performs speech recognition on the customer service speech information to obtain a meaning recognition result, after the semantic recognition, the information included in the customer service speech information can be displayed in terms of characters, and when performing the semantic recognition, it is also ensured that the time for character recognition is recorded, so as to obtain a character recognition time axis, in which the time in the customer service speech information corresponding to each character is recorded.
And an accent marking unit 402, configured to analyze the customer service voice information, and perform accent marking on the meaning recognition result according to the analysis result and the character recognition timeline.
In this module, the accent identifier 402 analyzes the customer service voice message, and marks each character in the meaning recognition result at a time position corresponding to the customer service voice message based on a character recognition time axis.
And a tone recognition unit 403, configured to perform tone recognition according to the character recognition result after the accent identifier, to obtain a tone recognition result.
In this module, the mood identifying unit 403 performs mood identification on the voice corresponding to the marked characters to determine the mood used by the customer service at that position, thereby obtaining a mood identification result.
As shown in fig. 7, as a preferred embodiment of the present invention, the accent identification unit includes:
the sentence segmentation sub-unit 4021 is configured to perform sentence segmentation on the characters included in the meaning recognition result to obtain a sentence recognition result.
In this unit, the clause unit 4021 clauses the characters included in the meaning recognition result, and since the characters included in the meaning recognition result may be continuous after semantic recognition, it is necessary to perform clause operation on the characters according to the content of the characters to obtain a clause recognition result, and in the subsequent processing process, the sentence-by-sentence processing is performed.
And the voice intercepting subunit 4022 is configured to perform voice interception on the customer service voice information corresponding to each character in the sentence recognition result according to the character recognition time axis to obtain single character voice information.
In this unit, the voice capturing subunit 4022 reads the customer service voice information, and then determines the time in the customer service voice information corresponding to the beginning and end of each character according to the character recognition time axis, so as to capture the customer service voice between the times, and then the voice corresponds to the character, and finally the single character voice information is obtained.
The speech statistics subunit 4023 is configured to count the time length and loudness of the speech information of the individual character corresponding to each character, so as to obtain a statistical result.
In this unit, the speech statistics subunit 4023 counts the time length and loudness of the speech information of each word corresponding to each character, and the person may have logical stress during the speaking process, and may sometimes lengthen or shorten the pronunciation time according to the tone, and thus, the tone of the customer service is comprehensively determined by counting the time length and loudness.
The identifier subunit 4024 is configured to perform accent identification on the meaning recognition result according to the statistical result.
In this unit, the identifier subunit 4024 combines the statistical results to obtain the pronunciation duration and loudness of each character, determines the characters exceeding the preset duration as accent characters, determines the characters exceeding the preset loudness as accent characters, analyzes all the characters, and determines the mood of the customer service according to the accent and the position of the accent, thereby evaluating the service quality.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.