CN114203200A

CN114203200A - Voice quality inspection method and device, computer equipment and storage medium

Info

Publication number: CN114203200A
Application number: CN202111360877.8A
Authority: CN
Inventors: 张奇; 施进; 范大章
Original assignee: Nanjing Suning Software Technology Co ltd
Current assignee: Nanjing Suning Software Technology Co ltd
Priority date: 2021-11-17
Filing date: 2021-11-17
Publication date: 2022-03-18
Also published as: CA3182191A1

Abstract

The application relates to a voice quality inspection method, a voice quality inspection device, computer equipment and a storage medium. The voice quality inspection method comprises the following steps: converting voice data into text data, and performing sentence breaking processing on the text data to obtain text segments; configuring a keyword type according to the task type, retrieving keywords corresponding to the keyword type, and updating a keyword library; selecting a keyword text from the keyword library, and comparing the keyword text with the text fragment to obtain matching information of the keyword text; comparing the number of the selected keyword texts with the matching information of the keyword texts to obtain the matching coefficient of the keyword types; and comparing the matching coefficient of the keyword type with a preset matching threshold value to obtain a quality inspection result of the task type, so that the defects of a manual quality inspection mode can be avoided, the enterprise cost is reduced, the voice quality inspection efficiency is improved, and customer service performance assessment and customer service satisfaction systematization are realized.

Description

Voice quality inspection method and device, computer equipment and storage medium

Technical Field

The present invention relates to the field of information processing technologies, and in particular, to a voice quality inspection method, apparatus, computer device, and storage medium.

Background

With the development of information technology, customers usually communicate with enterprise customer service through voice to perform operations such as business consultation and opinion complaint. In order to improve the voice service quality and the customer satisfaction, enterprises need to evaluate and assess the performance of voice customer service. At present, voice communication records generated between customer service and customers in a time period are mainly sorted and quality-checked in a manual mode. However, with the increase of the traffic volume, the manual mode has the defects of complicated operation steps, low execution efficiency, poor quality inspection, high labor cost, high subjectivity and the like, and the problems of low accuracy of voice quality inspection results, unreasonable voice customer service performance assessment results and the like occur.

Disclosure of Invention

In view of the above, it is desirable to provide a voice quality inspection method, apparatus, computer device and storage medium for improving poor quality inspection effect of manual voice.

In one aspect, a voice quality inspection method is provided, where the voice quality inspection method includes:

converting voice data into text data, and performing sentence breaking processing on the text data to obtain text segments;

configuring a keyword type according to the task type, retrieving keywords corresponding to the keyword type, and updating a keyword library;

selecting a keyword text from the keyword library, and comparing the keyword text with the text fragment to obtain matching information of the keyword text;

comparing the number of the selected keyword texts with the matching information of the keyword texts to obtain the matching coefficient of the keyword types;

and comparing the matching coefficient of the keyword type with a preset matching threshold value to obtain a quality inspection result of the task type.

In one embodiment, the step of configuring a keyword type according to a task type, retrieving a keyword corresponding to the keyword type, and updating a keyword library includes:

configuring a keyword type according to a task type, wherein the keyword type comprises: standard keywords, forbidden keywords and emotion keywords;

retrieving the meaning and application scene of each keyword type to obtain the corresponding keyword;

and updating a keyword library according to the keywords.

In one embodiment, the step of selecting a keyword text from the keyword library, comparing the keyword text with the text fragment, and obtaining matching information of the keyword text comprises:

according to the keyword library, at least one keyword text is respectively selected for each keyword type to form a single configuration of each keyword type;

performing content matching and traversal on the single configuration from the first clause of the text segment, and recording the serial number of the clause in the text segment to acquire the matching position of the keyword text if the clause is in matching relation with the keyword text in the single configuration;

recording the number of keyword texts of which the clauses and the single configuration form a matching relation, counting the number of the keyword texts only once if the same keyword texts in the single configuration repeatedly appear in the same clause, and acquiring the matching times of the keyword texts according to the number of the keyword texts;

and acquiring matching information of the keyword text according to the matching position of the keyword text and the matching times of the keyword text.

In one embodiment, the step of comparing the number of the selected keyword texts with the matching information of the keyword texts to obtain the matching coefficients of the keyword types includes:

sorting the matching times according to the matching information of the keyword text, and acquiring the maximum matching times according to the highest-value matching times in the sorting;

dividing the maximum matching times by the number of the selected keyword texts to obtain a matching coefficient of a single configuration;

setting sampling weight for each single configuration, and acquiring a single configuration sampling coefficient through the sampling weight and the single configuration matching coefficient, wherein the mathematical expression of a single configuration sampling coefficient sp is as follows:

sp＝w*p

wherein sp is the single configuration sampling coefficient, w is the single configuration sampling weight, and p is the single configuration matching coefficient;

comparing the single configuration sampling coefficient with a preset single configuration matching threshold value to obtain a matching coefficient S of the keyword type, wherein the mathematical expression of the matching coefficient S of the keyword type is as follows:

wherein, S is the keyword type matching coefficient, sp is a single configuration sampling coefficient, t is a single configuration matching threshold, max (·) represents taking the maximum value, and d (·) is a differential operator.

In one embodiment, the step of comparing the matching coefficient according to the keyword type with a preset matching threshold to obtain the quality inspection result of the task type includes:

and comparing the matching coefficients of the keyword types with preset matching threshold values respectively to obtain quality inspection results.

In one embodiment, the step of comparing the matching coefficient according to the keyword type with a preset matching threshold to obtain the quality inspection result of the task type further includes:

setting sampling weight for each keyword type, and acquiring sampling coefficient according to the matching coefficient of the sampling weight and the keyword type;

and comparing the sampling coefficient with a preset matching threshold value to obtain a quality inspection result.

In one embodiment, the step of converting voice data into text data, performing sentence break processing on the text data, and acquiring text segments includes:

separating the mute content and the voice content in the voice data, and acquiring a time separation label;

performing sentence breaking on the text data according to the time separation label, and adding punctuation marks at the end of the sentence according to the color related vocabulary at the end of the sentence;

and performing word number query on the text without the punctuation marks in the text data, and adding the punctuation marks when the text without the punctuation marks exceeds a preset word number threshold.

In another aspect, a voice quality inspection apparatus is provided, including:

the voice conversion text module is used for converting voice data into text data, and performing sentence breaking processing on the text data to obtain text segments;

the task parameter configuration module is used for configuring a keyword type according to the task type, retrieving keywords corresponding to the keyword type and updating a keyword library;

the matching information acquisition module is used for selecting a keyword text from the keyword library, comparing the keyword text with the text fragment and acquiring matching information of the keyword text;

the matching coefficient acquisition module is used for comparing the number of the selected keyword texts with the matching information of the keyword texts to acquire the matching coefficients of the keyword types;

and the quality inspection result acquisition module is used for comparing the matching coefficient of the keyword type with a preset matching threshold value to acquire the quality inspection result of the task type.

On the other hand, a voice quality inspection device is provided, the voice quality inspection device includes a quality inspection result obtaining module, and the quality inspection result obtaining module includes:

and the first acquisition unit is used for comparing the matching coefficients of the keyword types with preset matching threshold values respectively to acquire a quality inspection result.

On the other hand, a voice quality inspection device is provided, the voice quality inspection device comprises a quality inspection result acquisition module, and the quality inspection result acquisition module further comprises:

the first acquisition unit is used for setting sampling weight for each keyword type and acquiring sampling coefficient according to the matching coefficient of the sampling weight and the keyword type;

and the second acquisition unit is used for comparing the sampling coefficient with a preset matching threshold value to acquire a quality inspection result.

In another aspect, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the following steps when executing the computer program:

In yet another aspect, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of:

According to the voice quality inspection method, the device, the computer equipment and the storage medium, the keyword type is set according to the task type, the keyword text is selected to be compared with the customer service voice text, the matching information and the matching coefficient are obtained to judge the customer service voice quality inspection result, the defects existing in the manual quality inspection mode can be avoided, the enterprise cost is reduced, the voice quality inspection efficiency is improved, and customer service performance assessment and customer service satisfaction systematization are realized.

Drawings

FIG. 1 is a flow chart illustrating an exemplary implementation of a voice quality inspection method;

FIG. 2 is a diagram of an exemplary voice quality inspection system;

FIG. 3 is a flowchart illustrating a voice quality inspection method according to an embodiment;

FIG. 4 is a flowchart illustrating the step of obtaining text segments in one embodiment;

FIG. 5 is a flowchart illustrating the steps of updating a keyword library in one embodiment;

FIG. 6 is a flowchart illustrating the step of obtaining a matching location in one embodiment;

FIG. 7 is a flowchart illustrating the step of obtaining matching coefficients in one embodiment;

FIG. 8 is a flowchart illustrating steps of obtaining quality inspection results according to an embodiment;

FIG. 9 is a flowchart illustrating a step of obtaining quality inspection results according to another embodiment;

FIG. 10 is a block diagram of a quality inspection result obtaining module according to an embodiment;

FIG. 11 is a block diagram showing the structure of a voice quality control apparatus according to an embodiment;

FIG. 12 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The application flow of the voice quality inspection method provided by the application is shown in fig. 1. For example, the voice quality inspection method provided by the application can be applied to the detection of customer service voice service quality, the voice data 100 is converted into the text data 101, and the text data 101 is matched with the text data 102 to obtain the quality inspection result 103, so that the defects of a manual quality inspection mode can be avoided, the enterprise cost is reduced, the voice quality inspection efficiency is improved, and the customer service performance assessment and the customer service satisfaction systematization are realized.

The voice quality inspection method provided by the application can be applied to the application environment shown in fig. 2. In which a terminal 200 communicates with a server 201 via a network. For example, the voice quality inspection method provided by the application includes converting voice data into text data and performing sentence segmentation processing, configuring keyword types according to task types, updating a keyword library, selecting keyword texts from the keyword library, comparing the keyword texts with text segments, then obtaining matching information of the keyword texts, comparing the matching information with the number of the selected keyword texts to obtain matching coefficients of the keyword types, and comparing the matching coefficients with a preset matching threshold value to obtain quality inspection results of the task types. The terminal 200 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, portable wearable devices, or sub-servers, and the server 201 may be implemented by an independent server or a server cluster formed by a plurality of servers, or a cloud computing platform.

In one embodiment, as shown in fig. 3, there is provided a voice quality inspection method, including the steps of:

s1: converting voice data into text data, and performing sentence breaking processing on the text data to obtain text segments;

s2: configuring a keyword type according to the task type, retrieving keywords corresponding to the keyword type, and updating a keyword library;

s3: selecting a keyword text from the keyword library, and comparing the keyword text with the text fragment to obtain matching information of the keyword text;

s4: comparing the number of the selected keyword texts with the matching information of the keyword texts to obtain the matching coefficient of the keyword types;

s5: and comparing the matching coefficient of the keyword type with a preset matching threshold value to obtain a quality inspection result of the task type.

Through the steps, the problems of complex operation steps, low execution efficiency, poor quality inspection quality, high labor cost, high subjectivity and the like existing in the process of arranging and quality inspection of voice communication records generated between customer service and customers in a manual mode can be solved, the voice data are converted into the text data, the text data are compared with keyword texts in a keyword library, matching rules are set, voice quality inspection results are obtained, the defects existing in the manual quality inspection mode can be avoided, the enterprise cost is reduced, the voice quality inspection efficiency is improved, and the customer service performance assessment and customer service satisfaction systematization are realized.

Since the voice data includes a bidirectional voice record generated by the voice communication between the customer service and the enterprise service over a period of time, when the quality of the customer service voice service is detected, it is necessary to distinguish different voice contents between the customer service and the enterprise service, and there may be noise in the voice contents, so that it is impossible to detect text data directly converted into a single text by the voice data, in step S1, it is exemplarily described that the voice data can be converted into text data, the text data is processed by sentence breaking to obtain text segments, for example, after a bidirectional voice record generated by the voice communication between the customer service and the enterprise service over a period of time is obtained, the voice data is preprocessed to remove noise, interference, etc., and a training method such as voice activation detection is used to separate the mute content from the voice content in the voice data to obtain a time separation label, recognizing the preprocessed voice data through voice recognition algorithms such as a Viterbi algorithm and the like to obtain text data, performing role separation on the text data through automatic voice recognition techniques such as an automatic rapid recognition technique and the like to obtain client text data and customer service text data, performing sentence breaking on the client text data and the customer service text data according to a time separation label, adding punctuation marks, performing sentence breaking on the client text data and the customer service text data according to color vocabularies of the client text data and the customer service text data, adding punctuation marks, performing word number query on the client text data and texts without the punctuation marks of the customer service text data, and adding the punctuation marks when the texts without the punctuation marks exceed a preset word number threshold value to obtain text segments.

When detecting the voice customer service quality, in order to select a newer or more appropriate keyword, the keyword library needs to be updated, in step S2, illustratively, one or more task types may be set, then the keyword types are configured according to the task types, and the commonly used vocabulary corresponding to the meaning of the keyword types and the application scenario is retrieved, and then the vocabulary is combined, the keyword library is updated, for example, according to different task types, such as business consultation, marketing call-out, complaint suggestion, etc., different keyword types are configured, the commonly used vocabulary corresponding to the contents of the meaning of different keyword types and the application scenario is retrieved, and the commonly used vocabulary is combined, the keyword library is updated, a new keyword may be added to the keyword library, and a keyword that is inappropriate, infrequently used or misclassified in the keyword library is deleted, and updating the keyword library.

In order to detect whether the content of the voice communication between the customer service and the customer meets the requirement, keyword matching is performed on the text data, and in step S3, it is exemplarily illustrated that a keyword text may be selected from the keyword library and compared with the text segment to obtain matching information of the keyword text. For example, according to different task types and different keyword types, one or more keyword texts are selected from a keyword library, content matching and traversal are performed on the keyword texts from the first clause of the text segment, if the clause and the keyword texts have a matching relationship, the serial number of the clause in the text segment is recorded, the matching position is obtained, and the matching times of the keyword texts are recorded.

After the matching information of the keyword texts is obtained, the number of matches between each clause and the single keyword text in the text segment needs to be counted, and in step S4, it is exemplarily illustrated that the maximum number of keyword texts having a matching relationship in each clause may be divided by the number of the selected keyword texts according to the matching information, so as to obtain the matching coefficients of the keyword texts.

After obtaining the matching coefficient of the keyword text, to determine whether the voice quality inspection is qualified, in step S5, exemplarily, the matching coefficient may be compared with a preset matching threshold to obtain the quality inspection result of the task type, for example, a sampling weight is set for each keyword type according to the task type, where the sampling weight is used to represent the importance degree of each keyword type in different task types, for example, for a task type of business consultation, an enterprise may raise the weight of a standard keyword in the keyword type, and for a task type of complaint suggestion, an enterprise may raise the weight of a banned keyword and an emotion keyword in the keyword type, and then compare with the preset matching threshold according to the matching coefficient and the sampling weight to obtain the quality inspection result.

Before quality control is performed on the voice communication content between the client and the service, the step S1 of converting the voice data into text data, as shown in fig. 4, converting the voice data into text data, and performing sentence-breaking processing on the text data to obtain text segments includes:

s11: separating the mute content and the voice content in the voice data, and acquiring a time separation label;

s12: performing sentence breaking on the text data according to the time separation label, and adding punctuation marks at the end of the sentence according to the color vocabulary at the end of the sentence;

s13: and performing word number query on the text without the punctuation marks in the text data, and adding the punctuation marks when the text without the punctuation marks exceeds a preset word number threshold.

Through the steps, the voice communication content between the customer service and the client in a period of time can be divided into the mute content and the voice content, the different voice contents of the client and the customer service are distinguished, the noise possibly existing in the voice content is eliminated, punctuation marks are added to the converted text data for sentence breaking, and the applicability of the text data is improved.

As shown in fig. 4, in step S11, for example, to separate the mute content from the voice content in the voice data, the voice data may be detected by using a voice activity detection training method, when it is detected that the mute duration exceeds a mute threshold, for example, 3 seconds, the time point is recorded, a time separation flag is obtained, the time point is recorded when the mute content is separated from the voice content by using the time separation flag, for example, 5 seconds, the time separation flag is obtained, and the mute content is separated from the voice content by using the time separation flag. Different mute thresholds can be set according to different speech speeds, so that the separation of the mute content and the voice content in the voice data is more reasonable, respective pause habits of different customers and customer services during voice communication are considered, and the separated voice content is more suitable for the subsequent voice quality inspection step.

As shown in fig. 4, in step S12, it is exemplarily illustrated that when a vocabulary with exclamatory emotion color is retrieved at the time separation tag, such as: "o", "bar", "chan", etc., exclamation marks are added at the end of periods, when words with questionable emotion color are retrieved at time-separated tags, such as: and adding question marks at the end of the sentence when the words are ' Do ' and ' what ' are ' and the like, and adding proper punctuation marks to the text according to the emotional colors of the client or customer service in such a way, so that the subsequent voice quality inspection step is conveniently executed.

As shown in fig. 4, in step S13, it is exemplarily illustrated that the word number query is performed on the text without punctuation in the text data, for each period in the text data, if the word number exceeds the preset word number threshold, for example, 20 words, a period is added to perform a sentence break, otherwise, a comma is added to perform a sentence break, the last period in the text data is directly added to perform a sentence break, for example, 30 words, a period is added to perform a sentence break, otherwise, a comma is added to perform a sentence break, and the last period in the text data is directly added to perform a sentence break.

Before selecting the keyword text from the keyword library, the keyword library needs to be updated so as to select a suitable keyword text, as shown in fig. 5, in some embodiments, the voice quality inspection method further includes:

s21: configuring a keyword type according to a task type, wherein the keyword type comprises: standard keywords, forbidden keywords and emotion keywords;

s22: retrieving the meaning and application scene of each keyword type to obtain the corresponding keyword;

s23: and updating a keyword library according to the keywords.

Through the steps, different task types are analyzed, different keyword types can be configured according to the task types, after the meanings of the keyword types are further analyzed, the corresponding common vocabulary can be inquired in combination with the application scene, the common vocabulary is combined to form a keyword library, the keyword library is updated, the appropriate keyword texts can be selected from the keywords conveniently in the follow-up process, and the applicability of the selected keyword texts is improved.

As shown in fig. 5, in step S21, the keyword types configured according to the task type include: the standard keywords, the forbidden language keywords and the emotion keywords can be set into a single or multiple single configurations for each keyword type, and one or multiple keyword texts are set into each single configuration, so that different consideration ranges and quality inspection requirements of enterprises for different task types are met.

As shown in fig. 5, in step S22, it is exemplarily illustrated that after configuring each keyword type, the meaning and application scenario of the keyword type are parsed, and the corresponding keyword is retrieved from the internet, for example, for the task type of marketing outbound, the following keywords can be retrieved for the standard keyword: "you good", "mr", "women", "product", "price", etc., so as to satisfy the diversity and richness of the selectable keywords under different task types.

As shown in fig. 5, in step S23, it is exemplarily illustrated that after the keyword is retrieved, a manual review is performed to add and delete the keyword from the keyword library, for example, when the keyword is retrieved as "product" or "price", the keyword is added to the keyword library after the manual review is qualified, the keyword library is queried, and if the forbidden language type vocabulary such as illegal, violation, negative, slow, etc. is queried, the forbidden language vocabulary is deleted from the keyword library after the manual review is qualified, so as to ensure the real-time property of the keyword library and provide guarantee for the selection of the subsequent keyword text.

After updating the keyword library, it is necessary to select a specific keyword text to form each single configuration of each keyword type, and compare each single configuration with a text fragment, as shown in fig. 6, in some implementation processes, a method for obtaining keyword text matching information is provided:

s31: according to the keyword library, for each keyword type, at least one keyword text is respectively selected to form a single configuration of each keyword type;

s32: performing content matching and traversal on the single configuration from the first clause of the text segment, and recording the serial number of the clause in the text segment to acquire the matching position of the keyword text if the clause is in matching relation with the keyword text in the single configuration;

s33: recording the number of the keyword texts of which the clauses and the single configuration form a matching relation, if the same keyword text in the single configuration repeatedly appears in the same clause for multiple times, counting the number of the keyword texts only once, and acquiring the matching times of the keyword texts according to the number of the keyword texts;

s34: and acquiring matching information of the keyword text according to the matching position of the keyword text and the matching times of the keyword text.

Through the steps, the keyword text matching rules in the voice quality inspection process can be set in detail and accurately, the matching positions and the matching times of each single configuration and the text segments are obtained and recorded, the convenience of inquiring the keyword matching text information is improved, the matching condition of each keyword text is analyzed in the voice quality inspection process, and meanwhile the accuracy of voice quality inspection is improved.

As shown in fig. 6, in step S31, it is exemplarily illustrated that at least one keyword text is respectively selected for each keyword type to form a single configuration of each keyword type, for example, when the task type is marketing outbound, three keyword types are configured, which are respectively a standard keyword, a banned keyword, and an emotion keyword, three single configurations are set for the standard keyword, two single configurations are set for the banned keyword, and two single configurations are set for the emotion keyword, where the keyword selected in the standard keyword single configuration 1 is: "you good", "mr", "woman", the standard keyword bar configuration 2 selects the keywords: "function", "price", the keywords selected by the standard keyword single configuration 3 are: the answer is solved, and the key words selected by the banned keyword single configuration 1 are as follows: fraud, the banned keyword single configuration 2 selects the keywords as: the emotion keyword is not clear, and the keywords selected by the single configuration 1 of the emotion keyword are as follows: "thank you" and "generate qi". By the method, the requirement that enterprises set different single configurations for different keyword types and the selected keyword texts are diverse and rich can be met.

As shown in fig. 6, in step S32, it is exemplarily illustrated that the keyword texts in the single configuration are regarded as a whole, content matching and traversal are performed on the single configuration from the first clause of the text segment, if there is a matching relationship between the clause and the keyword texts in the single configuration, the sequence number of the clause in the text segment is recorded, and the matching position of the keyword texts is obtained, for example, when the nth clause in the text segment is "how you are saying, i are considered with you at all, and are irrelevant to me", and the keyword texts selected in one configuration of the banned keywords are: and if the words are random and junk, the matching relationship between the nth clause and the keyword texts in a single configuration of the forbidden word keywords is considered, and the sequence number of the clause in the text segment is n, so that the matching position of the keyword texts in the single configuration is n, wherein n is a non-negative integer.

As shown in fig. 6, in step S33, it is exemplarily illustrated that, when a clause has a matching relationship with the keyword texts in the single configuration, the number of keyword texts of the clause matching with the single configuration is recorded, but if the same keyword texts in the single configuration repeatedly appear in the clause for multiple times, the number of keyword texts is counted only once, for example, the single configuration is composed of N keyword texts, and of the N keywords, M keyword texts all have a matching relationship with the clause content, and Q keyword texts appear in the content of the clause K times, and at this time, although K times of the P keyword texts appear in the content of the clause, the number of keyword texts of Q keyword texts is counted only once, the number of keywords matching with the configuration is considered to be M only, therefore, the matching times of the single configured keyword texts are M, wherein N, M, Q and K are non-negative integers, and N is not less than M and not less than Q.

As shown in fig. 6, in step S34, for example, after the matching position and the matching times of the single configuration are obtained, the matching position and the matching times are combined, for example, if one clause has a matching relationship with the single configuration, the sequence number of the clause in the text segment is n, and if the single configuration matches the clause, the matching times is M, then the matching information of the single configuration may be represented as [ n, M ] for the one clause, and if another clause also has a matching relationship with the single configuration, the sequence number of the another clause in the text segment is M, and if the configuration matches the another clause, the matching times is Q, then the matching information of the single configuration may be represented as [ M, Q ] for the another clause, and if there is a matching relationship with X clauses, the matching information of the single configuration is represented as a matrix of X rows and Y columns, namely in the form of X Y, X represents the number of clauses which have matching relation with the single configuration, Y comprises the matching position and the matching times which are obtained when the single configuration is respectively matched with each clause of the X clauses, and n, M, M, Q, X and Y are nonnegative integers.

After the matching information of the single configuration is obtained, because the single configuration may have a matching relationship with a plurality of clauses, there are a plurality of sets of matching positions and matching times, and before the quality inspection result is obtained, the matching information needs to be processed, and more matching information needs to be obtained, as shown in fig. 7, in some implementation processes, an obtaining method is provided:

s41: sorting the matching times according to the matching information of the keyword text, and acquiring the maximum matching times according to the highest-value matching times in the sorting;

s42: dividing the maximum matching times by the number of the selected keyword texts to obtain a matching coefficient of a single configuration;

s43: setting sampling weight for each single configuration, and acquiring a single configuration sampling coefficient through the sampling weight and the single configuration matching coefficient, wherein the mathematical expression of a single configuration sampling coefficient sp is as follows:

sp＝w*p

s44: comparing the single configuration sampling coefficient with a preset single configuration matching threshold value to obtain a matching coefficient of a keyword type, wherein a mathematical expression of the keyword type matching coefficient S is as follows:

Through the steps, the specific information of each matching position and the matching times when each single configuration has a matching relation with different clauses in the text segment can be analyzed, and meanwhile, the proper matching times are selected for calculation in the subsequent quality inspection result acquisition step, so that the accuracy of the voice quality inspection result is improved.

As shown in fig. 7, in step S41, it is exemplarily illustrated that the multiple sets of matching times in the matching information are sorted in ascending order or descending order, and the matching time with the largest time in the multiple sets of matching times is obtained, for example, when a single configuration information expression is { [ M, M ], [ N, N ], [ Q, Q ] }, where M > N > Q, the largest matching time is considered to be M, where N, M, Q, N, M, Q are non-negative integers.

In step S42, it is exemplarily illustrated that after the maximum matching number is obtained, the maximum matching number is divided by the number of keyword texts selected by the current single configuration, so as to obtain the matching coefficient, for example, when the number of keyword texts selected by the single configuration is N, and the maximum matching number is M, the value of the matching coefficient p of the single configuration at this time is p ═ M/N, where N and M are non-negative integers.

In some embodiments, for step S42, when the keyword type is a forbidden keyword, after obtaining the maximum matching number, subtracting the maximum matching number from the number of keyword texts selected by the current single configuration, and dividing by the number of keyword texts selected by the current single configuration to obtain the matching coefficient, for example, when the number of keyword texts selected by the single configuration is N, and the maximum matching number is M, then the value of the matching coefficient of the single configuration at this time is p ═ N-M)/N, where N and M are non-negative integers.

In step S43, it is exemplarily explained that each keyword type is associated with a corresponding keywordSetting a sampling weight, multiplying the sampling weight by a matching coefficient of each single configuration, and then accumulating to obtain a matching coefficient of the keyword type, where for example, the keyword type has i single configurations, the matching coefficient of the i single configurations is represented as { p1, p2,. multidot.pi }, and the sampling weight is set for the i single configurations and is represented as { w1, w2,. multidot.wi }, and then a calculation formula of a single configuration sampling coefficient sp is as follows:

in step S44, a threshold value of the single configuration is set as t, and if sp is greater than t, it is determined that one keyword type formed by each single configuration is qualified, and a calculation formula of a matching coefficient S of the one keyword type is:

where k, i is a positive integer, max (-) represents taking the maximum value, and d (-) is the differential operator.

In other implementations, for step S43 and step S44, the method for calculating the matching coefficient of the keyword types further includes:

for a keyword type, i single configuration are provided, the matching coefficients of the i single configuration are represented as { p1, p 2.. and pi }, sampling weights are set for the i single configuration, and are represented as { w1, w 2.. and wi }, then the sampling coefficients of the i single configuration are sp ═ sp1, sp2, …, spi } - { p1 · w1, p2 × w2, …, and pi × wi }, the threshold values of the i single configuration are set as t ═ t1, t2, …, ti }, if the sampling coefficient of the k single configuration is greater than the matching threshold value of the k single configuration, the k single configuration is considered qualified, when all the i single configuration is qualified, the quality inspection of the keyword type formed by the i single configuration is considered qualified, and the matching coefficient of the keyword type is calculated as S:

After the matching coefficient of each keyword type is obtained, the matching coefficient needs to be compared with a preset matching threshold of each keyword type to obtain a quality inspection result, as shown in fig. 8, in some implementation processes, an obtaining method is provided:

s51: and comparing the matching coefficients of the keyword types with preset matching threshold values respectively to obtain quality inspection results.

In step S51, it is exemplarily illustrated that after the matching coefficients of the keyword types are obtained, the matching coefficients of the keyword types are directly compared with the preset matching thresholds of the keyword types, and the quality inspection result is obtained according to the comparison result, for example, there are three keyword types, which are respectively the standard keyword, the forbidden keyword, and the emotion keyword, the matching coefficients of the three keyword types are respectively S1, S2, and S3, the matching thresholds of the preset three keyword types are respectively T1, T2, and T3, and when S1> T1, S2> T2, and S3> T3, the three keyword types are all considered to be qualified, and the quality inspection result is qualified.

In some implementations, there is also provided an acquisition method:

s52: setting sampling weight for each keyword type, and acquiring a keyword type sampling coefficient according to the matching coefficient of the sampling weight and the keyword type, wherein the mathematical expression of the keyword type sampling coefficient SP is as follows:

SP＝S*W

wherein, SP is the sampling coefficient of the keyword type, S is the matching coefficient of the keyword type, and W is the sampling weight of the keyword type;

s53: and comparing the sampling coefficient with a preset matching threshold value to obtain a quality inspection result.

In step S52, it is exemplarily illustrated that after the matching coefficients of the keyword types are obtained, sampling weights of the keyword types are preset, the matching coefficients of the keyword types are multiplied by the sampling coefficients of the keyword types, and then the sampling coefficients are accumulated to obtain sampling coefficients, for example, if there are three keyword types, which are respectively a standard keyword, a banned keyword, and an emotion keyword, the matching coefficients of the three keyword types are S1, S2, and S3, the sampling weights of the three keyword types are respectively W1, W2, and W3, then the sampling coefficient SP has a calculation formula: SP-S1W 1+ S2W 2+ S3W 3.

Through the steps, the requirements of comprehensively evaluating and judging the quality inspection results of the task types when enterprises attach different attention degrees to the different task types can be met, and therefore the voice quality inspection results with accuracy change and higher reasonability can be obtained.

In step S53, it is exemplarily explained that after the sampling coefficient is obtained, the sampling coefficient is compared with a preset matching threshold value to obtain a quality inspection result, for example: there are three keyword types, which are a standard keyword, a forbidden word keyword, and an emotion keyword, respectively, matching coefficients of the three keyword types are S ═ { S1, S2, S3}, sampling weights of the three keyword types are W ═ W1, W2, W3}, and then a sampling coefficient calculation SP formula is as follows: and the SP is S1W 1+ S2W 2+ S3W 3, the preset matching threshold expression is T, and if the SP is more than T, the quality detection result is considered to be qualified.

In some implementations, as shown in fig. 9, for step S52, the method of obtaining sampling coefficients further includes:

after the matching coefficients of the keyword types are obtained, presetting the sampling weight of each keyword type, correspondingly multiplying the matching coefficients of the keyword types with the sampling coefficients of the keyword types to obtain sampling coefficients, and keeping the vector form of the sampling coefficients of the keyword types, for example, if three keyword types exist, which are respectively a standard keyword, a forbidden keyword and an emotion keyword, the matching coefficients of the three keyword types are { S1, S2 and S3}, the sampling weights of the three keyword types are { W1, W2 and W3}, the calculation formula of the sampling coefficient SP is as follows: SP ═ SP1, SP2, SP3 ═ S1 × W1, S2 × W2, S3 × W3 }.

In some implementations, for step S53, the step of obtaining the quality inspection result further includes:

after a sampling coefficient in a vector form is obtained, a matching threshold in the vector form is preset, the sampling coefficient in the vector form is compared with the matching threshold in the vector form, a quality inspection result is obtained, for example, there are three keyword types, which are respectively a standard keyword, a forbidden language keyword and an emotion keyword, the matching coefficients of the three keyword types are { S1, S2, S3}, the sampling weights of the three keyword types are { W1, W2, W3}, and a sampling coefficient SP calculation formula is as follows:

the quality inspection result is considered to be qualified if SP1, SP2, SP3 are { P1 × W1, P2 × W2, P3 × W3}, and the preset matching threshold expression is { T1, T2, T3}, and if SP1> T1 and SP2> T2 and SP3> T3.

It should be understood that although the various steps in the flow charts of fig. 2-9 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-9 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 10, there is provided a voice quality inspection apparatus, the voice quality inspection apparatus including a quality inspection result obtaining module, the quality inspection result obtaining module including:

and the second acquisition unit compares the sampling coefficient with a preset matching threshold value to acquire a quality inspection result.

Through the quality inspection result acquisition module, enterprises can analyze and evaluate the importance degree of different keyword types, set sampling weight for each keyword type according to different requirements, and then preset a matching threshold value to comprehensively evaluate the matching result of each keyword type.

In the first obtaining unit, for example, after obtaining the matching coefficient of each keyword type, presetting a sampling weight of each keyword type, multiplying the matching coefficient of each keyword type by the sampling coefficient of each keyword type correspondingly, and then accumulating to obtain a sampling coefficient, for example, if there are three keyword types, which are respectively a standard keyword, a banned keyword, and an emotion keyword, the matching coefficients of the three keyword types are S1, S2, and S3, the sampling weights of the three keyword types are respectively W1, W2, and W3, the sampling coefficient SP is calculated as: SP-S1W 1+ S2W 2+ S3W 3.

In the second obtaining unit, it is exemplarily explained that after the sampling coefficient is obtained, the sampling coefficient is compared with a preset matching threshold value to obtain a quality inspection result, for example: there are three keyword types, which are a standard keyword, a forbidden word keyword, and an emotion keyword, respectively, matching coefficients of the three keyword types are S ═ { S1, S2, S3}, sampling weights of the three keyword types are W ═ W1, W2, W3}, and then a sampling coefficient calculation SP formula is as follows: and the SP is S1W 1+ S2W 2+ S3W 3, the preset matching threshold expression is T, and if the SP is more than T, the quality detection result is considered to be qualified.

In some implementations, the first obtaining unit further includes: after the matching coefficient of each keyword type is obtained, presetting the sampling weight of each keyword type, correspondingly multiplying the matching coefficient of each keyword type with the sampling coefficient of each keyword type, obtaining the sampling coefficient, and keeping the vector form of the sampling coefficient of each keyword type, for example, there are three keyword types, which are respectively a standard keyword, a forbidden keyword, and an emotion keyword, the matching coefficients of the three keyword types are S ═ S1, S2, S3, the sampling weights of the three keyword types are W ═ W1, W2, W3, then the sampling coefficient SP has the following calculation formula: SP { S1 × W1, S2 × W2, S3 × W3 }.

In some implementations, the second obtaining unit further includes: after obtaining a sampling coefficient in a vector form, presetting a matching threshold in the vector form, and comparing the sampling coefficient in the vector form with the matching threshold in the vector form to obtain a quality inspection result, where for example, there are three keyword types, which are respectively a standard keyword, a forbidden keyword, and an emotion keyword, the matching coefficients of the three keyword types are S ═ S1, S2, S3, and the sampling weights of the three keyword types are W ═ W1, W2, and W3, then the sampling coefficient calculation formula is: the quality inspection result is considered to be qualified if SP { SP1, SP2, SP3} { S1 × W1, S2 × W2, S3 × W3}, and the preset matching threshold expression is T ═ T1, T2, T3}, and if SP1> T1, and SP2> T2, and SP3> T3.

As shown in fig. 11, a voice quality inspection apparatus further includes a voice conversion text module, a task parameter configuration module, a matching information acquisition module, a matching coefficient acquisition module, and a quality inspection report acquisition module, where voice data is converted into text data, the text data is subjected to sentence segmentation processing to obtain text segments, then keyword types are configured according to task types, keywords corresponding to the keyword types are retrieved, a keyword library is updated, keyword texts are selected from the keyword library and compared with the text segments to obtain matching information of the keyword texts, then the matching information of the keyword texts is compared with the number of the selected keyword texts according to the matching information of the keyword texts to obtain matching coefficients of the keyword types, and finally the matching coefficients of the keyword types are compared with a preset matching threshold value according to the matching coefficients of the keyword types, and acquiring a quality inspection result of the task type.

By the device, the problems of complex operation steps, low execution efficiency, poor quality inspection quality, high labor cost, high subjectivity and the like existing in the process of arranging and quality inspection of voice communication records generated between customer service and customers in a manual mode can be solved, the voice data are converted into the text data, the text data are compared with keyword texts in a keyword library, matching rules are set, voice quality inspection results are obtained, the defects existing in the manual quality inspection mode can be avoided, the enterprise cost is reduced, the voice quality inspection efficiency is improved, and customer service performance assessment and customer service satisfaction systematization are achieved.

In some embodiments, the step of the task parameter configuration module comprises:

and updating a keyword library according to the keywords.

In some embodiments, the quality inspection result obtaining module comprises:

In some embodiments, after the matching coefficient of each single configuration is obtained, a sampling weight is set for each single configuration, the matching coefficient of each single configuration is multiplied by the sampling weight of each single configuration and then accumulated to obtain the matching coefficient of each keyword type, then the sampling weight is set for each keyword type, the matching coefficient of each keyword type is multiplied by the sampling weight of each keyword type and then accumulated to obtain a sampling coefficient, then the obtained sampling coefficient is compared with a preset matching threshold, if the sampling coefficient is greater than the matching threshold, the voice quality inspection result is considered to be qualified, and if not, the voice quality inspection result is not qualified.

In some embodiments, after obtaining the matching coefficient of each single configuration of the standard keyword type, comparing the matching coefficient of each single configuration of the standard keyword type with each single configuration threshold of a preset standard keyword type, if the coefficient of each single configuration of the standard keyword type is greater than each single configuration threshold of the preset standard keyword type, the standard keyword type is qualified, then obtaining the matching coefficient of each single configuration of the forbidden language keyword type, comparing the matching coefficient of each single configuration of the forbidden language keyword type with each single configuration threshold of the preset forbidden language keyword type, if the coefficient of each single configuration of the forbidden language keyword type is greater than each single configuration threshold of the preset forbidden language keyword type, the forbidden language keyword type is qualified, and finally obtaining the matching coefficient of each single configuration of the emotion keyword type, and comparing the matching coefficient of each single configuration of the emotion keyword type with each single configuration threshold of a preset emotion keyword type, if the coefficient of each single configuration of the emotion keyword type is greater than each single configuration threshold of the preset emotion keyword type, determining that the emotion keyword type is qualified, if all the standard keyword type, the forbidden language keyword type and the emotion keyword type are qualified, determining that the voice quality inspection result is qualified, and if at least one keyword type among the standard keyword type, the forbidden language keyword type and the emotion keyword type is unqualified, determining that the voice quality inspection result is unqualified.

For the specific limitation of the voice quality inspection device, reference may be made to the above limitation of the voice quality inspection method, and details are not described herein again. All or part of each module in the voice quality inspection device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data for account management. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an account management method.

Those skilled in the art will appreciate that the architecture shown in fig. 12 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A voice quality inspection method is characterized by comprising the following steps:

2. The voice quality inspection method according to claim 1, wherein the step of configuring a keyword type according to the task type, retrieving a keyword corresponding to the keyword type, and updating the keyword library comprises:

acquiring corresponding keywords according to the meanings and application scenes of the types of the keywords;

and updating a keyword library according to the keywords.

3. The voice quality inspection method according to claim 1, wherein the step of selecting a keyword text from the keyword library, comparing the keyword text with the text fragment, and obtaining matching information of the keyword text comprises:

4. The voice quality inspection method according to claim 1 or 3, wherein the step of comparing the number of the selected keyword texts with the matching information of the keyword texts to obtain the matching coefficients of the keyword types comprises:

sp＝w*p

comparing the single configuration sampling coefficient with a preset single configuration matching threshold value to obtain a matching coefficient of a keyword type, wherein a mathematical expression of the keyword type matching coefficient S is as follows:

5. The voice quality inspection method according to claim 1, wherein the step of comparing the matching coefficient of the keyword type with a preset matching threshold to obtain the quality inspection result of the task type comprises:

6. The voice quality inspection method according to claim 1 or 5, wherein the step of comparing the matching coefficient of the keyword type with a preset matching threshold value to obtain the quality inspection result of the task type further comprises:

setting sampling weight for each keyword type, and acquiring a keyword type sampling coefficient according to the matching coefficient of the sampling weight and the keyword type, wherein the mathematical expression of the keyword type sampling coefficient SP is as follows:

SP＝S*W

7. The voice quality inspection method according to claim 1, wherein the step of converting voice data into text data, performing sentence-breaking processing on the text data, and obtaining text segments comprises:

performing sentence breaking on the text data according to the time separation label, and adding punctuation marks at the end of the sentence according to the color vocabulary at the end of the sentence;

8. A voice quality inspection apparatus, comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the voice quality testing method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the voice quality testing method according to any one of claims 1 to 7.