CN111445928A

CN111445928A - Voice quality inspection method, device, equipment and storage medium

Info

Publication number: CN111445928A
Application number: CN202010249540.9A
Authority: CN
Inventors: 张超
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2020-07-24

Abstract

The application discloses a voice quality inspection method, a device, equipment and a storage medium, wherein the method comprises the following steps: receiving voice data to be quality-tested, and performing text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested; identifying whether each preset quality inspection item exists in the text data to be inspected to obtain a target identification result; and evaluating the voice data to be quality-checked according to the target recognition result. The voice quality inspection device solves the technical problems that voice quality inspection is carried out in an artificial mode, quality inspection efficiency is low, and quality inspection is incomplete in the prior art.

Description

Voice quality inspection method, device, equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technology for financial technology (Fintech), and in particular, to a voice quality inspection method, apparatus, device, and storage medium.

Background

With the continuous development of financial science and technology, especially internet science and technology finance, more and more technologies are applied to the financial field, but the financial industry also puts higher requirements on the technologies, for example, the financial industry also has higher requirements on voice quality inspection.

Currently, enterprises perform voice collection through a customer service telephone and the like, for example, perform financial collection through a telephone and the like, in order to complete the evaluation of the quality of the telephone customer service and the collection quality, quality inspection is often performed on audio files such as recordings and the like collected by the telephone, wherein the quality inspection is to extract customer service voice content, determine whether the customer service voice content has corresponding quality inspection items and the like, and then perform quality inspection scoring. Quality inspection extraction, scoring and the like in the existing quality inspection process are manually finished, manual spot check has certain limitation, a large amount of human resources are consumed, and comprehensive spot check is difficult to perform.

Disclosure of Invention

The application mainly aims to provide a voice quality inspection method, a voice quality inspection device, voice quality inspection equipment and a storage medium, and aims to solve the technical problems that voice quality inspection is performed in a manual mode in the prior art, quality inspection efficiency is low, and quality inspection is incomplete.

In order to achieve the above object, the present application provides a voice quality inspection method, including:

receiving voice data to be quality-tested, and performing text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested;

identifying whether each preset quality inspection item exists in the text data to be inspected to obtain a target identification result;

and evaluating the voice data to be quality-checked according to the target recognition result.

Optionally, each preset quality inspection item is composed of each preset quality inspection element;

the step of identifying whether the text data to be quality-tested has each preset quality-testing item or not to obtain a target identification result comprises the following steps:

identifying whether the text data to be quality-tested has each preset quality-testing element or not through a first identification submodel in a preset quality-testing model to obtain an initial identification result;

acquiring initial text data corresponding to the initial identification result, inputting the initial text data into a second identification submodel in the preset quality inspection model, and identifying the strength of each preset quality inspection element of the initial text data through the second identification submodel to obtain the target identification result;

the first recognition submodel is obtained by performing iterative training on a first preset prediction model to be trained based on a first preset text set with each preset quality inspection element label through executing a first preset training process, and the second recognition submodel is obtained by performing iterative training on a second preset prediction model to be trained based on a second preset text set with each strength label of each preset quality inspection element through executing a second preset training process.

performing preset first mode matching processing on each first text at each preset determined position in the text data to be quality-tested to obtain a first identifier result of whether each first text has a preset quality testing item;

positioning and obtaining each second text associated with each preset quality inspection element from the text data to be inspected, and inputting each second text into a preset classification model to obtain a predicted quality inspection element label of each second text so as to obtain a second identifier result;

and setting the first recognizer result and the second recognizer result as the target recognition result.

Optionally, the step of setting the first recognizer result and the second recognizer result as the target recognition result includes:

taking the first text and the second text as candidate texts, and performing matching processing of a preset second mode to determine whether the false judgment of a preset quality inspection element exists or not so as to obtain a false judgment identification result;

and carrying out misjudgment clearing processing on the first identifier result and the second identifier result based on the misjudgment identification result to obtain a target identification result.

Optionally, before the step of obtaining, by positioning, each second text associated with each preset quality inspection element from the text data to be quality inspected, and inputting each second text into a preset classification model, the method includes:

acquiring preset classified text data, extracting a feature matrix from the preset classified text data based on a preset convolutional layer and a preset pooling layer in a third preset to-be-trained prediction model, and performing prediction processing on the feature matrix according to a preset activation layer of the third preset to-be-trained prediction model to obtain a predicted quality inspection element label set;

and comparing the predicted quality inspection element label set with an actual label set in the preset classification text data to obtain a comparison result, so as to adjust the parameters of the preset convolution layer and the preset pooling layer until the similarity between the predicted quality inspection element label set and the actual label set is greater than a preset similarity threshold value, so as to obtain the preset classification model.

the step of evaluating the voice data to be quality-checked according to the target recognition result comprises the following steps:

obtaining the scoring weight ratio of each preset quality testing item, and obtaining the score of the voice data to be quality tested according to the scoring weight ratio and the target identification result;

and forming a scoring report based on the scoring of the voice data to be quality tested, and reporting the scoring report.

Optionally, the step of receiving the voice data to be quality-tested, and performing text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested includes:

receiving voice data to be quality-checked, and performing text conversion processing on the voice data to be quality-checked to obtain first processed data;

and carrying out preset preprocessing on the first processing data to obtain text data to be inspected.

The application also provides a voice quality inspection device, the voice quality inspection device includes:

the receiving module is used for receiving voice data to be subjected to quality inspection and performing text conversion processing on the voice data to be subjected to quality inspection to obtain text data to be subjected to quality inspection;

the identification module is used for identifying whether the text data to be subjected to quality inspection has each preset quality inspection item or not to obtain a target identification result;

and the evaluation module is used for evaluating the voice data to be subjected to quality inspection according to the target identification result.

the identification module comprises:

the identification unit is used for identifying whether the text data to be quality-tested has each preset quality-testing element or not through a first identification submodel in a preset quality-testing model to obtain an initial identification result;

the first acquisition unit is used for acquiring initial text data corresponding to the initial identification result, inputting the initial text data into a second identification submodel in the preset quality inspection model, and identifying the strength of each preset quality inspection element of the initial text data through the second identification submodel to obtain the target identification result;

the identification module further comprises:

the matching unit is used for performing preset first mode matching processing on each first text at each preset determined position in the text data to be quality tested to obtain whether each first text has a first identifier result of a preset quality testing item;

the positioning unit is used for positioning and obtaining each second text related to each preset quality inspection element from the text data to be quality inspected, inputting each second text into a preset classification model, and obtaining a predicted quality inspection element label of each second text so as to obtain a second identifier result;

a setting unit, configured to set the first identifier result and the second identifier result as the target identification result.

Optionally, the setting unit includes:

the matching subunit is used for performing matching processing of a preset second mode by taking the first text and the second text as candidate texts so as to determine whether the false judgment of the preset quality inspection element exists or not and obtain a false judgment identification result;

and the clearing subunit is used for carrying out misjudgment clearing processing on the first identifier result and the second identifier result based on the misjudgment identification result to obtain a target identification result.

Optionally, the voice quality inspection apparatus further includes:

the acquisition module is used for acquiring preset classified text data, extracting a feature matrix from the preset classified text data based on a preset convolution layer and a preset pooling layer in a third preset to-be-trained prediction model, and performing prediction processing on the feature matrix according to a preset activation layer of the third preset to-be-trained prediction model to obtain a predicted quality inspection element label set;

and the comparison module is used for comparing the predicted quality inspection element label set with an actual label set in the preset classification text data to obtain a comparison result so as to adjust the parameters of the preset convolution layer and the preset pooling layer until the similarity between the predicted quality inspection element label set and the actual label set is greater than a preset similarity threshold value to obtain the preset classification model.

the evaluation module comprises:

the second acquisition unit is used for acquiring the scoring weight ratio of each preset quality inspection item and acquiring the score of the voice data to be quality inspected according to the scoring weight ratio and the target identification result;

and the reporting module is used for forming a scoring report based on the scoring of the voice data to be quality tested and reporting the scoring report.

Optionally, the receiving module includes:

the receiving unit is used for receiving voice data to be subjected to quality inspection and performing text conversion processing on the voice data to be subjected to quality inspection to obtain first processed data;

and the preprocessing unit is used for carrying out preset preprocessing on the first processing data to obtain text data to be inspected.

The application also provides a voice quality inspection device, the voice quality inspection device is an entity device, the voice quality inspection device includes: the voice quality detection method comprises a memory, a processor and a program of the voice quality detection method stored on the memory and capable of running on the processor, wherein the program of the voice quality detection method can realize the steps of the voice quality detection method when being executed by the processor.

The present application also provides a storage medium having a program stored thereon for implementing the voice quality inspection method, wherein the program implements the steps of the voice quality inspection method when executed by a processor.

The method comprises the steps of receiving voice data to be quality-tested, and performing text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested; identifying whether each preset quality inspection item exists in the text data to be inspected to obtain a target identification result; and evaluating the voice data to be quality-checked according to the target recognition result. In the application, after receiving the voice data to be quality-tested, the voice data to be quality-tested is subjected to text conversion processing to obtain the text data to be quality-tested, and then whether the text data to be quality-tested has each preset quality testing item is automatically identified to obtain a target identification result, because the text data to be quality-tested is automatically identified whether each preset quality testing item is present or not to obtain the target identification result, and according to the target identification result, the voice data to be quality-tested is automatically evaluated rather than the manual quality testing, so that the resources are saved, the quality testing efficiency is improved, and because the voice data to be quality-tested is received and then the quality testing is performed, the comprehensiveness of the quality testing is improved (namely, all the voice data to be quality-tested are subjected to the quality testing), thereby solving the problem that the voice quality testing is performed in a manual mode in the prior art, low quality inspection efficiency and incomplete quality inspection.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flowchart illustrating a voice quality inspection method according to a first embodiment of the present application;

fig. 2 is a schematic flowchart of a refining step of identifying whether each preset quality inspection item exists in the text data to be quality inspected to obtain a target identification result in the first embodiment of the voice quality inspection method according to the present application;

fig. 3 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.

The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In a first embodiment of the voice quality inspection method, referring to fig. 1, the voice quality inspection method includes:

step S10, receiving voice data to be quality-tested, and performing text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested;

step S20, identifying whether the text data to be quality tested has each preset quality testing item to obtain a target identification result;

and step S30, evaluating the voice data to be inspected according to the target recognition result.

The method comprises the following specific steps:

in this embodiment, it should be noted that the voice quality inspection method is applied to a voice quality inspection system, which belongs to a voice quality inspection device and is in communication with a call delivery customer service platform, wherein the call delivery customer service platform records the call delivery content of each call delivery member and obtains the recorded content, and the voice data to be quality inspected can be obtained from the recorded content, specifically, because the recorded content includes the voice content of the person to be delivered, in order to reduce the data to be processed, the recorded content is cut by voice to remove the recorded voice of the person to be delivered (specifically, the recorded voice of the person to be delivered is determined by a language identification technology), the recorded data of the person to be delivered is obtained as the voice data to be quality inspected and sent to the voice quality inspection system, after receiving the voice data to be quality-tested, the voice quality-testing system performs text conversion processing on the voice data to be quality-tested to obtain text data to be quality-tested.

The method comprises the following steps of receiving voice data to be quality-checked, performing text conversion processing on the voice data to be quality-checked, and obtaining text data to be quality-checked, wherein the steps comprise:

step S11, receiving voice data to be quality tested, and performing text conversion processing on the voice data to be quality tested to obtain first processed data;

in this embodiment, after obtaining the voice data to be quality-tested, the voice data to be quality-tested is subjected to text conversion processing by using the preset language conversion model, for example, the voice data to be quality-tested is subjected to text conversion processing by using the preset ASR model, so as to obtain first processing data.

And step S12, performing preset preprocessing on the first processed data to obtain text data to be inspected.

After the first processed data is obtained, due to an error in the text conversion process or due to a vocation of a receiver, a preset preprocessing needs to be performed on the first processed data, specifically, a preprocessing such as word segmentation, error correction, and rewriting is performed on the text data by a preset technology such as N L P (Natural L language Processing), so as to obtain the first processed data.

after obtaining text data to be quality-checked, identifying whether the text data to be quality-checked has each preset quality-checking item or not to obtain a target identification result, wherein the quality-checking item refers to a conversational text such as business knowledge, pressure points, communication skills, opening time, closing words and the like, and the method comprises the following steps: the open-field dialog text is "you good! Here xx bank staff call you, ask you for mr. XXX/ms ", etc. The identification of whether the text data to be subjected to quality inspection has each preset quality inspection item comprises the identification of whether the text data to be subjected to quality inspection has all the preset quality inspection items, wherein each preset quality inspection item comprises a plurality of preset quality inspection elements, for example, a quality inspection item 'pressure point' comprises a plurality of quality inspection elements such as informing overdue days, informing overdue charging interest and corresponding fine information, informing reporting a person side information report, informing the severity of the information report, informing an overdue bank that mail is not sent to a customer unit or a home, informing a service request that personal side information records are necessarily paid, and service knowledge comprises a plurality of quality inspection elements such as paying on applications, paying, checking whether the paying is successful and the like. Therefore, the identification of whether the text data to be quality-tested has each preset quality testing item refers to the identification of whether the text data to be quality-tested has each preset quality testing element in each preset quality testing item, and the identification of whether the text data to be quality-tested has each preset quality testing item (including each preset quality testing element) is performed to obtain a target identification result.

Each preset quality inspection item consists of preset quality inspection elements;

in this embodiment, a way to obtain the target recognition result is provided, that is, the way is: the step of identifying whether the text data to be quality-tested has each preset quality-testing item or not to obtain a target identification result comprises the following steps:

step S21, identifying whether the text data to be quality-tested has each preset quality-testing element through a first identification submodel in a preset quality-testing model to obtain an initial identification result;

step S22, acquiring initial text data corresponding to the initial recognition result, inputting the initial text data into a second recognition submodel in the preset quality inspection model, and recognizing the strength of each preset quality inspection element of the initial text data through the second recognition submodel to obtain the target recognition result;

In this embodiment, a preset quality inspection model is present, and the preset quality inspection model comprises a first identifier model and a second identifier model, and through the setting of the first identifier model, can accurately identify whether the text data to be inspected has each preset quality inspection element or not to obtain an initial identification result, and through the setting of the second identification submodel, identifying each intensity label of each preset quality inspection element corresponding to the initial identification result to obtain the target identification result, since the identification in this embodiment increases the identification of each intensity label of each predetermined quality inspection element, and only identifies each intensity label of each predetermined quality inspection element in the initial text data, therefore, the depth of recognition is increased, the data content needing to be recognized is reduced, and therefore the accuracy of recognition is improved, and the efficiency of recognition is improved.

In this embodiment, the first recognition submodel is capable of recognizing whether the text data to be quality-checked has each preset quality-check element to obtain an initial recognition result, wherein the first recognition submodel is obtained by performing iterative training on a first preset prediction model to be trained by performing a first preset training procedure based on a first preset text set having each preset quality-check element label, so that the text data to be quality-checked can be accurately recognized, specifically, during the training process of performing the first preset training procedure, performing first feature extraction on the first preset text set having each preset quality-check element label through a first upper sampling layer in a first preset prediction model to be trained to obtain a first extracted feature, and after obtaining the first extracted feature, predicting the first extracted feature through an activation layer in the first preset prediction model to be trained, and obtaining a first prediction result, comparing the first prediction result with a first actual result prestored in a first preset text set to obtain the similarity between the first prediction result and the first actual result, and continuously adjusting the parameters of the first upper sampling layer according to the similarity between the first prediction result and the first actual result until the similarity between the first prediction result and the first actual result is greater than a first preset identification similarity, thereby obtaining a first identification submodel.

In this embodiment, initial text data corresponding to the initial recognition result is obtained, the initial text data is input into a second recognition submodel in the preset quality inspection model, and the second recognition submodel is used to recognize the strength of each preset quality inspection element (determined by the characteristics of the speed of speech, the tone of speech, and the like) of the initial text data, so as to obtain the target recognition result; and the second recognition submodel is obtained by performing iterative training on a second preset prediction model to be trained by executing a second preset training process based on a second preset text set with each strength label of each preset quality inspection element. Specifically, in the training process of executing the second preset training process, the second feature extraction is performed on the second preset text set with each preset quality inspection element strength label through the upper sampling layer in the second preset prediction model to be trained to obtain a second extracted feature, after the second extracted feature is obtained, predicting the second extracted feature through an activation layer in a second preset prediction model to be trained to obtain a second prediction result, comparing the second prediction result with a second actual result prestored in a second preset text set to obtain the similarity between the second prediction result and the second actual result, and continuously adjusting the parameters of the second upper sampling layer according to the similarity of the second prediction result and the second actual result until the similarity of the second prediction result and the second actual result is greater than a second preset identification similarity, and further obtaining a second identification submodel.

In the embodiment, the preset quality inspection model is trained in advance, so that the target recognition result can be automatically obtained by directly inputting the text data to be inspected, and a foundation is laid for efficiently obtaining the voice quality inspection evaluation.

Each preset quality inspection item consists of each preset quality inspection element;

referring to fig. 2, in the present embodiment, another way of obtaining the target recognition result is provided, that is, a way two: the step of identifying whether the text data to be quality-tested has each preset quality-testing item or not to obtain a target identification result comprises the following steps:

step S23, performing preset first mode matching processing on each first text at each preset determined position in the text data to be quality tested to obtain whether each first text has a first identifier result of a preset quality testing item;

in this embodiment, another way of obtaining the target recognition result is provided, in which, first, a matching process of a preset first pattern is performed on each first text (the first text of the starting position or the first text of the ending position) of each preset determined position (such as the starting position or the ending position) in the text data to be quality checked, and a quality check element in a quality check item or a quality check item is found, for example, a quality check element of "xx bank staff" of an openfield white quality check item may use r "(here) ((xx bank staff | xx bank) ((call you) ()" which is a preset first pattern, and it should be noted that the preset first pattern has a certain generalization, and may cover multiple expression ways, for example, a sentence of "here a little bank call you (may be preset) may be covered, in this embodiment, it should be noted that, first, the matching processing of the preset first pattern is performed on each first text of the preset determined position, so that fast initial quality inspection is performed on the voice data to determine the grading level of the quality inspection or improve the efficiency of the quality inspection (this is because, if there is no corresponding quality inspection item in the initial position or the end position, it indicates that the current service level is poor).

Step S24, positioning and obtaining each second text related to each preset quality inspection element from the text data to be quality inspected, inputting each second text into a preset classification model, and obtaining a predicted quality inspection element label of each second text to obtain a second identifier result;

positioning and obtaining each second text associated with each preset quality inspection element from the text data to be inspected, specifically, positioning and obtaining each second text associated with each preset quality inspection element from the text data to be inspected in a word matching mode, for example, it is generally used to find out the quality inspection element to be detected (for example, "please pay for the antenoon full amount in tomorrow") from the text data to be inspected, i.e., from a long text, and determine the start and end positions thereof, wherein, in the text data to be quality tested, the condition that a plurality of sentences are connected together and punctuations are not separated in the middle generally occurs, for example, "you ask you for your information for a future full amount in the tomorrow or report a letter to costal", therefore, all the possible second texts where the quality inspection element is located need to be found out by means of preset fuzzy matching. After each second text is obtained, inputting each second text into a preset classification model to obtain a predicted quality inspection element label of each second text so as to obtain a second identifier result, namely accurately classifying through the preset classification model so as to obtain the second identifier result.

And after a first identifier result and a second identifier result are obtained, comprehensively setting the first identifier result and the second identifier result as the target identification result.

In this embodiment, after obtaining the target recognition result, the to-be-quality-tested voice data is evaluated according to the target recognition result, for example, if the target recognition result indicates that the to-be-quality-tested text data has all the preset quality testing items, the to-be-quality-tested voice data is full.

step S31, obtaining the score weight ratio of each preset quality inspection item, and obtaining the score of the voice data to be quality inspected according to the score weight ratio and the target identification result;

in this embodiment, after the target recognition result is obtained, according to the preset quality inspection items that are not included in the target recognition result and the score weight ratios of the preset quality inspection items, the weight ratios of the preset quality inspection items that are included in the target recognition result and the preset quality inspection items that are included in the target recognition result are obtained, and based on the total score, the score of the voice data to be quality inspected is obtained.

And step S32, forming a score report based on the score of the voice data to be quality tested, and reporting the score report.

And after the score of the voice data to be quality checked is obtained, forming a score report based on the score of the voice data to be quality checked, reporting the score report so as to facilitate subsequent inquiry or spot check, and comparing the score in the score report and the artificial score in the spot check process so as to determine whether the score in the score report is accurate.

Further, based on the first embodiment of the present application, in another embodiment of the present application, the step of setting the first recognizer result and the second recognizer result as the target recognition result includes:

step A1, taking the first text and the second text as candidate texts, and performing matching processing of a preset second mode to determine whether a false judgment of a preset quality inspection element exists or not to obtain a false judgment recognition result;

in this embodiment, after obtaining the first identifier result and the second identifier result, in order to avoid a possible erroneous judgment, the first text and the second text are further used as candidate texts, and a matching process in a preset second mode is performed to determine whether a erroneous judgment of a preset quality inspection element exists, so as to obtain a erroneous judgment identification result, specifically, in a recording of call collection, a customer service or a receiver often expresses negative statements such as "if a sufficient amount is paid before night, a credit is not reported to a central office", and the negative statements do not belong to a preset "communication skill" quality inspection item and do not belong to a preset payment plan (quality inspection element), but only through matching in a preset first mode and text positioning, a recognition error may exist, so that a candidate text needs to be re-predicted. Specifically, it is determined whether a preset negative statement exists in the candidate text. And if the preset negative statement exists, determining that the candidate text does not belong to the quality inspection item of the preset communication skill, or does not belong to the quality inspection element of the preset repayment plan.

Step A2, carrying out misjudgment clearing processing on the first identifier result and the second identifier result based on the misjudgment identification result to obtain a target identification result.

And after a misjudgment recognition result is obtained, carrying out misjudgment clearing processing on the first recognition sub-result and the second recognition sub-result based on the misjudgment recognition result, and if the candidate text does not belong to the quality inspection item of the preset communication skill, carrying out deletion processing to obtain a target recognition result.

In this embodiment, the first text and the second text are used as candidate texts to perform matching processing of a preset second pattern, so as to determine whether a false judgment of a preset quality inspection element exists, and obtain a false judgment recognition result; and carrying out misjudgment clearing processing on the first identifier result and the second identifier result based on the misjudgment identification result to obtain a target identification result. In this embodiment, the misjudgment is accurately identified, and the identification accuracy is improved.

Further, based on the first embodiment and the second embodiment in the present application, before the step of obtaining each second text associated with each preset quality inspection element from the text data to be quality inspected through positioning and inputting each second text into a preset classification model, the method includes:

step B1, acquiring preset classified text data, extracting a feature matrix from the preset classified text data based on a preset convolution layer and a preset pooling layer in a third preset to-be-trained prediction model, and performing prediction processing on the feature matrix according to a preset activation layer of the third preset to-be-trained prediction model to obtain a predicted quality inspection element label set;

in this embodiment, the preset classification model is accurately obtained to accurately obtain the predicted quality inspection element labels of each second text, specifically, preset classification text data (including an actual label set of each quality inspection element) is obtained, a feature matrix is extracted from the training sub-data of the preset classification text data based on a preset convolution layer and a preset pooling layer in a third preset to-be-trained prediction model, and the feature matrix is subjected to prediction processing according to a preset activation layer of the third preset to-be-trained prediction model to obtain a predicted quality inspection element label set, where the predicted quality inspection element label set relates to each quality inspection element label.

Step B2, comparing the predicted quality inspection element label set with an actual label set in the preset classification text data to obtain a comparison result, so as to adjust the parameters of the preset convolution layer and the preset pooling layer until the similarity between the predicted quality inspection element label set and the actual label set is greater than a preset similarity threshold value, so as to obtain the preset classification model.

Comparing the predicted quality inspection element label set with an actual label set in the preset classified text data to obtain a comparison result, sequentially adjusting parameters (including matrix weight) of the preset convolution layer and the preset pooling layer according to the comparison result after the comparison result is obtained until the similarity between the predicted quality inspection element label set and the actual label set is greater than a preset similarity threshold to obtain a target classification model, verifying the target classification model based on the verification subdata of the preset classified text data when the similarity between the predicted quality inspection element label set and the actual label set is greater than the preset similarity threshold, and obtaining the preset classification model when the verification is passed.

In the embodiment, a feature matrix is extracted from preset classified text data based on a preset convolution layer and a preset pooling layer in a third preset to-be-trained prediction model by acquiring the preset classified text data, and the feature matrix is subjected to prediction processing according to a preset activation layer of the third preset to-be-trained prediction model to obtain a predicted quality inspection element label set; and comparing the predicted quality inspection element label set with an actual label set in the preset classification text data to obtain a comparison result, so as to adjust the parameters of the preset convolution layer and the preset pooling layer until the similarity between the predicted quality inspection element label set and the actual label set is greater than a preset similarity threshold value, so as to obtain the preset classification model. In the embodiment, the preset classification model is accurately obtained, and a foundation is laid for accurately performing the quality inspection voice.

Referring to fig. 3, fig. 3 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.

As shown in fig. 3, the voice quality inspection apparatus may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.

Optionally, the voice quality inspection device may further include a rectangular user interface, a network interface, a camera, RF (radio frequency) circuits, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).

Those skilled in the art will appreciate that the voice quality inspection apparatus configuration shown in fig. 3 is not intended to be limiting of the voice quality inspection apparatus and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 3, a memory 1005, which is a storage medium, may include an operating system, a network communication module, and a voice quality inspection program. The operating system is a program for managing and controlling hardware and software resources of the voice quality inspection device, and supports the operation of the voice quality inspection program and other software and/or programs. The network communication module is used for realizing communication among the components in the memory 1005 and with other hardware and software in the voice quality inspection system.

In the voice quality inspection apparatus shown in fig. 3, the processor 1001 is configured to execute the voice quality inspection program stored in the memory 1005, and implement the steps of the voice quality inspection method described in any one of the above.

The specific implementation of the voice quality inspection device of the present application is substantially the same as the embodiments of the voice quality inspection method, and is not described herein again.

the identification module comprises:

the identification module further comprises:

Optionally, the setting unit includes:

Optionally, the voice quality inspection apparatus further includes:

the evaluation module comprises:

Optionally, the receiving module includes:

The embodiment of the present application provides a storage medium, and the storage medium stores one or more programs, and the one or more programs are further executable by one or more processors for implementing the steps of the voice quality inspection method described in any one of the above.

The specific implementation of the storage medium of the present application is substantially the same as the embodiments of the voice quality inspection method, and is not described herein again.

The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims

1. A voice quality inspection method is characterized by comprising the following steps:

2. The voice quality inspection method according to claim 1, wherein each of the predetermined quality inspection items is constituted by a respective predetermined quality inspection element;

3. The voice quality inspection method according to claim 1, wherein each of the predetermined quality inspection items is constituted by a respective predetermined quality inspection element;

4. The method for voice quality inspection according to claim 3, wherein the step of setting the first recognizer result and the second recognizer result as the target recognition result comprises:

5. The voice quality inspection method according to claim 3, wherein before the step of locating each second text associated with each preset quality inspection element from the text data to be inspected and inputting each second text into a preset classification model, the method comprises:

6. The voice quality inspection method according to any one of claims 1 to 5, wherein each of the predetermined quality inspection items is constituted by a respective predetermined quality inspection element;

7. The voice quality inspection method according to claim 1, wherein the step of receiving the voice data to be inspected, performing text conversion processing on the voice data to be inspected to obtain text data to be inspected comprises:

8. A voice quality inspection apparatus, characterized in that the voice quality inspection apparatus comprises:

9. A voice quality inspection apparatus, characterized by comprising: a memory, a processor, and a program stored on the memory for implementing the voice quality inspection method,

the memory is used for storing a program for realizing the voice quality inspection method;

the processor is used for executing the program for implementing the voice quality testing method so as to implement the steps of the voice quality testing method according to any one of claims 1 to 7.

10. A storage medium having stored thereon a program for implementing a voice quality inspection method, the program being executed by a processor to implement the steps of the voice quality inspection method according to any one of claims 1 to 7.