CN110364183A - Method, apparatus, computer equipment and the storage medium of voice quality inspection - Google Patents
Method, apparatus, computer equipment and the storage medium of voice quality inspection Download PDFInfo
- Publication number
- CN110364183A CN110364183A CN201910616721.8A CN201910616721A CN110364183A CN 110364183 A CN110364183 A CN 110364183A CN 201910616721 A CN201910616721 A CN 201910616721A CN 110364183 A CN110364183 A CN 110364183A
- Authority
- CN
- China
- Prior art keywords
- keywords
- audio
- audio data
- detected
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000007689 inspection Methods 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 55
- 238000012360 testing method Methods 0.000 claims abstract description 49
- 239000012634 fragment Substances 0.000 claims description 61
- 238000004590 computer program Methods 0.000 claims description 27
- 238000000638 solvent extraction Methods 0.000 claims description 26
- 230000011218 segmentation Effects 0.000 claims description 20
- 239000000284 extract Substances 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 8
- 230000001755 vocal effect Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02087—Noise filtering the noise being separate speech, e.g. cocktail party
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Artificial Intelligence (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application involves Business Process Optimizing technical fields, provide method, apparatus, computer equipment and the storage medium of a kind of voice quality inspection.The described method includes: being detected according to default first set of keywords to client audio data, and business personnel's audio data is detected according to default second set of keywords, second set of keywords includes words art set of keywords and violation set of keywords, when the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to frequency threshold value, or there is no art keywords in words art set of keywords in business personnel's audio data, or there are when the violation keyword in violation set of keywords in business personnel's audio data, the testing result for determining audio to be detected is not pass through detection, generate amended record prompt.Quality inspection can be carried out to each audio to be detected for recording node in real time using this method, to improve the efficiency being monitored to business service process.
Description
Technical field
This application involves Business Process Optimizing technical fields, method, apparatus, calculating more particularly to a kind of voice quality inspection
Machine equipment and storage medium.
Background technique
With the development of service industry, more and more enterprises are required to take business when carrying out business service to client
Business process is monitored, and traditionally, being monitored to business service process includes: to synchronize to be recorded and recorded to service process
Picture obtains business service video after business service, manually carries out on backstage to the conversation content in business service video
Repeatedly listen to and quality inspection, when finding certain section of dialogue by quality inspection, there are when problem, notify business personnel and client to carry out amended record.
However, the mode that traditionally business service process is monitored, until can just be looked into last quality check process
It finds the dialogue problem in each link and carries out amended record, there is a problem of that monitoring efficiency is low.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of side of voice quality inspection that can be improved monitoring efficiency
Method, device, computer equipment and storage medium.
A kind of method of voice quality inspection, which comprises
Each video to be detected for recording node and corresponding with video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from video to be detected;
It by audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
Client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, the second set of keywords includes words art set of keywords and violation set of keywords;
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
There is no deposit in art keyword or business personnel's audio data in words art set of keywords in value or business personnel's audio data
When violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to generate and mend not by detection
Record prompt.Carrying out detection to client audio data according to default first set of keywords in one of the embodiments, includes:
Obtaining from default first set of keywords multiple must read keyword;
Client audio data are converted into client's lteral data;
According to must respectively read keyword, client's lteral data is traversed, statistics must respectively read keyword and go out in client's lteral data
Existing number;
The number occurred in client's lteral data according to must respectively read keyword obtains respectively reading keyword in client audio
The number occurred in data.
Obtaining frequency threshold value corresponding with video to be detected in real time in one of the embodiments, includes:
The dialog template for recording node corresponding with video to be detected is obtained in real time;
According to the first set of keywords, the number that keyword appearance must be respectively read in dialog template is counted;
According to the number that must respectively read keyword appearance in dialog template, frequency threshold value is obtained.
Business personnel's audio data is detected according to default second set of keywords in one of the embodiments, the
Two set of keywords include words art set of keywords and violation set of keywords includes:
Business personnel's audio data is converted into business personnel's lteral data;
Art template if corresponding with video to be detected recording node is obtained, according to talking about art template from business personnel's lteral data
In extract it is corresponding if art information;
Words art keyword is obtained from the second set of keywords, and art information is talked about according to words art keyword match;
Violation keyword is obtained from the second set of keywords, and business personnel's lteral data is traversed according to violation keyword.
Client audio data are detected according to default first set of keywords in one of the embodiments, and root
After being detected according to default second set of keywords to business personnel's audio data, further includes:
When the number that keyword occurs in client audio data that must read in the first set of keywords reaches frequency threshold value,
And there is art keyword in words art set of keywords in business personnel's audio data, and there is no disobey in business personnel's audio data
When advising the violation keyword in set of keywords, determine that the testing result of audio to be detected is to pass through detection.
It is in one of the embodiments, multiple audio pieces by audio segmentation to be detected according to preset voice partitioning algorithm
Section, and merged the audio fragment for belonging to the same speaker in multiple audio fragments according to preset voice clustering algorithm, it obtains
Include: to business personnel's audio data and client audio data
Audio to be detected is filtered, the noise and ambient sound in audio to be detected are filtered out;
According to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple audio fragments;
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm,
Obtain business personnel's audio data and client audio data.
A kind of device of voice quality inspection, described device include:
Obtain module, for obtain during video record in real time each video to be detected for recording node and with it is to be detected
The corresponding frequency threshold value of video extracts each audio to be detected for recording node from video to be detected;
Extraction module is used to according to preset voice partitioning algorithm be multiple audio fragments by audio segmentation to be detected, and
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm, obtains business
Member's audio data and client audio data;
Detection module, for being detected according to default first set of keywords to client audio data, and according to default
Second set of keywords detects business personnel's audio data, and the second set of keywords includes words art set of keywords and disobeys
Advise set of keywords;
Processing module, for when the number that must be read keyword and occur in client audio data in the first set of keywords
Not equal in frequency threshold value or business personnel's audio data, there is no art keyword or business personnels in words art set of keywords
There are when the violation keyword in violation set of keywords, determine that the testing result of audio to be detected is not pass through in audio data
Detection generates amended record prompt.Detection module is also used to obtain from default first set of keywords in one of the embodiments,
It is multiple to read keyword, client audio data are converted into client's lteral data, according to must respectively read keyword, traverse client's text
Data, statistics must respectively read the number that keyword occurs in client's lteral data, according to must respectively read keyword in client's text number
According to the number of middle appearance, determination must respectively read the number that keyword occurs in client audio data.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
Each video to be detected for recording node and corresponding with video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from video to be detected;
It by audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
Client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, the second set of keywords includes words art set of keywords and violation set of keywords;
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
There is no deposit in art keyword or business personnel's audio data in words art set of keywords in value or business personnel's audio data
When violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to generate and mend not by detection
Record prompt.A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is executed by processor
When perform the steps of
Each video to be detected for recording node and corresponding with video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from video to be detected;
It by audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
Client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, the second set of keywords includes words art set of keywords and violation set of keywords;
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
There is no deposit in art keyword or business personnel's audio data in words art set of keywords in value or business personnel's audio data
When violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to generate and mend not by detection
Record prompt.Method, apparatus, computer equipment and the storage medium of above-mentioned voice quality inspection, according to default first set of keywords pair
Client audio data are detected, and are detected according to default second set of keywords to business personnel's audio data, are realized
Client audio data and business personnel's audio data are detected respectively, determine the detection of audio to be detected according to testing result
As a result, generating amended record prompt when the testing result of audio to be detected is not pass through detection.In this way, it is recorded in video
During system, quality inspection is carried out to each audio to be detected for recording node in real time, is realized in time to each during video record
Dialogue in a link is monitored, and improves the efficiency being monitored to business service process.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for voice quality inspection in one embodiment;
The sub-process schematic diagram that Fig. 2 is step S106 in Fig. 1 in one embodiment;
The sub-process schematic diagram that Fig. 3 is step S102 in Fig. 1 in one embodiment;
The sub-process schematic diagram that Fig. 4 is step S106 in Fig. 1 in one embodiment;
Fig. 5 is the flow diagram of the method for voice quality inspection in another embodiment;
The sub-process schematic diagram that Fig. 6 is step S104 in Fig. 1 in one embodiment;
Fig. 7 is the structural block diagram of the device of voice quality inspection in one embodiment;
Fig. 8 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
In one embodiment, as shown in Figure 1, providing a kind of method of voice quality inspection, comprising the following steps:
S102: each video to be detected for recording node and corresponding with video to be detected is obtained during video record in real time
Frequency threshold value, extracted from video to be detected it is each record node audio to be detected.
Video to be detected refers to that during video record, terminal acquires and be sent to each recording node of server
Video data.It include multiple recording links during video record, each link of recording has corresponding recording node.Obtain to
Detect video after, server can by video to be detected audio and image remove, extract it is each record node it is to be checked
Acoustic frequency.And the corresponding frequency threshold value of video to be detected refers to corresponding with video to be detected to read what keyword must occur
Frequency threshold value.It must read keyword and refer to that in recording link corresponding with node is recorded, client has to the word being mentioned to,
For being detected to client audio data.
S104: by audio segmentation to be detected being multiple audio fragments according to preset voice partitioning algorithm, and according to default
Voice clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio number
Accordingly and client audio data.
Due to that may have noise and ambient sound in audio to be detected, so being analyzed to audio to be detected
Before, first it is filtered, filter out noise and ambient sound therein.It include business personnel's sound in audio to be detected
Frequency accordingly and client audio data, when being detected to audio to be detected, server need by business personnel's audio data with
And client audio data separating comes.When separating to audio to be detected, voice partitioning algorithm and voice can be used
Clustering algorithm handles audio to be detected, first will be to be checked using voice partitioning algorithm by the way of first dividing and clustering again
Acoustic frequency division is segmented into multiple audio fragments, then uses voice clustering algorithm, and the same speaker will be belonged in multiple audio fragments
Audio fragment merge, obtain business personnel's audio data and client audio data.
S106: detecting client audio data according to default first set of keywords, and crucial according to default second
Word set detects business personnel's audio data, and the second set of keywords includes words art set of keywords and violation keyword
Set.
Include in default first set of keywords it is multiple must read keyword, must read keyword refer to record node pair
In the recording link answered, client has to the word being mentioned to, and violation keyword is referred in recording corresponding with node is recorded
In link, word that business personnel cannot be mentioned to.Words art keyword refers to business personnel in recording ring corresponding with node is recorded
In section, it is necessary to the word being mentioned to.Server detects client audio data according to default first set of keywords, system
Meter must read the number that keyword occurs in client audio data, by comparing number statistical result and corresponding with video to be detected
Frequency threshold value, determine the testing result of client audio data.Server can determine industry by detection business personnel's audio data
Whether business person refers to words art keyword, and whether does not refer to violation keyword, and then according in business personnel's audio data
It refers to situation, determines the testing result of business personnel's audio data.
S108: when in the first set of keywords must read number that keyword occurs in client audio data not equal to time
There is no art keyword or business personnel's audio datas in words art set of keywords in number threshold value or business personnel's audio data
When the middle violation keyword there are in violation set of keywords, determine audio to be detected testing result be not by detection, it is raw
It is prompted at amended record.
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
When value, the testing result of client audio data is not pass through detection.When there is no words art set of keywords in business personnel's audio data
Industry is determined there are when the violation keyword in violation set of keywords in art keyword or business personnel's audio data in conjunction
The testing result of business person's audio data is not pass through detection.When the testing result or business personnel audio data of client audio data
Testing result is when not passing through detection, and the testing result of audio to be detected is just not by detection, and server can generate amended record and mention
Show, amended record prompts the reason of client and business personnel can be prompted not to pass through recording, so that client and business personnel are carrying out live amended record
When, it avoids making a same mistake again.
The method of above-mentioned voice quality inspection detects client audio data according to default first set of keywords, and root
Business personnel's audio data is detected according to default second set of keywords, is realized to client audio data and business personnel's sound
Frequency determines the testing result of audio to be detected according to being detected respectively according to testing result, when the detection knot of audio to be detected
Fruit is to generate amended record prompt when not passing through detection.In this way, during video record, in real time to each recording node
Audio to be detected carry out quality inspection, realize and the dialogue in the links during video record be monitored in time, mention
The high efficiency that business service process is monitored.
In one of the embodiments, as shown in Fig. 2, S106 includes:
S202: obtaining from default first set of keywords multiple must read keyword;
S204: client audio data are converted into client's lteral data;
S206: according to must respectively read keyword, client's lteral data is traversed, statistics must respectively read keyword in client's lteral data
The number of middle appearance;
S208: according to the number that must respectively read keyword and occur in client's lteral data, obtain respectively reading keyword in visitor
The number occurred in the audio data of family.
It must read keyword and refer to that in recording link corresponding with node is recorded, client has to the word being mentioned to,
Server can be obtained from default first set of keywords it is multiple must read keyword, include in default first set of keywords
It is multiple to read keyword, when according to that must read keyword and detected to client audio data, need client audio data first
Client's lteral data is converted to, then further according to must respectively read keyword, traverses client's lteral data, statistics must respectively read keyword and exist
The number occurred in client's lteral data.Last basis must respectively read the number that keyword occurs in client's lteral data, obtain
The number that keyword occurs in client audio data must respectively be read.
Because business personnel can put question to client, and client can pass through in recording link corresponding with each recording node
Keyword must be read by, which referring to, replys the enquirement of business personnel, so can examine according to that must read keyword to client audio data
It surveying, the difference of number is putd question to according to business personnel in each recording link, client refers to that the number that must read keyword also can not be identical,
So to determine the number that must read keyword that should refer to of client in recording link corresponding with node is recorded, i.e., with view to be detected
Frequently corresponding frequency threshold value, and then comparison frequency threshold value and Ge Bi read the number that keyword occurs in client audio data, really
Determine the testing result of client audio data, is only equal to number when must respectively read the number that keyword occurs in client audio data
When threshold value, just it is believed that the testing result of client audio data is to pass through detection.Wherein, frequency threshold value can be according to recording node
Dialog template determines.
Above-described embodiment obtains respectively reading key according to the number that must respectively read keyword and occur in client's lteral data
The number that word occurs in client audio data, so that server can be according to must respectively read keyword in client audio data
The number of middle appearance determines the testing result of client audio data, realizes the detection to client audio data.
In one of the embodiments, as shown in figure 3, S102 includes:
S302: the dialog template for recording node corresponding with video to be detected is obtained in real time;
S304: according to the first set of keywords, the number that keyword appearance must be respectively read in dialog template is counted;
S306: according to the number that must respectively read keyword appearance in dialog template, frequency threshold value is obtained.
The node identification that server can be carried by recording node obtains in real time from preset dialog template database
With record the corresponding dialog template of node, and according to the first set of keywords obtain it is multiple must read keyword, according to must respectively read to close
Key word traverses dialog template, counts the number that must respectively read keyword appearance in dialog template, must respectively read keyword in dialog template
The number of appearance is exactly client in the number that must read keyword recording link and should referring to corresponding with node is recorded, i.e. number
Threshold value.
Above-described embodiment obtains the dialog template for recording node corresponding with video to be detected in real time, according to the first key
Word set, count dialog template in must respectively read keyword appearance number, according to must respectively be read in dialog template keyword appearance
Number obtains frequency threshold value, so that server can realize the detection to client audio data according to frequency threshold value.
In one of the embodiments, as shown in figure 4, S106 includes:
S402: business personnel's audio data is converted into business personnel's lteral data;
S404: obtaining art template if recording node corresponding with video to be detected, literary from business personnel according to words art template
Art information if corresponding to is extracted in digital data;
S406: obtaining words art keyword from the second set of keywords, talks about art information according to words art keyword match;
S408: violation keyword is obtained from the second set of keywords, and business personnel's text is traversed according to violation keyword
Data.
Server needs to be converted to business personnel's audio data business personnel's text when detecting to business personnel's audio data
Digital data obtains art template if corresponding with video to be detected recording node, according to talking about art template from business personnel's lteral data
In extract it is corresponding if art information, words art keyword is obtained from the second set of keywords, words art keyword refers to business
Member is in recording link corresponding with node is recorded, it is necessary to which the word being mentioned to is determined by detecting business personnel's audio data
Whether business personnel refers to words art keyword, when business personnel refers to words art keyword, determines the of business personnel's audio data
One testing result is to pass through detection.
Other than being detected according to words art keyword to business personnel's audio data, server also needs to close by violation
Key word detects business personnel's audio data, and violation keyword can be obtained from the second set of keywords, violation keyword
It refers in recording link corresponding with node is recorded, the word that business personnel cannot be mentioned to, by detecting business personnel's audio
Data, determine whether business personnel does not refer to violation keyword, when business personnel does not refer to violation keyword, determine business personnel's audio
Second testing result of data is to pass through detection.Only when the first testing result and the second testing result are all to pass through detection,
The testing result that just can determine that business personnel's audio data is to pass through detection.
Above-described embodiment detects business personnel's audio data according to words art keyword and violation keyword, realizes
Detection to business personnel's audio data.
In one of the embodiments, as shown in figure 5, after S106, further includes:
S502: when the number that keyword occurs in client audio data that must read in the first set of keywords reaches number
Threshold value, and there is art keyword in words art set of keywords in business personnel's audio data, and in business personnel's audio data not
There are when violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to pass through detection.
When the number that keyword occurs in client audio data that must read in the first set of keywords reaches frequency threshold value
When, server can determine that the testing result of client audio data is to pass through detection.Art is talked about when existing in business personnel's audio data
Art keyword in set of keywords, and there is no the violation keywords in violation set of keywords in business personnel's audio data
When, server can determine that the testing result of business personnel's audio data is to pass through detection.When client audio data and business personnel's sound
The testing result of frequency evidence is all when passing through detection, and server is that can determine that the testing result of audio to be detected is to pass through detection.
Above-described embodiment determines audio to be detected by the testing result of client audio data and business personnel's audio data
Testing result, realize the determination of the testing result to audio to be detected.
In one of the embodiments, as shown in fig. 6, S104 includes:
S602: being filtered audio to be detected, filters out noise and ambient sound in audio to be detected;
S604: according to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple audio fragments;
S606: the audio fragment of the same speaker will be belonged in multiple audio fragments according to preset voice clustering algorithm
Merge, obtains business personnel's audio data and client audio data.
Because may include noise and ambient sound in audio to be detected, server is to audio to be detected
When reason, it is necessary first to be filtered to audio to be detected, filter out the noise and ambient sound in audio to be detected, then use
Voice partitioning algorithm and voice clustering algorithm handle filtered audio to be detected, obtain business personnel's audio data with
And client audio data.Wherein, voice partitioning algorithm refers to that speaker changes detection of change-point, i.e. speaker in positioning voice data
The point that identity changes.Based on common voice partitioning algorithm usually moves cut-point detection algorithm by the window of Gauss model,
The distance between observe and calculate adjacent voice window, determine this two sections of voices whether from same based on threshold value or penalty factor
One speaker.Wherein, threshold value or penalty factor can be obtained by acquisition training set data.It can be with by voice partitioning algorithm
It only include the audio data of a people by audio segmentation to be detected at multiple audio fragments, in each audio fragment.
Voice clustering algorithm is to merge the audio fragment for belonging to the same speaker on the basis of voice partitioning algorithm
Get up, common voice clustering algorithm can be divided into two classes: top-down cluster and bottom-up cluster, by what is obtained after segmentation
Each audio fragment is as one kind, then according to BIC (Bayesian Information Criterions, Bayesian Information rule
Then) distance continuously merges two most adjacent classes, until the merging of sound bite is no longer result in the value increase of BIC, with this
Obtain two class audio frequency data.After obtaining two class audio frequency data, server can further be analyzed two class audio frequency data, be mentioned
The vocal print feature for taking out two class audio frequency data matches preset business personnel's information data by the vocal print feature of two class audio frequency data
Business personnel's vocal print feature in library determines business personnel's audio data in two class audio frequency data, another is client audio number
According to.
Above-described embodiment is filtered audio to be detected, filters out noise and ambient sound in audio to be detected,
It is that multiple audio fragments will be more using voice clustering algorithm by filtered audio segmentation to be detected using voice partitioning algorithm
A audio fragment cluster is business personnel's audio data and client audio data, is realized to business personnel's audio data and client
The extraction of audio data.
It should be understood that although each step in the flow chart of Fig. 1-6 is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 1-6
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively
It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately
It executes.
In one embodiment, as shown in fig. 7, providing a kind of device of voice quality inspection, comprising: obtain module 702, mention
Modulus block 704, detection module 706 and processing module 708, in which:
Obtain module 702, for obtain during video record in real time each video to be detected for recording node and with to
The corresponding frequency threshold value of video is detected, each audio to be detected for recording node is extracted from video to be detected;
Extraction module 704 is used to according to preset voice partitioning algorithm be multiple audio fragments by audio segmentation to be detected,
And merged the audio fragment for belonging to the same speaker in multiple audio fragments according to preset voice clustering algorithm, obtain industry
Business person's audio data and client audio data;
Detection module 706, for being detected according to default first set of keywords to client audio data, and according to pre-
If the second set of keywords detects business personnel's audio data, the second set of keywords include words art set of keywords and
Violation set of keywords;
Processing module 708 must read what keyword occurred in client audio data for working as in the first set of keywords
There is no art keyword or industry in words art set of keywords not equal in frequency threshold value or business personnel's audio data for number
There are when the violation keyword in violation set of keywords, determine that the testing result of audio to be detected is not in business person's audio data
By detection, amended record prompt is generated.The device of above-mentioned voice quality inspection, according to default first set of keywords to client audio data
It is detected, and business personnel's audio data is detected according to default second set of keywords, realized to client audio number
Accordingly and business personnel's audio data detects respectively, the testing result of audio to be detected is determined according to testing result, when to be checked
The testing result of acoustic frequency is to generate amended record prompt when not passing through detection.In this way, real during video record
When to it is each record node audio to be detected carry out quality inspection, realize in time to pair in the links during video record
Words are monitored, and improve the efficiency being monitored to business service process.
Detection module is also used to obtain from default first set of keywords and multiple must read to close in one of the embodiments,
Client audio data are converted to client's lteral data by key word, according to must respectively read keyword, traverse client's lteral data, statistics
The number that keyword occurs in client's lteral data must be respectively read, according to respectively must reading what keyword occurred in client's lteral data
Number obtains respectively reading the number that keyword occurs in client audio data.
Module is obtained in one of the embodiments, is also used to obtain recording node corresponding with video to be detected in real time
Dialog template counts the number that keyword appearance must be respectively read in dialog template, according to dialog template according to the first set of keywords
In must respectively read keyword appearance number, obtain frequency threshold value.
Detection module is also used to being converted to business personnel's audio data into business personnel's text number in one of the embodiments,
According to art template if acquisition recording node corresponding with video to be detected is mentioned from business personnel's lteral data according to words art template
Art information if corresponding to is taken out, words art keyword is obtained from the second set of keywords, is believed according to words art keyword match words art
Breath obtains violation keyword from the second set of keywords, and traverses business personnel's lteral data according to violation keyword.
Detection module, which is also used to work as in the first set of keywords, in one of the embodiments, must read keyword in client
The number occurred in audio data reaches frequency threshold value, and there is art in words art set of keywords in business personnel's audio data
Keyword, and audio to be detected is determined there is no when the violation keyword in violation set of keywords in business personnel's audio data
Testing result be pass through detection.
Extraction module is also used to be filtered audio to be detected in one of the embodiments, filters out to be detected
Noise and ambient sound in audio, according to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple sounds
Frequency segment merges the audio fragment for belonging to the same speaker in multiple audio fragments according to preset voice clustering algorithm,
Obtain business personnel's audio data and client audio data.
The specific of device about voice quality inspection limits the restriction that may refer to the method above for voice quality inspection,
This is repeated no more.Modules in the device of above-mentioned voice quality inspection can come fully or partially through software, hardware and combinations thereof
It realizes.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with software
Form is stored in the memory in computer equipment, executes the corresponding operation of the above modules in order to which processor calls.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in Figure 8.The computer equipment include by system bus connect processor, memory, network interface and
Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment
Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data
Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The database of machine equipment must read key data, violation key data and dialog template data for storing.The computer
The network interface of equipment is used to communicate with external terminal by network connection.The computer program is executed by processor Shi Yishi
A kind of method of existing voice quality inspection.
It will be understood by those skilled in the art that structure shown in Fig. 8, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with
Computer program, the processor perform the steps of when executing computer program
Each video to be detected for recording node and corresponding with video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from video to be detected;
It by audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
Client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, the second set of keywords includes words art set of keywords and violation set of keywords;
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
There is no deposit in art keyword or business personnel's audio data in words art set of keywords in value or business personnel's audio data
When violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to generate and mend not by detection
Record prompt.The computer equipment of above-mentioned voice quality inspection detects client audio data according to default first set of keywords,
And business personnel's audio data is detected according to default second set of keywords, it realizes to client audio data and business
Member's audio data is detected respectively, the testing result of audio to be detected is determined according to testing result, when the inspection of audio to be detected
Surveying result is to generate amended record prompt when not passing through detection.In this way, during video record, in real time to each recording
The audio to be detected of node carries out quality inspection, realizes and supervises in time to the dialogue in the links during video record
Control, improves the efficiency being monitored to business service process.
In one embodiment, it is also performed the steps of when processor executes computer program
Obtaining from default first set of keywords multiple must read keyword;
Client audio data are converted into client's lteral data;
According to must respectively read keyword, client's lteral data is traversed, statistics must respectively read keyword and go out in client's lteral data
Existing number;
The number occurred in client's lteral data according to must respectively read keyword obtains respectively reading keyword in client audio
The number occurred in data.
In one embodiment, it is also performed the steps of when processor executes computer program
The dialog template for recording node corresponding with video to be detected is obtained in real time;
According to the first set of keywords, the number that keyword appearance must be respectively read in dialog template is counted;
According to the number that must respectively read keyword appearance in dialog template, frequency threshold value is obtained.
In one embodiment, it is also performed the steps of when processor executes computer program
Business personnel's audio data is converted into business personnel's lteral data;
Art template if corresponding with video to be detected recording node is obtained, according to talking about art template from business personnel's lteral data
In extract it is corresponding if art information;
Words art keyword is obtained from the second set of keywords, and art information is talked about according to words art keyword match;
Violation keyword is obtained from the second set of keywords, and business personnel's lteral data is traversed according to violation keyword.
In one embodiment, it is also performed the steps of when processor executes computer program
When the number that keyword occurs in client audio data that must read in the first set of keywords reaches frequency threshold value,
And there is art keyword in words art set of keywords in business personnel's audio data, and there is no disobey in business personnel's audio data
When advising the violation keyword in set of keywords, determine that the testing result of audio to be detected is to pass through detection.
In one embodiment, it is also performed the steps of when processor executes computer program
Audio to be detected is filtered, the noise and ambient sound in audio to be detected are filtered out;
According to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple audio fragments;
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm,
Obtain business personnel's audio data and client audio data.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Machine program performs the steps of when being executed by processor
Each video to be detected for recording node and corresponding with video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from video to be detected;
It by audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
Client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, the second set of keywords includes words art set of keywords and violation set of keywords;
When the number that keyword occurs in client audio data that must read in the first set of keywords is not equal to number threshold
There is no deposit in art keyword or business personnel's audio data in words art set of keywords in value or business personnel's audio data
When violation keyword in violation set of keywords, determine that the testing result of audio to be detected is to generate and mend not by detection
Record prompt.The storage medium of above-mentioned voice quality inspection detects client audio data according to default first set of keywords, and
Business personnel's audio data is detected according to default second set of keywords, is realized to client audio data and business personnel
Audio data is detected respectively, determines the testing result of audio to be detected according to testing result, when the detection of audio to be detected
As a result it is prompted when not passing through detection, to generate amended record.In this way, during video record, each recording is saved in real time
The audio to be detected of point carries out quality inspection, realizes and is monitored in time to the dialogue in the links during video record,
Improve the efficiency being monitored to business service process.
In one embodiment, it is also performed the steps of when computer program is executed by processor
Obtaining from default first set of keywords multiple must read keyword;
Client audio data are converted into client's lteral data;
According to must respectively read keyword, client's lteral data is traversed, statistics must respectively read keyword and go out in client's lteral data
Existing number;
The number occurred in client's lteral data according to must respectively read keyword obtains respectively reading keyword in client audio
The number occurred in data.
In one embodiment, it is also performed the steps of when computer program is executed by processor
The dialog template for recording node corresponding with video to be detected is obtained in real time;
According to the first set of keywords, the number that keyword appearance must be respectively read in dialog template is counted;
According to the number that must respectively read keyword appearance in dialog template, frequency threshold value is obtained.
In one embodiment, it is also performed the steps of when computer program is executed by processor
Business personnel's audio data is converted into business personnel's lteral data;
Art template if corresponding with video to be detected recording node is obtained, according to talking about art template from business personnel's lteral data
In extract it is corresponding if art information;
Words art keyword is obtained from the second set of keywords, and art information is talked about according to words art keyword match;
Violation keyword is obtained from the second set of keywords, and business personnel's lteral data is traversed according to violation keyword.
In one embodiment, it is also performed the steps of when computer program is executed by processor
When the number that keyword occurs in client audio data that must read in the first set of keywords reaches frequency threshold value,
And there is art keyword in words art set of keywords in business personnel's audio data, and there is no disobey in business personnel's audio data
When advising the violation keyword in set of keywords, determine that the testing result of audio to be detected is to pass through detection.
In one embodiment, it is also performed the steps of when computer program is executed by processor
Audio to be detected is filtered, the noise and ambient sound in audio to be detected are filtered out;
According to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple audio fragments;
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm,
Obtain business personnel's audio data and client audio data.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (10)
1. a kind of method of voice quality inspection, which comprises
Each video to be detected for recording node and corresponding with the video to be detected time are obtained during video record in real time
Number threshold value extracts each audio to be detected for recording node from the video to be detected;
It by the audio segmentation to be detected is multiple audio fragments according to preset voice partitioning algorithm, and according to preset voice
Clustering algorithm will belong in multiple audio fragments the same speaker audio fragment merge, obtain business personnel's audio data and
Client audio data;
The client audio data are detected according to default first set of keywords, and according to default second set of keywords
Business personnel's audio data is detected, second set of keywords includes that words art set of keywords and violation are crucial
Word set;
When the number that keyword occurs in the client audio data that must read in first set of keywords is not equal to institute
It states and art keyword or institute in the words art set of keywords is not present in frequency threshold value or business personnel's audio data
It states in business personnel's audio data there are when the violation keyword in the violation set of keywords, determines the audio to be detected
Testing result is to generate amended record prompt not by detection.
2. the method according to claim 1, wherein the basis presets the first set of keywords to the client
Audio data carries out detection
Obtaining from default first set of keywords multiple must read keyword;
The client audio data are converted into client's lteral data;
According to it is each it is described must read keyword, traverse client's lteral data, counting each described must read keyword in the client
The number occurred in lteral data;
According to each number that must be read keyword and occur in client's lteral data, obtain each described to read keyword and existing
The number occurred in the client audio data.
3. the method according to claim 1, wherein described real-time acquisition corresponding with the video to be detected time
Counting threshold value includes:
The dialog template for recording node corresponding with the video to be detected is obtained in real time;
According to first set of keywords, the number that keyword appearance must be respectively read in the dialog template is counted;
According to the number that must respectively read keyword appearance in the dialog template, frequency threshold value is obtained.
4. the method according to claim 1, wherein the basis presets the second set of keywords to the business
Member's audio data is detected, and second set of keywords includes words art set of keywords and violation set of keywords packet
It includes:
Business personnel's audio data is converted into business personnel's lteral data;
Art template if recording node corresponding with the video to be detected is obtained, according to the words art template from the business personnel
Art information if corresponding to is extracted in lteral data;
Words art keyword is obtained from second set of keywords, and art information is talked about according to the words art keyword match;
Violation keyword is obtained from second set of keywords, and business personnel's text is traversed according to the violation keyword
Digital data.
5. the method according to claim 1, wherein the basis presets the first set of keywords to the client
Audio data is detected, and after being detected according to default second set of keywords to business personnel's audio data, also
Include:
When the number that keyword occurs in the client audio data that must read in first set of keywords reaches described
Frequency threshold value, and there are art keyword in the words art set of keywords in business personnel's audio data, and the industry
There is no when the violation keyword in the violation set of keywords in business person's audio data, the inspection of the audio to be detected is determined
Surveying result is to pass through detection.
6. the method according to claim 1, wherein described will be described to be checked according to preset voice partitioning algorithm
Acoustic frequency division is segmented into multiple audio fragments, and will belong to the same theory in multiple audio fragments according to preset voice clustering algorithm
The audio fragment for talking about people merges, and obtains business personnel's audio data and client audio data include:
The audio to be detected is filtered, the noise and ambient sound in the audio to be detected are filtered out;
According to preset voice partitioning algorithm by filtered audio segmentation to be detected be multiple audio fragments;
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm, is obtained
Business personnel's audio data and client audio data.
7. a kind of device of voice quality inspection, which is characterized in that described device includes:
Obtain module, for obtain during video record in real time each video to be detected for recording node and with it is described to be detected
The corresponding frequency threshold value of video extracts each audio to be detected for recording node from the video to be detected;
Extraction module is used to according to preset voice partitioning algorithm be multiple audio fragments by the audio segmentation to be detected, and
The audio fragment for belonging to the same speaker in multiple audio fragments is merged according to preset voice clustering algorithm, obtains business
Member's audio data and client audio data;
Detection module, for being detected according to default first set of keywords to the client audio data, and according to default
Second set of keywords detects business personnel's audio data, and second set of keywords includes words art set of keywords
Conjunction and violation set of keywords;
Processing module must read what keyword occurred in the client audio data for working as in first set of keywords
There is no arts in the words art set of keywords not equal in the frequency threshold value or business personnel's audio data for number
There are when the violation keyword in the violation set of keywords in keyword or business personnel's audio data, determine described in
The testing result of audio to be detected is to generate amended record prompt not by detection.
8. device according to claim 7, which is characterized in that the detection module is also used to from default first set of keywords
Obtained in conjunction it is multiple must read keyword, the client audio data are converted into client's lteral data, described must read to close according to each
Key word, traverses client's lteral data, count it is each it is described must read the number that keyword occurs in client's lteral data,
According to each number that must be read keyword and occur in client's lteral data, determining each described must read keyword described
The number occurred in client audio data.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 6 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 6 is realized when being executed by processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910616721.8A CN110364183A (en) | 2019-07-09 | 2019-07-09 | Method, apparatus, computer equipment and the storage medium of voice quality inspection |
PCT/CN2020/086625 WO2021004128A1 (en) | 2019-07-09 | 2020-04-24 | Voice quality control method and device, computer apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910616721.8A CN110364183A (en) | 2019-07-09 | 2019-07-09 | Method, apparatus, computer equipment and the storage medium of voice quality inspection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110364183A true CN110364183A (en) | 2019-10-22 |
Family
ID=68218251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910616721.8A Pending CN110364183A (en) | 2019-07-09 | 2019-07-09 | Method, apparatus, computer equipment and the storage medium of voice quality inspection |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110364183A (en) |
WO (1) | WO2021004128A1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111696527A (en) * | 2020-06-15 | 2020-09-22 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for positioning voice quality inspection area, positioning equipment and storage medium |
CN111723204A (en) * | 2020-06-15 | 2020-09-29 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for correcting voice quality inspection area, correction equipment and storage medium |
CN111883139A (en) * | 2020-07-24 | 2020-11-03 | 北京字节跳动网络技术有限公司 | Method, apparatus, device and medium for screening target voices |
WO2021004128A1 (en) * | 2019-07-09 | 2021-01-14 | 深圳壹账通智能科技有限公司 | Voice quality control method and device, computer apparatus, and storage medium |
CN113158662A (en) * | 2021-04-27 | 2021-07-23 | 中国工商银行股份有限公司 | Real-time monitoring method and device for audio data |
WO2021212998A1 (en) * | 2020-04-24 | 2021-10-28 | 深圳壹账通智能科技有限公司 | Multi-level logic-based speech verbal skill inspection method and apparatus, and computer device and storage medium |
CN113641795A (en) * | 2021-08-20 | 2021-11-12 | 上海明略人工智能(集团)有限公司 | Method and device for dialectical statistics, electronic equipment and storage medium |
CN115883760A (en) * | 2022-01-11 | 2023-03-31 | 北京中关村科金技术有限公司 | Real-time quality inspection method and device for audio and video and storage medium |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112911180A (en) * | 2021-01-28 | 2021-06-04 | 中国建设银行股份有限公司 | Video recording method and device, electronic equipment and readable storage medium |
CN113035201A (en) * | 2021-03-16 | 2021-06-25 | 广州佰锐网络科技有限公司 | Financial service quality inspection method and system |
CN113240436A (en) * | 2021-04-22 | 2021-08-10 | 北京沃东天骏信息技术有限公司 | Method and device for online customer service call technical quality inspection |
CN113571048B (en) * | 2021-07-21 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Audio data detection method, device, equipment and readable storage medium |
CN113506585A (en) * | 2021-09-09 | 2021-10-15 | 深圳市一号互联科技有限公司 | Quality evaluation method and system for voice call |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
CN108091328A (en) * | 2017-11-20 | 2018-05-29 | 北京百度网讯科技有限公司 | Speech recognition error correction method, device and readable medium based on artificial intelligence |
CN108962282A (en) * | 2018-06-19 | 2018-12-07 | 京北方信息技术股份有限公司 | Speech detection analysis method, apparatus, computer equipment and storage medium |
CN109327632A (en) * | 2018-11-23 | 2019-02-12 | 深圳前海微众银行股份有限公司 | Intelligent quality inspection system, method and the computer readable storage medium of customer service recording |
CN109587360A (en) * | 2018-11-12 | 2019-04-05 | 平安科技(深圳)有限公司 | Electronic device should talk with art recommended method and computer readable storage medium |
CN109599093A (en) * | 2018-10-26 | 2019-04-09 | 北京中关村科金技术有限公司 | Keyword detection method, apparatus, equipment and the readable storage medium storing program for executing of intelligent quality inspection |
CN109711996A (en) * | 2018-08-17 | 2019-05-03 | 深圳壹账通智能科技有限公司 | The double record file quality detecting methods of declaration form, device, equipment and readable storage medium storing program for executing |
CN109729383A (en) * | 2019-01-04 | 2019-05-07 | 深圳壹账通智能科技有限公司 | Double record video quality detection methods, device, computer equipment and storage medium |
CN109767765A (en) * | 2019-01-17 | 2019-05-17 | 平安科技(深圳)有限公司 | Talk about art matching process and device, storage medium, computer equipment |
CN109767335A (en) * | 2018-12-15 | 2019-05-17 | 深圳壹账通智能科技有限公司 | Double record quality detecting methods, device, computer equipment and storage medium |
CN109819128A (en) * | 2019-01-23 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of quality detecting method and device of telephonograph |
CN109830246A (en) * | 2019-01-25 | 2019-05-31 | 北京海天瑞声科技股份有限公司 | Audio quality appraisal procedure, device, electronic equipment and storage medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105261362B (en) * | 2015-09-07 | 2019-07-05 | 科大讯飞股份有限公司 | A kind of call voice monitoring method and system |
CN108737667B (en) * | 2018-05-03 | 2021-09-10 | 平安科技(深圳)有限公司 | Voice quality inspection method and device, computer equipment and storage medium |
JP6614589B2 (en) * | 2018-05-09 | 2019-12-04 | 株式会社野村総合研究所 | Compliance check system and compliance check program |
CN109660744A (en) * | 2018-10-19 | 2019-04-19 | 深圳壹账通智能科技有限公司 | The double recording methods of intelligence, equipment, storage medium and device based on big data |
CN109783338B (en) * | 2019-01-02 | 2022-11-15 | 深圳壹账通智能科技有限公司 | Recording processing method and device based on service information and computer equipment |
CN110364183A (en) * | 2019-07-09 | 2019-10-22 | 深圳壹账通智能科技有限公司 | Method, apparatus, computer equipment and the storage medium of voice quality inspection |
-
2019
- 2019-07-09 CN CN201910616721.8A patent/CN110364183A/en active Pending
-
2020
- 2020-04-24 WO PCT/CN2020/086625 patent/WO2021004128A1/en active Application Filing
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
CN108091328A (en) * | 2017-11-20 | 2018-05-29 | 北京百度网讯科技有限公司 | Speech recognition error correction method, device and readable medium based on artificial intelligence |
CN108962282A (en) * | 2018-06-19 | 2018-12-07 | 京北方信息技术股份有限公司 | Speech detection analysis method, apparatus, computer equipment and storage medium |
CN109711996A (en) * | 2018-08-17 | 2019-05-03 | 深圳壹账通智能科技有限公司 | The double record file quality detecting methods of declaration form, device, equipment and readable storage medium storing program for executing |
CN109599093A (en) * | 2018-10-26 | 2019-04-09 | 北京中关村科金技术有限公司 | Keyword detection method, apparatus, equipment and the readable storage medium storing program for executing of intelligent quality inspection |
CN109587360A (en) * | 2018-11-12 | 2019-04-05 | 平安科技(深圳)有限公司 | Electronic device should talk with art recommended method and computer readable storage medium |
CN109327632A (en) * | 2018-11-23 | 2019-02-12 | 深圳前海微众银行股份有限公司 | Intelligent quality inspection system, method and the computer readable storage medium of customer service recording |
CN109767335A (en) * | 2018-12-15 | 2019-05-17 | 深圳壹账通智能科技有限公司 | Double record quality detecting methods, device, computer equipment and storage medium |
CN109729383A (en) * | 2019-01-04 | 2019-05-07 | 深圳壹账通智能科技有限公司 | Double record video quality detection methods, device, computer equipment and storage medium |
CN109767765A (en) * | 2019-01-17 | 2019-05-17 | 平安科技(深圳)有限公司 | Talk about art matching process and device, storage medium, computer equipment |
CN109819128A (en) * | 2019-01-23 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of quality detecting method and device of telephonograph |
CN109830246A (en) * | 2019-01-25 | 2019-05-31 | 北京海天瑞声科技股份有限公司 | Audio quality appraisal procedure, device, electronic equipment and storage medium |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021004128A1 (en) * | 2019-07-09 | 2021-01-14 | 深圳壹账通智能科技有限公司 | Voice quality control method and device, computer apparatus, and storage medium |
WO2021212998A1 (en) * | 2020-04-24 | 2021-10-28 | 深圳壹账通智能科技有限公司 | Multi-level logic-based speech verbal skill inspection method and apparatus, and computer device and storage medium |
CN111696527A (en) * | 2020-06-15 | 2020-09-22 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for positioning voice quality inspection area, positioning equipment and storage medium |
CN111723204A (en) * | 2020-06-15 | 2020-09-29 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for correcting voice quality inspection area, correction equipment and storage medium |
CN111723204B (en) * | 2020-06-15 | 2021-04-02 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for correcting voice quality inspection area, correction equipment and storage medium |
CN111883139A (en) * | 2020-07-24 | 2020-11-03 | 北京字节跳动网络技术有限公司 | Method, apparatus, device and medium for screening target voices |
CN113158662A (en) * | 2021-04-27 | 2021-07-23 | 中国工商银行股份有限公司 | Real-time monitoring method and device for audio data |
CN113641795A (en) * | 2021-08-20 | 2021-11-12 | 上海明略人工智能(集团)有限公司 | Method and device for dialectical statistics, electronic equipment and storage medium |
CN115883760A (en) * | 2022-01-11 | 2023-03-31 | 北京中关村科金技术有限公司 | Real-time quality inspection method and device for audio and video and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2021004128A1 (en) | 2021-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110364183A (en) | Method, apparatus, computer equipment and the storage medium of voice quality inspection | |
WO2020244153A1 (en) | Conference voice data processing method and apparatus, computer device and storage medium | |
US9258425B2 (en) | Method and system for speaker verification | |
CN110533288A (en) | Business handling process detection method, device, computer equipment and storage medium | |
US11646038B2 (en) | Method and system for separating and authenticating speech of a speaker on an audio stream of speakers | |
US8219404B2 (en) | Method and apparatus for recognizing a speaker in lawful interception systems | |
US8078463B2 (en) | Method and apparatus for speaker spotting | |
US20190325345A1 (en) | Bot-based data collection for detecting phone solicitations | |
CN110910901A (en) | Emotion recognition method and device, electronic equipment and readable storage medium | |
CN109766474A (en) | Inquest signal auditing method, device, computer equipment and storage medium | |
CN110598008B (en) | Method and device for detecting quality of recorded data and storage medium | |
CN111914926B (en) | Sliding window-based video plagiarism detection method, device, equipment and medium | |
CN111914649A (en) | Face recognition method and device, electronic equipment and storage medium | |
CN111010484A (en) | Automatic quality inspection method for call recording | |
KR20100124983A (en) | System and method for protecting pornograph | |
CN110378587A (en) | Intelligent quality detecting method, system, medium and equipment | |
US20140297280A1 (en) | Speaker identification | |
CN112468753B (en) | Method and device for acquiring and checking record data based on audio and video recognition technology | |
CN110298543B (en) | Service tracking method, device, computer equipment and storage medium | |
Pandey et al. | Cell-phone identification from audio recordings using PSD of speech-free regions | |
Andy et al. | Simple duplicate frame detection of MJPEG codec for video forensic | |
CN109446335B (en) | News main body judging method, device, computer equipment and storage medium | |
CN114666137A (en) | Threat information processing method and device | |
CN112783799A (en) | Software daemon test method and device | |
CN111339829A (en) | User identity authentication method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |