WO2021078298A1 - 话务质检方法、装置、存储介质及服务器 - Google Patents

话务质检方法、装置、存储介质及服务器 Download PDF

Info

Publication number
WO2021078298A1
WO2021078298A1 PCT/CN2020/123657 CN2020123657W WO2021078298A1 WO 2021078298 A1 WO2021078298 A1 WO 2021078298A1 CN 2020123657 W CN2020123657 W CN 2020123657W WO 2021078298 A1 WO2021078298 A1 WO 2021078298A1
Authority
WO
WIPO (PCT)
Prior art keywords
quality inspection
audio
real
call
evaluation information
Prior art date
Application number
PCT/CN2020/123657
Other languages
English (en)
French (fr)
Inventor
刘波
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2021078298A1 publication Critical patent/WO2021078298A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2236Quality of speech transmission monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing

Definitions

  • the present invention claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 25, 2019, the application number is 201911021502.1, and the invention title is "traffic quality inspection method, device, storage medium and server", the entire content of the application Incorporated in the present invention by reference.
  • the embodiments of the present application relate to the field of network monitoring technology, and in particular, to a traffic quality inspection method, device, storage medium, and server.
  • the contact center usually records the content of the conversation between the operator and the user. Therefore, the traditional traffic quality inspection is mainly to sample the recordings in the recording library, and then perform quality inspections on the sampled recordings.
  • the embodiments of the present application provide a traffic quality inspection method, device, storage medium, and server, which are used to solve the problems of low efficiency and limitations of the traffic quality inspection.
  • the technical solution is as follows.
  • a traffic quality inspection method includes: obtaining a real-time audio stream generated during the call when an operator is talking with a user; performing quality inspection on the real-time audio stream; and generating the real-time audio stream; The quality inspection result of the audio stream.
  • a traffic quality inspection device includes: an acquisition module, which is used to acquire a real-time audio stream generated during the call when an operator is in a call with a user; and the quality inspection module is used to The real-time audio stream is subjected to quality inspection; a generating module is used to generate the quality inspection result of the real-time audio stream.
  • a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set Or the instruction set is loaded and executed by the processor to realize the traffic quality inspection method as described above.
  • a server in one aspect, includes a processor and a memory, and at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the above-mentioned traffic quality inspection method .
  • FIG. 1 is a method flowchart of a traffic quality inspection method provided by an embodiment of the present application
  • FIG. 2 is a method flowchart of a traffic quality inspection method provided by another embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a traffic quality inspection method provided by another embodiment of the present application.
  • FIG. 4 is a structural block diagram of a traffic quality inspection device provided by still another embodiment of the present application.
  • Fig. 5 is a structural block diagram of a traffic quality inspection device provided by still another embodiment of the present application.
  • FIG. 1 shows a method flowchart of a traffic quality inspection method provided by an embodiment of the present application.
  • the traffic quality inspection method can be applied to a server.
  • the traffic quality inspection method may include the following steps.
  • Step 101 When the operator is in a call with a user, obtain a real-time audio stream generated during the call.
  • the call between the operator and the user can be initiated by the user or the operator. For example, when a user wants to make a business consultation or business complaint or give a business suggestion, he can call a contact center agent through the user's terminal, and the contact center agent will assign a salesperson to answer the call; or, when the operator wants to When pushing services or information to the user, the user's terminal can be called through the contact center agent.
  • This embodiment does not limit the call initiation method between the operator and the user.
  • the audio stream can be collected.
  • the RTP Real-time Transport Protocol
  • the RTP protocol Real-time Transport Protocol
  • Get the RTP protocol data packet and then send the RTP protocol data packet to the server through the communication core network or the media server.
  • the server unpacks and decodes each received RTP protocol data packet, and then combines the various audio streams obtained by decoding into a real-time audio stream.
  • Step 102 Perform quality inspection on the real-time audio stream.
  • the server can perform quality inspection on the real-time audio stream.
  • the quality inspection process is described in detail below, and will not be repeated here.
  • Step 103 Generate a quality inspection result of the real-time audio stream.
  • the server may generate the quality inspection result in real time during the quality inspection, or may generate the quality inspection result after the call ends, which is not limited in this embodiment.
  • the traffic quality inspection method obtains the real-time audio stream generated during the call when the operator is talking with the user, and then performs quality inspection on the real-time audio stream to generate the real-time audio Streaming quality inspection results, so that quality inspection can be performed during the call without recording the content of the call, and then sampling the recording, which can improve the efficiency of quality inspection; in addition, all calls can be inspected to avoid sampling Omit important and typical recording problems at the time, thereby improving the comprehensiveness of quality inspection.
  • FIG. 2 shows a method flowchart of a traffic quality inspection method provided by another embodiment of the present application.
  • the traffic quality inspection method can be applied to a server.
  • the traffic quality inspection method may include the following steps.
  • Step 201 When the operator is in a call with a user, obtain a real-time audio stream generated during the call.
  • the call between the operator and the user can be initiated by the user or the operator. For example, when a user wants to make a business consultation or business complaint or give a business suggestion, he can call a contact center agent through the user's terminal, and the contact center agent will assign a salesperson to answer the call; or, when the operator wants to When pushing services or information to the user, the user's terminal can be called through the contact center agent.
  • This embodiment does not limit the call initiation method between the operator and the user.
  • the audio stream can be collected. After every predetermined time period, the audio stream collected in the current time period is encoded using RTP to obtain the RTP protocol data packet, which is then passed through the communication core
  • the network or media server sends the RTP protocol data packet to the server.
  • the server unpacks and decodes each received RTP protocol data packet, and then combines the various audio streams obtained by decoding into a real-time audio stream.
  • Step 202 Divide the real-time audio stream into multiple audio segments.
  • the server may preset a division mode of the real-time audio stream, and divide the real-time audio stream according to the division mode to obtain multiple audio clips.
  • the division method is to intercept audio per unit time from a real-time audio stream, and use the audio as an audio segment.
  • the unit time can be any numerical value, for example, the unit time can be 1-5 seconds, which is not limited in this embodiment.
  • the server can intercept the 1-5 second audio from the received real-time audio stream as the first audio segment, and then intercept the 6-10 second audio from the real-time audio stream as the first audio segment. Two audio segments, and so on, stop until the last audio segment in the real-time audio stream is intercepted.
  • Step 203 Perform quality inspection on all or part of the audio clips to obtain evaluation information of the audio clips.
  • the server may perform quality inspection on the obtained audio clips after the call ends; or, in order to improve the real-time quality of the quality inspection, the server may perform quality inspection on the obtained audio clips in real time during the call.
  • the quality inspection result of the real-time audio stream can be obtained at the end of the call, thereby improving the efficiency of the quality inspection.
  • the server can perform quality inspection on all the audio clips obtained, thereby ensuring the comprehensiveness of the quality inspection; or, the server can also filter out parts of all the audio clips obtained For audio clips, perform quality inspection on this part of the audio clips, thereby reducing the processing pressure of the server by reducing the number of audio clips for quality inspection, and improving the efficiency of quality inspection.
  • the server can filter some audio clips. For example, the server can select an audio clip after every predetermined audio clip, or the server can preprocess each audio clip and select the audio clip based on the preprocessing result. Alternatively, the server may randomly select audio clips, which is not limited in this embodiment.
  • the server can perform quality inspection on the audio clip after each audio clip is obtained, so as to ensure the comprehensiveness of the quality inspection; or, the server can also obtain a After the audio clip is judged to be quality-checked, the audio clip is then quality-checked. After an audio clip is obtained and it is judged that the audio clip does not need to be quality-checked, the audio clip will not be quality-checked. Therefore, the processing pressure of the server is reduced by reducing the number of audio clips for quality inspection, and the efficiency of quality inspection is improved. Among them, there are many ways for the server to determine whether it is necessary to perform quality inspection on the audio clip. For details, refer to the above description, which will not be repeated here.
  • Evaluation information is the result of quality inspection on an audio clip.
  • the evaluation information may include the score of the audio clip, keywords, etc., which is not limited in this embodiment.
  • the server can set multiple quality inspection items for the audio clip in advance, and set a corresponding score for each quality inspection item.
  • the score is obtained by adding the scores of each quality inspection item that the audio clip meets; or, the server can preset the evaluation model, and set a corresponding score for each evaluation model, which will match the score of the audio clip As a score.
  • the following uses a preset evaluation model as an example to illustrate three possible implementation manners.
  • performing quality inspection on each audio segment in all or part of the audio segment to obtain evaluation information of the audio segment may include the following steps.
  • the server may execute steps 1-4 to generate the evaluation information of the audio fragment after each audio clip that needs quality inspection is obtained, and stop until it obtains the evaluation information of all audio fragments that need quality inspection.
  • the server can use any speech-to-text technology to convert the audio segment into text, and this embodiment does not limit the conversion method.
  • the server may pre-set the keyword extraction rules, and extract the keywords from the text according to the extraction rules. For example, when the extraction rule is to extract honorific words, keywords such as "you” and “please” can be extracted from the text; when the extraction rule is to extract business terms, it is assumed that the call content is that the operator introduces mobile broadband services to the user. You can extract keywords such as “mobile broadband service”, “tariff”, and "bandwidth” from the text.
  • the semantic model is a kind of evaluation model, and the semantic model is related to the content of the call, and is used for quality inspection of the content of the call.
  • each semantic model corresponds to a set of keywords
  • the server can compare the keywords extracted from the text with a set of keywords in each semantic model.
  • the keywords extracted from the text are compared with
  • a set of keywords of a certain semantic model are the same (that is, the keywords are identical) or similar (that is, the proportion of the same keywords reaches a predetermined ratio threshold)
  • the evaluation information of the audio segment is generated according to the semantic model.
  • each semantic model corresponds to a score
  • the server can use the score of the semantic model as the score of the audio segment to obtain the evaluation information of the audio segment.
  • the keywords of the audio clip can also be added to the evaluation information.
  • performing quality inspection on each audio segment in all or part of the audio segment to obtain evaluation information of the audio segment may include the following steps.
  • an emotion parameter is extracted from the audio segment, and the emotion parameter is used to indicate the emotion of the operator or the user during the call.
  • the server may execute steps 1-3 to generate the evaluation information of the audio fragment after each audio clip that needs quality inspection is obtained, and stop until it obtains the evaluation information of all audio fragments that need quality inspection.
  • the operator or the user may change their voice to express their dissatisfaction.
  • the voice change can be reflected by voice feature parameters, and the voice feature parameters can include volume, pitch, and so on. It can be seen that the voice feature parameter during a call can be used as a kind of emotion parameter.
  • extracting the emotion parameter from the audio segment may include: extracting the voice feature parameter from the audio segment, and using the voice feature parameter as the emotion parameter.
  • extracting the emotional parameter from the audio clip includes: converting the audio clip into text, extracting the keyword from the text, and using the keyword as the emotional parameter .
  • the voice feature parameters and keywords during a call can also be used as emotional parameters.
  • extracting emotional parameters from an audio clip includes: extracting voice feature parameters from the audio clip, and adding The audio segment is converted into text, keywords are extracted from the text, and the keywords and the voice feature parameters are used as emotion parameters.
  • the emotion model is a kind of evaluation model, and the emotion model is related to the emotion, and is used for quality inspection of the emotions of both parties during the call (that is, the service attitude of the operator and the communication attitude of the user).
  • each emotion model corresponds to a voice feature parameter interval
  • the server can compare the extracted voice feature parameter with a voice feature parameter interval of each emotion model.
  • the extracted voice feature When the parameter belongs to a voice feature parameter interval of a certain emotion model, it can be determined that the emotion model matches the extracted voice feature parameter.
  • the server can compare the keywords extracted from the text with the set of keywords of each emotion model.
  • a keyword is the same as a set of keywords of a certain emotion model (that is, the keywords are exactly the same) or similar (that is, the proportion of the same keywords reaches the predetermined ratio threshold)
  • the emotion model can be determined to be the same as the keywords extracted from the text Match.
  • the server can compare the extracted voice feature parameters with a voice feature parameter interval of each emotion model. Compare and compare the keywords extracted from the text with a set of keywords of each emotion model.
  • the extracted voice feature parameters belong to a voice feature parameter interval of a certain emotion model, and the extracted keywords
  • it is the same as a set of keywords of a certain emotion model (that is, the keywords are exactly the same) or similar (that is, the proportion of the same keywords reaches the predetermined ratio threshold)
  • it can be determined that the emotion model is similar to the extracted speech feature parameters and keywords. match.
  • the evaluation information of the audio segment is generated according to the emotion model.
  • each emotion model corresponds to a score
  • the server can use the score of the emotion model as the score of the audio segment to obtain the evaluation information of the audio segment.
  • the emotion parameter also includes keywords
  • the keywords of the audio clip may also be added to the evaluation information.
  • the first and second embodiments can also be combined, that is, the server can determine the semantic model and the emotion model that match the audio segment, and then generate the audio segment based on the semantic model and the emotion model Evaluation information. At this time, the server can calculate the average value of the scores corresponding to the semantic model and the emotion model, and use the average value as the score of the audio segment to obtain the evaluation information of the audio segment.
  • the server may also add the keyword of the audio clip to the evaluation information.
  • keyword extraction is required when matching the semantic model, and when the emotional parameters include keywords, keyword extraction is also required. Therefore, in the third embodiment, only one keyword extraction is required. , In order to improve the efficiency of quality inspection.
  • Step 204 Generate a quality inspection result of the real-time audio stream according to all the currently obtained evaluation information.
  • the server can generate a quality inspection result based on all the evaluation information obtained after the call is over; if the quality inspection result is generated in real time during the call, the server can be at any time during the call At any time, a quality inspection result is generated based on all the evaluation information obtained. For example, the server can generate a quality inspection result based on the first piece of evaluation information when it obtains the first piece of evaluation information; when it obtains the second piece of evaluation information, it can generate a piece of quality inspection result based on the first and second pieces of evaluation information. Result: At the moment when the third piece of evaluation information is obtained, a quality inspection result is generated based on the first, second and third pieces of evaluation information, and so on.
  • the evaluation information when the evaluation information includes a score, after all the evaluation information is obtained, an average value of the scores in all the evaluation information can be calculated, and the average value can be used as the quality inspection result of the real-time audio stream.
  • generating the quality inspection results of the real-time audio stream based on all the evaluation information may include: obtaining the traffic type corresponding to the call; and determining the evaluation of the real-time audio stream according to the traffic type Standard; According to the evaluation criteria and the evaluation information of all audio clips, the quality inspection results of real-time audio streams are generated.
  • the traffic type may include business consultation, complaint, business recommendation, etc., which is not limited in this embodiment.
  • Different types of traffic can correspond to different evaluation criteria.
  • the evaluation standard can be based on whether the user can provide business answers to determine the service score; when the traffic type is business complaint, the evaluation standard can be determined based on whether the user can solve the problem Service points:
  • the evaluation criteria can be based on whether the user can be persuaded to open a certain service to determine the service score, or the evaluation standard can be based on whether the number of services that can be introduced to the user is used to determine the service score. .
  • the traffic type can be determined based on the keywords in all evaluation information, then the business score of the real-time audio stream can be determined based on the keywords and evaluation criteria, and then the score of the real-time audio stream can be determined, and finally the business score and score can be weighted. Get the quality inspection result of the real-time audio stream.
  • the method may further include: when the quality inspection result of the real-time audio stream does not meet the preset condition, according to the evaluation criteria and the current obtained All the evaluation information of the generated improvement information, which is used to instruct the operator to improve the traffic level.
  • the preset condition can be a preset score, that is, when the score of the quality inspection result is less than the preset score, the improvement information can be generated, and the improvement information can be displayed to the operator, so that the operator can treat himself/herself based on the improvement information.
  • the preset condition can be a preset score, that is, when the score of the quality inspection result is less than the preset score, the improvement information can be generated, and the improvement information can be displayed to the operator, so that the operator can treat himself/herself based on the improvement information.
  • the score of the operator’s quality inspection result when the score of the operator’s quality inspection result is low, it is determined according to the evaluation criteria that the number of services recommended by the operator to the user is small, and then the improvement information can be generated to recommend more services to the user; or, the score of the operator’s quality inspection result When the value is low, it is determined that the operator’s service attitude is not good according to the evaluation criteria, and then improvement information to improve the service attitude can be generated.
  • the traffic quality inspection method obtains the real-time audio stream generated during the call when the operator is talking with the user, and then performs quality inspection on the real-time audio stream to generate the real-time audio Streaming quality inspection results, so that quality inspection can be performed during the call without recording the content of the call, and then sampling the recording, which can improve the efficiency of quality inspection; in addition, all calls can be inspected to avoid sampling Omit important and typical recording problems at the time, thereby improving the comprehensiveness of quality inspection.
  • the quality inspection results can be generated, and the improvement information is generated based on the quality inspection results, so that real-time conversations with the attendant's work process can give evaluations and suggestions, rather than post-post static evaluations, which can improve user satisfaction.
  • the call after performing quality inspection on each audio segment in all or part of the audio segment, if a problem or unexpected situation is found during the call, the call can also be manipulated to intervene in the call in real time, thereby It can handle problems in real time and provide users with more efficient and high-quality services. Then, after performing quality inspection on each audio segment in all or part of the audio segment to obtain the evaluation information of the audio segment, the method further includes: for each audio segment in all or part of the audio segment, determining the audio segment Whether the evaluation information meets the manipulation conditions; when the evaluation information of the audio clip meets the manipulation conditions, the call is manipulated.
  • the server may compare the score with a predetermined score threshold; when the score is lower than the predetermined score threshold, it is determined that the score information satisfies the control condition; when the score is higher than the predetermined score threshold When it is determined that the scoring information does not meet the control conditions, the control process is ended.
  • the server may compare the keyword with a predetermined keyword; when the keyword matches the predetermined keyword, it is determined that the scoring information satisfies the control condition; when the key When the word does not match the predetermined keyword, it is determined that the scoring information does not satisfy the control condition, and the control process is ended. For example, when the keyword matches the predetermined keyword "unanswerable", it means that the operator cannot answer the user's question, and the call needs to be controlled.
  • controlling the call may include the following steps.
  • a scoring interval corresponding to each question level can be set, and the score can be compared with each scoring interval.
  • the server The question level corresponding to the scoring interval can be used as the question level of the audio clip.
  • a set of keywords corresponding to each question level can be set, and the keywords are compared with each set of keywords.
  • the server may use the question level corresponding to the group of keywords as the question level of the audio clip.
  • the control operation in this embodiment may include at least one of interception, monitoring, interruption, three-party conference, and forced teardown.
  • interception means to switch the call between the user and the operator to the call between the user and the quality inspector;
  • monitoring means that the quality inspector monitors the call between the user and the operator;
  • interruption means that the quality inspector can Insert voice into the call between the user and the attendant;
  • a three-party conference refers to switching the two-party call between the user and the attendant to a three-way call between the user, the attendant, and the quality inspector;
  • forced teardown refers to forcibly terminating the call between the user and the attendant. Call between.
  • the correspondence between the traffic type and the problem level and the control operation can be preset.
  • the traffic type is business consultation, the problem level is low, and the control operation is the corresponding relationship
  • the traffic type is business consultation, the problem level is intermediate, and the control operation is the correspondence of the tripartite conference
  • the traffic type is business consultation,
  • the problem level is high-level, the control operation is the corresponding relationship of interception, etc., which are not limited in this embodiment.
  • FIG. 3 shows a schematic flowchart of a traffic quality inspection method.
  • Step 301 Acquire audio data packets (ie, RTP protocol data packets).
  • This step can be performed by the audio stream collection module in the server.
  • Step 302 Decode the audio data packet to obtain a real-time audio stream.
  • This step can be executed by the traffic processing module in the server.
  • Step 303 Divide the real-time audio stream into audio segments, convert the audio segments into text, and extract parameters, which include keywords, voice feature parameters, and so on.
  • This step can be performed by the audio stream processing module in the server.
  • Step 304 Obtain a semantic model matching the audio clip from the semantic model library according to the keywords.
  • Step 305 Obtain an emotion model matching the audio segment from the emotion model library according to the speech feature parameters and/or keywords.
  • Step 306 Obtain a score (ie, evaluation information) according to the semantic model and the emotion model.
  • Steps 304-306 can be performed by the model matching module in the server.
  • Step 307 Determine whether to transfer to the monitoring agent (that is, whether to control the call) according to the score; if transfer to the monitoring agent, perform step 308; if not transfer to the monitoring agent, perform step 309.
  • step 308 the call is monitored, and step 309 is executed.
  • Steps 307-308 can be executed by the monitoring module in the server.
  • Step 309 Store the score, and generate a total evaluation (that is, the quality inspection result) after obtaining the scores of all audio clips.
  • This step can be executed by the evaluation model in the server.
  • FIG. 4 shows a structural block diagram of a traffic quality inspection device provided by an embodiment of the present application.
  • the traffic quality inspection device can be applied to a server.
  • the traffic quality inspection device may include the following modules.
  • the obtaining module 410 is used to obtain the real-time audio stream generated during the call when the operator is in the call with the user
  • Quality inspection module 420 for quality inspection of real-time audio streams
  • the generating module 430 is used to generate the quality inspection result of the real-time audio stream.
  • the quality inspection module 420 is also used to divide the real-time audio stream into multiple audio segments; perform quality inspection on all or part of the audio segments to obtain evaluation information of the audio segments
  • the generating module 430 is also used to generate the quality inspection result of the real-time audio stream according to all the currently obtained evaluation information.
  • the quality inspection module 420 is also used to convert all or part of the audio clips into text for each audio clip; extract keywords from the text; determine the key words from the semantic model library A semantic model that matches words, and the semantic model is related to the content of the call; the evaluation information of the audio segment is generated according to the semantic model.
  • the quality inspection module 420 is also used for extracting emotion parameters from all or part of the audio segments for each audio segment.
  • the emotion parameters are used to indicate the emotions of the operator or the user during the call. ; Determine the emotion model matching the emotion parameters from the emotion model library, and the emotion model is related to the emotion; generate the evaluation information of the audio segment according to the emotion model.
  • the quality inspection module 420 is also used to extract voice feature parameters from the audio segment, and use the voice feature parameters as emotion parameters; or, convert the audio segment into text, extract keywords from the text, and convert the keywords As emotion parameters; or, extract voice feature parameters from audio segments, convert the audio segments into text, extract keywords from the text, and use keywords and voice feature parameters as emotional parameters.
  • the generating module 430 is also used to obtain the traffic type corresponding to the call; determine the evaluation standard of the real-time audio stream according to the traffic type; generate the quality inspection of the real-time audio stream according to the evaluation standard and all evaluation information currently obtained result.
  • the generating module 430 is further configured to generate the quality inspection result of the real-time audio stream according to the evaluation criteria and all currently obtained evaluation information, when the quality inspection result of the real-time audio stream does not meet the preset condition, according to The evaluation criteria and the evaluation information of all audio clips generate improvement information, and the improvement information is used to instruct the operator to improve the traffic level.
  • the device further includes the following modules.
  • the determining module 440 is configured to perform quality inspection on all or part of the audio clips to obtain the evaluation information of the audio clips, and for each audio clip in all or part of the audio clips, determine whether the evaluation information of the audio clip satisfies the manipulation condition
  • the manipulation module 450 is also used to manipulate the call when the evaluation information of the audio clip meets the manipulation condition.
  • control module 450 is also used to obtain the traffic type corresponding to the call; to obtain the problem level corresponding to the evaluation information of the audio clip, the problem level is used to indicate the level of the problem in the call; according to the traffic type and The problem level determines the control operation; executes the control operation on the call.
  • the traffic quality inspection device obtains the real-time audio stream generated during the call when the operator is talking with the user, and then performs quality inspection on the real-time audio stream to generate the real-time audio Streaming quality inspection results, so that quality inspection can be performed during the call without recording the content of the call, and then sampling the recording, which can improve the efficiency of quality inspection; in addition, all calls can be inspected to avoid sampling Omit important and typical recording problems at the time, thereby improving the comprehensiveness of quality inspection.
  • the quality inspection results can be generated, and the improvement information is generated based on the quality inspection results, so that real-time conversations with the attendant's work process can give evaluations and suggestions, rather than post-post static evaluations, which can improve user satisfaction.
  • the call After performing quality inspection on all or part of each audio clip, if a problem or unexpected situation is found during the call, the call can also be manipulated to intervene in the call in real time, so that the problem can be dealt with in real time. Provide users with more efficient and high-quality services.
  • An embodiment of the present application provides a computer-readable storage medium that stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the The code set or instruction set is loaded and executed by the processor to implement the traffic quality inspection method as described above.
  • An embodiment of the present application provides a server, the server includes a processor and a memory, and at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to achieve the above-mentioned traffic quality. Inspection method.
  • an embodiment of the present invention also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, when the program instructions are When executed by a computer, the computer is caused to execute the method in any of the foregoing method embodiments.
  • the traffic quality inspection device provided in the above embodiment performs traffic quality inspection
  • only the division of the above-mentioned functional modules is used as an example for illustration.
  • the above-mentioned functions can be assigned to different functions according to needs.
  • the functional module is completed, that is, the internal structure of the traffic quality inspection device is divided into different functional modules to complete all or part of the functions described above.
  • the traffic quality inspection device provided in the above embodiment and the traffic quality inspection method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本申请实施例公开了一种话务质检方法、装置、存储介质及服务器,属于网络监控技术领域。所述方法包括:在话务员与用户进行通话时,获取通话过程中生成的实时音频流;对所述实时音频流进行质检;生成所述实时音频流的质检结果。

Description

话务质检方法、装置、存储介质及服务器
交叉引用
本发明要求在2019年10月25日提交中国专利局、申请号为201911021502.1、发明名称为“话务质检方法、装置、存储介质及服务器”的中国专利申请的优先权,该申请的全部内容通过引用结合在本发明中。
技术领域
本申请实施例涉及网络监控技术领域,特别涉及一种话务质检方法、装置、存储介质及服务器。
背景技术
为了考察和审核话务员的工作情况,规范话务部门的工作,需要对联络中心的话务进行质检。
联络中心通常会对话务员和用户之间的通话内容进行录音,所以,传统的话务质检主要是对录音库中的录音进行抽样,再对抽样得到的录音进行质检。
由于录音库中录音的数量较多,所以,抽样质检的效率较低。另外,在抽样录音时,受限于抽样算法的随机性,可能会遗漏重要和典型的录音,从而导致话务质检存在局限性。
发明内容
本申请实施例提供了一种话务质检方法、装置、存储介质及服务器,用于解决话务质检的效率较低以及存在局限性的问题。所述技术方案如下。
一方面,提供了一种话务质检方法,所述方法包括:在话务员与用户进行通话时,获取通话过程中生成的实时音频流;对所述实时音频流进行质检;生成所述实时音频流的质检结果。
一方面,提供了一种话务质检装置,所述装置包括:获取模块,用于在话务员与用户进行通话时,获取通话过程中生成的实时音频流;质检模块,用于对所述实时音频流进行质检;生成模块,用于生成所述实时音频流的质检结果。
一方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如上所述的话务质检方法。
一方面,提供了一种服务器,所述服务器包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如上所述的话务质检方法。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个实施例提供的话务质检方法的方法流程图;
图2是本申请另一实施例提供的话务质检方法的方法流程图;
图3是本申请另一实施例提供的话务质检方法的流程示意图;
图4是本申请再一实施例提供的话务质检装置的结构框图;
图5是本申请再一实施例提供的话务质检装置的结构框图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
请参考图1,其示出了本申请一个实施例提供的话务质检方法的方法流程图,该话务质检方法可以应用于服务器中。该话务质检方法,可以包括以下步骤。
步骤101,在话务员与用户进行通话时,获取通话过程中生成的实时音频流。
话务员和用户之间的通话可以由用户发起,也可以由话务员发起。比如,当用户想要进行业务咨询或业务投诉或给出业务建议时,可以通过用户的终端呼叫联络中心坐席,联络中心坐席会将该通话分配个一个业务员进行接听;或者,当话务员想要向用户推送业务或资讯时,可以通过联络中心坐席呼叫用户的终端,本实施例不对话务员与用户之间的通话发起方式作限定。
在话务员与用户的通话接通时,可以开始采集音频流,在每隔预定时间段后,利用RTP(Real-time Transport Protocol,实时传输协议)对当前时间段内采集到的一段音频流进行编码,得到RTP协议数据包,再通过通信核心网或媒体服务器将该RTP协议数据包发送给服务器。服务器对接收到的每个RTP协议数据包进行解包和解码,再将解码得到的各段音频流组合成实时音频流。
步骤102,对实时音频流进行质检。
服务器可以对实时音频流进行质检,质检流程详见下文中的描述,此处不作赘述。
步骤103,生成实时音频流的质检结果。
本实施例中,服务器可以在质检时实时生成质检结果,也可以在通话结束后生成质检结果,本实施例不作限定。
综上所述,本申请实施例提供的话务质检方法,通过在话务员与用户进行通话时,获取通话过程中生成的实时音频流,再对该实时音频流进行质检,生成该实时音频流的质检结果,这样,可以在通话时进行质检,而无需对通话内容进行录音,再对录音进行抽检,可以提高质检的效率;另外,所有通 话都可进行质检,避免出现抽样时遗漏重要和典型的录音的问题,从而提高质检的全面性。
请参考图2,其示出了本申请另一实施例提供的话务质检方法的方法流程图,该话务质检方法可以应用于服务器中。该话务质检方法,可以包括以下步骤。
步骤201,在话务员与用户进行通话时,获取通话过程中生成的实时音频流。
话务员和用户之间的通话可以由用户发起,也可以由话务员发起。比如,当用户想要进行业务咨询或业务投诉或给出业务建议时,可以通过用户的终端呼叫联络中心坐席,联络中心坐席会将该通话分配个一个业务员进行接听;或者,当话务员想要向用户推送业务或资讯时,可以通过联络中心坐席呼叫用户的终端,本实施例不对话务员与用户之间的通话发起方式作限定。
在话务员与用户的通话接通时,可以开始采集音频流,在每隔预定时间段后,利用RTP对当前时间段内采集到的一段音频流进行编码,得到RTP协议数据包,再通过通信核心网或媒体服务器将该RTP协议数据包发送给服务器。服务器对接收到的每个RTP协议数据包进行解包和解码,再将解码得到的各段音频流组合成实时音频流。
步骤202,将实时音频流划分成多个音频片段。
其中,服务器可以预设实时音频流的划分方式,并按照该划分方式对该实时音频流进行划分,得到多个音频片段。比如,划分方式是从实时音频流中截取单位时间的音频,将该音频作为一个音频片段。其中,单位时间可以是任意数值,比如,单位时间可以是1-5秒,本实施例不作限定。
假设单位时间为5秒,则服务器可以从接收到的实时音频流中截取第1-5秒的音频作为第一个音频片段,再从该实时音频流中截取第6-10秒的音频作为第二个音频片段,依此类推,直至截取到实时音频流中最后一个音频片段后停止。
步骤203,对全部或部分音频片段进行质检,得到音频片段的评价信息。
本实施例中,服务器可以在通话结束后,对得到的音频片段进行质检;或者,为了提高质检的实时性,服务器可以在通话过程中实时对得到的音频片段进行质检,这样,在通话结束时即可得到该实时音频流的质检结果,从而提高质检的效率。
若在通话结束后对得到的音频片段进行质检,则服务器可以对得到的全部音频片段进行质检,从而保证质检的全面性;或者,服务器还可以从得到的全部音频片段中筛选出部分音频片段,对该部分音频片段进行质检,从而通过减少质检的音频片段的数量来减少服务器的处理压力,提高质检效率。其中,服务器筛选部分音频片段的方式有很多种,比如,服务器可以每间隔预定个音频片段后选择一个音频片段,或者,服务器可以对每个音频片段进行预处理,根据预处理结果选择音频片段,或者,服务器可以随机选择音频片段,本实施例不作限定。
若在通话过程中实时对得到的音频片段进行质检,则服务器可以在每得到一个音频片段后,对该音频片段进行质检,从而保证质检的全面性;或者,服务器还可以在得到一个音频片段且判断出需要对该音频片段进行质检后,再对该音频片段进行质检,在得到一个音频片段且判断出无需对该音频片段进行质检后,不对该音频片段进行质检,从而通过减少质检的音频片段的数量来减少服务器的处理压力,提高质检效率。其中,服务器判断是否需要对音频片段进行质检的方式有很多种,详见上文中的描述,此处不作赘述。
评价信息是对一个音频片段进行质检所得到的结果。其中,评价信息可以包括音频片段的评分、关键词等等,本实施例不作限定。
其中,生成音频片段的评价信息的实施方式有很多种,比如,当评价信息包括评分时,服务器可以预先对音频片段设置多个质检项,并对每个质检项设置对应的分值,将该音频片段满足的各个质检项的分值相加后得到评分;或者,服务器可以预设评价模型,并对每个评价模型设置对应的分值,将与 该音频片段相匹配的分值作为评分。下面以预设评价模型为例,对三种可能的实施方式进行举例说明。
在第一种实施方式中,对全部或部分音频片段中的每个音频片段进行质检,得到音频片段的评价信息,可以包括以下步骤。
1)对于全部或部分音频片段中的每个音频片段,将音频片段转换为文本。
本实施例中,服务器可以在每得到一个需要进行质检的音频片段后,执行步骤1-4来生成该音频片段的评价信息,直至得到所有需要进行质检的音频片段的评价信息后停止。
其中,服务器可以利用任意的语音转文本技术来将音频片段转换为文本,本实施例不对转换方式作限定。
2)从文本中提取关键词。
本实施例中,服务器可以预先设置关键词的提取规则,并根据该提取规则从文本中提取关键词。比如,当提取规则是提取敬语时,可以从文本中提取“您”、“请”之类的关键词;当提取规则是提取业务术语时,假设通话内容是话务员向用户介绍移动宽带业务,则可以从文本中提取“移动宽带业务”、“资费”、“带宽”之类的关键词。
3)从语义模型库中确定与关键词相匹配的语义模型,语义模型与通话内容相关。
语义模型是评价模型的一种,且语义模型与通话内容相关,用于对通话的通话内容进行质检。
本实施例中,每个语义模型对应于一组关键词,则服务器可以将从文本中提取的关键词与每个语义模型的一组关键词进行比对,当从文本中提取的关键词与某一个语义模型的一组关键词相同(即关键词完全相同)或相似(即相同的关键词的比例达到预定比例阈值)时,可以确定该语义模型与从文本中提取的关键词相匹配。
4)根据语义模型生成音频片段的评价信息。
本实施例中,每个语义模型对应于一个分值,则服务器可以将该语义模型的分值作为音频片段的评分,得到该音频片段的评价信息。另外,还可以将该音频片段的关键词也添加到评价信息中。
在第二种实施方式中,对全部或部分音频片段中的每个音频片段进行质检,得到音频片段的评价信息,可以包括以下步骤。
1)对于全部或部分音频片段中的每个音频片段,从该音频片段中提取情绪参数,该情绪参数用于指示话务员或用户在通话过程中的情绪。
本实施例中,服务器可以在每得到一个需要进行质检的音频片段后,执行步骤1-3来生成该音频片段的评价信息,直至得到所有需要进行质检的音频片段的评价信息后停止。
在一个应用场景中,当话务员与用户的沟通不顺畅时,话务员或用户可能会改变自己的语音,以表达自己的不满。语音的改变可以通过语音特征参数体现,该语音特征参数可以包括音量、音调等。可见,可以将通话时的语音特征参数作为一种情绪参数,此时,从音频片段中提取情绪参数,可以包括:从音频片段中提取语音特征参数,将该语音特征参数作为情绪参数。
在另一个应用场景中,当话务员与用户的沟通不顺畅时,话务员或用户可能会在言语中表达自己的不满,比如,通话中出现侮辱性的词汇。可见,可以将通话时的关键词作为一种情绪参数,此时,从音频片段中提取情绪参数,包括:将音频片段转换为文本,从该文本中提取关键词,将该关键词作为情绪参数。
在又一个应用场景中,还可以将通话时的语音特征参数和关键词都作为一种情绪参数,此时,从音频片段中提取情绪参数,包括:从音频片段中提取语音特征参数,并将该音频片段转换为文本,从该文本中提取关键词,将该关键词和该语音特征参数作为情绪参数。
2)从情绪模型库中确定与情绪参数相匹配的情绪模型,该情绪模型与情绪相关。
情绪模型是评价模型的一种,且情绪模型与情绪相关,用于对通话时双方的情绪(即话务员的服务态度和用户的沟通态度)进行质检。
当情绪参数包括语音特征参数时,每个情绪模型对应于一个语音特征参数区间,则服务器可以将提取的语音特征参数与每个情绪模型的一个语音特征参数区间进行比对,当提取的语音特征参数属于某一个情绪模型的一个语音特征参数区间内时,可以确定该情绪模型与提取的语音特征参数相匹配。
当情绪参数包括关键词时,每个情绪模型对应于一组关键词,则服务器可以将从文本中提取的关键词与每个情绪模型的一组关键词进行比对,当从文本中提取的关键词与某一个情绪模型的一组关键词相同(即关键词完全相同)或相似(即相同的关键词的比例达到预定比例阈值)时,可以确定该情绪模型与从文本中提取的关键词相匹配。
当情绪参数包括语音特征参数和关键词时,每个情绪模型对应于一个语音特征参数区间和一组关键词,则服务器可以将提取的语音特征参数与每个情绪模型的一个语音特征参数区间进行比对,并将从文本中提取的关键词与每个情绪模型的一组关键词进行比对,当提取的语音特征参数属于某一个情绪模型的一个语音特征参数区间内,且提取的关键词与某一个情绪模型的一组关键词相同(即关键词完全相同)或相似(即相同的关键词的比例达到预定比例阈值)时,可以确定该情绪模型与提取的语音特征参数和关键词相匹配。
3)根据情绪模型生成该音频片段的评价信息。
本实施例中,每个情绪模型对应于一个分值,则服务器可以将该情绪模型的分值作为音频片段的评分,得到该音频片段的评价信息。另外,当情绪参数还包括关键词时,还可以将该音频片段的关键词也添加到评价信息中。
在第三种实施方式中,还可以将第一种和第二种实施方式进行结合,即服务器可以确定与音频片段相匹配的语义模型和情绪模型,再根据该语义模型和情绪模型生成音频片段的评价信息。此时,服务器可以计算语义模型和 情绪模型对应的分值的平均值,并将该平均值作为音频片段的评分,得到该音频片段的评价信息。另外,当情绪参数还包括关键词时,服务器还可以将该音频片段的关键词也添加到评价信息中。
需要说明的是,由于在匹配语义模型时需要进行关键词提取,当情绪参数包括关键词时也需要进行关键词提取,所以,在第三种实施方式中,只需进行一次关键词提取即可,以提高质检效率。
步骤204,根据当前得到的所有评价信息生成实时音频流的质检结果。
若在通话结束后生成质检结果,则服务器可以在通话结束后,根据得到的所有评价信息生成一条质检结果;若在通话过程中实时生成质检结果,则服务器可以在通话过程中的任意时刻,根据得到的所有评价信息生成一条质检结果。比如,服务器可以在得到第一条评价信息的时刻,根据第一条评价信息生成一条质检结果;在得到第二条评价信息的时刻,根据第一条和第二条评价信息生成一条质检结果;在得到第三条评价信息的时刻,根据第一条、第二条和第三条评价信息生成一条质检结果,依此类推。
在一实施方式中,当评价信息包括评分时,在得到所有评价信息后,可以计算所有评价信息中的评分的平均值,将该平均值作为实时音频流的质检结果。
在一实施方式中,当评价信息包括评分和关键词时,根据所有评价信息生成实时音频流的质检结果,可以包括:获取通话对应的话务类型;根据话务类型确定实时音频流的评价标准;根据评价标准和所有音频片段的评价信息生成实时音频流的质检结果。
其中,话务类型可以包括业务咨询、投诉、业务推荐等等,本实施例不作限定。不同的话务类型可以对应不同的评价标准。比如,当话务类型是业务咨询时,评价标准可以是根据是否能够为用户提供业务解答来确定业务分;当话务类型是业务投诉时,评价标准可以是根据是否能够为用户解决问题来确定业务分;当话务类型是业务推荐时,评价标准可以是根据是否可以说服 用户开通某项业务来确定业务分,或,评价标准可以是根据是否可以向用户介绍的业务的数量来确定业务分。
此时,可以根据所有评价信息中的关键词确定话务类型,再根据关键词和评价标准确定实时音频流的业务分,再确定该实时音频流的评分,最后对业务分和评分进行加权运算得到该实时音频流的质检结果。
本实施例中,在根据当前得到的所有评价信息生成实时音频流的质检结果之后,该方法还可以包括:当实时音频流的质检结果不满足预设条件时,根据评价标准和当前得到的所有评价信息生成改进信息,该改进信息用于指示话务员提升话务水平。
其中,预设条件可以是预设分值,即,当质检结果的分值小于预设分值时,可以生成改进信息,并向话务员展示改进信息,以使话务员能够根据改进信息对自己整个通话过程有一个全面的了解和评估,并对做的不够到位的地方进行改进,从而提升话务水平。
比如,话务员的质检结果的分值较低时,根据评价标准确定话务员向用户推荐的业务的数量较少,则可以生成向用户多推荐业务的改进信息;或者,话务员的质检结果的分值较低时,根据评价标准确定话务员的服务态度不好,则可以生成提升服务态度的改进信息。
综上所述,本申请实施例提供的话务质检方法,通过在话务员与用户进行通话时,获取通话过程中生成的实时音频流,再对该实时音频流进行质检,生成该实时音频流的质检结果,这样,可以在通话时进行质检,而无需对通话内容进行录音,再对录音进行抽检,可以提高质检的效率;另外,所有通话都可进行质检,避免出现抽样时遗漏重要和典型的录音的问题,从而提高质检的全面性。
在通话结束后即可生成质检结果,并根据质检结果生成改进信息,从而实时对话务员的工作过程给出评价和建议,而非事后的静态考评,可以提升用户的满意度。
本实施例中,在对全部或部分音频片段中的每个音频片段进行质检后,若发现通话过程中出现问题或突发情况,则还可以对通话进行操控,以实时介入干预通话,从而能够实时处理问题,为用户提供更高效高质的服务。那么,在对全部或部分音频片段中的每个音频片段进行质检,得到音频片段的评价信息之后,该方法还包括:对于全部或部分音频片段中的每个音频片段,确定该音频片段的评价信息是否满足操控条件;当该音频片段的评价信息满足操控条件时,对通话进行操控。
在一实施方式中,当评价信息包括评分时,服务器可以将评分与预定评分阈值进行比较;当该评分低于预定评分阈值时,确定该评分信息满足操控条件;当该评分高于预定评分阈值时,确定该评分信息不满足操控条件,则结束操控流程。
在另一实施方式中,当评价信息包括关键词时,服务器可以将关键词与预定关键词进行比较;当该关键词与预定关键词相匹配时,确定该评分信息满足操控条件;当该关键词与预定关键词不匹配时,确定该评分信息不满足操控条件,则结束操控流程。比如,关键词与预定关键词“无法解答”相匹配时,说明话务员无法解答用户的问题,则需要对通话进行操控。
其中,对通话进行操控,可以包括以下步骤。
1)获取通话对应的话务类型。
其中,确定话务类型的实现流程详见上文中的描述,此处不作赘述。
2)获取音频片段的评价信息对应的问题等级,该问题等级用于指示通话中出现的问题的等级。
在一实施方式中,当评价信息包括评分时,可以设置每个问题等级对应的一个评分区间,并将该评分与每个评分区间进行比对,当该评分属于某一个评分区间内时,服务器可以将该评分区间对应的问题等级作为该音频片段的问题等级。
在另一实施方式中,当评价信息包括关键词时,可以设置每个问题等级 对应的一组关键词,并将该关键词与每组关键词进行比对,当该关键词与某一组关键词相同或相似时,服务器可以将该组关键词对应的问题等级作为该音频片段的问题等级。
3)根据话务类型和问题等级确定控制操作。
本实施例中的控制操作可以包括拦截、监听、插话、三方会议、强拆中的至少一种。其中,拦截是指将用户与话务员之间的通话切换为用户与质检员质检的通话;监听是指质检员对用户与话务员之间的通话进行监听;插话是指质检员可以在用户与话务员之间的通话中插入语音;三方会议是指将用户与话务员之间的两方通话切换为用户、话务员和质检员之间的三方通话;强拆是指强行终断用户与话务员之间的通话。
在一实施方式中,可以预设话务类型和问题等级和控制操作之间的对应关系。比如,话务类型为业务咨询、问题等级为低级、控制操作为插话的对应关系;话务类型为业务咨询、问题等级为中级、控制操作为三方会议的对应关系;话务类型为业务咨询、问题等级为高级、控制操作为拦截的对应关系等等,本实施例不作限定。
4)对通话执行控制操作。
本实施例中,还可以将操控通话和质检进行结合,则请参考图3,其示出了一种话务质检方法的流程示意图。
步骤301,获取音频数据包(即RTP协议数据包)。
本步骤可以由服务器中的音频流采集模块执行。
步骤302,对音频数据包进行解码,得到实时音频流。
本步骤可以由服务器中的话务处理模块执行。
步骤303,将实时音频流划分成音频片段,再对音频片段转换为文本,并提取参数,该参数包括关键词、语音特征参数等。
本步骤可以由服务器中的音频流处理模块执行。
步骤304,根据关键词从语义模型库中获取与音频片段相匹配的语义模 型。
步骤305,根据语音特征参数和/或关键词从情绪模型库中获取与音频片段相匹配的情绪模型。
步骤306,根据语义模型和情绪模型获取评分(即评价信息)。
步骤304-306可以由服务器中的模型匹配模块执行。
步骤307,根据评分确定是否转监控坐席(即是否操控通话);若转监控坐席,则执行步骤308;若不转监控坐席,则执行步骤309。
步骤308,对通话进行监控操作,执行步骤309。
步骤307-308可以由服务器中的监控模块执行。
步骤309,对评分进行存储,并在得到所有音频片段的评分后生成总评价(即质检结果)。
本步骤可以由服务器中的评价模型执行。
请参考图4,其示出了本申请一个实施例提供的话务质检装置的结构框图,该话务质检装置可以应用于服务器中。该话务质检装置,可以包括以下模块。
获取模块410,用于在话务员与用户进行通话时,获取通话过程中生成的实时音频流
质检模块420,用于对实时音频流进行质检
生成模块430,用于生成实时音频流的质检结果。
在一实施方式中,质检模块420,还用于将实时音频流划分成多个音频片段;对全部或部分音频片段进行质检,得到音频片段的评价信息
生成模块430,还用于根据当前得到的所有评价信息生成实时音频流的质检结果。
在一实施方式中,质检模块420,还用于对于全部或部分音频片段中的每个音频片段,将该音频片段转换为文本;从文本中提取关键词;从语义模型库中确定与关键词相匹配的语义模型,语义模型与通话内容相关;根据语 义模型生成该音频片段的评价信息。
在一实施方式中,质检模块420,还用于对于全部或部分音频片段中的每个音频片段,从该音频片段中提取情绪参数,情绪参数用于指示话务员或用户在通话过程中的情绪;从情绪模型库中确定与情绪参数相匹配的情绪模型,情绪模型与情绪相关;根据情绪模型生成该音频片段的评价信息。
在一实施方式中,质检模块420,还用于从音频片段中提取语音特征参数,将语音特征参数作为情绪参数;或,将音频片段转换为文本,从文本中提取关键词,将关键词作为情绪参数;或,从音频片段中提取语音特征参数,并将音频片段转换为文本,从文本中提取关键词,将关键词和语音特征参数作为情绪参数。
在一实施方式中,生成模块430,还用于获取通话对应的话务类型;根据话务类型确定实时音频流的评价标准;根据评价标准和当前得到的所有评价信息生成实时音频流的质检结果。
在一实施方式中,生成模块430,还用于在根据评价标准和当前得到的所有评价信息生成实时音频流的质检结果之后,当实时音频流的质检结果不满足预设条件时,根据评价标准和所有音频片段的评价信息生成改进信息,改进信息用于指示话务员提升话务水平。
在一实施方式中,请参考图5,该装置还包括以下模块。
确定模块440,用于在对全部或部分音频片段进行质检,得到音频片段的评价信息之后,对于全部或部分音频片段中的每个音频片段,确定该音频片段的评价信息是否满足操控条件
操控模块450,还用于当该音频片段的评价信息满足操控条件时,对通话进行操控。
在一实施方式中,操控模块450,还用于获取通话对应的话务类型;获取音频片段的评价信息对应的问题等级,问题等级用于指示通话中出现的问题的等级;根据话务类型和问题等级确定控制操作;对通话执行控制操作。
综上所述,本申请实施例提供的话务质检装置,通过在话务员与用户进行通话时,获取通话过程中生成的实时音频流,再对该实时音频流进行质检,生成该实时音频流的质检结果,这样,可以在通话时进行质检,而无需对通话内容进行录音,再对录音进行抽检,可以提高质检的效率;另外,所有通话都可进行质检,避免出现抽样时遗漏重要和典型的录音的问题,从而提高质检的全面性。
在通话结束后即可生成质检结果,并根据质检结果生成改进信息,从而实时对话务员的工作过程给出评价和建议,而非事后的静态考评,可以提升用户的满意度。
在对全部或部分音频片段中的每个音频片段进行质检后,若发现通话过程中出现问题或突发情况,则还可以对通话进行操控,以实时介入干预通话,从而能够实时处理问题,为用户提供更高效高质的服务。
本申请一个实施例提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如上所述的话务质检方法。
本申请一个实施例提供了一种服务器,所述服务器包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如上所述的话务质检方法。
此外,本发明实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述任意方法实施例中的方法。
需要说明的是:上述实施例提供的话务质检装置在进行话务质检时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将话务质检装置的内部结构划分成不 同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的话务质检装置与话务质检方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述并不用以限制本申请实施例,凡在本申请实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请实施例的保护范围之内。

Claims (12)

  1. 一种话务质检方法,其中,所述方法包括:
    在话务员与用户进行通话时,获取通话过程中生成的实时音频流;
    对所述实时音频流进行质检;
    生成所述实时音频流的质检结果。
  2. 根据权利要求1所述的方法,其中,
    所述对所述实时音频流进行质检,包括:将所述实时音频流划分成多个音频片段;对全部或部分音频片段进行质检,得到所述音频片段的评价信息;
    所述生成所述实时音频流的质检结果,包括:根据当前得到的所有评价信息生成所述实时音频流的质检结果。
  3. 根据权利要求2所述的方法,其中,所述对全部或部分音频片段进行质检,得到所述音频片段的评价信息,包括:
    对于所述全部或部分音频片段中的每个音频片段,将所述音频片段转换为文本;
    从所述文本中提取关键词;
    从语义模型库中确定与所述关键词相匹配的语义模型;
    根据所述语义模型生成所述音频片段的评价信息。
  4. 根据权利要求2所述的方法,其中,所述对全部或部分音频片段进行质检,得到所述音频片段的评价信息,包括:
    对于所述全部或部分音频片段中的每个音频片段,从所述音频片段中提取情绪参数,所述情绪参数用于指示所述话务员或所述用户在通话过程中的情绪;
    从情绪模型库中确定与所述情参数相匹配的情绪模型;
    根据所述情绪模型生成所述音频片段的评价信息。
  5. 根据权利要求4所述的方法,其中,所述从所述音频片段中提取情绪参数,包括:
    从所述音频片段中提取语音特征参数,将所述语音特征参数作为所述情绪参数;或
    将所述音频片段转换为文本,从所述文本中提取关键词,将所述关键词作为所述情绪参数;或
    从所述音频片段中提取语音特征参数,并将所述音频片段转换为文本,从所述文本中提取关键词,将所述关键词和所述语音特征参数作为所述情绪参数。
  6. 根据权利要求2所述的方法,其中,所述根据当前得到的所有评价信息生成所述实时音频流的质检结果,包括:
    获取所述通话对应的话务类型;
    根据所述话务类型确定所述实时音频流的评价标准;
    根据所述评价标准和当前得到的所有评价信息生成所述实时音频流的质检结果。
  7. 根据权利要求6所述的方法,其中,在所述根据所述评价标准和当前得到的所有评价信息生成所述实时音频流的质检结果之后,所述方法还包括:
    当所述实时音频流的质检结果不满足预设条件时,根据所述评价标准和所有音频片段的评价信息生成改进信息,所述改进信息用于指示所述话务员提升话务水平。
  8. 根据权利要求2至7中任一项所述的方法,其中,在所述对全部或部分音频片段进行质检,得到所述音频片段的评价信息之后,所述方法还包括:
    对于所述全部或部分音频片段中的每个音频片段,确定所述音频片段的评价信息是否满足操控条件;
    当所述音频片段的评价信息满足所述操控条件时,对所述通话进行操控。
  9. 根据权利要求8所述的方法,其中,所述对所述通话进行操控,包括:
    获取所述通话对应的话务类型;
    获取所述音频片段的评价信息对应的问题等级,所述问题等级用于指示 所述通话中出现的问题的等级;
    根据所述话务类型和所述问题等级确定控制操作;
    对所述通话执行所述控制操作。
  10. 一种话务质检装置,其中,所述装置包括:
    获取模块,用于在话务员与用户进行通话时,获取通话过程中生成的实时音频流;
    质检模块,用于对所述实时音频流进行质检;
    生成模块,用于生成所述实时音频流的质检结果。
  11. 一种计算机可读存储介质,其中,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至9任一所述的话务质检方法。
  12. 一种服务器,其中,所述服务器包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如权利要求1至9任一所述的话务质检方法。
PCT/CN2020/123657 2019-10-25 2020-10-26 话务质检方法、装置、存储介质及服务器 WO2021078298A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911021502.1 2019-10-25
CN201911021502.1A CN112714217A (zh) 2019-10-25 2019-10-25 话务质检方法、装置、存储介质及服务器

Publications (1)

Publication Number Publication Date
WO2021078298A1 true WO2021078298A1 (zh) 2021-04-29

Family

ID=75541455

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/123657 WO2021078298A1 (zh) 2019-10-25 2020-10-26 话务质检方法、装置、存储介质及服务器

Country Status (2)

Country Link
CN (1) CN112714217A (zh)
WO (1) WO2021078298A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113593553B (zh) * 2021-07-12 2022-05-24 深圳市明源云客电子商务有限公司 语音识别方法、装置、语音管理服务器以及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313232B1 (en) * 2003-08-26 2007-12-25 Nortel Networks Limited Monitoring for operator services
CN101662550A (zh) * 2009-09-11 2010-03-03 中兴通讯股份有限公司 呼叫中心服务质量检测方法及系统
CN102082879A (zh) * 2009-11-27 2011-06-01 华为技术有限公司 呼叫中心语音检测的方法、装置及系统
CN102625005A (zh) * 2012-03-05 2012-08-01 广东天波信息技术股份有限公司 具有服务质量实时监督功能的呼叫中心系统及其实现方法
CN103905657A (zh) * 2012-12-28 2014-07-02 中国移动通信集团江苏有限公司 一种监控呼叫服务质量的方法和系统
CN104168394A (zh) * 2014-06-27 2014-11-26 国家电网公司 一种呼叫中心抽样质检方法及系统
CN106776806A (zh) * 2016-11-22 2017-05-31 广东电网有限责任公司佛山供电局 呼叫中心质检语音的评分方法和系统
CN109639914A (zh) * 2019-01-08 2019-04-16 深圳市沃特沃德股份有限公司 智能考评方法、系统及计算机可读存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313232B1 (en) * 2003-08-26 2007-12-25 Nortel Networks Limited Monitoring for operator services
CN101662550A (zh) * 2009-09-11 2010-03-03 中兴通讯股份有限公司 呼叫中心服务质量检测方法及系统
CN102082879A (zh) * 2009-11-27 2011-06-01 华为技术有限公司 呼叫中心语音检测的方法、装置及系统
CN102625005A (zh) * 2012-03-05 2012-08-01 广东天波信息技术股份有限公司 具有服务质量实时监督功能的呼叫中心系统及其实现方法
CN103905657A (zh) * 2012-12-28 2014-07-02 中国移动通信集团江苏有限公司 一种监控呼叫服务质量的方法和系统
CN104168394A (zh) * 2014-06-27 2014-11-26 国家电网公司 一种呼叫中心抽样质检方法及系统
CN106776806A (zh) * 2016-11-22 2017-05-31 广东电网有限责任公司佛山供电局 呼叫中心质检语音的评分方法和系统
CN109639914A (zh) * 2019-01-08 2019-04-16 深圳市沃特沃德股份有限公司 智能考评方法、系统及计算机可读存储介质

Also Published As

Publication number Publication date
CN112714217A (zh) 2021-04-27

Similar Documents

Publication Publication Date Title
US10276153B2 (en) Online chat communication analysis via mono-recording system and methods
US10410636B2 (en) Methods and system for reducing false positive voice print matching
US10194029B2 (en) System and methods for analyzing online forum language
CN110266899B (zh) 客户意图的识别方法和客服系统
US7599475B2 (en) Method and apparatus for generic analytics
US7596498B2 (en) Monitoring, mining, and classifying electronically recordable conversations
CN104050221A (zh) 用于在虚拟会议中自动记笔记的方法和系统
CN111128241A (zh) 语音通话的智能质检方法及系统
CA2665055C (en) Treatment processing of a plurality of streaming voice signals for determination of responsive action thereto
CN109417583A (zh) 一种将音频信号实时转录为文本的系统和方法
WO2021078298A1 (zh) 话务质检方法、装置、存储介质及服务器
CN111654658A (zh) 音视频通话的处理方法、系统、编解码器及存储装置
JP2016143909A (ja) 通話内容分析表示装置、通話内容分析表示方法、及びプログラム
TWI782442B (zh) 一種在線訪談的方法及系統
US11606461B2 (en) Method for training a spoofing detection model using biometric clustering
US8751222B2 (en) Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto
Pornpongtechavanich et al. Audiovisual quality assessment: a study of video calls provided by social media applications
RU2783966C1 (ru) Способ обработки входящих звонков
CN115731937A (zh) 信息处理方法、装置、电子设备及存储介质
CN115914673A (zh) 一种基于流媒体服务的合规检测方法及装置
CN115798479A (zh) 确定会话信息的方法、装置、电子设备及存储介质
CN116975242A (zh) 语音播报打断处理方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20879986

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20879986

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20/02/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20879986

Country of ref document: EP

Kind code of ref document: A1