WO2021169423A1 - 客服录音的质检方法、装置、设备及存储介质 - Google Patents

客服录音的质检方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021169423A1
WO2021169423A1 PCT/CN2020/129256 CN2020129256W WO2021169423A1 WO 2021169423 A1 WO2021169423 A1 WO 2021169423A1 CN 2020129256 W CN2020129256 W CN 2020129256W WO 2021169423 A1 WO2021169423 A1 WO 2021169423A1
Authority
WO
WIPO (PCT)
Prior art keywords
quality inspection
customer service
service recording
candidate
score
Prior art date
Application number
PCT/CN2020/129256
Other languages
English (en)
French (fr)
Inventor
黄研洲
张超
杨海军
徐倩
杨强
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2021169423A1 publication Critical patent/WO2021169423A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management

Definitions

  • This application provides a quality inspection method, device, equipment and storage medium for customer service recording, which aims to reduce the workload of customer service recording quality inspection and improve the efficiency of quality inspection.
  • the candidate segments and their scores are input into the quality inspection model, and the quality inspection result of the customer service recording is determined according to the prediction result output by the quality inspection model.
  • the step of mining the segments of the customer service recording text includes:
  • the method before the step of acquiring the customer service recording text and mining the fragments of the customer service recording text, the method further includes:
  • this application also provides a quality inspection device for customer service recording, and the quality inspection device for customer service recording includes:
  • the mining module is used to obtain the recorded text of the customer service, and mine the fragments of the recorded text of the customer service;
  • the screening module is used to calculate the score of the fragment, and filter out candidate fragments based on the score;
  • the quality inspection equipment for customer service recording includes a processor, a memory, and a quality inspection program for customer service recording stored in the memory.
  • the quality inspection program is run by the processor, the steps of the quality inspection method for customer service recording as described above are implemented.
  • this application also provides a computer storage medium, the computer storage medium stores a quality inspection program for customer service recording, and the quality inspection program for customer service recording is executed by the processor to achieve the above-mentioned customer service recording The steps of the quality inspection method.
  • this application provides a quality inspection method, device, equipment, and storage medium for customer service recording, which acquires customer service recording text, mines segments of the customer service recording text; calculates the score of the segment, and based on the The candidate fragments are screened out by scores; the candidate fragments and their scores are input into the quality inspection model, and the quality inspection result of the customer service recording is determined according to the prediction result output by the quality inspection model.
  • FIG. 5 is a schematic diagram of the functional modules of the first embodiment of the quality inspection device for customer service recording of this application.
  • the quality inspection equipment for customer service recording mainly involved in the embodiments of the present application refers to a network connection device capable of realizing network connection, and the quality inspection equipment for customer service recording may be a server, a cloud platform, and the like.
  • the quality inspection method for customer service recording is applied to the quality inspection equipment for customer service recording, and the method includes:
  • Step S101 Obtain the customer service recording text, and mine the fragments of the customer service recording text;
  • the customer service recording text to be quality-inspected can be obtained through the preset text input interface.
  • the customer service recording text may have no role mark, and the customer service recording text may also have no clauses.
  • the format of the customer service recording text may be txt, doc, xls, pdf, etc.
  • Step S101-1 obtaining keywords of the recorded text of the customer service, and obtaining a keyword set based on the keywords;
  • the step of obtaining keywords of the customer service recording text and obtaining a keyword set based on the keywords includes:
  • Step a Perform word segmentation and part-of-speech tagging on the customer service recording text, and filter based on the part-of-speech tagging results to obtain candidate keywords;
  • word segmentation mainly refers to Chinese word segmentation, which refers to the segmentation of a Chinese character sequence into individual words, which is a process of recombining consecutive character sequences into word sequences according to certain specifications.
  • word segmentation methods based on string matching
  • word segmentation methods based on understanding word segmentation methods based on statistics.
  • word segmentation method based on string matching includes forward maximum matching, reverse maximum matching, two-way maximum matching, N-gram (Chinese language model) two-way maximum matching, and the like.
  • the forward maximum matching refers to matching the most continuous characters in the customer service recording text with the vocabulary from left to right, and if they match, a word is segmented
  • the reverse maximum matching refers to from right to right On the left, the most continuous characters in the customer service recording text are matched with the vocabulary, and if they match, a word is segmented
  • the two-way maximum matching includes a forward maximum matching algorithm and a reverse maximum matching algorithm.
  • the N-gram two-way maximum matching is based on the forward maximum matching algorithm and the reverse in the string-based word segmentation method
  • the maximum matching algorithm uses Bi-gram to calculate the parts with greater probability for the different parts of the sequence results obtained by matching in the two directions, and finally concatenates to obtain the best word sequence.
  • part-of-speech tagging can be performed based on a dictionary search algorithm of string matching. Specifically, the part of speech of each word is searched from the dictionary, and the corresponding part of speech is marked according to the found part of speech.
  • the customer service recording text is filtered based on the part-of-speech tagging results, for example, words with specified parts of speech are filtered out; words appearing in the stop word list can also be filtered out, and the stop word list is preset; It is also possible to filter words whose length is less than a preset value according to the length of the word.
  • the preset value may be 3. Mark the remaining words after filtering as candidate keywords.
  • Step b Select one candidate keyword from the candidate keywords in turn, and construct a keyword map of each candidate keyword.
  • the keyword map includes the selected candidate keyword and the four following the word. Four edges composed of candidate keywords;
  • a keyword map of each candidate keyword is constructed with each candidate keyword as the center.
  • the candidate keywords and the four candidate keywords following the candidate keywords respectively form four edges of the keyword graph, and the keyword graph includes the four edges composed of keywords and their corresponding weights.
  • the keyword graph is composed of four edges composed of the selected candidate keyword and the four candidate keywords following the word, and the four edges are (A, B), (A, C), ( A,D), (A,E). And the initial weight of each edge is 1. When this edge appears again or more later, on the basis of the initial weight, add 1 every time it appears, and finally use the result of the addition as the weight
  • a keyword map of the candidate keyword is constructed. Select candidate keywords in turn until the corresponding keyword map is constructed for all candidate keywords.
  • the preset formula is:
  • S represents the weight
  • d represents the damping coefficient
  • represents the candidate keyword set
  • i represents the target candidate keyword
  • j represents each candidate keyword before i
  • w represents the degree of importance between i and j
  • out(vj) is The number of candidate keywords.
  • the value of the damping coefficient d may be 0.85.
  • the weight of i depends on the weight of the edge (j, i) composed of i and the points j before i, and the sum of the weights from j to the other edges.
  • Step d Sort the candidate keywords according to the weights in the iteration results, select candidate keywords according to the ranking results, and save the candidate keywords and their corresponding weights as the keyword set.
  • the keywords may be sorted in reverse order according to the weights, and a number of candidate keywords at the top of the ranking may be selected according to the sorting result, and the number of the candidate keywords may be selected according to actual conditions. Finally, the several candidate keywords and their corresponding weights are saved as the keyword set, thereby obtaining the keyword set.
  • step S101-2 the segment is determined according to the word map and the keyword set, and the word map is constructed according to the standard example sentences of the quality inspection items.
  • the word graph defines the transfer matrix from word to word. For example, for “you”, “of”, “ID number”, “back”, “four digits”, “yes” and “how many", you can transfer the "ID number” to the front of " ⁇ ".
  • the segment is determined based on the word map.
  • the word map is constructed based on the standard example sentences of quality inspection items.
  • fragments can be obtained by processing only the customer service recording text, and there is no sentence segmentation requirement for the customer service recording text, so there will be no misjudgment due to the pause of the speaker, and the accuracy of quality inspection can be improved.
  • Step S102 Calculate the score of the fragment, and filter out candidate fragments based on the score
  • the score is a criterion for screening candidate fragments. Understandably, the score can be a percent system, a ten-point system, and so on.
  • Step S102a Obtain the similarity between the segment and the quality inspection item, and calculate the score of the segment belonging to each quality inspection item based on the similarity;
  • the score of the segment belonging to each quality inspection item is calculated.
  • the interval of the score can be set to 0-100.
  • step S102b if the score is greater than or equal to the score threshold, the corresponding segment is screened as a candidate segment.
  • the score threshold may be 80, 70, 60, and so on. If the score is greater than or equal to the score threshold, the corresponding segment is screened as a candidate segment.
  • Step S103 Input the candidate segments and their scores into the quality inspection model, and determine the quality inspection result of the customer service recording according to the prediction result output by the quality inspection model.
  • the step of determining the quality inspection result of the customer service recording according to the prediction result output by the quality inspection model includes:
  • Step S103c If the target prediction label includes all the quality inspection items, it is determined that the customer service recording meets the requirements, and the quality inspection result is determined to be qualified;
  • the quality inspection items are set according to the quality inspection requirements. For example, for loan risk assessment, the quality inspection items can be "whether you are the person, whether you have the ability to repay, whether you have the willingness to repay", and so on.
  • the standard example sentence of the quality inspection item is one or more related standard sentences for the quality inspection item. For example, if it is the person, the standard example sentences of the quality inspection items can be "what is your name”, “what is your contact number”, “what is your work unit”, “what is your ID number” . For another example, for the quality inspection item "customer work unit address verification", the corresponding standard example sentences can be set to "where is your work unit address", "where do you work now", etc.
  • one or more quality inspection items are set, and several quality inspection item standard example sentences are set for the quality inspection items; the customer service recording text is obtained, and the fragments of the customer service recording text are excavated; and the fragments are calculated
  • the candidate fragments are selected based on the scores; the candidate fragments and their scores are input into the quality inspection model, and the quality inspection result of the customer service recording is determined according to the prediction result output by the quality inspection model.
  • Figure 4 is the third embodiment of the quality inspection method for customer service recording of this application. Schematic diagram of the process
  • the method further includes:
  • Step S1031 training according to the quality inspection example sentences and their corresponding quality inspection labels to obtain a quality inspection model
  • the quality inspection model includes a representation of a word embedding layer and a representation of a multi-layer neural network.
  • the quality inspection model is constructed based on a Multilayer Perceptron (MLP), and the layers of the Multilayer Perceptron are fully connected.
  • the bottom layer of the multilayer perceptron is the input layer, the middle is the hidden layer, and the last is the output layer.
  • Word embedding is a type of method that uses dense vector representation to represent words and documents. This is an improvement on the traditional bag-of-word model coding scheme, in which a large sparse vector is used to represent each word or each word in the vector is assigned a number to represent the entire vocabulary. These representations are sparse because the vocabulary is extensive, so that a given word or document will be represented by a vector geometric representation mainly composed of zero values.
  • word embedding can be performed by Word2Vec neural network or GloVe neural network.
  • the initial parameters of the word embedding layer and the multilayer neural network are random, or the initial parameters are determined based on experience. Therefore, the quality inspection model needs to be trained, and the specific training process is as follows:
  • a cross entropy loss function is calculated based on the predicted label Z and the quality inspection label.
  • the cross-entropy loss function is calculated based on mini-batch. Calculate the gradient corresponding to each parameter in the initial model according to the cross-entropy loss function, and update each parameter correspondingly according to the gradient of each parameter, that is, adjust the word embedding layer and the multi-layer nerve in the quality inspection model Various parameters of the network.
  • the process of updating the number according to the cross-entropy loss function is similar to the existing model parameter updating process, and will not be described in detail here.
  • Step S1032 judging whether the quality inspection model converges according to the loss function
  • Step S1033 If the quality inspection model is in a convergent state, stop training, save the model parameters, and obtain the quality inspection model.
  • the quality inspection model is obtained by training according to the quality inspection example sentences and the corresponding quality inspection labels; judging whether the quality inspection model is converged according to the loss function; if the quality inspection model is in a convergent state, the training is stopped , Save the model parameters, and obtain the quality inspection model.
  • the quality inspection model is trained according to the quality inspection example sentences, which improves the pertinence of the quality inspection model, and can improve the effect and accuracy of the quality inspection of customer service recordings.
  • the quality inspection device for customer service recording includes:
  • the quality inspection module is used to input the candidate fragments and their scores into the quality inspection model, and determine the quality inspection result of the customer service recording according to the prediction result output by the quality inspection model.
  • the screening module includes:
  • the first screening unit is configured to screen the corresponding fragment as a candidate fragment if the score is greater than or equal to the score threshold.
  • the quality inspection module further includes:
  • the training unit is used to train to obtain the quality inspection model according to the quality inspection example sentences and their corresponding quality inspection labels;
  • the obtaining unit is configured to stop training if the quality inspection model is in a convergent state, save the model parameters, and obtain the quality inspection model.
  • the mining module further includes:
  • a mining unit configured to obtain keywords of the customer service recording text, and obtain a keyword set based on the shutdown words
  • the determining unit is configured to determine the segment according to the word map and the keyword set, and the word map is constructed according to the standard example sentences of the quality inspection items.
  • the mining unit further includes:
  • the first selection subunit is used to sequentially select one candidate keyword from the candidate keywords to construct a keyword map of each candidate keyword.
  • the keyword map includes the selected candidate keywords and The four candidate keywords after the word are composed of four edges;
  • the iterative subunit is used to iteratively propagate the weight of each node of the keyword graph according to a preset formula until convergence;
  • the second selection subunit is used to sort the candidate keywords according to the weight in the iteration result, select the candidate keywords according to the ranking result, and save the candidate keywords and their corresponding weights as the keywords gather.
  • the quality inspection module further includes:
  • the second screening unit is used to screen the prediction result to obtain a target prediction label with a probability greater than or equal to a probability threshold;
  • a comparison unit configured to compare the target prediction label with the quality inspection item, and determine whether the customer service recording meets the requirements
  • the first determining unit is configured to determine that the customer service recording meets the requirements if the target prediction label includes all the quality inspection items, and determine the quality inspection result as qualified;
  • the second determination unit is configured to determine that the customer service recording text does not meet the requirements if the target prediction tag does not include all the quality inspection items, and determine the quality inspection result as unqualified.
  • the setting unit is used to set one or more quality inspection items, and set several quality inspection item standard example sentences for the quality inspection items.
  • the embodiment of the present application also provides a computer storage medium, the computer storage medium stores a quality inspection program for customer service recording, and the quality inspection program for customer service recording is executed by the processor to achieve the quality of the customer service recording as described above. The steps of the inspection method will not be repeated this time.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Biology (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种客服录音的质检方法、装置、设备及存储介质,该方法包括:获取客服录音文本,挖掘所述客服录音文本的片段;计算所述片段的分数,并基于所述分数筛选出候选片段;将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。

Description

客服录音的质检方法、装置、设备及存储介质
优先权信息
本申请要求于2020年2月26日申请的、申请号为202010123365.9、名称为“客服录音的质检方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种客服录音的质检方法、装置、设备及存储介质。
背景技术
随着计算机技术的发展,越来越多的技术(大数据、分布式、区块链Blockchain、人工智能等)应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,但由于金融行业的安全性、实时性要求,也对技术提出了更高的要求。
目前,对于包括客服和客户的质检录音中,若录入方式是单通道,则无法从物理上区分说话角色,进而需要对所有的录音都进行质检,或者先对待质检录音文本进行角色标注,再根据角色标注结果进行选择性质检,因而在客服录音的质检工作量大,耗时费力。
发明内容
本申请提供一种客服录音的质检方法、装置、设备及存储介质,旨在减轻客服录音质检的工作量,提高质检效率。
为实现上述目的,本申请提供一种客服录音的质检方法,所述方法包括:
获取客服录音文本,挖掘所述客服录音文本的片段;
计算所述片段的分数,并基于所述分数筛选出候选片段;
将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
在一实施例中,所述计算所述片段的分数,并基于所述分数筛选出候选片段的步骤包括:
获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
在一实施例中,所述将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤之前还包括:
根据质检例句及其对应的质检标签进行训练获得质检模型;
根据损失函数判断所述质检模型是否收敛;
若所述初始模型处于收敛状态,则停止训练,获得所述质检模型。
在一实施例中,所述挖掘所述客服录音文本的片段的步骤包括:
获取所述客服录音文本的关键词,并基于所述关机词获得关键词集合;
根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。
在一实施例中,所述获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合的步骤包括:
对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边;
根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
根据迭代结果中的权值对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为所述关键词集合。
在一实施例中,所述根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤包括:
对所述预测结果进行筛选,获得概率大于或等于概率阈值的目标预测标签;
将所述目标预测标签与所述质检项进行比较,判断所述客服录音是否符合要求;
若所述目标预测标签包括所有的所述质检项,则判定所述客服录音符合要求,将质检结果确定为合格;
若所述目标预测标签不包括所有的所述质检项,则判定所述客服录音文本不符合要求,将质检结果确定为不合格。
在一实施例中,所述获取客服录音文本,挖掘所述客服录音文本的片段的步骤之前还包括:
设置一个或多个质检项,并为所述质检项设置若干个质检项标准例句。
为实现上述目的,本申请还提供一种客服录音的质检装置,所述客服录音的质检装置包括:
挖掘模块,用于获取客服录音文本,挖掘所述客服录音文本的片段;
筛选模块,用于计算所述片段的分数,并基于所述分数筛选出候选片段;
质检模块,用于将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
为实现上述目的,本申请还提供一种客服录音的质检设备,所述客服录音的质检设备包括处理器,存储器以及存储在所述存储器中的客服录音的质检程序,所述客服录音的质检程序被所述处理器运行时,实现如上所述的客服录音的质检方法的步骤。
为实现上述目的,本申请还提供一种计算机存储介质,所述计算机存储介质上存储有客服录音的质检程序,所述客服录音的质检程序被处理器运行时实现如上所述客服录音的质检方法的步骤。
相比现有技术,本申请提供一种客服录音的质检方法、装置、设备及存储介质,获取客服录音文本,挖掘所述客服录音文本的片段;计算所述片段的分数,并基于所述分数筛选出候选片段;将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。由此,从客服录音中挖掘片段,并只对筛选出来的候选片段进行质检,减轻了客服录音质检的工作量,提高了质检效率。
附图说明
图1是本申请各实施例涉及的客服录音的质检设备的硬件结构示意图;
图2是本申请客服录音的质检方法第一实施例的流程示意图;
图3是本申请客服录音的质检方法第二实施例的流程示意图;
图4是本申请客服录音的质检方法第二实施例的流程示意图;
图5是本申请客服录音的质检装置第一实施例的功能模块示意图。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例主要涉及的客服录音的质检设备是指能够实现网络连接的网络连接设备,所述客服录音的质检设备可以是服务器、云平台等。
参照图1,图1是本申请各实施例涉及的客服录音的质检设备的硬件结构示意图。本申请实施例中,客服录音的质检设备可以包括处理器1001(例如中央处理器Central Processing Unit、CPU),通信总线1002,输入端口1003,输出端口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信;输入端口1003用于数据输入;输出端口1004用于数据输出,存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器,存储器1005可选的还可以是独立于前述处理器1001的存储装置。本领域技术人员可以理解,图1中示出的硬件结构并不构成对本申请的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
继续参照图1,图1中作为一种可读存储介质的存储器1005可以包括操作系统、网络通信模块、应用程序模块以及客服录音的质检程序。在图1中,网络通信模块主要用于连接服务器,与服务器进行数据通信;而处理器1001可以调用存储器1005中存储的客服录音的质检程序,并执行本申请实施例提供的客服录音的质检方法。
本申请实施例提供了一种客服录音的质检方法。
参照图2,图2是本申请客服录音的质检方法第一实施例的流程示意图。
本实施例中,所述客服录音的质检方法应用于客服录音的质检设备,所 述方法包括:
步骤S101,获取客服录音文本,挖掘所述客服录音文本的片段;
本实施例中,可以通过预设文本输入接口获取待质检的客服录音文本。所述客服录音文本可以是无角色标记的,所述客服录音文本也可以是没有分句的。所述客服录音文本的格式可以是txt,doc,xls,pdf等。
本实施例中,利用textrank算法挖掘关键词。textrank算法是一种用于挖掘文本关键词的算法,可基于关键词图实现。一般地,如果一个词出现在很多词的后面,则说明该词比较重要;如果一个词跟在textrank值很高的词后面,则这个词的textrank值也会比较高。
所述挖掘所述客服录音文本的片段的步骤包括:
步骤S101-1,获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合;
扫描所述客服录音文本,利用自然语言技术挖掘所述客服录音文本的关键词,获得一个或多个关键词。
具体地,所述获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合的步骤包括:
步骤a,对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
本实施例中,分词主要指中文分词,指的是将一个汉字序列切分成一个一个单独的词,是将连续的字序列按照一定的规范重新组合成词序列的过程。一般地,有基于字符串匹配的分词方法、基于理解的分词方法和基于统计的分词方法。其中,所述基于字符串匹配的分词方法又包括正向最大匹配、逆向最大匹配、双向最大匹配、N-gram(汉语语言模型)双向最大匹配等。其中,正向最大匹配是指从左到右将所述客服录音文本中的最多个连续字符与词表匹配,如果匹配上,则切分出一个词;所述逆向最大匹配是指从右到左将所述客服录音文本中的最多个连续字符与词表匹配,如果匹配上,则切分出一个词;所述双向最大匹配包括正向最大匹配算法和逆向最大匹配算法.如果两个算法得到相同的分词结果,则判定切分成功,否则,则判定出现了歧义现象或者是未登录词问题;所述N-gram双向最大匹配基于字符串的分词方法中的正向最大匹配算法和逆向最大匹配算法,然后对两个方向匹配得出的 序列结果中不同的部分运用Bi-gram计算得出较大概率的部分,最后拼接得到最佳词序列。
对所述客服录音文本分词后可以基于字符串匹配的字典查找算法进行词性标注。具体地,从字典中查找每个词语的词性,根据查找到的词性进行对应标注。
标注词性后,再基于词性标注结果对所述客服录音文本进行过滤,例如将指定词性的词过滤掉;也可以过滤掉出现在停用词表中的词,所述停用词表预先设置;还可以根据词的长度,将长度小于预设值的词过滤,例如,所述预设值可以是3。将过滤后剩余的词标记为候选关键词。
步骤b,从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边;
分别以每个所述候选关键词为中心,构建各个所述候选关键词的关键词图。
具体地,首先选定一个候选关键词,将所述候选关键词用A表示,将候选关键词A标记为目标候选关键词;
以所述目标候选关键词为中心,获取紧跟所述目标候选关键词后面的四个候选关键词,假设所述候选关键词A后面的四个候选关键词分别为B,C,D,E;
将所述候选关键词与所述候选关键词后面的四个候选关键词分别组成所述关键词图的四条边,所述关键词图包括关键词组成的四条边及其对应的权重。
所述关键词图是由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边构成,这四条边分别为(A,B)、(A,C)、(A,D)、(A,E)。并且每条边的初始权值为1,当这条边在之后再次或多次出现时,则在所述初始权值的基础上,每出现一次加1,最后将加和结果作为权值
每次选择一个候选关键词,构建该候选关键词的关键词图。依次选择候选关键词,直到为所有的候选关键词都构建了对应的关键词图。
例如,“您”“的”“身份证号”“后”“四位”“是”“多少”,在这句话中,若选择“身份证号”作为候选关键词,则对应的四条边则分别是 (身份证号,后)、(身份证号,四位)、(身份证号,是)、(身份证号,多少)。
步骤c,根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
本实施例中,所述预设公式为:
Figure PCTCN2020129256-appb-000001
其中
S表示权重,d表示阻尼系数,ε表示候选关键词集合,i表示所述目标候选关键词,j表示i前面的各个候选关键词,w表示i、j间的重要程度,out(vj)是候选关键词的个数。所述阻尼系数d的取值可以是0.85。
由此,i的权重取决于i与在i前面的各个点j组成的(j,i)这条边的权重,以及j到其它各条边的权重之和。
步骤d,根据迭代结果中的权重对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为为所述关键词集合。
本实施例中,可以根据所述权重对所述关键词进行倒序排序,根据排序结果选择排序靠前的若干个候选关键词,所述候选关键词的个数可以根据实际情况选择。最后将所述若干个候选关键词及其对应的权重保存为所述关键词集合,由此,获得所述关键词集合。
步骤S101-2,根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。并且所述词图定义了词到词之间的转移矩阵。例如,对于“您”“的”“身份证号”“后”“四位”“是”“多少”,可以将“身份证号”转移至“的”前面。
获得关键词后,再基于词图确定所述片段。本实施例中,词图是基于质检项标准例句构造。
由此,仅对客服录音文本进行处理就可以获得片段,并且对所述客服录音文本没有断句的要求,因而不会由于说话人的停顿中断导致误判,能提高质检准确率。例如,在汽车金融贷款场景中,“客服:您的工作单位地址在哪里;客户:在广东深圳市南山区”。由于不知道说话人的角色,质检模型需要对这两句话都进行质检(客户说的话不应该列入质检范围),影响了质检效率。还会由于说话人停顿,对于客服录音:“客服:您的工作单位地址 在哪里”,则可能会被语音识别系统识别为“您的工作单位”,“地址在哪里”两句。此时质检模型容易把“您的工作单位”混淆为质检项“客户工作单位核实”,造成错误。
步骤S102,计算所述片段的分数,并基于所述分数筛选出候选片段;
所述分数是筛选候选片段的标准。可以理解地,分数可以是百分制、十分制等。
具体地,所述根据预设函数计算所述片段的分数,并基于所述分数筛选出候选片段的步骤包括:
步骤S102a,获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
本实施例中,将函数定义为Score(Span,Si,Li,Di,Zi)其中Score表示分数,Span表示片段,Si表示关键词集合,Li表示词图,Di表示客服录音文本,Zi表示质检项。
根据所述片段Span与所述质检项Zi之间的相似度计算所述片段属于各个质检项的分数,相似度越高,则分数也越高。可以将所述分数的区间设置为0-100。
步骤S102b,若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
所述分数阈值可以为80、70、60等。若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
反之,若所述分数小于所述分数阈值,则忽略该分数对应的片段。
步骤S103,将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
具体地,所述根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤包括:
步骤S103a:对所述预测结果进行筛选,获得概率大于或等于概率阈值的目标预测标签;
获取随时质检模型输出的预测标签,所述预测标签包括预测结果及其概率。基于概率对所述预测结果进行筛选,将所述预测结果分成两个部分:一是概率大于或等于概率阈值的第一预测结果;二是概率小于概率阈值的第二 预测结果。本实施例中,为了获得准确的质检结果,将概率大于或等于概率阈值的所述第一预测结果标记为所述目标预测结果,并获取所述目标预测结果对应的目标预测标签。
步骤S103b:将所述目标预测标签与所述质检项进行比较,判断所述客服录音是否符合要求;
本实施例中,所述目标预测标签与质检项对应。比较所述目标预测标签的数量是否与所述质检项的数量一致;
若所述目标预测标签的数量与所述质检项的数量一致,则进一步比较各个目标预测标签与所述质检项的内容是否完全相同;若所述各个预测标签与所述质检项的内容完全相同,则判定所述目标预测标签包括所有的所述质检项。
若所述目标预测标签的数量小于所述质检项的数量,则直接判定所述目标预测标签不包括所有的所述质检项。
可以理解地,若存在相同的预测标签,则预先合并所有相同的目标预测标签。
步骤S103c:若所述目标预测标签包括所有的所述质检项,则判定所述客服录音符合要求,将质检结果确定为合格;
若所述目标预测标签不包括所有的所述质检项,则判定所述客服录音文本不符合要求,将质检结果确定为不合格。
例如,在贷款风险评估中,设置了十个质检项。若所述目标预测标签包括这十个质检项,则判定所述客服录音符合要求,将质检结果确定为合格;若所述目标预测标签没有完全包括这十个质检项,则判定所述客服录音不符合要求,将质检结果确定为不合格。
本实施例通过上述方案,获取客服录音文本,挖掘所述客服录音文本的片段;计算所述片段的分数,并基于所述分数筛选出候选片段;将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。由此,从客服录音中挖掘片段,并只对筛选出来的候选片段进行质检,减轻了客服录音质检的工作量,提高了质检效率。
基于上述图2所述的第一实施例,提出本申请的第二实施例,如图3所 示,图3是本申请客服录音的质检方法第二实施例的流程示意图
进一步地,所述步骤S101:获取客服录音文本,挖掘所述客服录音文本的片段的步骤之前还包括:
步骤S100:设置一个或多个质检项,并为所述质检项设置若干个质检项标准例句。
所述质检项根据质检要求进行设置。例如对于贷款风险评估,则质检项可以为“是否为本人、是否有还款能力、是否有还款意愿”等。
进一步地,再根据所述质检项设置对应的若干个质检项标准例句。所述质检项标准例句是针对该质检项的一个或多个相关的标准句子。例如对于是否为本人,则其质检项标准例句可以为“您的姓名是什么”、“您的联系电话是多少”、“您的工作单位是什么”、“您的身份证号码是什么”。再例如,对于质检项“客户工作单位地址核实”,则可以将对应的标准例句设置为“您的工作单位地址在哪里”、“您现在在哪里工作”等。
本实施例通过上述方案,设置一个或多个质检项,并为所述质检项设置若干个质检项标准例句;获取客服录音文本,挖掘所述客服录音文本的片段;计算所述片段的分数,并基于所述分数筛选出候选片段;将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。由此,从客服录音中挖掘片段,并只对筛选出来的候选片段进行质检,并且,设置所述质检项可以有针对性地客服录音进行质检,减轻了客服录音质检的工作量,提高了质检效率。
基于上述图1、图2所述的第一实施例和第二实施例,提出本申请的第三实施例,如图4所示,图4是本申请客服录音的质检方法第三实施例的流程示意图
所述将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤之前还包括:
步骤S1031,根据质检例句及其对应的质检标签进行训练获得质检模型;
所述质检模型包括词嵌入层的表示和多层神经网络的表示。
本实施例基于多层感知机(MLP,Multilayer Perceptron)构建所述质检模型,多层感知机层与层之间是全连接的。多层感知机的最底层是输入层,中 间是隐藏层,最后是输出层。词嵌入是使用密集向量表示来表示单词和文档的一类方法。这是对传统的袋型(bag-of-word)模型编码方案的改进,其中使用大的稀疏向量来表示每个单词或向量中的每个单词进行数字分配以表示整个词汇表。这些表示是稀疏的,因为词汇是广泛的,这样一个给定的单词或文档将由一个主要由零值组成的向量几何表示。本实施例中,可以通过Word2Vec神经网络或GloVe神经网络来进行词嵌入。
对于所述质检模型,所述词嵌入层和所述多层神经网络的初始参数是随机的,或者根据经验确定所述初始参数。因此需要对所述质检模型进行训练,具体地训练过程如下:
收集整理大量的质检例句,并标注所述质检例句的质检标签,将所述质检标签分别标记为质检例句X1、质检例句X2……质检例句Xn,将与所述质检例句对应的质检标签分别表示为质检标签Y1、质检标签Y2……质检标签Yn。将所述质检例句输基于词嵌入层和多层神经网络构建的所述质检模型,所述质检模型中的所述词嵌入层和所述多层神经网络根据初始参数对所述质检例句进行处理后输出预测标签Z。获得所述预测标签Z后,基于所述预测标签Z与所述质检标签计算交叉熵损失函数。本实施例中,基于mini-batch计算所述交叉熵损失函数。根据所述交叉熵损失函数计算所述初始模型中各个参数对应的梯度,根据各个参数的梯度来对应更新各个参数,也即调整所述质检模型中所述词嵌入层和所述多层神经网络的各个参数。此处,根据交叉熵损失函数更新数的过程与现有的模型参数更新过程类似,在此不做详细赘述。
步骤S1032,根据损失函数判断所述质检模型是否收敛;
判断所述交叉熵损失函数是否收敛,若所述交叉熵损失函数收敛,则判定对应的所述质检模型收敛。
步骤S1033,若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。
若所述初始模型处于收敛状态,则停止训练,将最后一次训练的参数保存为最终的模型参数,基于所述最终的模型参数获得所述质检模型。
反之,若所述质检模型未达到收敛状态,则继续训练:不断地进行迭代更新,直到收敛。最终获得所述质检模型。
本实施例中,所述质检模型是一个分类器,输出的预测结果是预测标签及其概率。所述预测标签与所述质检项对应。可以理解地,在实际运行过程中,会出现不符合任何一个质检项的情况,因此所述预测标签还包括“其他”。
本实施例通过上述方案,根据质检例句及其对应的质检标签进行训练获得质检模型;根据损失函数判断所述质检模型是否收敛;若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。由此,根据质检例句训练质检模型,提高了质检模型的针对性,能提高客服录音质检的效果和准确性。
此外,本实施例还提供一种客服录音的质检装置。参照图5,图5为本申请客服录音的质检装置第一实施例的功能模块示意图。
本实施例中,所述客服录音的质检装置为虚拟装置,存储于图1所示的客服录音的质检设备的存储器1005中,以实现客服录音的质检程序的所有功能:用于获取客服录音文本,挖掘所述客服录音文本的片段;用于计算所述片段的分数,并基于所述分数筛选出候选片段;用于将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
具体地,所述客服录音的质检装置包括:
挖掘模块,用于获取客服录音文本,挖掘所述客服录音文本的片段;
筛选模块,用于计算所述片段的分数,并基于所述分数筛选出候选片段;
质检模块,用于将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
进一步地,所述筛选模块包括:
获取单元,用于获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
第一筛选单元,用于若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
进一步地,所述质检模块还包括:
训练单元,用于根据质检例句及其对应的质检标签进行训练获得质检模型;
判断单元,用于根据损失函数判断所述质检模型是否收敛;
获得单元,用于若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。
进一步地,所述挖掘模块还包括:
挖掘单元,用于获取所述客服录音文本的关键词,并基于所述关机词获得关键词集合;
确定单元,用于根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。
进一步地,所述挖掘单元还包括:
标注子单元,用于对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
第一选择子单元,用于从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边;
迭代子单元,用于根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
第二选择子单元,用于根据迭代结果中的权值对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为所述关键词集合。
进一地,所述质检模块还包括:
第二筛选单元,用于对所述预测结果进行筛选,获得概率大于或等于概率阈值的目标预测标签;
比较单元,用于将所述目标预测标签与所述质检项进行比较,判断所述客服录音是否符合要求;
第一判定单元,用于若所述目标预测标签包括所有的所述质检项,则判定所述客服录音符合要求,将质检结果确定为合格;
第二判定单元,用于若所述目标预测标签不包括所有的所述质检项,则判定所述客服录音文本不符合要求,将质检结果确定为不合格。
进一步地,所述挖掘模块还包括:
设置单元,用于设置一个或多个质检项,并为所述质检项设置若干个质 检项标准例句。
此外,本申请实施例还提供一种计算机存储介质,所述计算机存储介质上存储有客服录音的质检程序,所述客服录音的质检程序被处理器运行时实现如上所述客服录音的质检方法的步骤,此次不再赘述。
相比现有技术,本申请提出的一种客服录音的质检方法、装置、设备及存储介质,该方法包括:获取客服录音文本,挖掘所述客服录音文本的片段;计算所述片段的分数,并基于所述分数筛选出候选片段;将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。由此,从客服录音中挖掘片段,并只对筛选出来的候选片段进行质检,减轻了客服录音质检的工作量,提高了质检效率。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备执行本申请各个实施例所述的方法。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种客服录音的质检方法,其中,所述方法包括:
    获取客服录音文本,挖掘所述客服录音文本的片段;
    计算所述片段的分数,并基于所述分数筛选出候选片段;
    将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
  2. 根据权利要求1所述的客服录音的质检方法,其中,所述计算所述片段的分数,并基于所述分数筛选出候选片段的步骤包括:
    获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
    若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
  3. 根据权利要求1所述的客服录音的质检方法,其中,所述将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤之前还包括:
    根据质检例句及其对应的质检标签进行训练获得质检模型;
    根据损失函数判断所述质检模型是否收敛;
    若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。
  4. 根据权利要求1所述的客服录音的质检方法,其中,所述挖掘所述客服录音文本的片段的步骤包括:
    获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合;
    根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。
  5. 根据权利要求4所述的客服录音的质检方法,其中,所述获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合的步骤包括:
    对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
    从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的 四个候选关键词分别组成的四条边;
    根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
    根据迭代结果中的权值对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为所述关键词集合。
  6. 根据权利要求1所述的客服录音的质检方法,其中,所述根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤包括:
    对所述预测结果进行筛选,获得概率大于或等于概率阈值的目标预测标签;
    将所述目标预测标签与所述质检项进行比较,判断所述客服录音是否符合要求;
    若所述目标预测标签包括所有的所述质检项,则判定所述客服录音符合要求,将质检结果确定为合格;
    若所述目标预测标签不包括所有的所述质检项,则判定所述客服录音文本不符合要求,将质检结果确定为不合格。
  7. 根据权利要求1所述的客服录音的质检方法,其中,所述获取客服录音文本,挖掘所述客服录音文本的片段的步骤之前还包括:
    设置一个或多个质检项,并为所述质检项设置若干个质检项标准例句。
  8. 一种客服录音的质检装置,其中,所述客服录音的质检装置包括:
    挖掘模块,用于获取客服录音文本,挖掘所述客服录音文本的片段;
    筛选模块,用于计算所述片段的分数,并基于所述分数筛选出候选片段;
    质检模块,用于将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
  9. 一种客服录音的质检设备,其中,所述客服录音的质检设备包括处理器,存储器以及存储在所述存储器中的客服录音的质检程序,所述客服录音的质检程序被所述处理器运行时,实现如下步骤:
    获取客服录音文本,挖掘所述客服录音文本的片段;
    计算所述片段的分数,并基于所述分数筛选出候选片段;
    将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
  10. 根据权利要求9所述的客服录音的质检设备,其中,所述计算所述 片段的分数,并基于所述分数筛选出候选片段的步骤包括:
    获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
    若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
  11. 根据权利要求9所述的客服录音的质检设备,其中,所述将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤之前还包括:
    根据质检例句及其对应的质检标签进行训练获得质检模型;
    根据损失函数判断所述质检模型是否收敛;
    若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。
  12. 根据权利要求9所述的客服录音的质检设备,其中,所述挖掘所述客服录音文本的片段的步骤包括:
    获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合;
    根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。
  13. 根据权利要求12所述的客服录音的质检设备,其中,所述获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合的步骤包括:
    对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
    从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边;
    根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
    根据迭代结果中的权值对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为所述关键词集合。
  14. 根据权利要求9所述的客服录音的质检设备,其中,所述根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤包括:
    对所述预测结果进行筛选,获得概率大于或等于概率阈值的目标预测标签;
    将所述目标预测标签与所述质检项进行比较,判断所述客服录音是否符合要求;
    若所述目标预测标签包括所有的所述质检项,则判定所述客服录音符合要求,将质检结果确定为合格;
    若所述目标预测标签不包括所有的所述质检项,则判定所述客服录音文本不符合要求,将质检结果确定为不合格。
  15. 根据权利要求9所述的客服录音的质检设备,其中,所述获取客服录音文本,挖掘所述客服录音文本的片段的步骤之前还包括:
    设置一个或多个质检项,并为所述质检项设置若干个质检项标准例句。
  16. 一种计算机存储介质,其中,所述计算机存储介质上存储有客服录音的质检程序,所述客服录音的质检程序被处理器运行时实现如下步骤:
    获取客服录音文本,挖掘所述客服录音文本的片段;
    计算所述片段的分数,并基于所述分数筛选出候选片段;
    将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果。
  17. 根据权利要求16所述的计算机存储介质,其中,所述计算所述片段的分数,并基于所述分数筛选出候选片段的步骤包括:
    获取所述片段和质检项的相似度,基于所述相似度计算所述片段属于各个质检项的分数;
    若所述分数大于或等于分数阈值,则将对应的片段筛选为候选片段。
  18. 根据权利要求16所述的计算机存储介质,其中,所述将所述候选片段及其分数输入质检模型,并根据所述质检模型输出的预测结果确定客服录音的质检结果的步骤之前还包括:
    根据质检例句及其对应的质检标签进行训练获得质检模型;
    根据损失函数判断所述质检模型是否收敛;
    若所述质检模型处于收敛状态,则停止训练,保存模型参数,获得所述质检模型。
  19. 根据权利要求16所述的计算机存储介质,其中,所述挖掘所述客服录音文本的片段的步骤包括:
    获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合;
    根据词图和所述关键词集合确定所述片段,所述词图根据所述质检项标准例句构造。
  20. 根据权利要求19所述的计算机存储介质,其中,所述获取所述客服录音文本的关键词,并基于所述关键词获得关键词集合的步骤包括:
    对所述客服录音文本进行分词和词性标注,并基于词性标注结果进行过滤,获得候选关键词;
    从所述候选关键词中依次选择一个候选关键词,构建各个所述候选关键词的关键词图,所述关键词图包括由被选中的所述候选关键词与该词后面的四个候选关键词分别组成的四条边;
    根据预设公式,迭代传播所述关键词图各节点的权重,直至收敛;
    根据迭代结果中的权值对所述候选关键词进行排序,根据排序结果选择候选关键词,并将所述候选关键词及其对应的权重保存为所述关键词集合。
PCT/CN2020/129256 2020-02-26 2020-11-17 客服录音的质检方法、装置、设备及存储介质 WO2021169423A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010123365.9A CN111368130A (zh) 2020-02-26 2020-02-26 客服录音的质检方法、装置、设备及存储介质
CN202010123365.9 2020-02-26

Publications (1)

Publication Number Publication Date
WO2021169423A1 true WO2021169423A1 (zh) 2021-09-02

Family

ID=71206448

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129256 WO2021169423A1 (zh) 2020-02-26 2020-11-17 客服录音的质检方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111368130A (zh)
WO (1) WO2021169423A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155859A (zh) * 2020-08-18 2022-03-08 马上消费金融股份有限公司 检测模型训练方法、语音对话检测方法及相关设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368130A (zh) * 2020-02-26 2020-07-03 深圳前海微众银行股份有限公司 客服录音的质检方法、装置、设备及存储介质
CN112037819A (zh) * 2020-09-03 2020-12-04 阳光保险集团股份有限公司 一种基于语义的语音质检方法和装置
CN112885376A (zh) * 2021-01-23 2021-06-01 深圳通联金融网络科技服务有限公司 一种提高语音通话质检效果的方法和装置
CN112966082A (zh) * 2021-03-05 2021-06-15 北京百度网讯科技有限公司 音频质检方法、装置、设备以及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107093431A (zh) * 2016-02-18 2017-08-25 中国移动通信集团辽宁有限公司 一种对服务质量进行质检的方法及装置
CN107886231A (zh) * 2017-11-03 2018-04-06 广州杰赛科技股份有限公司 客服的服务质量评价方法与系统
CN109658923A (zh) * 2018-10-19 2019-04-19 平安科技(深圳)有限公司 基于人工智能的语音质检方法、设备、存储介质及装置
CN110197672A (zh) * 2018-02-27 2019-09-03 招商信诺人寿保险有限公司 一种语音通话质量检测方法、服务器、存储介质
CN110334241A (zh) * 2019-07-10 2019-10-15 深圳前海微众银行股份有限公司 客服录音的质检方法、装置、设备及计算机可读存储介质
CN111368130A (zh) * 2020-02-26 2020-07-03 深圳前海微众银行股份有限公司 客服录音的质检方法、装置、设备及存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728996A (zh) * 2019-10-24 2020-01-24 北京九狐时代智能科技有限公司 一种实时语音质检方法、装置、设备及计算机存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107093431A (zh) * 2016-02-18 2017-08-25 中国移动通信集团辽宁有限公司 一种对服务质量进行质检的方法及装置
CN107886231A (zh) * 2017-11-03 2018-04-06 广州杰赛科技股份有限公司 客服的服务质量评价方法与系统
CN110197672A (zh) * 2018-02-27 2019-09-03 招商信诺人寿保险有限公司 一种语音通话质量检测方法、服务器、存储介质
CN109658923A (zh) * 2018-10-19 2019-04-19 平安科技(深圳)有限公司 基于人工智能的语音质检方法、设备、存储介质及装置
CN110334241A (zh) * 2019-07-10 2019-10-15 深圳前海微众银行股份有限公司 客服录音的质检方法、装置、设备及计算机可读存储介质
CN111368130A (zh) * 2020-02-26 2020-07-03 深圳前海微众银行股份有限公司 客服录音的质检方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155859A (zh) * 2020-08-18 2022-03-08 马上消费金融股份有限公司 检测模型训练方法、语音对话检测方法及相关设备
CN114155859B (zh) * 2020-08-18 2023-08-08 马上消费金融股份有限公司 检测模型训练方法、语音对话检测方法及相关设备

Also Published As

Publication number Publication date
CN111368130A (zh) 2020-07-03

Similar Documents

Publication Publication Date Title
WO2021169423A1 (zh) 客服录音的质检方法、装置、设备及存储介质
WO2021027533A1 (zh) 文本语义识别方法、装置、计算机设备和存储介质
WO2021174919A1 (zh) 简历数据信息解析及匹配方法、装置、电子设备及介质
WO2022100045A1 (zh) 分类模型的训练方法、样本分类方法、装置和设备
WO2021042503A1 (zh) 信息分类抽取方法、装置、计算机设备和存储介质
WO2019153737A1 (zh) 用于对评论进行评估的方法、装置、设备和存储介质
US11501210B1 (en) Adjusting confidence thresholds based on review and ML outputs
WO2021093755A1 (zh) 问题的匹配方法及装置、问题的回复方法及装置
JP5744228B2 (ja) インターネットにおける有害情報の遮断方法と装置
CN111160017A (zh) 关键词抽取方法、话术评分方法以及话术推荐方法
WO2021051518A1 (zh) 基于神经网络模型的文本数据分类方法、装置及存储介质
CN112069321B (zh) 用于文本层级分类的方法、电子设备和存储介质
CN113591483A (zh) 一种基于序列标注的文档级事件论元抽取方法
CN110162771B (zh) 事件触发词的识别方法、装置、电子设备
WO2021121198A1 (zh) 基于语义相似度的实体关系抽取方法、装置、设备及介质
CN112395385B (zh) 基于人工智能的文本生成方法、装置、计算机设备及介质
CN111158641B (zh) 基于语义分析和文本挖掘的事务类功能点自动识别方法
CN110427612B (zh) 基于多语言的实体消歧方法、装置、设备和存储介质
WO2022134805A1 (zh) 文档分类预测方法、装置、计算机设备及存储介质
CN111695349A (zh) 文本匹配方法和文本匹配系统
CN113961666B (zh) 关键词识别方法、装置、设备、介质及计算机程序产品
US20240111956A1 (en) Nested named entity recognition method based on part-of-speech awareness, device and storage medium therefor
CN113157859A (zh) 一种基于上位概念信息的事件检测方法
WO2022228127A1 (zh) 要素文本处理方法、装置、电子设备和存储介质
CN114756675A (zh) 文本分类方法、相关设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920821

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920821

Country of ref document: EP

Kind code of ref document: A1