CN109767791A

CN109767791A - A kind of voice mood identification and application system conversed for call center

Info

Publication number: CN109767791A
Application number: CN201910217722.5A
Authority: CN
Inventors: 林僚; 梁冬明; 张超婧; 韦建福; 蒋莉芳
Original assignee: China Asean Information Port Ltd By Share Ltd
Current assignee: China Asean Information Port Ltd By Share Ltd
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2019-05-17
Anticipated expiration: 2039-03-21
Also published as: CN109767791B

Abstract

It is a kind of for call center call voice mood identification and application system include speech processing module, be used to extract voice messaging and voice messaging pre-processed；Its voice data for being used to analyze the phonetic feature submodule of voice keyword detection module is identified as emotion class keywords and theme class keywords, and obtains mood data information and reacted problem data information；Emotion model collection module is used to that the affective state of caller dynamically to be captured and be tracked；Mood categorization module is used to judge the mood classification of voice in call to be detected；Business application module is used to provide response auxiliary for contact staff and provides management auxiliary for administrative staff.The mood of client that contact staff can be made accurately to understand for the voice mood identification of call center's call and application system of the invention a kind of, while effective response scheme is provided, and can accurately be examined to contact staff.

Description

A kind of voice mood identification and application system conversed for call center

Technical field

The present invention relates to audio data processing technology field, especially a kind of voice mood for call center's call is known Other and application system.

Background technique

In modern enterprise, call center carries the substantial responsibility of maintaining enterprise customer relationship and business marketing, to exhaling It makes center voice service quality be monitored to be of great significance.Sentiment analysis is carried out to call voice, can identify that customer service is logical The emotional state of customer service and client in words, to effectively track and monitor the quality of service quality.Existing call center's call Mostly it is that content of text is converted speech by speech recognition technology in Emotion identification scheme, carries out mood point further according to text Analysis.On the one hand this method relies on the accuracy rate and robustness of speech recognition modeling, there are one during being converted to text Fixed rate of translation error；Another aspect voice turns to be lost the emotion information that voice itself contains after text, in voice The variation of intensity, intonation, speed etc. can be with the mood of effecting reaction people.These two aspects factor can all be influenced to voice mood The accuracy of identification.

Also there is the scheme classified based on voice itself in heart call Emotion identification in a call at present, these schemes exist Using when mood disaggregated model substantially only with single model, due to the diversity of telephone user, scene and dialog context, single mould The Emotion identification effect of type is often difficult to reach good stability, and can only carry out Emotion identification, cannot find to deposit in time The key message that client is reflected in the call of problem.Call center call contact staff service quality often by Client scores manually, is often inaccuracy by way of above-mentioned scoring, is completely dependent on subjectivity or the customer service people of client Member requires client's scoring, and client is made to score according to contact staff, sometimes client in order to save time without Call scoring is carried out, this is unfavorable for the development of enterprise, can not really understand client.Since contact staff is frequently not the skill of profession Art personnel, for the product defects that client proposes, contact staff sometimes cannot preferably deal with；And facing mood Diversified client, contact staff is difficult to cope with using effective processing mode, particularly with new hand contact staff.

Summary of the invention

To solve the above-mentioned problems, the present invention provide it is a kind of for call center call voice mood identify and application be System, the mood for the client that contact staff can be made accurately to understand, while effective response scheme is provided, and can be to customer service people Member is accurately examined.

To achieve the goals above, the technical solution adopted by the present invention is that:

It is a kind of to be closed for the voice mood identification of call center's call and application system, including speech processing module, voice Keyword detection sub-module, emotion model collection module, mood categorization module, business application module and database module；

The speech processing module includes that voice extracting sub-module and phonetic feature analysis submodule, the voice extract son Acquisition of the module for the voice in call to be detected；The phonetic feature analysis submodule is mentioned for that will receive the voice The voice data of submodule is taken, and by preemphasis, adding window framing and end-point detection mode to the voice extracting sub-module Voice is handled, to obtain musical note, sound quality and the spectrum signature of the voice extracting sub-module；

The voice keyword detection module is used to receive the voice data of the phonetic feature analysis submodule, and passes through Keywords database identification emotion class keywords and theme class keywords are established, to obtain the feelings of client in the voice extracting sub-module Thread data information and reacted problem data information；

The emotion model collection module receives the phonetic feature for storing multiple and different sentiment classification model collection The data information for analyzing submodule, is dynamically captured and is tracked with the affective state to caller；

The mood categorization module is used to obtain the voice keyword detection module and the emotion model collection module Data information, and use disaggregated model judges the mood classification of voice in call to be detected；

The business application module include customer information display sub-module, mood display sub-module, response prompting submodule, Examination data submodule and enterprise's case study submodule, the customer information display sub-module and product sales figure platform are logical News connection, the customer information display sub-module are used to show client in product sales figure platform according to the telephone number of client Purchase information；The mood display sub-module is for receiving the voice keyword detection submodule and mood classification mould The data information of block, and in real-time display current talking client mood trend information；The response prompting submodule includes answering Scheme database and response prompting frame are answered, the response scheme database is for storing product related information, the different moods of reply The data information of type processing scheme, response term and issue handling process；The response prompting frame is for passing through machine learning Algorithm is in conjunction with the voice keyword detection submodule, the mood categorization module and the response scheme database data, certainly It is dynamic to generate response prompt scheme and show；The examination data submodule is used for the data according to the mood categorization module to visitor Take the examination of service quality；Enterprise's case study submodule is used for the data pair according to the voice keyword detection module The analysis of product situation；

The database module is used for the voice keyword detection module, the emotion model collection module, the mood The storage and transmission of categorization module and the business application module data.

Further, the phonetic feature analysis submodule in musical note, sound quality and spectrum signature by obtaining in time domain Or it is the short-time energy of frequency domain, average amplitude, short-time average zero-crossing rate, fundamental frequency, formant, Meier spectrum signature, linear pre- Survey the different characteristic of cepstrum coefficient and mel cepstrum coefficients, sound spectrograph, and the maximum value of analytical calculation different characteristic, minimum value, Frame, mean value, linear approximation slope, linear approximation offset, linear approximation are secondary partially where frame, minimum value where range, maximum value Difference, standard deviation, degree of skewness, kurtosis, the statistic of first-order difference and second differnce.

Further, the voice keyword detection module includes emotion weight database and keyword extraction submodule, The emotion weight database is used to establish and the emotion weight database of stored key word；The keyword extraction submodule is used In by the phonetic feature analysis submodule data by acoustic model, language model, pronunciation dictionary and decoder with it is described The data of emotion weight database carry out pairing identification emotion class keywords and theme class keywords, and can count emotion class The frequency that keyword and theme class keywords occur in voice, the keyword extraction submodule can also be according to the emotions The data of weight database are that the emotion tendency of emotion class keywords assigns weight, by by the weight and emotion class keywords The frequency binding analysis simultaneously scores to every kind of mood of voice；The emotion model collection module passes through the hidden Ma Erke of training Husband's model, mixed Gauss model, supporting vector machine model, artificial neural network, convolutional neural networks model and shot and long term memory Network model, and each model is combined, with the voice feelings to speech processing module affective characteristics described in each model Sense tendency scores.

Further, the mood categorization module is according to the voice keyword detection module and the emotion model collection mould The different model datas that block provides judge that the phonetic feature analysis submodule obtains language using ballot method, point system and combined method The mood classification of sound；

The ballot method is by obtaining each model in the keyword extraction submodule and the emotion model collection module Mood classification results, and count the Number of Models that current speech is judged as certain class mood, who gets the most votes's mood classification is made For recognition result；

The point system is by obtaining the scoring number in the keyword extraction submodule and the emotion model collection module Value, and by score value form new feature be input to it is trained after decision tree, SVM and neural network classification model fall into a trap It calculates, exports Emotion identification result.

The combined method will acquire in the keyword extraction submodule and the emotion model collection module score value with The voice feature data of the phonetic feature analysis submodule is combined into new phonetic feature, and new phonetic feature is led to Decision tree, SVM and neural network classification model training and classified calculating are crossed, to obtain Emotion identification result.

Further, the database module is established the connection between the data storage end and the end Web with WebSocket and is led to Road, the data transmission between the voice keyword detection module, the mood categorization module and the business application module Instant data service is provided.

Further, the examination data submodule is used for the data by obtaining the mood categorization module, and analyzes Calculate every contact staff over a period to come all call moods the case where, the industry examination data submodule can also basis The case where mood of conversing, automatically generates statistical table and statistical graph.

Further, enterprise's case study submodule is used to obtain the theme in the voice keyword detection module Class keywords, and by the theme class keywords in voice keyword detection module described in analytical calculation, to collect and count visitor The critical issue that family is reflected.

Further, the business module obtains corresponding data information by search computing engines, and described search calculates Engine provides data access support by multi-source heterogeneous Data Access Components, metadata management and access modules.

The beneficial effects of the present invention are:

1. voice keyword detection module extracts keyword in the data by speech processing module, compared to continuous speech Identification technology, voice keyword spotting do not need entire voice flow to identify, need to only construct oneself interested keyword Table and there is better flexibility, while requirement to grammer, ambient noise etc. is lower to more adapt to complicated call scene, Voice keyword detection submodule can also identify emotion class keywords and theme class keywords, in order to postorder process analysis It the Sentiment orientation of client and required solves the problems, such as；Emotion model collection module is by a variety of training patterns to the emotion shape of client State is dynamically captured and is tracked；Mood categorization module passes through integrating speech sound keyword detection module and emotion model collection mould Block, and the multiple machine learning models of training and deep learning model, play the respective advantage of multi-model, improve the essence of model identification Degree and robustness guarantee the accuracy for identifying mood in voice；Business module obtains voice keyword detection submodule, emotion The data of Models Sets module and mood categorization module have access to product sales figure platform, by customer phone with sold Product be associated, the product for enabling contact staff to recognize that client is bought, to grasp the information of client, favorably In the progress of call；Have the function of real-time response scheme, customer service performance appraisal and enterprise's case study simultaneously, is contact staff The mood of client, and real-time, accurate, specification response prompt are provided in real time, contact staff is enabled accurately to handle visitor The problem of family.

2. phonetic feature analyze submodule by obtained in musical note, sound quality and spectrum signature time domain or frequency domain in short-term Energy, average amplitude, short-time average zero-crossing rate, fundamental frequency, formant, Meier spectrum signature, linear prediction residue error and The different characteristic of mel cepstrum coefficients, sound spectrograph, and ASSOCIATE STATISTICS amount is calculated, comprehensive phonetic feature is enriched to extract, The limitation that single kind or dimensional characteristics show emotion information is avoided, provides necessary means for the identification of mood.

3. keyword extraction submodule 22 has the function of constantly training study, supplement instruction can be carried out to new keyword Practice study, so that this system has good expansion, due to quickly growing for network, a large amount of network words are often from visitor Occur in the mouth at family, it can be by being trained in advance with sample data in the present invention, continuous training iteration more new keywords make this be System can identify various unconventional keywords.

4. examination data submodule by every contact staff of analytical calculation over a period to come it is all call moods feelings Condition examines contact staff according to the accounting of every kind of type of emotion, avoids the error of tradition examination, can be effectively Improve the quality of contact staff.Enterprise's case study submodule can count every kind of production according to voice keyword detection module The frequency that product are gone wrong, so as to make administrative staff know the management state of enterprise product, convenient for finding the problem in time And targetedly optimize product.

Detailed description of the invention

Fig. 1 is a better embodiment of the invention for the voice mood identification of call center's call and application system Structural block diagram.

In figure, 1- speech processing module, 11- voice extracting sub-module, 12- phonetic feature analysis submodule, 2- voice pass Keyword detection module, 21- emotion weight database, 22- keyword extraction submodule, 3- emotion model collection module, 4- mood point Generic module, 5- business application module, 51- customer information display sub-module, 52- mood display sub-module, 53- response prompt submodule Block, 54- examination data submodule, 55- enterprise case study submodule.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention The normally understood meaning of technical staff is identical.Term as used herein in the specification of the present invention is intended merely to description tool The purpose of the embodiment of body, it is not intended that in the limitation present invention.Term " and or " used herein includes one or more phases Any and all combinations of the listed item of pass.

Referring to Figure 1, the voice mood identification and application for call center's call of a better embodiment of the invention System, including speech processing module 1, voice keyword detection submodule 2, emotion model collection module 3, mood categorization module 4, industry Business application module 5 and database module 6.

Speech processing module 1 includes voice extracting sub-module 11 and phonetic feature analyzes submodule 12, and voice extracts submodule Acquisition of the block 11 for the voice in call to be detected.Phonetic feature analysis submodule 12 is for receiving voice extracting sub-module 11 voice data, and carried out by the voice of preemphasis, adding window framing and end-point detection mode to voice extracting sub-module 11 Processing, to obtain musical note, sound quality and the spectrum signature of voice extracting sub-module 11.

In the present embodiment, phonetic feature analysis submodule 12 by musical note, sound quality and spectrum signature obtain when It is the short-time energy of domain or frequency domain, average amplitude, short-time average zero-crossing rate, fundamental frequency, formant, Meier spectrum signature, linear Predict the different characteristic of cepstrum coefficient and mel cepstrum coefficients, sound spectrograph, and the maximum value of analytical calculation different characteristic, minimum Frame, mean value, linear approximation slope, linear approximation offset, linear approximation are secondary where frame, minimum value where value, range, maximum value Deviation, standard deviation, degree of skewness, kurtosis, the statistic of first-order difference and second differnce.By analyzing submodule in phonetic feature The ASSOCIATE STATISTICS amount of call voice feature is extracted in 12, extraction enriches comprehensive phonetic feature, avoids single kind or dimension The limitation that feature shows emotion information, and convenient for the progress of postorder speech recognition work.

Voice keyword detection module 2 is used to receive the voice data of phonetic feature analysis submodule 12, and passes through foundation Keywords database identifies emotion class keywords and theme class keywords, to obtain the mood data information in voice extracting sub-module 11 And reacted problem data information.

In the present embodiment, voice keyword detection module 2 includes emotion weight database 21 and keyword extraction submodule Block 22.

Emotion weight database 21 is used to establish and the emotion weight database of stored key word.

Keyword extraction submodule 22 is used to the data of phonetic feature analysis submodule 12 passing through acoustic model, language mould The data of type, pronunciation dictionary and decoder and emotion weight database 21 carry out pairing identification emotion class keywords and theme class is closed Keyword, and the frequency that emotion class keywords and theme class keywords occur in voice can be counted, keyword extraction submodule Block 22 can also assign weight according to the emotion tendency that the data of emotion weight database 21 are emotion class keywords, by that will weigh Value scores with emotion class keywords frequency binding analysis and to every kind of mood of voice.

Keyword extraction submodule 22 can detect whether designated key word occurs from voice, know compared to continuous speech Other technology, voice keyword spotting do not need entire voice flow to identify, need to only construct oneself interested antistop list And there is better flexibility, while the requirement to grammer, ambient noise etc. is lower to more adapt to complicated call scene.

Hidden Markov model (HMM) can be used to model each acoustic elements its acoustic model, Mei Gemo Type is all made of the transfer between continuous multiple states and state, when acoustic training model, according to the observation given in corpus Sentence carries out correctly marking simultaneously constantly iteration optimization parameter, so that it is general correctly to mark the maximum posteriority of pronunciation generation corresponding with its Rate.

Language model is used to acoustic model output result is identified as text, in order to cope with the biggish keyword inspection of vocabulary Out, using statistical language model, i.e., based on the statistics to corpus from the relationship between the angle descriptor and word of probability.

Language model is used to acoustic model output result being identified as text, in order to cope with the biggish keyword inspection of vocabulary Out, using statistical language model, i.e., based on the statistics to corpus from the relationship between the angle descriptor and word of probability, in mould A large amount of text corpus is used when type training as training set to improve model accuracy.

Pronounceable dictionary is used to connect acoustic model and language model, contains the mapping between word to phoneme, when building Covering as much as possible identification emotion class keywords and theme class keywords of interest when Pronounceable dictionary, while abandoning and not needing Words, to improve recall precision and recognition performance.

The result that keyword is obtained in keyword extraction submodule 22 also needs decoder, and decoder is calculated by Viterbi Method is decoded, and HMM state decoded to obtain optimum state sequence before this, and then keyword spotting is decoded, and obtains final knowledge Other result.

In the present embodiment, keyword extraction submodule 22 has the function of constantly training study, can be to new key Word carries out supplementary training study, so that this system has good expansion, due to quickly growing for network, a large amount of network word Remittance often occurs from the mouth of client, can be by being trained in advance with sample data in the present invention, and constantly training iteration updates Keyword enables the system to enough various unconventional keywords of identification.

Emotion model collection module 3 receives speech processing module 1 for storing multiple and different sentiment classification model collection Data information is dynamically captured and is tracked with the affective state to caller.In the present embodiment, emotion model collection module 3 pass through training hidden Markov model, mixed Gauss model, supporting vector machine model, artificial neural network, convolutional neural networks Model and shot and long term memory network model, and each model is combined, to 1 emotion of speech processing module in each model The speech emotional tendency of feature scores.

Since every kind of model has the benefit and limitation of oneself, it is respectively excellent that it is played by the multiple and different model of training Gesture improves the accuracy and robustness of system entirety.And speech emotional statistics category feature is uncorrelated for speaker with more Good adaptability can mention to be classified by training mixed Gauss model or supporting vector machine model and using these features The robustness of high system.

Phonetic feature analysis submodule 12 is extracted the various features of call voice, so as to according to emotion model collection mould The model requirements of block 3 are calculated and are selected, in training convolutional neural networks model to the sound spectrograph comprising enriching emotion information Carry out feature extraction, be as a result input to shot and long term memory network model, can the affective state to caller dynamically caught It obtains and tracks.

Mood categorization module 4 is used for the data information according to voice keyword detection module 2 and emotion model collection module 3, And disaggregated model is used, judge the mood classification of voice in call to be detected, in the present embodiment, disaggregated model can be certainly Plan tree, SVM or neural network etc. it is any.

In the present embodiment, mood categorization module 4 is mentioned according to voice keyword detection module 2 and emotion model collection module 3 The different model datas supplied judge that speech processing module 1 obtains the mood classification of voice using ballot method, point system and combined method；

Ballot method is classified by obtaining the mood of each model in keyword extraction submodule 22 and emotion model collection module 3 As a result, and count the Number of Models that current speech is judged as certain class mood, who gets the most votes's mood classification is as recognition result；

Point system will be commented by obtaining the score value in keyword extraction submodule 22 and emotion model collection module 3 Fractional value form new feature be input to it is trained after decision tree, calculate in SVM and neural network classification model, export mood Recognition result.

Combined method will acquire score value and phonetic feature point in keyword extraction submodule 22 and emotion model collection module 3 The voice feature data of analysis submodule 12 is combined into new phonetic feature, and new phonetic feature is passed through decision tree, SVM And neural network classification model training and classified calculating, to obtain Emotion identification result.

Mood categorization module 4 fully utilizes the advantage of each model, avoids the limitation of single model, output the result is that Certain voice belongs to specific any mood classification, be generally divided into indignation, it is normal, be satisfied with three classes or more etc..

Business application module 5 includes customer information display sub-module 51, mood display sub-module 52, response prompting submodule 53, examination data submodule 54 and enterprise's case study submodule 55.

Customer information display sub-module 51 and product sales figure platform communication connection, customer information display sub-module 51 are used The purchase information of client is shown in product sales figure platform in the telephone number according to client.Submodule is shown by customer information It is associated customer phone with the product sold, the production for enabling contact staff to recognize that client is bought Product are conducive to the progress of call to grasp the information of client.

Mood display sub-module 52 is used to receive the data information of keyword extraction submodule 22 and mood categorization module 3, And in real-time display current talking client mood tendency and mood keyword.In mood display sub-module 52, contact staff The real-time emotion of client can be grasped in time, to be conducive to contact staff and client by directly observing customer anger information Communication, increase the validity of communication.

Response prompting submodule 53 includes response scheme database 531 and response prompting frame 532, response scheme database 531 for storing product related information, the different type of emotion processing schemes of reply, response term and the data of issue handling process Information.Response prompting frame 532 be used for by machine learning algorithm combination keyword extraction submodule 22, mood categorization module 3 and 531 data of response scheme database, automatically generate response scheme and show.The response scheme of response scheme database 531 can It is established by machine learning model and deep learning model, so as to according to the update of product, using sample number According to being trained, to carry out the update of database.

Examination data submodule 54 is used for the examination according to the data of mood categorization module 4 to customer service quality.At this In embodiment, examination data submodule 54 is used for the data by obtaining mood categorization module 4, and every customer service people of analytical calculation Member over a period to come all call moods the case where, examination data submodule 54 can also according to converse mood the case where it is automatic Generate statistical table and statistical graph.Examination data submodule 54 can according to every kind of type of emotion accounting to contact staff into Row examination avoids the error of tradition examination, can effectively improve the quality of contact staff.Statistical table and statistical graph Under the action of, the service quality of contact staff can be intuitively understood convenient for administrative staff, to formulate suitable managing system Degree.

Enterprise's case study submodule 55 is for dividing product situation according to the data of voice keyword detection module 2 Analysis.In the present embodiment, enterprise's case study submodule 55 is used to obtain the theme class in voice keyword extraction submodule 22 Keyword, and by the theme class keywords in analytical calculation keyword extraction submodule 22, it is anti-to collect and count client institute The critical issue reflected.Enterprise's case study submodule 55 can count every kind of product institute according to keyword extraction submodule 22 The frequency to go wrong, so as to make administrative staff know the management state of enterprise product, convenient for finding the problem and having in time Pointedly optimize product.

Database module 6 is used for voice keyword detection module 2, emotion model collection module 3, mood categorization module 4 and industry The storage and transmission for 5 data of application module of being engaged in.

In the present embodiment, database module 6 is established the connection between the data storage end and the end Web with WebSocket and is led to Road, the data transmission between voice keyword detection module 2, mood categorization module 4 and business application module 5 provide instant number According to service.Response scheme database 531 can be deposited in database module 6, can be stored respectively by distributed storage mode Class data, provide fast poll response, and the Data subject of storage may include call Emotion identification, client's message registration, keyword Theme, standard response scheme etc..

In the present embodiment, business module 5 obtains corresponding data information by search computing engines, searches for computing engines Data access support is provided by multi-source heterogeneous Data Access Components, metadata management and access modules.Searching for computing engines can Data are inquired, are classified, are assembled, are described and visualized operation, operational decision making being supported, so that customer information display sub-module 51, data required for mood display sub-module 52 and response prompting submodule 53 can be efficiently from database module 6 It transfers.

Business module can be established at the end Web, by terminating into search computing engines in Web as a result, providing different numbers According to displaying requirement, call examination data submodule 54 and enterprise's case study submodule 55 in the data of database module 6 Hold the patterned data exhibition methods such as table, curve graph, distribution map, the pie chart for realizing data, makes data more intuitive, more favorably In decision.

When speech processing module 1 receives the voice of client, voice is extracted submodule by phonetic feature analysis submodule 12 The voice content that block 11 obtains is handled, and obtains musical note, sound quality and the spectrum signature of the end voice.

Whether keyword extraction submodule 22 detects in this section of voice containing the key specified in emotion weight database 21 Word, keyword extraction submodule 22 will acquire key word information and be identified as identification emotion class keywords and theme class keywords, together When according to emotion tendency be respectively emotion class keywords, by weight with emotion class keywords frequency binding analysis and to the every of voice Kind mood scores.

Emotion model collection module 3 receives speech processing module 1 for storing multiple and different sentiment classification model collection Data information is dynamically captured and is tracked with the affective state to caller, and each model is combined, to every The speech emotional tendency of 1 affective characteristics of speech processing module scores in one model.

The different pattern numbers that mood categorization module 4 is provided according to keyword extraction submodule 22 and emotion model collection module 3 Judge that speech processing module 1 obtains the mood classification of voice according to using ballot method, point system and combined method.

Speech processing module 1, voice keyword detection submodule 2, emotion model collection module 3 and the number of mood categorization module 4 According to content storage into database module 6, and business module 5 can be in communication process, and system is identified by client's number, in real time Customer anger, call keyword etc. information inputs to search computing engines, search for computing engines from data-storage system oneself It is dynamic to match optimal response scheme, relevant information is pushed on the end Web and shows contact staff, enable contact staff from Customer information display sub-module 51, mood display sub-module 52 and response prompting submodule 53 obtain relevant business information, make Obtaining contact staff can effectively be linked up with client.

Administrative staff pass through performance appraisal, problem point in examination data submodule 54 and enterprise's case study submodule 55 Analysis etc. is in application, search computing engines inquire related data, by statistical according to query demand from data-storage system Result is pushed on the end Web after calculating and shows by analysis, to be managed to contact staff and product quality.

Claims

1. a kind of for the voice mood identification of call center's call and application system, which is characterized in that including speech processes mould Block (1), voice keyword detection submodule (2), emotion model collection module (3), mood categorization module (4), business application module (5) and database module (6)；

The speech processing module (1) includes voice extracting sub-module (11) and phonetic feature analysis submodule (12), institute's predicate Acquisition of the sound extracting sub-module (11) for the voice in call to be detected；Phonetic feature analysis submodule (12) is used for The voice data of the voice extracting sub-module (11) will be received, and passes through preemphasis, adding window framing and end-point detection mode pair The voice of the voice extracting sub-module (11) is handled, to obtain musical note, the sound quality of the voice extracting sub-module (11) And spectrum signature；

The voice keyword detection module (2) is used to receive the voice data of phonetic feature analysis submodule (12), and Emotion class keywords and theme class keywords are identified by establishing keywords database, to obtain in the voice extracting sub-module (11) The mood data information of client and reacted problem data information；

The emotion model collection module (3) receives the phonetic feature for storing multiple and different sentiment classification model collection The data information for analyzing submodule (12), is dynamically captured and is tracked with the affective state to caller；

The mood categorization module (4) is for obtaining the voice keyword detection module (2) and the emotion model collection module (3) data information, and use disaggregated model judges the mood classification of voice in call to be detected；

The business application module (5) includes customer information display sub-module (51), mood display sub-module (52), response prompt Submodule (53), examination data submodule (54) and enterprise's case study submodule (55), the customer information display sub-module (51) with product sales figure platform communication connection, the customer information display sub-module (51) is for the phone number according to client Code shows the purchase information of client in product sales figure platform；The mood display sub-module (52) is for receiving the voice The data information of keyword detection submodule (2) and the mood categorization module (3), and client in real-time display current talking Mood trend information；The response prompting submodule (53) includes response scheme database (531) and response prompting frame (532), The response scheme database (531) is for storing product related information, the different type of emotion processing schemes of reply, response term And the data information of issue handling process；The response prompting frame (532) is used for through machine learning algorithm in conjunction with the voice Keyword detection submodule (2), the mood categorization module (3) and the response scheme database (531) data, automatically generate Response prompt scheme is simultaneously shown；The examination data submodule (54) is used for the data pair according to the mood categorization module (4) The examination of customer service quality；Enterprise's case study submodule (55) is used for according to the voice keyword detection module (2) analysis of the data to product situation；

The database module (6) is for the voice keyword detection module (2), the emotion model collection module (3), described The storage and transmission of mood categorization module (4) and the business application module (5) data.

2. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: phonetic feature analysis submodule (12) in musical note, sound quality and spectrum signature by obtaining in time domain or frequency domain Short-time energy, average amplitude, short-time average zero-crossing rate, fundamental frequency, formant, Meier spectrum signature, linear prediction cepstrum coefficient system Several and mel cepstrum coefficients, sound spectrograph different characteristics, and the maximum value of analytical calculation different characteristic, minimum value, range, maximum Frame, mean value, linear approximation slope, linear approximation offset, the secondary deviation of linear approximation, standard deviation where frame, minimum value where value Difference, degree of skewness, kurtosis, the statistic of first-order difference and second differnce.

3. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: the voice keyword detection module (2) includes emotion weight database (21) and keyword extraction submodule (22), institute Emotion weight database (21) are stated for establishing and the emotion weight database of stored key word；The keyword extraction submodule (22) for the data of phonetic feature analysis submodule (12) to be passed through acoustic model, language model, pronunciation dictionary and solution The data of code device and the emotion weight database (21) carry out pairing identification emotion class keywords and theme class keywords, and The frequency that emotion class keywords and theme class keywords occur in voice, the keyword extraction submodule (22) can be counted Can also according to the data of the emotion weight database (21) be emotion class keywords emotion tendency assign weight, pass through by The weight scores with frequency binding analysis described in emotion class keywords and to every kind of mood of voice；The emotion model Collect module (3) and passes through training hidden Markov model, mixed Gauss model, supporting vector machine model, artificial neural network, convolution Neural network model and shot and long term memory network model, and each model is combined, to voice described in each model The speech emotional tendency of processing module (1) affective characteristics scores.

4. according to claim 3 a kind of for the voice mood identification of call center's call and application system, feature Be: the mood categorization module (4) is according to the voice keyword detection module (2) and the emotion model collection module (3) The different model datas of offer judge that phonetic feature analysis submodule (12) obtains using ballot method, point system and combined method The mood classification of voice；

The ballot method is by obtaining each mould in the keyword extraction submodule (22) and the emotion model collection module (3) The mood classification results of type, and count the Number of Models that current speech is judged as certain class mood, who gets the most votes's mood classification As recognition result；

The point system is by obtaining the scoring in the keyword extraction submodule (22) and the emotion model collection module (3) Numerical value, and by score value form new feature be input to it is trained after decision tree, SVM and neural network classification model fall into a trap It calculates, exports Emotion identification result.

The combined method will acquire score value in the keyword extraction submodule (22) and the emotion model collection module (3) New phonetic feature is combined into the voice feature data of phonetic feature analysis submodule (12), and by new voice Feature is by decision tree, SVM and neural network classification model training and classified calculating, to obtain Emotion identification result.

5. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: the database module (6) establishes the interface channel between the data storage end and the end Web with WebSocket, for institute's predicate Data transmission between sound keyword detection module (2), the mood categorization module (4) and the business application module (5) mentions For instant data service.

6. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: the examination data submodule (54) is used for the data by obtaining the mood categorization module (4), and analytical calculation is every Name contact staff over a period to come all call moods the case where, the industry examination data submodule (54) can also according to lead to The case where talking about mood automatically generates statistical table and statistical graph.

7. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: the theme class that enterprise's case study submodule (55) is used to obtain in the voice keyword detection module (2) is closed Keyword, and by the theme class keywords in voice keyword detection module (2) described in analytical calculation, to collect and count client The critical issue reflected.

8. according to claim 1 a kind of for the voice mood identification of call center's call and application system, feature Be: the business module (5) obtains corresponding data information by search computing engines, and described search computing engines are by multi-source Isomeric data accesses component, metadata management and access modules and provides data access support.