CN105261362B

CN105261362B - A kind of call voice monitoring method and system

Info

Publication number: CN105261362B
Application number: CN201510563318.5A
Authority: CN
Inventors: 吴奎; 王影; 王平华; 刘江
Original assignee: iFlytek Co Ltd
Current assignee: Iflytek Changjiang Information Technology Co Ltd
Priority date: 2015-09-07
Filing date: 2015-09-07
Publication date: 2019-07-05
Anticipated expiration: 2035-09-07
Also published as: CN105261362A

Abstract

The invention discloses a kind of call voice monitoring method and systems, this method comprises: the voice data that acquisition server-side and client are conversed in real time respectively；Speech recognition is carried out to the voice data of the server-side and client, respectively obtains server-side identification text and client identification text；According to the server-side call voice data and corresponding identification text and the client call voice data and corresponding identification text, real-time monitoring is carried out to server-side call voice.Using the present invention, comprehensive, the real-time monitoring to contact staff's call voice may be implemented, promote the service quality of server-side contact staff.

Description

A kind of call voice monitoring method and system

Technical field

The present invention relates to field of voice signal, and in particular to a kind of call voice monitoring method and system.

Background technique

With the development and maturation of network, communication and computer technology, more and more enterprises show electronic, long-range The characteristics of change, virtualization and networking.Communication between client and enterprise is also developed to by aspectant exchange and conmmunication based on net The exchange and conmmunication of the long-distance intelligents equipment such as network and phone.More and more large enterprises are mentioned by establishing call center for client For service, the contact staff of call center answers the phone of client by server-side.Contact staff is provided by server-side daily A large amount of telephone voice service simultaneously handles the diversified demand for services of client, and the demand for services includes pre-sales seeking advice from, buying, selling After service, lodge a complaint.During service, contact staff needs to cope with the client of different moods, and to the query of client, throwing It tells and complains etc. and make suitable answer.It may be said that the image that the contact staff of call center is exactly enterprise is represented.Pass through raising The service quality of contact staff promotes the important public relations direction that the satisfaction of client has become enterprise.Therefore, more and more Enterprise starts to be monitored the call of server-side.

Current main monitoring method is usually to record the call voice of server-side and client by sound pick-up outfit Come, by manually spot-check recording data or being carried out simple analysis and monitoring whether voice signal exception occurs to recording data by machine Situation, the simple analysis such as carry out speech signal analysis to recording data.It is existing to be easy to using the method manually spot-check There is the case where missing inspection, while cost of labor is also higher.And recording data simply analyze with machine and is only applicable to When carrying out off-line analysis to voice signal, the service quality to contact staff an effective prison cannot be carried out in real time It surveys.In addition, carrying out the simple service state analyzed only general analysis, monitored contact staff to recording data using machine Degree, will lead to has deviation and general to the evaluation of the attitude of contact staff, and contact staff can not effectively know Oneself specific service quality in terms of which goes wrong.

Summary of the invention

The present invention provides a kind of call voice monitoring method and system, with realize to contact staff's call voice it is comprehensive, Real-time monitoring promotes the service quality of server-side contact staff.

For this purpose, the invention provides the following technical scheme:

A kind of call voice monitoring method, comprising:

The voice data that acquisition server-side and client are conversed in real time respectively；

Speech recognition is carried out to the voice data of the server-side and client, respectively obtains server-side identification text and visitor Family end identifies text；

According to the server-side call voice data and corresponding identification text and the client call voice data With corresponding identification text, real-time monitoring is carried out to server-side call voice.

Preferably, the difference acquires server-side in real time and the voice data of client call includes:

Directly acquiring server-side voice data in real time by way of recording in physics sound card；

Acquire client voice data in real time by way of recording in the virtual sound card from configuration.

Preferably, it is described to server-side call voice carry out real-time monitoring include: monitoring server-side call voice in whether There is service to prohibit language, and/or monitors the validity that contact staff replies in server-side call voice.

Preferably, whether have in the monitoring server-side call voice service prohibit language include:

It is semantic vector by current service end identification text conversion；

The semantic vector is calculated at a distance from the semantic vector of the every taboo language prohibited in language database constructed in advance；

If the distance is less than the distance threshold of setting, remind contact staff that service has been used to prohibit language；

The validity of contact staff's answer includes: in the monitoring server-side call voice

Text is identified according to active client, and it is corresponding that the problem of client proposes is searched in the answer library constructed in advance Answer；

Extract the keyword of the answer；

Check in current service end identification text the quantity of the keyword occur；

If the quantity is less than the amount threshold of setting, contact staff is reminded to pay attention to the validity replied.

Preferably, the validity that contact staff replies in the monitoring server-side call voice further include:

If the quantity is less than the amount threshold of setting, the answer is sent and shows, to server-side to prompt visitor Take the content of personnel's correct option.

Preferably, the method also includes: monitoring server-side call voice it is following any one or more: volume, language Speed, the tone；

It is described monitoring server-side call voice volume include:

Calculate the average energy value of current service end voice data；

If described the average energy value exceeds energy reference range, contact staff is reminded to pay attention to call voice；

It is described monitoring server-side call voice word speed include:

Calculate the average word speed of current service end voice data；

If the average word speed exceeds word speed term of reference, contact staff is reminded to pay attention to word speed of conversing；

It is described monitoring server-side call voice the tone include:

The stressed correlated characteristic of current service end voice data is extracted as unit of syllable；

According to the stressed correlated characteristic, current service end call voice is detected by the stressed detection model of training in advance Stressed syllable；

If the stressed syllable quantity of the current service end call voice data is greater than stressed syllable amount threshold, mention Awake contact staff pays attention to the tone of conversing.

Preferably, the prompting contact staff notices that call voice or word speed include any of the following or a variety of:

By the energy value of the server-side current talking voice data or Speeking speed changing at curve real-time display, and need to Volume that contact staff pays attention to or the corresponding curved portion of word speed is wanted to be distinguished with different colours；

The energy value of the server-side current talking voice data or Speeking speed changing are indicated at the mode of signal lattice, and Mark the signal lattice of the volume or word speed that need contact staff to pay attention to；

The energy value of the server-side current talking voice data or word speed are indicated in a manner of color gradient, and marked Infuse the energy value for needing contact staff to pay attention to or the corresponding color of word speed；

The energy value of the server-side current talking voice data or word speed are indicated in a manner of bubble, when needing visitor When the personnel of clothes pay attention to volume or word speed, bubble explosion；

Contact staff is reminded using individualized voice；

The prompting contact staff notices that the call tone includes:

Show the stressed syllable in server-side current talking voice data；Or

Contact staff is reminded using individualized voice.

A kind of call voice monitoring system, comprising:

First voice acquisition module, for acquiring the voice data of server-side call in real time；

Second voice acquisition module, for acquiring the voice data of client call in real time；

Speech recognition module, for respectively to first voice acquisition module acquisition server-side call voice data and The client call voice data of second voice acquisition module acquisition carry out speech recognition, obtain server-side identification text and Client identifies text；

Speech monitoring module, for according to the server-side call voice data and corresponding identification text and the visitor Family end call voice data and corresponding identification text carry out real-time monitoring to server-side call voice.

Preferably, first voice acquisition module is specifically used for real-time directly by way of recording in physics sound card Acquire server-side voice data；

Second voice acquisition module is specifically used for acquiring in real time by way of recording in the virtual sound card from configuration Client voice data.

Preferably, the speech monitoring module includes: that service prohibits language monitoring submodule, and/or replies validity monitoring Module；

Whether the service prohibits language and monitors submodule, have service to prohibit language for monitoring in server-side call voice；

The answer validity monitors submodule, for monitor contact staff in server-side call voice reply it is effective Property.

Preferably, the service taboo language monitoring submodule includes:

Semantic vector converting unit, for being semantic vector by current service end identification text conversion；

Metrics calculation unit prohibits language with every prohibited in language database constructed in advance for calculating the semantic vector The distance of semantic vector；

Language reminding unit is prohibited in service, when for being less than the distance threshold of setting in the distance, contact staff is reminded to use Language is prohibited in service；

The answer validity monitors submodule

Answer searching unit searches client in the answer library constructed in advance for identifying text according to active client The corresponding answer of the problem of proposition；

Keyword extraction unit, for extracting the keyword of the answer；

Volume check unit, for checking in current service end identification text the quantity of the keyword occur；

Validity reminding unit is replied, when the quantity is less than the amount threshold of setting, contact staff is reminded to pay attention to answering Multiple validity.

Preferably, the answer validity reminding unit further include:

Answer transmission unit when for being less than the amount threshold of setting in the quantity, sends the answer to server-side；

Answer shows unit, for showing the answer, to prompt the content of contact staff's correct option.

Preferably, the system also includes following any one or more modules: volume monitoring modular, word speed monitor mould Block, tone monitoring modular；

The volume monitoring modular, for monitoring the volume of server-side call voice；

The word speed monitoring modular, for monitoring the word speed of server-side call voice；

The tone monitoring modular, for monitoring the tone of server-side call voice；

The volume monitoring modular includes:

Energy value computational submodule, for calculating the average energy value of current service end voice data；

Volume prompting submodule, for reminding contact staff's note when described the average energy value exceeds energy reference range Meaning call voice；

The word speed monitoring modular includes:

Word speed computational submodule, for calculating the average word speed of current service end voice data；

Word speed reminds submodule, for reminding contact staff to pay attention to when the average word speed exceeds word speed term of reference Call word speed；

The tone monitoring modular includes:

Correlated characteristic extracting sub-module is read again, for extracting the stressed phase of current service end voice data as unit of syllable Close feature；

Stressed syllable detection sub-module, for passing through the stressed detection mould of training in advance according to the stressed correlated characteristic The stressed syllable of type detection current service end call voice；

The tone reminds submodule, is greater than for the stressed syllable quantity in the current service end call voice data and reads again When syllable amount threshold, contact staff is reminded to pay attention to the tone of conversing.

Preferably, the volume prompting submodule or word speed remind submodule specifically use it is following any one or more Mode reminds contact staff:

Contact staff is reminded using individualized voice；

The tone reminds submodule to be specifically used for the stressed syllable in display server-side current talking voice data；Or Contact staff is reminded using individualized voice.

A kind of call voice monitoring method and system provided by the invention can fully and effectively monitor the logical of contact staff Language sound carries out different analyses according to different factors, can send not for the problem after analyzing result With the prompting of content, contact staff is helped timely and effectively to correct the problem, to mention while guaranteeing speech quality Rise server-side service quality.

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only one recorded in the present invention A little embodiments are also possible to obtain other drawings based on these drawings for those of ordinary skill in the art.

Fig. 1 is the flow chart of the call voice monitoring method of the embodiment of the present invention；

Fig. 2 is the structure chart of the call voice monitoring system of the embodiment of the present invention.

Specific embodiment

The scheme of embodiment in order to enable those skilled in the art to better understand the present invention with reference to the accompanying drawing and is implemented Mode is described in further detail the embodiment of the present invention.

As shown in Figure 1, being the flow chart of the call voice monitoring method of the embodiment of the present invention, comprising the following steps:

Step 101, the voice data that acquisition server-side and client are conversed in real time respectively.

When carrying out data under voice, server-side and visitor can be acquired in real time by two different recording channels respectively The voice data at family end.Specifically, can by directly from physics sound card by recording in a manner of acquire server-side voice number in real time According to, client install virtual sound card, pass through the virtual sound card acquire client voice data.It is, of course, also possible to using it Its mode voice data that acquisition server-side and client are conversed in real time respectively, it is not limited in the embodiment of the present invention.

Step 102, speech recognition is carried out to the voice data of the server-side and client, respectively obtains server-side identification Text and client identify text.

It should be noted that can make detecting server-side or client to promote the accuracy of speech recognition After dialect, speech recognition is carried out using dialect customizing model.It is of course also possible to detecting server-side or client use After foreign language, speech recognition is carried out using corresponding foreign language customizing model.Concrete sound recognition methods can use the prior art, this Inventive embodiments to this also without limitation.

Step 103, it is conversed according to the server-side call voice data and corresponding identification text and the client Voice data and corresponding identification text carry out real-time monitoring to server-side call voice.

In order to improve the service quality of contact staff, when the service of contact staff goes wrong, makes and accordingly mentioning Wake up so that contact staff's timely and effectively more direct problem, thus realize to server-side call voice carry out in real time, comprehensively, it is effective Monitoring.

Specifically, to server-side call voice carry out real-time monitoring may include: monitoring server-side call voice in whether There is service to prohibit language, and/or monitors the validity that contact staff replies in server-side call voice.

Whether have service to prohibit language in the monitoring server-side call voice may comprise steps of:

It (1) is semantic vector by current service end identification text conversion.

Specifically, the prior art can be used, the semantic vector of each word is obtained by training, then will be identified in text The semantic vector of all words combines, and obtains the semantic vector of whole sentence.For example, can will be worked as by Sentence2Vec model Preceding server-side identification text conversion is that semantic vector does not do this embodiment of the present invention it is, of course, also possible to there is other conversion regimes It limits.

(2) calculate the semantic vector and the semantic vector of the every taboo language prohibited in language database constructed in advance away from From.

The distance can be COS distance and be also possible to other distances, without limitation to this embodiment of the present invention.It is described Prohibiting language database can be constructed by collecting the word for easily causing the reaction of client's unhealthy emotion, phrase or sentence.Described in storage When word, phrase or sentence, it can be stored in a manner of semantic vector simultaneously, can also only store corresponding semantic vector. Furthermore it is possible to be continuously updated to the taboo language database, by the emerging word for easily causing the reaction of client's unhealthy emotion, short Language or sentence are added in the taboo language database.

(3) if the distance is less than the distance threshold of setting, remind contact staff that service has been used to prohibit language.

Whether there is service to prohibit language in monitoring server-side call voice primarily to preventing contact staff from answering for client When problem, complaint is mediated, using some with personal mood or impatient sentences, for example " says as soon as possible, not waste me The sentences such as time ", " also problematic, there are also other things for I ", so that service quality be caused to decline.

The validity that contact staff replies in the monitoring server-side call voice may comprise steps of:

(1) text is identified according to active client, it is right that the problem of client proposes institute is searched in the answer library constructed in advance The answer answered.

It should be noted that can expand the problems in described answer library and answer when constructing the answer library Exhibition, the extension can be synonym extension, hypernym extension, hyponym extension etc., such as can " I will do coloured silk to problem Bell " is extended to " I will open CRBT ", " I will use CRBT " etc..Certainly, extended method is implemented this present invention there are also very much Example is without limitation.

(2) keyword of the answer is extracted.

(3) check in current service end identification text the quantity of the keyword occur.

(4) if the quantity of the keyword is less than the amount threshold of setting, contact staff is reminded to notice that is replied has Effect property.

It should be noted that if being less than the amount threshold of setting in the quantity that step (4) checks the keyword, also The answer can be sent and show, to server-side to prompt the content of contact staff's correct option.So can effectively it help It helps contact staff to find the answer for the problem of client proposes in time, avoids the misunderstanding in communication and then avoid client's production Raw undesirable emotional reactions, at the same time it can also reduce the time that contact staff thinks deeply answer, so that service is more efficient.

In an alternative embodiment of the invention, call voice monitoring method can with the following steps are included:

Real-time monitoring is carried out to the volume, word speed, the tone of server-side call voice, it is of course also possible to select in this three One or two be monitored.

Specifically, the volume of the monitoring server-side call voice can carry out in the following manner: calculate current service end The average energy value of voice data reminds contact staff to pay attention to leading to if described the average energy value exceeds energy reference range Language sound.

The determination of the energy reference range, can be by collecting a large amount of preferable languages of server-side call voice quality in advance Sound data count the average value of the voice data, and the energy reference value of sound is heard using the average value as suitable client, Then expand the percentage of setting, such as 10% up and down on the basis of the reference value, obtain the energy reference range.Certainly also There can be other methods to obtain the energy reference range, without limitation to this embodiment of the present invention.

The prompting contact staff notices that call voice can be and detects contact staff's volume too large or too small time-division Indescribably wake up so that contact staff more clearly occur In Call the problem of where.

Above-mentioned prompting contact staff pay attention to the mode of call voice may include it is following any one or more:

(1) energy value of the server-side current talking voice data is transformed into curve real-time display, and visitor will be needed The corresponding curved portion of the volume that the personnel of clothes pay attention to is distinguished with different colours.

(2) mode that the energy value of the server-side current talking voice data is transformed into signal lattice is indicated, and marked The signal lattice for the volume for needing contact staff to pay attention to.

(3) energy value of the server-side current talking voice data is indicated in a manner of color gradient, and marking need to The corresponding color of the energy value for wanting contact staff to pay attention to.

(4) energy value of the server-side current talking voice data is indicated in a manner of bubble, when needing customer service people When member pays attention to volume, bubble explosion.

(5) contact staff is reminded using individualized voice.

It should be noted that can also have other methods that contact staff is reminded to pay attention to call voice or word speed, herein no longer Enumerate one by one, the embodiment of the present invention to this also without limitation.

The volume of the monitoring server-side call voice can prevent the call voice volume of contact staff excessive or mistake Small, the volume is crossed conference and is damaged to client's hearing, and the volume is too small to make client not hear contact staff's call Content.Method provided in an embodiment of the present invention can effectively avoid these problems, in the call voice volume for detecting contact staff It is timely and effectively reminded when too large or too small, helps contact staff's more direct problem.

The word speed of the monitoring server-side call voice may include: the average language for calculating current service end voice data Speed reminds contact staff to pay attention to word speed of conversing if the average word speed exceeds word speed term of reference.

Specifically, the determination of the word speed term of reference, can be preferable by collecting a large amount of server-side call voices in advance Call voice data, count the number of words of speaking of average minute clock in these voice data, using obtained average value as service Then the standard word speed for holding call carries out it on the basis of the standard word speed expanding up and down, such as more multiple slowly than standard word speed Word or fast multiple words, obtain the word speed term of reference.There can also be other methods to obtain the word speed term of reference certainly, it is right This embodiment of the present invention is without limitation.

The prompting contact staff notices that call word speed can be and detects that contact staff's word speed is too fast or word speed is excessively slow Time-division you can well imagine wake up so that contact staff more clearly occur call word speed the problem of where.

Remind contact staff pay attention to conversing word speed concrete mode can there are many, such as similar to above-mentioned prompting contact staff Pay attention to the various modes etc. of call voice.

Method provided in an embodiment of the present invention can be too fast or excessively slow in the call voice word speed for detecting contact staff Shi Jinhang is timely and effectively reminded, and client, which can be effectively avoided, can not understand the content spoken, and helps contact staff's corrigendum Problem, to help client more comfortablely and contact staff's call.

It is described monitoring server-side call voice the tone may include:

(1) the stressed correlated characteristic of current service end voice data is extracted as unit of syllable.

The stressed correlated characteristic may include the fundamental frequency of current syllable, current syllable previous syllable fundamental frequency, when Whether fundamental frequency, current syllable position, the current syllable in sentence of the latter syllable of preceding syllable should be read again.The current syllable Whether should read again can be obtained by manually marking a large amount of voice data in advance.

(2) according to the stressed correlated characteristic, the call of current service end is detected by the stressed detection model of training in advance The stressed syllable of voice.

The stressed detection model can be by extracting the stressed related special of a large amount of preferable server-side call voice data Training obtains after sign.In addition, common model in statistics, which can be used, in the stressed detection model indicates, such as supporting vector Machine model, neural network model etc..It should be noted that training and indicate the method for reading detection model again there are also very much, herein Different one enumerates, the embodiment of the present invention to this also without limitation.

(3) if the stressed syllable quantity of the current service end call voice data is greater than stressed syllable amount threshold, Contact staff is then reminded to pay attention to the tone of conversing.

The method for reminding contact staff to pay attention to the call tone may include: display server-side current talking voice data In stressed syllable or using individualized voice remind contact staff.There can also be other methods to remind contact staff's note certainly Anticipate call voice or word speed, no longer enumerate one by one herein, the embodiment of the present invention to this also without limitation.

By monitor server-side call voice the tone, can help contact staff to client provide voice service when It waits, avoids to read and influence the mood of client again in stressed place.

The method of call voice monitoring provided in an embodiment of the present invention, by the way that acquisition server-side and client are logical in real time respectively The voice data of words simultaneously carries out speech recognition to it, according to server-side call voice data and corresponding identification text and described Client call voice data and corresponding identification text, carrying out real-time monitoring to server-side call voice can be comprehensively and effectively The call voice for monitoring contact staff, different analyses is carried out according to different factors, can be directed to after analyzing result The problem sends the prompting of different content, helps contact staff timely and effectively to correct the problem, thus guaranteeing While speech quality, server-side service quality is promoted.

Further, in another embodiment of the method for the present invention, volume, the language of server-side call voice can also be monitored One of speed, tone are a variety of, remind giving contact staff in time when something goes wrong, server-side is facilitated to make tune in time It is whole, and then better assure that service quality.

Correspondingly, the embodiment of the present invention also provides a kind of call voice monitoring system, as shown in Fig. 2, being implementation of the present invention The structure chart of the call voice monitoring system of example, may include following module:

First voice acquisition module 201, for acquiring the voice data of server-side call in real time.

Second voice acquisition module 202, for acquiring the voice data of client call in real time.

Speech recognition module 203, for respectively to the server-side call voice of first voice acquisition module 201 acquisition Data and the client voice data of second voice acquisition module 202 acquisition carry out speech recognition, obtain server-side identification Text and client identify text.

Speech monitoring module 204, for according to the server-side call voice data and corresponding identification text and institute Client call voice data and corresponding identification text are stated, real-time monitoring is carried out to server-side call voice.

Above-mentioned first voice acquisition module 201 can specifically acquire in real time service in a manner of directly recording from physics sound card Hold voice data.

Above-mentioned second voice acquisition module 202 specifically can be by configuring virtual sound card in client, from virtual sound card The mode of recording acquires client voice data in real time.

Whether the speech monitoring module 204 can specifically monitor in server-side call voice has service to prohibit language, and/or prison Survey the validity that contact staff replies in server-side call voice.Correspondingly, which may include: service Prohibit language monitoring submodule, and/or reply validity and monitors submodule.The service prohibits language and monitors submodule, for monitoring service Whether there is service to prohibit language in the call voice of end.The answer validity monitors submodule, for monitoring in server-side call voice The validity that contact staff replies.

Language monitoring submodule is prohibited in the service

Semantic vector converting unit, for being semantic vector by current service end identification text conversion.

Metrics calculation unit prohibits language with every prohibited in language database constructed in advance for calculating the semantic vector The distance of semantic vector.

Language reminding unit is prohibited in service, when for being less than the distance threshold of setting in the distance, contact staff is reminded to use Language is prohibited in service.

The answer validity monitors submodule

Answer searching unit searches client in the answer library constructed in advance for identifying text according to active client The corresponding answer of the problem of proposition.

Keyword extraction unit, for extracting the keyword of the answer.

Volume check unit, for checking in current service end identification text the quantity of the keyword occur.

Validity reminding unit is replied, when the quantity of the keyword is less than the amount threshold of setting, reminds customer service people Member pays attention to the validity replied.

It should be noted that replying validity reminding unit can also include that answer transmission unit and answer show unit. Wherein, the answer transmission unit is when the quantity of the keyword is less than the amount threshold of setting, to described in server-side transmission Answer.The answer shows unit and shows the answer, to prompt the content of contact staff's correct option.It so can be effectively Contact staff is helped to find the answer for the problem of client proposes in time, the misunderstanding avoided in communication then avoids client Undesirable emotional reactions are generated, at the same time it can also reduce the time that contact staff thinks deeply answer, so that service is more efficient.

In an alternative embodiment of the invention, call voice monitoring system can also include any one or more following modules:

Volume monitoring modular, for according to the server-side call voice data and the client call voice number According to the volume of monitoring server-side call voice.

Word speed monitoring modular, for according to the server-side call voice data and the client call voice number According to the word speed of monitoring server-side call voice.

Tone monitoring modular, for according to the server-side call voice data and the client call voice number According to the tone of monitoring server-side call voice.

Specifically, the volume monitoring modular may include:

Energy value computational submodule, for calculating the average energy value of current service end voice data.

Volume prompting submodule, for reminding contact staff's note when described the average energy value exceeds energy reference range Meaning call voice.

The word speed monitoring modular may include:

Word speed computational submodule, for calculating the average word speed of current service end voice data.

Word speed reminds submodule, for reminding contact staff to pay attention to when the average word speed exceeds word speed term of reference Call word speed.

Above-mentioned volume prompting submodule or word speed remind submodule that can specifically use any one or more following side Formula reminds contact staff:

(1) by the energy value of the server-side current talking voice data or Speeking speed changing at curve real-time display, and The volume for needing contact staff to pay attention to or the corresponding curved portion of word speed are distinguished with different colours.

(2) by the energy value of the server-side current talking voice data or Speeking speed changing at the mode table of signal lattice Show, and marks the signal lattice of the volume or word speed that need contact staff to pay attention to.

(3) energy value of the server-side current talking voice data or word speed are indicated in a manner of color gradient, And mark the energy value for needing contact staff to pay attention to or the corresponding color of word speed.

(4) energy value of the server-side current talking voice data or word speed are indicated in a manner of bubble, when need When contact staff being wanted to pay attention to volume or word speed, bubble explosion.

(5) contact staff is reminded using individualized voice.

Certainly, above-mentioned volume prompting submodule or word speed remind submodule that can also remind customer service people using other modes Member pays attention to call voice or word speed, no longer enumerates one by one herein, the embodiment of the present invention to this also without limitation.

The tone monitoring modular may include:

Correlated characteristic extracting sub-module is read again, for extracting the stressed phase of current service end voice data as unit of syllable Close feature.

Stressed syllable detection sub-module, for passing through the stressed detection mould of training in advance according to the stressed correlated characteristic The stressed syllable of type detection current service end call voice.

The tone reminds submodule specifically can be by the stressed syllable in display server-side current talking voice data Or contact staff is reminded using individualized voice.

The system of call voice monitoring provided in an embodiment of the present invention, by the way that acquisition server-side and client are logical in real time respectively The voice data of words simultaneously carries out speech recognition to it, according to server-side call voice data and corresponding identification text and described Client call voice data and corresponding identification text, carrying out real-time monitoring to server-side call voice can be comprehensively and effectively The call voice for monitoring contact staff, different analyses is carried out according to different factors, can be directed to after analyzing result The problem sends the prompting of different content, helps contact staff timely and effectively to correct the problem, thus guaranteeing While speech quality, server-side service quality is promoted.

Further, the system of call voice of the present invention monitoring can also monitor the volume of server-side call voice, word speed, One of tone is a variety of, reminds giving contact staff in time when something goes wrong, server-side is facilitated to adjust in time, into And better assure that service quality.

All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to embodiment of the method Part explanation.System embodiment described above is only schematical, wherein described be used as separate part description Unit may or may not be physically separated, component shown as a unit may or may not be Physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to the actual needs Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying In the case where creative work, it can understand and implement.

The embodiment of the present invention has been described in detail above, and specific embodiment used herein carries out the present invention It illustrates, method and system of the invention that the above embodiments are only used to help understand；Meanwhile for the one of this field As technical staff, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, to sum up institute It states, the contents of this specification are not to be construed as limiting the invention.

Claims

1. a kind of call voice monitoring method characterized by comprising

Speech recognition is carried out to the voice data of the server-side and client, respectively obtains server-side identification text and client Identify text；

According to the server-side call voice data and corresponding identification text and the client call voice data and right The identification text answered carries out contact staff in real-time monitoring, including monitoring server-side call voice to server-side call voice and answers Multiple validity；

The problem of identifying text according to active client, client's proposition is searched in the answer library constructed in advance is corresponding to be answered Case；

Extract the keyword of the answer；

2. the method according to claim 1, wherein the difference acquire in real time server-side and client call Voice data includes:

3. the method according to claim 1, wherein described also wrap server-side call voice progress real-time monitoring It includes: whether thering is service to prohibit language in monitoring server-side call voice.

4. according to the method described in claim 3, it is characterized in that,

Whether have in the monitoring server-side call voice service prohibit language include:

It is semantic vector by current service end identification text conversion；

If the distance is less than the distance threshold of setting, remind contact staff that service has been used to prohibit language.

5. the method according to claim 1, wherein contact staff replies in the monitoring server-side call voice Validity further include:

If the quantity is less than the amount threshold of setting, the answer is sent and shows, to server-side to prompt customer service people The content of member's correct option.

6. method according to any one of claims 1 to 5, which is characterized in that the method also includes: monitoring server-side is logical Language sound it is following any one or more: volume, word speed, the tone；

It is described monitoring server-side call voice volume include:

Calculate the average energy value of current service end voice data；

It is described monitoring server-side call voice word speed include:

Calculate the average word speed of current service end voice data；

It is described monitoring server-side call voice the tone include:

According to the stressed correlated characteristic, the weight of current service end call voice is detected by the stressed detection model of training in advance Pronunciation section；

If the stressed syllable quantity of the current service end call voice data is greater than stressed syllable amount threshold, visitor is reminded The personnel of clothes pay attention to the tone of conversing.

7. according to the method described in claim 6, it is characterized in that,

The prompting contact staff notices that call voice or word speed include any of the following or a variety of:

By the energy value of the server-side current talking voice data or Speeking speed changing at curve real-time display, and visitor will be needed The corresponding curved portion of volume or word speed that the personnel of clothes pay attention to is distinguished with different colours；

The energy value of the server-side current talking voice data or Speeking speed changing are indicated at the mode of signal lattice, and marked Need the signal lattice for the volume or word speed that contact staff pays attention to；

The energy value of the server-side current talking voice data or word speed are indicated in a manner of color gradient, and marking need to Want the energy value or the corresponding color of word speed that contact staff pays attention to；

The energy value of the server-side current talking voice data or word speed are indicated in a manner of bubble, when needing customer service people When member pays attention to volume or word speed, bubble explosion；

Contact staff is reminded using individualized voice；

The prompting contact staff notices that the call tone includes:

Show the stressed syllable in server-side current talking voice data；Or

Contact staff is reminded using individualized voice.

8. a kind of call voice monitors system characterized by comprising

Speech recognition module, for respectively to the server-side call voice data and described of first voice acquisition module acquisition The client call voice data of second voice acquisition module acquisition carry out speech recognition, obtain server-side identification text and client End identification text；

Speech monitoring module, for according to the server-side call voice data and corresponding identification text and the client Call voice data and corresponding identification text carry out real-time monitoring to server-side call voice；The speech monitoring module packet It includes and replies validity monitoring submodule, the answer validity monitors submodule, for monitoring customer service in server-side call voice The validity that personnel reply；

The answer validity monitors submodule

Answer searching unit is searched client in the answer library constructed in advance and is proposed for identifying text according to active client The problem of corresponding answer；

Keyword extraction unit, for extracting the keyword of the answer；

Validity reminding unit is replied, when the quantity is less than the amount threshold of setting, contact staff is reminded to pay attention to answer Validity.

9. system according to claim 8, which is characterized in that

First voice acquisition module is specifically used for directly acquiring server-side in real time by way of recording in physics sound card Voice data；

Second voice acquisition module is specifically used for acquiring client in real time by way of recording in the virtual sound card from configuration Hold voice data.

10. system according to claim 8, which is characterized in that the speech monitoring module further include: language monitoring is prohibited in service Submodule；

Whether the service prohibits language and monitors submodule, have service to prohibit language for monitoring in server-side call voice.

11. system according to claim 10, which is characterized in that

Language monitoring submodule is prohibited in the service

Metrics calculation unit, for calculating the semanteme of the semantic vector with the every taboo language prohibited in language database constructed in advance The distance of vector；

Language reminding unit is prohibited in service, when for being less than the distance threshold of setting in the distance, contact staff is reminded to use clothes Language is prohibited in business.

12. system according to claim 8, which is characterized in that the answer validity reminding unit further include:

13. system according to claim 8, which is characterized in that the system also includes following any one or more moulds Block: volume monitoring modular, word speed monitoring modular, tone monitoring modular；

The volume monitoring modular includes:

Volume prompting submodule, for reminding contact staff to pay attention to leading to when described the average energy value exceeds energy reference range Language sound；

The word speed monitoring modular includes:

Word speed reminds submodule, for reminding contact staff to pay attention to conversing when the average word speed exceeds word speed term of reference Word speed；

The tone monitoring modular includes:

Correlated characteristic extracting sub-module is read again, for extracting the stressed related special of current service end voice data as unit of syllable Sign；

Stressed syllable detection sub-module, for being examined by the stressed detection model of training in advance according to the stressed correlated characteristic Survey the stressed syllable of current service end call voice；

The tone reminds submodule, is greater than stressed syllable for the stressed syllable quantity in the current service end call voice data When amount threshold, contact staff is reminded to pay attention to the tone of conversing.

14. system according to claim 13, which is characterized in that

The volume prompting submodule or word speed remind submodule specifically to use any one or more following mode to customer service Personnel remind:

Contact staff is reminded using individualized voice；

The tone reminds submodule to be specifically used for the stressed syllable in display server-side current talking voice data；Or it uses Individualized voice reminds contact staff.