CN110519470A

CN110519470A - A kind of method of speech processing, server and audio access device

Info

Publication number: CN110519470A
Application number: CN201910779362.8A
Authority: CN
Inventors: 徐菲; 蔡昀天; 蔡劲松; 朱达贤
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-08-22
Filing date: 2019-08-22
Publication date: 2019-11-29

Abstract

The embodiment of the present application provides a kind of method of speech processing, and server can receive the voice data of audio access device transmission, real time phone call data of the voice data between user and the first contact staff.In other words, audio access device can obtain the real time phone call data between user and the first contact staff in real time, and the voice data that will acquire is sent to server.The voice data received is converted into text by server, and carries out semantic analysis to text, obtains indicating whether the voice data meets the semantic analysis result of default conversation discipline.Then the semantic analysis result is sent to terminal device by server.It can be seen that, server can carry out semantic analysis to the real time phone call data between user and the first contact staff in real time, determine whether the real time phone call data between user and the first contact staff meet default conversation discipline, compared with the mode spot-check in traditional technology afterwards, the service quality of the first contact staff can be supervised in time.

Description

A kind of method of speech processing, server and audio access device

Technical field

This application involves voices to supervise field, fills more particularly to a kind of method of speech processing, server and audio access It sets.

Background technique

Currently, many enterprises often provide the user with voice service function to provide the user with better service.With Family can dial Service Phone, link up with contact staff, and client personnel can answer aiming at the problem that user proposes.

In order to guarantee the service quality of contact staff, enterprise generally can take certain measure to supervise Service Phone It superintends and directs.Specifically, it can record to the dialog context of contact staff and user, then by way of selective examination recording, to visitor The service quality for taking personnel exercises supervision.

But by the way of this selective examination recording, the service quality of contact staff can only be determined afterwards, supervision is not in time.

Summary of the invention

Technical problems to be solved in this application are the modes that traditional service quality to contact staff exercises supervision, prison It superintends and directs not in time, a kind of method of speech processing, server and audio access device is provided.

In a first aspect, the embodiment of the present application provides a kind of method of speech processing, which comprises

Server receives the voice data that audio access device is sent；The voice data is user and the first contact staff Between real time phone call data；

The voice data received is converted into text by server；

Server carries out semantic analysis to the text, obtains semantic analysis result, and the semantic analysis result is used for table Show whether the voice data meets default conversation discipline；

The semantic analysis result is sent to terminal device by server.

Optionally, the method also includes:

If the semantic analysis result indicates that the voice data does not meet the default conversation discipline, the server Send hang up instruction to the audio access device so that the audio access device cut off the verbal system of the user with Call speech channel between the verbal system of first contact staff.

Optionally, the method also includes:

If the semantic analysis result indicates that the voice data does not meet the default conversation discipline, the server Send reference order to the audio access device so that the audio access device cut off the verbal system of the user with Call speech channel between the logical equipment of first contact staff, and connect verbal system and the second contact staff of the user Verbal system between call speech channel.

Optionally,

The verbal system of first contact staff accesses the audio access device by the way of registration of attending a banquet, described Audio access device access carrier system by way of registration of attending a banquet；

Alternatively,

The verbal system of first contact staff accesses the audio access device by the way of registration of attending a banquet, described Audio access device is accessed computer telecommunication by way of registration of attending a banquet and integrates CTI system.

Optionally, the server carries out semantic analysis to the text, comprising:

The server carries out semantic analysis to the text by semantic analysis model trained in advance；The preparatory instruction Experienced semantic analysis model is obtained based on label training entrained by training text and the trained text, the training Label entrained by text, is used to indicate whether the corresponding voice data of the trained text meets default conversation discipline.

Optionally, the server carries out semantic analysis to the text, comprising:

The server matches the text with the grapholect in the semantic base constructed in advance, to realize to institute It states text and carries out semantic analysis；

The corresponding voice data of the grapholect meets the default conversation discipline；Alternatively, the grapholect is corresponding Voice data do not meet the default conversation discipline.

Optionally, the default conversation discipline, comprising:

Dialog context meets specification；It and/or does not include unhealthy emotion.

Optionally, the method also includes:

The server receives the voice data acquisition request that the terminal device is sent；

The voice data is sent to the terminal device by the server.

Optionally, the server receives the voice data that audio access device is sent, comprising:

Voice data after what the server received that the audio access device sends handled by encoding and decoding speech, Voice data after the encoding and decoding speech processing, meets the voice data format that the server is supported.

Optionally, the method also includes:

The server obtains the corresponding semantic analysis result of corresponding with the first contact staff multiple voice data；

The server is based on the corresponding semantic analysis of the multiple voice data corresponding with the first contact staff As a result, determining the service quality of first contact staff.

Second aspect, the embodiment of the present application provide a kind of method of speech processing, which comprises

Speech sound access equipment obtains voice data；The voice data is real-time logical between user and the first contact staff Talk about data；

The voice data is sent to server by the speech sound access equipment.

Optionally, the method also includes:

The audio access device receives the hang up instruction that the server is sent；

The audio access device according to the hang up instruction cut off the user verbal system and first customer service Call speech channel between the verbal system of personnel.

Optionally, the method also includes:

The audio access device receives the reference order that the server is sent；

The speech sound access equipment based on the reference order cut off the user verbal system and first customer service Call speech channel between the logical equipment of personnel, and connect the user verbal system and the second contact staff verbal system it Between call speech channel.

Optionally, the verbal system of first contact staff accesses the audio access dress by the way of registration of attending a banquet It sets, audio access device access carrier system by way of registration of attending a banquet；

Alternatively,

Optionally, the voice data is sent to server by the speech sound access equipment, comprising:

The speech sound access equipment carries out encoding and decoding speech processing to the voice data, so that the voice number after processing The voice data format supported according to the server is met；

The speech sound access equipment by it is described handled by encoding and decoding speech after voice data be sent to the service Device.

Optionally, the speech sound access equipment obtains voice data, comprising:

The audio access device receives record command, and obtains the voice data based on the record command.

The third aspect, the embodiment of the present application provide a kind of server, and the server includes:

First receiving unit, for receiving the voice data of audio access device transmission；The voice data be user with Real time phone call data between first contact staff；

Converting unit, for the voice data received to be converted into text；

Semantic analysis unit obtains semantic analysis result, the semantic analysis for carrying out semantic analysis to the text As a result, for indicating whether the voice data meets default conversation discipline；

First transmission unit, for the semantic analysis result to be sent to terminal device.

Optionally, the server further include:

Second transmission unit, if indicating that the voice data does not meet the default call for the semantic analysis result Specification then sends hang up instruction to the audio access device, so that the audio access device cuts off the logical of the user Talk about the call speech channel between equipment and the verbal system of first contact staff.

Optionally, the server further include:

Third transmission unit, if indicating that the voice data does not meet the default call for the semantic analysis result Specification then sends reference order to the audio access device, so that the audio access device cuts off the logical of the user The call speech channel between equipment and the logical equipment of first contact staff is talked about, and connects the verbal system and second of the user Call speech channel between the verbal system of contact staff.

Optionally,

Alternatively,

Optionally, the semantic analysis unit, is specifically used for:

Semantic analysis is carried out to the text by semantic analysis model trained in advance；The semanteme point trained in advance Model is analysed, is to be obtained based on label training entrained by training text and the trained text, entrained by the trained text Label, be used to indicate whether the corresponding voice data of the trained text meets default conversation discipline.

Optionally, the semantic analysis unit, is specifically used for:

The text is matched with the grapholect in the semantic base constructed in advance, the text is carried out with realizing Semantic analysis；

Optionally, the default conversation discipline, comprising:

Optionally, the server further include:

Second receiving unit, the voice data acquisition request sent for receiving the terminal device；

4th transmission unit, for the voice data to be sent to the terminal device.

Optionally, first receiving unit, is specifically used for:

Voice data after receive that the audio access device sends handled by encoding and decoding speech, the voice coder Voice data after decoding process meets the voice data format that the server is supported.

Optionally, the server further include:

Acquiring unit, for obtaining the corresponding semantic analysis knot of corresponding with the first contact staff multiple voice data Fruit；

Determination unit, for semantic point corresponding based on multiple voice data corresponding with the first contact staff Analysis is as a result, determine the service quality of first contact staff.

Fourth aspect, the embodiment of the present application provide a kind of audio access device, and the audio access device includes:

Acquiring unit, for obtaining voice data；The voice data is real-time between user and the first contact staff Communicating data；

Transmission unit, for the voice data to be sent to server.

Optionally, the audio access device further include:

First receiving unit, the hang up instruction sent for receiving the server；

Hang up unit, for cut off according to the hang up instruction user verbal system and first contact staff Verbal system between call speech channel.

Optionally, the audio access device further include:

Second receiving unit, the reference order sent for receiving the server；

Adapter unit, for cut off based on the reference order user verbal system and first contact staff Logical equipment between call speech channel, and connect between the verbal system of the user and the verbal system of the second contact staff Call speech channel.

Alternatively,

Optionally, the transmission unit, is specifically used for:

Encoding and decoding speech processing is carried out to the voice data, so that the voice data after processing meets the server The voice data format of support；

By it is described handled by encoding and decoding speech after voice data be sent to the server.

Optionally, the acquiring unit, is specifically used for: receiving record command, and based on described in record command acquisition Voice data.

Compared with prior art, the embodiment of the present application has the advantage that

The embodiment of the present application provides a kind of method of speech processing, and specifically, server can receive audio access device The voice data of transmission, real time phone call data of the voice data between user and the first contact staff.In other words, voice connects Real time phone call data between user and the first contact staff can be obtained in real time by entering device, and the voice data hair that will acquire Give server.The voice data received is converted into text by server, and carries out semantic analysis to the text, obtains table Show whether the voice data meets the semantic analysis result of default conversation discipline.Then server sends the semantic analysis result To terminal device.It can be seen that server can be in real time to user and the first customer service using scheme provided by the embodiments of the present application Real time phone call data between personnel carry out semantic analysis, determine that the real time phone call data between user and the first contact staff are It is no to meet default conversation discipline, compared with the mode spot-check in traditional technology afterwards, it can supervise the first contact staff's in time Service quality.

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts, It is also possible to obtain other drawings based on these drawings.

Fig. 1 is a kind of structural schematic diagram of speech processing system provided by the embodiments of the present application；

Fig. 2 is a kind of signaling interaction diagram of method of speech processing provided by the embodiments of the present application；

Fig. 3 a is a kind of access schematic diagram of a scenario of audio access module provided by the embodiments of the present application；

Fig. 3 b is the access schematic diagram of a scenario of another audio access module provided by the embodiments of the present application；

Fig. 4 is a kind of structural schematic diagram of server provided by the embodiments of the present application；

Fig. 5 is a kind of structural schematic diagram of audio access device provided by the embodiments of the present application.

Specific embodiment

In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.

Present inventor has found after study, in traditional technology, in order to guarantee the service quality of contact staff, enterprise Generally certain measure can be taken to exercise supervision Service Phone.It specifically, can be to the dialog context of contact staff and user It records, then by way of selective examination recording, exercises supervision to the service quality of contact staff.But use this selective examination The mode of recording can only determine the service quality of contact staff afterwards, and supervision is not in time.

To solve the above-mentioned problems, the embodiment of the present application provides a kind of method of speech processing and device, can supervise in time Superintend and direct the service quality of contact staff.

With reference to the accompanying drawing, the various non-limiting embodiments of the application are described in detail.

Illustrative methods

For convenience of understanding, the scene of method of speech processing provided by the embodiments of the present application is introduced first.

Referring to Fig. 1, which is a kind of structural schematic diagram of speech processing system provided by the embodiments of the present application.The application is real The speech processing system 100 of example offer, including server 101 and audio access device 102 are provided.Wherein, server 101 can be with Audio access device 102 establishes connection by network.

The server 101 that the embodiment of the present application refers to, can be for dedicated for execution voice provided by the embodiments of the present application The private server of processing method, or be also equipped with the generic server of other data processing functions, the embodiment of the present application It is not specifically limited.

Audio access device 102 provided by the embodiments of the present application can be serially connected in verbal system and the contact staff of user Verbal system between call speech channel in, for obtaining the communicating data of user and contact staff in real time.

Below in conjunction with Fig. 2, method of speech processing provided by the embodiments of the present application is introduced.

Referring to fig. 2, which is a kind of signaling interaction diagram of method of speech processing provided by the embodiments of the present application.The application is real The method of speech processing of example offer is provided, such as can be realized with S201-S205 as follows.

It should be noted that server shown in Fig. 2 can be the server 101 in Fig. 1, audio access shown in Fig. 2 Device can be the audio access device 102 in Fig. 1.

S201: audio access device obtains voice data, and the voice data is between user and the first contact staff Real time phone call data.

As it was noted above, audio access device, can be serially connected in the verbal system of user and the verbal system of contact staff Between call speech channel in.Therefore, user with contact staff for example the first contact staff converses during, voice connects Entering device can record, to get the real time phone call data between user and the first contact staff.

In view of in practical applications, when user dials service calls, that is dialed can be common fixed line, it is also possible to Phone agent.In the embodiment of the present application, it if user is conversed by way of common fixed line with the first contact staff, uses The verbal system at family can establish call speech channel by business system and the common fixed line of first contact staff.For this Kind situation, reference can be made to Fig. 3 a is understood, Fig. 3 a is a kind of access scene of audio access module provided by the embodiments of the present application Schematic diagram.As shown in Figure 3a, (i.e. the first contact staff's is logical in the common fixed line equipment 310 for the speech sound access equipment 301 Talk about equipment) and business system 320 between.If user is conversed by way of phone agent with the first contact staff, The verbal system of user can it is integrated by computer telecommunication (Computer Telecommunication Integration, CTI) phone agent of system and first contact staff establish call speech channel.In this case, reference can be made to Fig. 3 b is carried out Understand, Fig. 3 b is the access schematic diagram of a scenario of another audio access module provided by the embodiments of the present application.As shown in Figure 3b, institute Speech sound access equipment 301 is stated between the phone agent 330 (i.e. the verbal system of the first contact staff) and CTI system 340. Wherein, CTI system can also be referred to as call center system.

In the embodiment of the present application, it is contemplated that in practical applications, when user verbal system to business system or When CTI system initiates call request, business system or CTI system can establish the call of itself and the first contact staff first Call speech channel between equipment resettles itself call speech channel between the verbal system of user, then waits the first customer service After personnel's off-hook response, user can just converse with the first contact staff.When being made a phone call here it is general user, meeting The reason of the tinkle of bells for some time plays.It is understood that if business system or CTI system and the first contact staff Verbal system between call speech channel do not set up completion or the first contact staff not off-hook response, audio access device Start to record, then may record for example aforementioned the tinkle of bells of one section of hash, and be not user and the first contact staff it Between real time phone call data.In consideration of it, in the embodiment of the present application, audio access device can start to record based on record command Sound, i.e. audio access device can start to record after receiving record command, and be not in business system or CTI system is established after itself call speech channel between the verbal system of the first contact staff, recording is got started, to keep away Exempt to be recorded to for example aforementioned the tinkle of bells of useless data.Wherein, the record command is for indicating that user can be with the first customer service Personnel converse, that is, after receiving the record command, the data that audio access device is recorded are user and the first customer service Real time phone call data between personnel.

It should be noted that in the embodiment of the present application, if the phone that user dials is common fixed line, aforementioned recording refers to Order can be business system and be sent to audio access device.Specifically, business system can be in the verbal system of user After call speech channel foundation and first contact staff's off-hook response between business system, sent out to audio access device Give the record command.If the phone that user dials is phone agent, aforementioned record command can be CTI system and be sent to voice Access device.Specifically, CTI system can the call speech channel between the verbal system and CTI system of user establish and After first contact staff's off-hook response, the record command is sent to audio access device.

In the embodiment of the present application, it is contemplated that verbal system and first contact staff of the audio access device serial connection in user Verbal system between call speech channel in, therefore, once audio access device breaks down, then will lead to user can not be with One contact staff converses.In the embodiment of the present application, in order to avoid this problem, if user is by way of common fixed line It converses with the first contact staff, then the verbal system of first contact staff is accessed described by the way of registration of attending a banquet Audio access device, audio access device access carrier system by way of registration of attending a banquet.Due to the first customer service people The verbal system of member accesses the audio access device using the mode for registration of attending a banquet, therefore, even if audio access fills It sets and breaks down, operation quotient system can be revised as by audio access device in the registered address of the verbal system of the first contact staff System, so will not influence user and the first contact staff converses.If user by way of phone agent with the first customer service Personnel converse, then the verbal system of first contact staff accesses the audio access dress by the way of registration of attending a banquet It sets, the audio access device accesses CTI system by way of registration of attending a banquet.Once audio access device breaks down, can CTI system is revised as by audio access device in the registered address of the verbal system of the first contact staff, so will not influence User and the first contact staff converse.

S202: the voice data that speech sound access equipment will acquire is sent to server.

In the embodiment of the present application, the speech sound access equipment and the server can establish connection by network, because This, after the speech sound access equipment gets the voice data, can be sent to clothes for the voice data by network Business device.

In the embodiment of the present application, it is contemplated that there are many kinds of the code encoding/decoding modes of voice data, and server branch in practice The voice data format held, it may be possible to the voice data of encoding and decoding is carried out using a certain preset code encoding/decoding mode.In consideration of it, It, can be to the voice after speech sound access equipment gets voice data in a kind of implementation of the embodiment of the present application Data carry out encoding and decoding processing, so that the voice data after processing, meets the voice data format that the server is supported, and By it is described handled by encoding and decoding speech after voice data be sent to server.

S203: the voice data received is converted into text by server.

In the embodiment of the present application, after server receives the voice data, the voice data can be converted At text, semantic analysis is carried out based on the text being converted in order to subsequent.The embodiment of the present application does not limit the clothes specifically Business device converts voice data to the specific implementation of text, and as an example, server can call corresponding voice It identifies engine, the voice data is converted into text.The speech recognition that the speech recognition engine can provide for third party is drawn It holds up, or the speech recognition engine of independent development, the embodiment of the present application are not specifically limited.

S204: server carries out semantic analysis to the text, obtains semantic analysis result, the semantic analysis result, For indicating whether the voice data meets default conversation discipline.

In the embodiment of the present application, server carries out semantic analysis to the text, can be to the aforementioned institute being converted to There is text to carry out semantic analysis.Semantic analysis can also be carried out to the segment word in the aforementioned all texts being converted to.This Application embodiment is not specifically limited.Specifically, preceding sections text for example can be for the text progress being converted to Pretreatment obtains, wherein so-called pretreatment, such as can be remove in the aforementioned text being converted to without physical meaning Text, these do not have the text of physical meaning may be related to the communicative habits of user, for example, user habit in the mistake spoken Be added in journey " uh ", the words such as " " can in advance locate the text being converted in advance in the embodiment of the present application Reason, removing these does not have the text of physical meaning, and carries out semantic analysis based on the text obtained after processing.

The embodiment of the present application does not limit the default conversation discipline specifically.In view of in practical applications, general requirement is objective Personnel are taken during exchanging with user, the tone is more amiable, cannot have negative emotions.Therefore, implement in the application In a kind of implementation of example, the default conversation discipline, such as can be include unhealthy emotion.

The unhealthy emotion referred in the present embodiment is for normal mood, and normal mood refers to that mood is relatively more flat With, and mood swing is little, such as the tone in communication process is more amiable, and tone fluctuation is little.Except normal mood Mood is the unhealthy emotion referred in the present embodiment.For example, the tone is badly unhealthy emotion in communication process.

Furthermore, it is contemplated that in practical applications, for dialog context of contact staff during with user's communication There is certain requirement, i.e. dialog context will meet specification.In consideration of it, in another implementation of the embodiment of the present application, institute Default conversation discipline is stated, such as specification can be met for dialog context.The dialog context referred in the embodiment of the present application meets rule Model can be presented as at least one following aspect, and being on the one hand to include the content not being allowed to, for example, contact staff is not Client can be required to favorable comment, for another example, contact staff is not able to guide client and adds wechat, etc. privately.It on the other hand is to include Regulation has to the content referred to, must include introduction, the industry of business in dialog context for example, conversing for business consultation class Business charge situation etc..

S205: the semantic analysis result is sent to terminal device by server.

After server obtains the semantic analysis result, which can be sent to terminal device, eventually End equipment can show the semantic analysis result.Terminal device provided herein can be the service having to contact staff The terminal device for the enterprise that quality exercises supervision.So, user such as supervisor can be by showing on terminal device Semantic analysis result, determine whether the real time phone call data of the first contact staff and user meet default conversation discipline, that is, supervise The personnel of superintending and directing can the service quality in real time to the first contact staff exercise supervision.As can be seen from the above description, the application reality is utilized The scheme of example offer is applied, server can carry out semantic point to the real time phone call data between user and the first contact staff in real time Analysis, determines whether the real time phone call data between user and the first contact staff meet default conversation discipline, in traditional technology The mode spot-check afterwards is compared, and can supervise the service quality of the first contact staff in time.

Method of speech processing provided by the embodiments of the present application is described above, below in abovementioned steps S203 The specific implementation of " server carries out semantic analysis to the text, obtains semantic analysis result " is introduced.In this Shen Please in embodiment, " server carries out semantic analysis to the text, obtains semantic analysis result " in specific implementation, Ke Yiyou A variety of implementations, two kinds of possible implementations introduced below.

The first implementation:

The server can carry out semantic analysis to the text by semantic analysis model trained in advance.The application Embodiment does not limit the semantic analysis model specifically, and the semantic analysis model can be machine learning model.

In the embodiment of the present application, the semantic analysis model trained in advance, is based on training text and the training Label training entrained by text obtains, label entrained by the trained text, and it is corresponding to be used to indicate the trained text Voice data whether meet default conversation discipline.Wherein, the corresponding label of the trained text can be and artificially demarcate.

In the embodiment of the present application, all or part of text in the aforementioned text being converted to can be inputted into institute Predicate justice analysis model, the semantic analysis model, that is, exportable label corresponding with the input text, i.e. the semantic analysis model Whether the i.e. exportable input text meets the result of default conversation discipline.

Second of implementation:

Semantic base can be constructed in advance, include grapholect, the corresponding voice data symbol of the grapholect in the semantic base Default conversation discipline is closed, alternatively, the corresponding voice data of the grapholect does not meet default conversation discipline.The embodiment of the present application is not The specific specific implementation for limiting the building semantic base can collect a large amount of history communicating data as an example, The semantic base is constructed by data based on history.

Server can be by some or all of text in the aforementioned text being converted to, with the semantic base constructed in advance In grapholect matched, with realize to the text carry out semantic analysis.Specifically, if the grapholect is corresponding Voice data meets the default conversation discipline, and some or all of text in the text being converted to and the mark Quasi- characters matching success, the then semantic analysis result obtained are that the voice data meets default conversation discipline.If the standard The corresponding voice data of text does not meet the default conversation discipline, and some or all of in the text being converted to Text and the grapholect successful match, the then semantic analysis result obtained are that the voice data does not meet default call rule Model.

In a kind of implementation of the embodiment of the present application, in order to further enhance service quality, server obtains described After semantic analysis result, however, it is determined that the semantic analysis result indicates that the voice data does not meet the default call rule Model, then the server can interrupt the call between the first contact staff and the user.Specifically, the server can be with Hang up instruction is sent to the audio access device, after audio access device receives the hang up instruction, can be cut off described Call speech channel between the verbal system of user and the verbal system of first contact staff.Specifically, audio access device The call speech channel between audio access device and the verbal system of the first contact staff can be cut off.It so, can be to avoid Occur further to conflict between user and the first contact staff.

In another implementation of the embodiment of the present application, in order to further enhance service quality, server obtains institute After stating semantic analysis result, however, it is determined that the semantic analysis result indicates that the voice data does not meet the default call and advises Model, then the server can interrupt the call between the first contact staff and the user, and make user and the second visitor The personnel of clothes converse.Specifically, the server can send reference order, audio access dress to the audio access device It sets after receiving the reference order, the verbal system of the user and the verbal system of first contact staff can be cut off Between call speech channel.And connect the words of the call between the verbal system of the user and the verbal system of the second contact staff Road.Specifically, since the verbal system of contact staff is to access the audio access device by the way of access of attending a banquet, because After this audio access device can cut off the call speech channel between audio access device and the verbal system of the first contact staff, The call speech channel between the verbal system of the user and the verbal system of the second contact staff can be continued to turn on.Such one Come, can further conflict to avoid generation between user and the first contact staff, and it is also possible that user and the second visitor Family personnel continue to converse.

In the embodiment of the present application, it is contemplated that in practical applications, supervisor is in addition to being desired to determine user and the first visitor Whether the real time phone call data taken between personnel meet except default conversation discipline, it is also desirable to obtain user and the first contact staff Between real time phone call data (i.e. aforementioned voice data), in order to be further processed based on the voice data.For Such case, in a kind of implementation of the embodiment of the present application, for example aforementioned supervisor of user can set in the terminal For the corresponding voice data export operation of upper triggering, the voice that terminal device is triggered on the terminal device based on supervisor Data export movement, generates voice data acquisition request, and the voice data acquisition request is sent to service by network Device.After server receives the voice data acquisition request, the voice data can be sent to terminal device.In order to Terminal device is based on the voice data and is further processed.

In addition, in a kind of implementation of the embodiment of the present application, in order to the quality of server to the first contact staff into Row supervision, server can also obtain the corresponding semantic analysis knot of corresponding with the first contact staff multiple voice data Fruit, and based on the corresponding semantic analysis result of corresponding with the first contact staff multiple voice data, described in determination The service quality of first contact staff.

In the embodiment of the present application, " server obtains multiple voice data corresponding with the first contact staff and respectively corresponds Semantic analysis result " in specific implementation, such as after aforementioned semantic analysis result being obtained for server, can establish And save the corresponding relationship between the semantic analysis result and the mark of first contact staff.Then server can be with The mark of one contact staff obtains the corresponding language of corresponding with the first contact staff multiple voice data as search condition Justice analysis result.Further, server can be (i.e. more according to the corresponding semantic analysis result of the multiple voice data A semantic analysis result), determine the quality of server of the first contact staff.

The embodiment of the present application does not limit the server specifically and determines the first customer service based on the multiple semantic analysis result The specific implementation of the service quality of personnel can indicate as an example according in the multiple semantic analysis result Voice data meets the number of the semantic analysis result of default conversation discipline and instruction voice data does not meet default call rule The number of the semantic analysis result of model determines the quality of server of first contact staff.For example, obtaining described first altogether Ten semantic analysis results of contact staff, one of semantic analysis result instruction voice data do not meet default call rule Model, in addition nine semantic analysis result instruction voice data fits preset conversation discipline, then can determine the first contact staff's Service quality is relatively good, and closing rule probability is 90%.

Example devices

Based on the method for speech processing that above embodiments provide, the embodiment of the present application also provides a kind of server and one kind Audio access device introduces the server and audio access device below in conjunction with attached drawing.

Referring to fig. 4, which is a kind of structural schematic diagram of server provided by the embodiments of the present application.

Server 400 provided by the embodiments of the present application, such as can specifically include: the first receiving unit 401, converting unit 402, semantic analysis unit 403 and the first transmission unit 404.

First receiving unit 401, for receiving the voice data of audio access device transmission；The voice data is user Real time phone call data between the first contact staff；

Converting unit 402, for the voice data received to be converted into text；

Semantic analysis unit 403 obtains semantic analysis result, the semanteme for carrying out semantic analysis to the text Analysis is as a result, for indicating whether the voice data meets default conversation discipline；

First transmission unit 404, for the semantic analysis result to be sent to terminal device.

Optionally, the server 400 further include:

Optionally,

Alternatively,

Optionally, the semantic analysis unit 403, is specifically used for:

Optionally, the default conversation discipline, comprising:

Optionally, the server 400 further include:

4th transmission unit, for the voice data to be sent to the terminal device.

Optionally, first receiving unit 401, is specifically used for:

Optionally, the server 400 further include:

Since the server 400 is corresponding server in the method provided with above method embodiment, the service The specific implementation of each unit of device 400 is same design with above method embodiment, accordingly, with respect to the server 400 Each unit specific implementation, can refer to description section of the above method embodiment about server, details are not described herein again

Referring to Fig. 5, which is a kind of structural schematic diagram of audio access device provided by the embodiments of the present application.

Audio access device 500 provided by the embodiments of the present application, such as can specifically include: acquiring unit 501 and transmission Unit 502.

Acquiring unit 501, for obtaining voice data；Reality of the voice data between user and the first contact staff When communicating data；

Transmission unit 502, for the voice data to be sent to server.

Optionally, the audio access device 500 further include:

First receiving unit, the hang up instruction sent for receiving the server；

Optionally, the audio access device 500 further include:

Second receiving unit, the reference order sent for receiving the server；

Alternatively,

Optionally, the transmission unit 502, is specifically used for:

Optionally, the acquiring unit 501, is specifically used for: receiving record command, and obtains institute based on the record command State voice data.

Since the audio access device 500 is corresponding server in the method provided with above method embodiment, institute The specific implementation of each unit of predicate sound access device 500 is same design with above method embodiment, accordingly, with respect to The specific implementation of each unit of the audio access device 500 can refer to above method embodiment retouching about server Part is stated, details are not described herein again

Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the application Its embodiment.This application is intended to cover any variations, uses, or adaptations of the application, these modifications, purposes or Person's adaptive change follows the general principle of the application and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the application are by following Claim is pointed out.

It should be understood that the application is not limited to the precise structure that has been described above and shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.Scope of the present application is only limited by the accompanying claims

The foregoing is merely the preferred embodiments of the application, not to limit the application, it is all in spirit herein and Within principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims

1. a kind of method of speech processing, which is characterized in that the described method includes:

Server receives the voice data that audio access device is sent；The voice data is between user and the first contact staff Real time phone call data；

The voice data received is converted into text by server；

Server carries out semantic analysis to the text, obtains semantic analysis result, the semantic analysis result, for indicating State whether voice data meets default conversation discipline；

The semantic analysis result is sent to terminal device by server.

2. the method according to claim 1, wherein the method also includes:

If the semantic analysis result indicates the voice data and do not meet the default conversation discipline, the server is to institute Predicate sound access device sends hang up instruction so that the audio access device cut off the verbal system of the user with it is described Call speech channel between the verbal system of first contact staff.

3. the method according to claim 1, wherein the method also includes:

If the semantic analysis result indicates the voice data and do not meet the default conversation discipline, the server is to institute Predicate sound access device sends reference order so that the audio access device cut off the verbal system of the user with it is described Call speech channel between the logical equipment of first contact staff, and connect the verbal system of the user and leading to for the second contact staff Talk about the call speech channel between equipment.

4. according to the method in claim 2 or 3, which is characterized in that

The verbal system of first contact staff accesses the audio access device, the voice by the way of registration of attending a banquet Access device access carrier system by way of registration of attending a banquet；

Alternatively,

The verbal system of first contact staff accesses the audio access device, the voice by the way of registration of attending a banquet Access device is accessed computer telecommunication by way of registration of attending a banquet and integrates CTI system.

5. the method according to claim 1, wherein the server carries out semantic analysis, packet to the text It includes:

The server carries out semantic analysis to the text by semantic analysis model trained in advance；The training in advance Semantic analysis model is obtained based on label training entrained by training text and the trained text, the trained text Entrained label, is used to indicate whether the corresponding voice data of the trained text meets default conversation discipline.

6. the method according to claim 1, wherein the server carries out semantic analysis, packet to the text It includes:

The server matches the text with the grapholect in the semantic base constructed in advance, to realize to the text Word carries out semantic analysis；

The corresponding voice data of the grapholect meets the default conversation discipline；Alternatively, the corresponding language of the grapholect Sound data do not meet the default conversation discipline.

7. method described in -6 any one according to claim 1, which is characterized in that the default conversation discipline, comprising:

8. the method according to claim 1, wherein the method also includes:

The voice data is sent to the terminal device by the server.

9. the method according to claim 1, wherein the server receives the voice that audio access device is sent Data, comprising:

Voice data after what the server received that the audio access device sends handled by encoding and decoding speech, it is described Voice data after encoding and decoding speech processing meets the voice data format that the server is supported.

10. the method according to claim 1, wherein the method also includes:

The server is based on the corresponding semantic analysis result of the multiple voice data corresponding with the first contact staff, Determine the service quality of first contact staff.

11. a kind of method of speech processing, which is characterized in that the described method includes:

Speech sound access equipment obtains voice data；Real time phone call number of the voice data between user and the first contact staff According to；

The voice data is sent to server by the speech sound access equipment.

12. according to the method for claim 11, which is characterized in that the method also includes:

The audio access device according to the hang up instruction cut off the user verbal system and first contact staff Verbal system between call speech channel.

13. according to the method for claim 11, which is characterized in that the method also includes:

The audio access device receives the reference order that the server is sent；

The speech sound access equipment based on the reference order cut off the user verbal system and first contact staff Logical equipment between call speech channel, and connect between the verbal system of the user and the verbal system of the second contact staff Call speech channel.

14. method according to claim 12 or 13, which is characterized in that the verbal system of first contact staff uses The mode of registration of attending a banquet accesses the audio access device, and the audio access device accesses operation by way of registration of attending a banquet Quotient system system；

Alternatively,

15. according to the method for claim 11, which is characterized in that the speech sound access equipment sends the voice data To server, comprising:

The speech sound access equipment carries out encoding and decoding speech processing to the voice data, so that the voice data symbol after processing Close the voice data format that the server is supported；

The speech sound access equipment by it is described handled by encoding and decoding speech after voice data be sent to the server.

16. according to the method for claim 11, which is characterized in that the speech sound access equipment obtains voice data, comprising:

17. a kind of server, which is characterized in that the server includes:

First receiving unit, for receiving the voice data of audio access device transmission；The voice data is user and first Real time phone call data between contact staff；

Converting unit, for the voice data received to be converted into text；

Semantic analysis unit obtains semantic analysis result, the semantic analysis knot for carrying out semantic analysis to the text Fruit, for indicating whether the voice data meets default conversation discipline；

18. server according to claim 17, which is characterized in that the server further include:

Second transmission unit, if indicating that the voice data does not meet the default call rule for the semantic analysis result Model then sends hang up instruction to the audio access device, so that the audio access device cuts off the call of the user Call speech channel between equipment and the verbal system of first contact staff.

19. server according to claim 17, which is characterized in that the server further include:

Third transmission unit, if indicating that the voice data does not meet the default call rule for the semantic analysis result Model then sends reference order to the audio access device, so that the audio access device cuts off the call of the user Call speech channel between equipment and the logical equipment of first contact staff, and connect verbal system and the second visitor of the user Take the call speech channel between the verbal system of personnel.

20. server described in 8 or 19 according to claim 1, which is characterized in that

Alternatively,

21. server according to claim 17, which is characterized in that the semantic analysis unit is specifically used for:

Semantic analysis is carried out to the text by semantic analysis model trained in advance；The semantic analysis mould trained in advance Type is obtained based on label training entrained by training text and the trained text, mark entrained by the trained text Label, are used to indicate whether the corresponding voice data of the trained text meets default conversation discipline.

22. server according to claim 17, which is characterized in that the semantic analysis unit is specifically used for:

The text is matched with the grapholect in the semantic base constructed in advance, semanteme is carried out to the text to realize Analysis；

23. server described in 7-22 any one according to claim 1, which is characterized in that the default conversation discipline, packet It includes:

24. server according to claim 17, which is characterized in that the server further include:

4th transmission unit, for the voice data to be sent to the terminal device.

25. server according to claim 17, which is characterized in that first receiving unit is specifically used for:

Voice data after receive that the audio access device sends handled by encoding and decoding speech, the encoding and decoding speech Voice data after processing meets the voice data format that the server is supported.

26. server according to claim 17, which is characterized in that the server further include:

Acquiring unit, for obtaining the corresponding semantic analysis result of corresponding with the first contact staff multiple voice data；

Determination unit, for based on the corresponding semantic analysis knot of corresponding with the first contact staff multiple voice data Fruit determines the service quality of first contact staff.

27. a kind of audio access device, which is characterized in that the audio access device includes:

Acquiring unit, for obtaining voice data；Real time phone call of the voice data between user and the first contact staff Data；

Transmission unit, for the voice data to be sent to server.

28. audio access device according to claim 27, which is characterized in that the audio access device further include:

First receiving unit, the hang up instruction sent for receiving the server；

Unit is hung up, for cutting off the verbal system of the user and leading to for first contact staff according to the hang up instruction Talk about the call speech channel between equipment.

29. audio access device according to claim 27, which is characterized in that the audio access device further include:

Second receiving unit, the reference order sent for receiving the server；

Adapter unit, for cutting off the verbal system of the user and leading to for first contact staff based on the reference order Call speech channel between equipment, and connect the call between the verbal system of the user and the verbal system of the second contact staff Speech channel.

30. the audio access device according to claim 28 or 29, which is characterized in that the call of first contact staff Equipment accesses the audio access device by the way of registration of attending a banquet, and the audio access device is by way of registration of attending a banquet Access carrier system；

Alternatively,

31. audio access device according to claim 27, which is characterized in that the transmission unit is specifically used for:

Encoding and decoding speech processing is carried out to the voice data, so that the voice data after processing meets the server and supports Voice data format；

32. audio access device according to claim 27, which is characterized in that the acquiring unit is specifically used for: receiving Record command, and the voice data is obtained based on the record command.