CN108184032A - The method of servicing and device of a kind of customer service system - Google Patents

The method of servicing and device of a kind of customer service system Download PDF

Info

Publication number
CN108184032A
CN108184032A CN201611116110.XA CN201611116110A CN108184032A CN 108184032 A CN108184032 A CN 108184032A CN 201611116110 A CN201611116110 A CN 201611116110A CN 108184032 A CN108184032 A CN 108184032A
Authority
CN
China
Prior art keywords
speech
synthesized
contact staff
text
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611116110.XA
Other languages
Chinese (zh)
Other versions
CN108184032B (en
Inventor
王朝民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201611116110.XA priority Critical patent/CN108184032B/en
Publication of CN108184032A publication Critical patent/CN108184032A/en
Application granted granted Critical
Publication of CN108184032B publication Critical patent/CN108184032B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/527Centralised call answering arrangements not requiring operator intervention
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

Abstract

The invention discloses the method for servicing and device of a kind of customer service system, including:Receive phonetic synthesis instruction;It is instructed according to the phonetic synthesis received, determines speech text to be synthesized;According to the speech text to be synthesized determined and in advance according to the speech parameter model library of contact staff's tone color foundation currently answered, the speech with contact staff's tamber characteristic of speech text to be synthesized is synthesized;The instruction of contact staff is received, and the sentence being made of synthesis voice and/or contact staff's artificial speech is played according to instruction.Due to play to user be have contact staff's tamber characteristic synthesis voice and/or contact staff's artificial speech composition sentence, therefore, considerably reduce language amount of the contact staff during manual service, reduce the tired pressure of contact staff, and, user can be defaulted as contact staff always with its verbal communication, so as to improve the service quality of customer service system, enhance user experience.

Description

The method of servicing and device of a kind of customer service system
Technical field
The present invention relates to the method for servicing and device of communication technique field more particularly to a kind of customer service system.
Background technology
Movement at present, unicom, the customer service system of three big communication company of telecommunications, usually by machine customer service and artificial customer service group Into.During telephone service, when receiving conversation message from the user, first serviced by machine customer service.Work as user When thinking that machine customer service can not solve the problems, such as its proposition, then artificial customer service is manually selected, seeked advice to artificial customer service.
In current this customer service system, the speech comparison of machine customer service is dull, sound no natural language that Sample is vivid, also, machine customer service does not have adaptability to changes when participating in the cintest, can solve the problems, such as it is limited, therefore, in telephone service Cheng Zhong, artificial customer service occupy consequence.But artificial customer service needs continuous work 6 hours or more in shifts every time, and The problem of needing not had to according to user during telephone service and situation provide a large amount of answer and explain, are very easy to feel tired Labor.Situations such as fatigue can cause the cacoepy of artificial customer service or misread so as to reduce the quality of customer service, influences User experience.
Therefore, the service quality of customer service system how is improved, and then promotes user experience, is that the technology of urgent need to resolve is asked Topic.
Invention content
The embodiment of the present invention provides a kind of method of servicing and device of customer service system, in the prior art to solve The problem of how improving the service quality of customer service system, and then promoting user experience.
An embodiment of the present invention provides a kind of method of servicing of customer service system, including:
Receive phonetic synthesis instruction;
It is instructed according to the phonetic synthesis received, determines speech text to be synthesized;
It is established according to the speech text to be synthesized determined and in advance according to the contact staff's tone color currently answered Speech parameter model library, synthesize the speech with contact staff's tamber characteristic of the speech text to be synthesized;
The instruction of the contact staff is received, and is played according to described instruction and synthesizes voice and/or the customer service by described The sentence of personnel's artificial speech composition.
In a kind of possible realization method, in above-mentioned method of servicing provided in an embodiment of the present invention, the basis connects The phonetic synthesis instruction received, determines speech text to be synthesized, specifically includes:
Determine that the phonetic synthesis received instructs whether corresponding speech text to be synthesized is standard words term sentence;
If so, corresponding standard words term sentence is instructed to be determined as the speech text to be synthesized the phonetic synthesis;
If it is not, then the formula words term sentence of filling a vacancy after the text for inserting the phonetic synthesis instruction carrying is waited to close as described in Into speech text.
In a kind of possible realization method, in above-mentioned method of servicing provided in an embodiment of the present invention, the basis is true The speech text to be synthesized made and the speech parameter model established in advance according to the contact staff's tone color currently answered Library synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized, specifically includes:
The speech text to be synthesized determined is segmented using text analyzer, is obtained and the words to be synthesized The corresponding word mark file of sound text;
The speech parameter mould that file is marked according to the word and is established in advance according to the contact staff's tone color currently answered Type library determines speech characteristic parameter corresponding with the speech text to be synthesized;
According to the speech characteristic parameter, synthesize the speech text to be synthesized has contact staff's tamber characteristic Speech.
It is described according to institute in above-mentioned method of servicing provided in an embodiment of the present invention in a kind of possible realization method Predicate language marks file and the speech parameter model library established in advance according to contact staff's tone color for currently answering, it is determining with it is described The corresponding speech characteristic parameter of speech text to be synthesized, specifically includes:
In the speech parameter model library established in advance according to the contact staff's tone color currently answered, search and the word Mark the corresponding speech parameter model of each word in file;
According to the corresponding speech parameter model of each word, determined and the speech text to be synthesized by parameter generation algorithm The LF0 that corresponding fundamental frequency information conversion log domains obtain, aperiodic ingredient spectrum information average value BAP on different frequency bands and The 18 dimension line spectrum pairs parameter LSP that sound channel spectrum information extracts in frame.
It is described according to institute in above-mentioned method of servicing provided in an embodiment of the present invention in a kind of possible realization method Speech characteristic parameter is stated, synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized, it is specific to wrap It includes:
Mixed excitation source corresponding with the speech text to be synthesized is formed using the LF0 determined and the BAP;
The mixed excitation source input filter that will be determined, and pass through the LSP determined to the wave filter It is controlled, synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized.
In a kind of possible realization method, in above-mentioned method of servicing provided in an embodiment of the present invention, further include:Pass through Following manner establishes the speech parameter model library with contact staff's tone color:
The raw tone wave file included in the speech database of contact staff is decomposed, obtains the raw tone waveform The fundamental frequency information of each syllable, aperiodic ingredient spectrum information harmony road spectrum information in file;
The fundamental frequency information of each syllable is converted to log domains and obtains LF0;
The aperiodic ingredient spectrum information of each syllable is averaged to obtain respectively in preset each frequency band BAP;
The sound channel spectrum information of each syllable is extracted into 18 dimension line spectrum pairs parameter LSP in frame;
According to the corresponding word mark file of the raw tone wave file, LF0, the BAP determined to each syllable With LSP speech parameter model is established according to hidden Markov model;
After carrying out Model tying and model training to established each speech parameter model, obtain that there is the contact staff The speech parameter model library of tone color.
The embodiment of the present invention additionally provides a kind of service unit of customer service system, including:
Receiving unit, for receiving phonetic synthesis instruction;
Determination unit, for according to the phonetic synthesis instruction received, determining speech text to be synthesized;
Synthesis unit, for according to the speech text to be synthesized determined and in advance according to the visitor currently to answer The speech parameter model library of personnel's tone color foundation is taken, synthesize the speech text to be synthesized has contact staff's tone color spy The speech of sign;
Broadcast unit for receiving the instruction of the contact staff, and plays according to described instruction and synthesizes voice by described And/or the sentence of contact staff's artificial speech composition.
In a kind of possible realization method, in above-mentioned service unit provided in an embodiment of the present invention, the determining list Member instructs whether corresponding speech text to be synthesized is standard words term specifically for the phonetic synthesis for determining to receive Sentence;If so, corresponding standard words term sentence is instructed to be determined as the speech text to be synthesized the phonetic synthesis;If it is not, The formula of filling a vacancy after the text for inserting the phonetic synthesis instruction carrying is then talked about into term sentence as the speech text to be synthesized.
In a kind of possible realization method, in above-mentioned service unit provided in an embodiment of the present invention, the synthesis is single Member, including:
First synthesizing subunit, for being divided using text analyzer the speech text to be synthesized determined Word obtains word mark file corresponding with the speech text to be synthesized;
Second synthesizing subunit, for marking file and in advance according to the contact staff's sound currently answered according to the word The speech parameter model library that color is established determines speech characteristic parameter corresponding with the speech text to be synthesized;
Third synthesizing subunit, for according to the speech characteristic parameter, synthesizing having for the speech text to be synthesized The speech of contact staff's tamber characteristic.
In a kind of possible realization method, in above-mentioned service unit provided in an embodiment of the present invention, described second closes Into subelement, specifically in the speech parameter model library established in advance according to the contact staff's tone color currently answered, searching Speech parameter model corresponding with word each in word mark file;According to the corresponding speech parameter model of each word, lead to It crosses parameter generation algorithm and determines the LF0 that fundamental frequency information conversion log domains corresponding with the speech text to be synthesized obtain, it is aperiodic The 18 dimension line spectrum pairs parameters that ingredient spectrum information average value BAP on different frequency bands and sound channel spectrum information extract in frame LSP。
In a kind of possible realization method, in above-mentioned service unit provided in an embodiment of the present invention, the third is closed It is corresponding with the speech text to be synthesized mixed specifically for being formed using the LF0 and the BAP that determine into subelement Close driving source;The mixed excitation source input filter that will be determined, and pass through the LSP determined to the wave filter It is controlled, synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized.
In a kind of possible realization method, in above-mentioned service unit provided in an embodiment of the present invention, further include:Modeling Unit for decomposing the raw tone wave file included in the speech database of contact staff, obtains the raw tone wave The fundamental frequency information of each syllable, aperiodic ingredient spectrum information harmony road spectrum information in shape files;By the fundamental frequency of each syllable Information is converted to log domains and obtains LF0;The aperiodic ingredient spectrum information of each syllable is taken respectively in preset each frequency band Averagely it is worth to BAP;The sound channel spectrum information of each syllable is extracted into 18 dimension line spectrum pairs parameter LSP in frame;According to described The corresponding word mark file of raw tone wave file, to LF0, BAP and LSP that each syllable is determined according to hidden Ma Erke Husband's model foundation speech parameter model;After carrying out Model tying and model training to established each speech parameter model, obtain Speech parameter model library with contact staff's tone color.
The present invention has the beneficial effect that:
The method of servicing and device of customer service system provided in an embodiment of the present invention, including:Receive phonetic synthesis instruction;According to The phonetic synthesis instruction received, determines speech text to be synthesized;It presses according to the speech text to be synthesized determined and in advance According to the speech parameter model library that the contact staff's tone color currently answered is established, synthesize speech text to be synthesized has contact staff The speech of tamber characteristic;The instruction of contact staff is received, and is played according to instruction by synthesizing voice and/or the artificial language of contact staff The sentence of sound composition.Due to according to the speech parameter model library established in advance according to the contact staff's tone color currently answered, obtaining The speech with contact staff's tamber characteristic of speech text to be synthesized, and can will be by synthesizing according to the instruction of contact staff The sentence of voice and/or contact staff's artificial speech composition plays to user, therefore, it is possible to reduce contact staff is in manual service Language amount in the process, reduces the tired pressure of contact staff, and then improves the service quality of customer service system, enhances user Experience.Also, play to user is the speech with contact staff's tamber characteristic, sounds vivid so that user feels Know less than the participation for having machine more in interactive process, be defaulted as contact staff always with its verbal communication, therefore, further The service quality of customer service system is improved, enhances user experience.
Description of the drawings
Fig. 1 is the flow chart of the method for servicing of customer service system provided in an embodiment of the present invention;
Fig. 2 is the stream for the speech with contact staff's tamber characteristic that speech text to be synthesized is synthesized in the embodiment of the present invention Cheng Tu;
Fig. 3 is the flow chart that the speech parameter model library with contact staff's tone color is established in the embodiment of the present invention;
Fig. 4 is the structure diagram of the service unit of customer service system provided in an embodiment of the present invention;
Fig. 5 is the parameterised speech synthesis system frame provided in an embodiment of the present invention based on hidden Markov model;
Fig. 6 is the signal that the service unit provided in an embodiment of the present invention by customer service system assists that contact staff services Figure.
Specific embodiment
Below in conjunction with the accompanying drawings, the specific embodiment party of the method for servicing to customer service system provided in an embodiment of the present invention and device Formula is described in detail.
The method of servicing of a kind of customer service system provided in an embodiment of the present invention, as shown in Figure 1, specifically including following steps:
S101, phonetic synthesis instruction is received;
S102, it is instructed according to the phonetic synthesis received, determines speech text to be synthesized;
The speech text to be synthesized and built in advance according to the contact staff's tone color currently answered that S103, basis are determined Vertical speech parameter model library synthesizes the speech with contact staff's tamber characteristic of speech text to be synthesized;
S104, the instruction for receiving contact staff, and played according to instruction by synthesizing voice and/or contact staff's artificial speech The sentence of composition.
Specifically, in above-mentioned method of servicing provided in an embodiment of the present invention, due to according in advance according to currently answering The speech parameter model library that contact staff's tone color is established, obtained speech text to be synthesized has contact staff's tamber characteristic Speech, and can be played the sentence being made of synthesis voice and/or contact staff's artificial speech according to the instruction of contact staff To user, therefore, it is possible to reduce language amount of the contact staff during manual service reduces the tired pressure of contact staff, And then the service quality of customer service system is improved, enhance user experience.Also, play to user is with contact staff's sound The speech of color characteristic sounds vivid so that user is perceived less than the participation for having machine more in interactive process, is defaulted as Contact staff always with its verbal communication, therefore, further improve the service quality of customer service system, enhance user's body It tests.
In the specific implementation, in above-mentioned method of servicing provided in an embodiment of the present invention, step S102 is according to receiving Phonetic synthesis instructs, and determines speech text to be synthesized, can specifically be accomplished by the following way:
Determine that the phonetic synthesis received instructs whether corresponding speech text to be synthesized is standard words term sentence;
If so, corresponding standard words term sentence is instructed to be determined as speech text to be synthesized phonetic synthesis;
If it is not, then using the formula words term sentence of filling a vacancy after the text for inserting phonetic synthesis instruction carrying as speech to be synthesized text This.
Specifically, in above-mentioned method of servicing provided in an embodiment of the present invention, in the specific embodiment of step S102 Standard words term sentence is some basic exchange sentences that contact staff uses when being serviced for subscriber phone, such as:" it is very glad Serviced for you ", " would you please input ID card No. ".Also, during standard words term sentence is played to user, if user Either party speaks with contact staff personnel, then can stop speech play at any time, to ensure between contact staff and user It is good interactive, improve user experience.
Specifically, in above-mentioned method of servicing provided in an embodiment of the present invention, in the specific embodiment of step S102 Formula of filling a vacancy talks about term sentence, is the sentence for needing to carry out group sentence according to the real consumption situation or traffic conditions of user.Such as:" you Current credit balance is XX members ", wherein, XX is the data in charge system, needs to be inserted in fixed clause, then lead to It crosses personalized speech synthetic technology and carries out online synthesis voice output.Certainly, formula of filling a vacancy talks about term sentence, can also there is other realizations Mode, such as:Still by taking " your current credit balance is XX members " as an example, only " your current credit balance is member " can be carried out Phonetic synthesis exports, and credit balance " XX " can be said by contact staff oneself, not limited herein.
In the specific implementation, in above-mentioned method of servicing provided in an embodiment of the present invention, step S103 is according to determining Speech text to be synthesized and the speech parameter model library established in advance according to the contact staff's tone color currently answered, synthesis are treated The speech with contact staff's tamber characteristic of speech text is synthesized, as shown in Fig. 2, specifically may comprise steps of:
S201, the speech text to be synthesized determined is segmented using text analyzer, obtained and speech to be synthesized The corresponding word mark file of text;
S202, the speech parameter mould that file is marked according to word and is established in advance according to the contact staff's tone color currently answered Type library determines speech characteristic parameter corresponding with speech text to be synthesized;
S203, according to speech characteristic parameter, synthesize the speech with contact staff's tamber characteristic of speech text to be synthesized.
Specifically, in above-mentioned method of servicing provided in an embodiment of the present invention, such as with " be very glad and serviced for you " to treat For synthesizing speech text, use text analyzer that can obtain " very " "high" " emerging " " for " " you " " clothes " " business " and its respective right The mark file answered;Then in conjunction with mark file, in the speech parameter established in advance according to the contact staff's tone color currently answered In model library, it can find and the corresponding speech characteristic parameter of " very " "high" " emerging " " for " " you " " clothes " " business ";Finally, According to the corresponding speech characteristic parameter found, " being very glad for the speech with contact staff's tamber characteristic can be synthesized For you service " voice.
In the specific implementation, in above-mentioned method of servicing provided in an embodiment of the present invention, step S202 is marked according to word File and the speech parameter model library established in advance according to the contact staff's tone color currently answered, determine and speech text to be synthesized Corresponding speech characteristic parameter can specifically be accomplished by the following way:
In the speech parameter model library established in advance according to the contact staff's tone color currently answered, search and marked with word The corresponding speech parameter model of each word in file;
According to the corresponding speech parameter model of each word, determined by parameter generation algorithm corresponding with speech text to be synthesized The obtained LF0 in fundamental frequency information conversion log domains, aperiodic ingredient spectrum information average value BAP on different frequency bands and sound channel The 18 dimension line spectrum pairs parameter LSP that spectrum information extracts in frame.
Specifically, in above-mentioned method of servicing provided in an embodiment of the present invention, in order to improve the quality of synthesis voice, step The average value BAP of aperiodic ingredient spectrum information on different frequency bands in the specific implementation of S202, can be with right and wrong periodic component Spectrum Ap be averaged to obtain BAP according to 5 frequency bands, wherein, 5 frequency bands can be respectively 0~1000Hz, 1000~2000Hz, 2000~4000Hz, 4000~6000HZ, 6000~8000Hz, do not limit herein.
In the specific implementation, in above-mentioned method of servicing provided in an embodiment of the present invention, step S203 is according to phonetic feature Parameter synthesizes the speech with contact staff's tamber characteristic of speech text to be synthesized, can specifically be accomplished by the following way:
Mixed excitation source corresponding with speech text to be synthesized is formed using the LF0 and BAP determined;
The mixed excitation source input filter that will be determined, and pass through the LSP determined and wave filter is controlled, it synthesizes The speech with contact staff's tamber characteristic of speech text to be synthesized.
In the specific implementation, in above-mentioned method of servicing provided in an embodiment of the present invention, can also include:Pass through such as lower section Formula establishes the speech parameter model library with contact staff's tone color, as shown in Figure 3:
S301, the raw tone wave file that includes in the speech database of contact staff is decomposed, obtains raw tone wave The fundamental frequency information of each syllable, aperiodic ingredient spectrum information harmony road spectrum information in shape files;
S302, it the fundamental frequency information of each syllable is converted to log domains obtains LF0;
S303, the aperiodic ingredient spectrum information of each syllable is averaged to obtain respectively in preset each frequency band BAP;Wherein, preset each frequency band can be 0~1000Hz, 1000~2000Hz, 2000~4000Hz, 4000~ 6000HZ, 6000~8000Hz, do not limit herein;
S304, the sound channel spectrum information of each syllable is extracted to 18 dimension line spectrum pairs parameter LSP in frame;
S305, file, LF0, the BAP determined to each syllable are marked according to the corresponding word of raw tone wave file With LSP speech parameter model is established according to hidden Markov model;
S306, it after carrying out Model tying and model training to established each speech parameter model, obtains with customer service people The speech parameter model library of member's tone color.
It should be noted that the sequence of the step S302-S304 in above-mentioned method of servicing provided in an embodiment of the present invention can To exchange, however it is not limited to the sequencing of foregoing description.
Based on same inventive concept, the embodiment of the present invention additionally provides a kind of service unit of customer service system, due to the clothes The principle that business device solves the problems, such as is similar to above-mentioned method of servicing, and therefore, the implementation of the service unit may refer to above-mentioned clothes The implementation of business method, overlaps will not be repeated.
The service unit of customer service system provided in an embodiment of the present invention, as shown in figure 4, can include:
Receiving unit 401, for receiving phonetic synthesis instruction;
Determination unit 402, for according to the phonetic synthesis instruction received, determining speech text to be synthesized;
Synthesis unit 403, for according to the speech text to be synthesized determined and in advance according to the customer service currently answered The speech parameter model library that personnel's tone color is established synthesizes the speech with contact staff's tamber characteristic of speech text to be synthesized;
Broadcast unit 404 for receiving the instruction of contact staff, and is played according to instruction by synthesizing voice and/or customer service The sentence of personnel's artificial speech composition.
In the specific implementation, in above-mentioned service unit provided in an embodiment of the present invention, determination unit 402 specifically can be with For determining that the phonetic synthesis received instructs whether corresponding speech text to be synthesized is standard words term sentence;It if so, will Phonetic synthesis instructs corresponding standard words term sentence to be determined as speech text to be synthesized;If it is not, it will then insert phonetic synthesis instruction Formula of filling a vacancy after the text of carrying talks about term sentence as speech text to be synthesized.
In the specific implementation, in above-mentioned service unit provided in an embodiment of the present invention, synthesis unit 403 can include:
First synthesizing subunit 4031, for being divided using text analyzer the speech text to be synthesized determined Word obtains word mark file corresponding with speech text to be synthesized;
Second synthesizing subunit 4032, for marking file and in advance according to the contact staff's sound currently answered according to word The speech parameter model library that color is established determines speech characteristic parameter corresponding with speech text to be synthesized;
Third synthesizing subunit 4033, for according to speech characteristic parameter, synthesize speech text to be synthesized to have customer service The speech of personnel's tamber characteristic.
In the specific implementation, in above-mentioned service unit provided in an embodiment of the present invention, the second synthesizing subunit 4032, tool Body can be used in the speech parameter model library established in advance according to the contact staff's tone color currently answered, and search and word mark The corresponding speech parameter model of each word in explanatory notes part;According to the corresponding speech parameter model of each word, generated and calculated by parameter Method determines the LF0 that fundamental frequency information conversion log domains corresponding with speech text to be synthesized obtain, and aperiodic ingredient spectrum information is in difference The 18 dimension line spectrum pairs parameter LSP that average value BAP and sound channel spectrum information on frequency band are extracted in frame.
In the specific implementation, in above-mentioned service unit provided in an embodiment of the present invention, third synthesizing subunit, 4033 tools Body can be used for forming mixed excitation source corresponding with speech text to be synthesized using the LF0 and BAP that determine;By what is determined Mixed excitation source input filter, and pass through the LSP determined and wave filter is controlled, synthesize the tool of speech text to be synthesized There is the speech of contact staff's tamber characteristic.
In the specific implementation, in above-mentioned service unit provided in an embodiment of the present invention, can also include:Modeling unit 405, for decomposing the raw tone wave file included in the speech database of contact staff, obtain raw tone wave file In each syllable fundamental frequency information, aperiodic ingredient spectrum information harmony road spectrum information;The fundamental frequency information of each syllable is converted to Log domains obtain LF0;The aperiodic ingredient spectrum information of each syllable is averaged to obtain BAP respectively in preset each frequency band; The sound channel spectrum information of each syllable is extracted into 18 dimension line spectrum pairs parameter LSP in frame;It is corresponding according to raw tone wave file Word marks file, and speech parameter mould is established according to hidden Markov model to LF0, BAP and LSP that each syllable is determined Type;After carrying out Model tying and model training to established each speech parameter model, the language with contact staff's tone color is obtained Sound parameter model library.
Technical solution for a better understanding of the present invention, the present invention provides establish to have customer service in above-mentioned method of servicing The speech with contact staff's tamber characteristic of the speech parameter model library of personnel's tone color and synthesis speech text to be synthesized Specific embodiment, i.e., the parameterised speech synthesis system frame based on hidden Markov model, as shown in Figure 5:
Part A show the specific embodiment for establishing the speech parameter model library with contact staff's tone color in Fig. 5.Target The speech database of contact staff includes the raw tone wave file of wav forms and corresponding mark file label. By raw tone wave file by adaptive weighted general interpositioning, i.e. STRAIGHT analytical technologies, it is effectively decomposed into source letter Breath and channel information, wherein, source information includes fundamental frequency F0 and aperiodic component spectrum AP, and channel information composes SP for sound channel.Then, into Fundamental frequency F0 is converted to log domains and obtains LF0 by the processing of one step;Aperiodic component spectrum Ap is averaged to obtain according to 5 frequency bands BAP, wherein, 5 frequency bands be respectively 0~1000Hz, 1000~2000Hz, 2000~4000Hz, 4000~6000HZ, 6000~ 8000Hz;Sound channel spectrum sp is extracted into 18 dimension line spectrum pairs parameter LSP in frame.Finally, with reference to mark file label to LF0, BAP And the parameter combination of LSP, it carries out hidden Markov model and establishes speech parameter model, then to established each speech parameter mould Type carries out Model tying and model training, recycles 3 times or so the speech parameter models for obtaining target contact staff.
Part B show the specific of the speech with contact staff's tamber characteristic that synthesizes speech text to be synthesized in Fig. 5 Embodiment.The text of voice to be synthesized obtains the mark file label forms of synthesis needs by text analyzer, then, knot The speech parameter model library of target contact staff that part A obtains in Fig. 5 is closed, finds voice corresponding with speech text to be synthesized Characteristic parameter, LF0, BAP and LSP.Finally, mixed excitation corresponding with speech text to be synthesized is formed using LF0 and BAP Source;The mixed excitation source input filter that will be determined, and pass through the LSP determined and wave filter is controlled, it synthesizes and waits to close Into the speech with contact staff's tamber characteristic of speech text.
In addition, the tool of voice service is realized by above-mentioned method of servicing and service unit the present invention also provides contact staff Body embodiment, as shown in Figure 6:
After contact staff's accessing user's phone, standard can be talked about term sentence and formula of filling a vacancy talks about the voices to be synthesized such as term sentence Text by sound of the above-mentioned service unit synthesis with contact staff's tone color, plays to user.Such as " you are good, is very glad Serviced for you " this standard words term sentence, by sound of the above-mentioned service unit synthesis with contact staff's tone color, play to User.For another example, when user needs to handle or change business, it can generate by above-mentioned service unit and " would you please input identity card The sound with contact staff's tone color of number " this standard words term sentence, plays to user.And to ensure preferable use Family is experienced, and stops speech play at any time.When user inquires credit balance, need by charge system with currently inquiring The corresponding credit balance data XX of user is inserted in fixed clause " your current credit balance is member ", then will be inserted telephone expenses The sentence " your current credit balance is XX members " of remaining sum XX is synthesized by above-mentioned service unit to be exported.As it can be seen that contact staff is only Communication need to be carried out with user in the case of the answer content that the exchange way according to user need to adjust at any time, such as " good ", the sentence of " situation is such " this kind of basic communication;And in both situations above, can will have oneself tone color The speech of feature plays to client, and user the perceives or contact staff exchanges with it, and experience effect is preferable.
The method of servicing and device of above-mentioned customer service system provided in an embodiment of the present invention, including:Receive phonetic synthesis instruction; It is instructed according to the phonetic synthesis received, determines speech text to be synthesized;According to the speech text to be synthesized determined and in advance The speech parameter model library first established according to the contact staff's tone color currently answered, synthesize speech text to be synthesized has customer service The speech of personnel's tamber characteristic;The instruction of contact staff is received, and is played according to instruction by synthesizing voice and/or contact staff people The sentence of work voice composition.Due to according to the speech parameter model library established in advance according to contact staff's tone color for currently answering, The speech with contact staff's tamber characteristic of speech text to be synthesized is obtained, and can will be by according to the instruction of contact staff The sentence of synthesis voice and/or contact staff's artificial speech composition plays to user, therefore, it is possible to reduce contact staff is artificial Language amount in service process, reduces the tired pressure of contact staff, and then improves the service quality of customer service system, enhances User experience.Also, play to user is the speech with contact staff's tamber characteristic, sounds vivid so that is used Family is perceived less than the participation for having machine more in interactive process, be defaulted as contact staff always with its verbal communication, therefore, into One step improves the service quality of customer service system, enhances user experience.
In addition, personalized speech synthetic technology, is that one kind is synthesized by establishing target speaker's phonetic feature model The technology of target person sound of speaking.The technology collects the recording materials of certain phoneme spreadability first, then extracts speaker The phonetic feature of feature establishes the characteristic model of target speaker, and then for any one section of statement text, can pass through model The speech parameter feature of the text is generated, the sound of the text with target speaker's speciality is synthesized finally by vocoder Sound.Current speech synthesis technique is mainly waveform concatenation speech synthesis technique and parameterised speech synthetic technology.
But speech synthesis technique is only used as voice broadcast in customer service field at present, not extensively customer service field its He uses in applying.And in the method for servicing and device of customer service system provided in an embodiment of the present invention, start phonetic synthesis A new application scenarios of the technology in customer service field, by personalized speech synthetic technology in customer service incoming call outbound calling process Middle use, greatly reduces the workload of contact staff, and then improves the service quality and user experience of customer service system, has Wide application prospect.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art God and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (12)

1. a kind of method of servicing of customer service system, which is characterized in that including:
Receive phonetic synthesis instruction;
It is instructed according to the phonetic synthesis received, determines speech text to be synthesized;
According to the speech text to be synthesized determined and in advance according to the language of contact staff's tone color foundation currently answered Sound parameter model library synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized;
The instruction of the contact staff is received, and is played according to described instruction and synthesizes voice and/or the contact staff by described The sentence of artificial speech composition.
2. method of servicing as described in claim 1, which is characterized in that the phonetic synthesis instruction that the basis receives, It determines speech text to be synthesized, specifically includes:
Determine that the phonetic synthesis received instructs whether corresponding speech text to be synthesized is standard words term sentence;
If so, corresponding standard words term sentence is instructed to be determined as the speech text to be synthesized the phonetic synthesis;
If it is not, the formula of filling a vacancy after the text for inserting the phonetic synthesis instruction carrying is then talked about into term sentence as the words to be synthesized Sound text.
3. method of servicing as described in claim 1, which is characterized in that the speech text to be synthesized that the basis is determined And this is in advance according to the speech parameter model library of contact staff's tone color foundation currently answered, and synthesizes the speech to be synthesized The speech with contact staff's tamber characteristic of text, specifically includes:
The speech text to be synthesized determined is segmented using text analyzer, is obtained and the speech text to be synthesized This corresponding word mark file;
The speech parameter model library that file is marked according to the word and is established in advance according to the contact staff's tone color currently answered, Determine speech characteristic parameter corresponding with the speech text to be synthesized;
According to the speech characteristic parameter, synthesize the speech text to be synthesized there is contact staff's tamber characteristic if Sound.
4. method of servicing as claimed in claim 3, which is characterized in that it is described according to the word mark file and in advance according to The speech parameter model library that contact staff's tone color for currently answering is established determines voice corresponding with the speech text to be synthesized Characteristic parameter specifically includes:
In the speech parameter model library established in advance according to the contact staff's tone color currently answered, search and marked with the word The corresponding speech parameter model of each word in file;
According to the corresponding speech parameter model of each word, determined by parameter generation algorithm corresponding with the speech text to be synthesized The obtained LF0 in fundamental frequency information conversion log domains, aperiodic ingredient spectrum information average value BAP on different frequency bands and sound channel The 18 dimension line spectrum pairs parameter LSP that spectrum information extracts in frame.
5. method of servicing as claimed in claim 4, which is characterized in that it is described according to the speech characteristic parameter, described in synthesis The speech with contact staff's tamber characteristic of speech text to be synthesized, specifically includes:
Mixed excitation source corresponding with the speech text to be synthesized is formed using the LF0 determined and the BAP;
The mixed excitation source input filter that will be determined, and pass through the LSP determined and the wave filter is carried out Control synthesizes the speech with contact staff's tamber characteristic of the speech text to be synthesized.
6. such as claim 1-5 any one of them method of servicing, which is characterized in that further include:Tool is established in the following way There is the speech parameter model library of contact staff's tone color:
The raw tone wave file included in the speech database of contact staff is decomposed, obtains the raw tone wave file In each syllable fundamental frequency information, aperiodic ingredient spectrum information harmony road spectrum information;
The fundamental frequency information of each syllable is converted to log domains and obtains LF0;
The aperiodic ingredient spectrum information of each syllable is averaged to obtain BAP respectively in preset each frequency band;
The sound channel spectrum information of each syllable is extracted into 18 dimension line spectrum pairs parameter LSP in frame;
According to the corresponding word mark file of the raw tone wave file, LF0, BAP and the LSP determined to each syllable Speech parameter model is established according to hidden Markov model;
After carrying out Model tying and model training to established each speech parameter model, obtain that there is contact staff's tone color Speech parameter model library.
7. a kind of service unit of customer service system, which is characterized in that including:
Receiving unit, for receiving phonetic synthesis instruction;
Determination unit, for according to the phonetic synthesis instruction received, determining speech text to be synthesized;
Synthesis unit, for according to the speech text to be synthesized determined and in advance according to the customer service people currently to answer Speech parameter model library that member's tone color is established synthesizes the contact staff's tamber characteristic that has of the speech text to be synthesized Speech;
Broadcast unit, for receiving the instruction of the contact staff, and according to described instruction play by it is described synthesize voice and/or The sentence of contact staff's artificial speech composition.
8. service unit as claimed in claim 7, which is characterized in that the determination unit, specifically for determining what is received The phonetic synthesis instructs whether corresponding speech text to be synthesized is standard words term sentence;If so, by the phonetic synthesis Corresponding standard words term sentence is instructed to be determined as the speech text to be synthesized;If it is not, it will then insert the phonetic synthesis instruction Formula of filling a vacancy after the text of carrying talks about term sentence as the speech text to be synthesized.
9. service unit as claimed in claim 7, which is characterized in that the synthesis unit, including:
First synthesizing subunit for being segmented using text analyzer to the speech text to be synthesized determined, is obtained To word mark file corresponding with the speech text to be synthesized;
Second synthesizing subunit, for marking file according to the word and being built in advance according to the contact staff's tone color currently answered Vertical speech parameter model library determines speech characteristic parameter corresponding with the speech text to be synthesized;
Third synthesizing subunit, for according to the speech characteristic parameter, it is described to synthesize having for the speech text to be synthesized The speech of contact staff's tamber characteristic.
10. service unit as claimed in claim 9, which is characterized in that second synthesizing subunit, specifically for advance In the speech parameter model library established according to the contact staff's tone color currently answered, search and each word in word mark file The corresponding speech parameter model of language;According to the corresponding speech parameter model of each word, by parameter generation algorithm determine with it is described LF0 that the corresponding fundamental frequency information conversion log domains of speech text to be synthesized obtain, aperiodic ingredient spectrum information is on different frequency bands The 18 dimension line spectrum pairs parameter LSP that average value BAP and sound channel spectrum information extract in frame.
11. service unit as claimed in claim 10, which is characterized in that the third synthesizing subunit, specifically for using The LF0 determined and the BAP form mixed excitation source corresponding with the speech text to be synthesized;The institute that will be determined Mixed excitation source input filter is stated, and passes through the LSP determined and the wave filter is controlled, waits to close described in synthesis Into the speech with contact staff's tamber characteristic of speech text.
12. such as claim 7-11 any one of them service units, which is characterized in that further include:Modeling unit, for decomposing The raw tone wave file included in the speech database of contact staff obtains each sound in the raw tone wave file The fundamental frequency information of section, aperiodic ingredient spectrum information harmony road spectrum information;The fundamental frequency information of each syllable is converted to log domains Obtain LF0;The aperiodic ingredient spectrum information of each syllable is averaged to obtain BAP respectively in preset each frequency band; The sound channel spectrum information of each syllable is extracted into 18 dimension line spectrum pairs parameter LSP in frame;According to raw tone waveform text The corresponding word mark file of part, LF0, BAP and the LSP determined to each syllable establish voice according to hidden Markov model Parameter model;After carrying out Model tying and model training to established each speech parameter model, obtain that there is the customer service people The speech parameter model library of member's tone color.
CN201611116110.XA 2016-12-07 2016-12-07 Service method and device of customer service system Active CN108184032B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611116110.XA CN108184032B (en) 2016-12-07 2016-12-07 Service method and device of customer service system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611116110.XA CN108184032B (en) 2016-12-07 2016-12-07 Service method and device of customer service system

Publications (2)

Publication Number Publication Date
CN108184032A true CN108184032A (en) 2018-06-19
CN108184032B CN108184032B (en) 2020-02-21

Family

ID=62544670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611116110.XA Active CN108184032B (en) 2016-12-07 2016-12-07 Service method and device of customer service system

Country Status (1)

Country Link
CN (1) CN108184032B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109785823A (en) * 2019-01-22 2019-05-21 中财颐和科技发展(北京)有限公司 Phoneme synthesizing method and system
CN109933658A (en) * 2019-03-21 2019-06-25 中国联合网络通信集团有限公司 Customer service speaking analysis method and device
CN110085209A (en) * 2019-04-11 2019-08-02 广州多益网络股份有限公司 A kind of tone color screening technique and device
CN110610720A (en) * 2019-09-19 2019-12-24 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN111883133A (en) * 2020-07-20 2020-11-03 深圳乐信软件技术有限公司 Customer service voice recognition method, customer service voice recognition device, customer service voice recognition server and storage medium
CN112988998A (en) * 2021-03-15 2021-06-18 中国联合网络通信集团有限公司 Response method and device
CN113808576A (en) * 2020-06-16 2021-12-17 阿里巴巴集团控股有限公司 Voice conversion method, device and computer system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1336750A (en) * 2000-07-27 2002-02-20 霈捷科技股份有限公司 Multiplex telephone service system and mechanism
US20100145702A1 (en) * 2005-09-21 2010-06-10 Amit Karmarkar Association of context data with a voice-message component
CN102231275A (en) * 2011-06-01 2011-11-02 北京宇音天下科技有限公司 Embedded speech synthesis method based on weighted mixed excitation
CN103065619A (en) * 2012-12-26 2013-04-24 安徽科大讯飞信息科技股份有限公司 Speech synthesis method and speech synthesis system
CN105261355A (en) * 2015-09-02 2016-01-20 百度在线网络技术(北京)有限公司 Voice synthesis method and apparatus
CN105304080A (en) * 2015-09-22 2016-02-03 科大讯飞股份有限公司 Speech synthesis device and speech synthesis method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1336750A (en) * 2000-07-27 2002-02-20 霈捷科技股份有限公司 Multiplex telephone service system and mechanism
US20100145702A1 (en) * 2005-09-21 2010-06-10 Amit Karmarkar Association of context data with a voice-message component
CN102231275A (en) * 2011-06-01 2011-11-02 北京宇音天下科技有限公司 Embedded speech synthesis method based on weighted mixed excitation
CN103065619A (en) * 2012-12-26 2013-04-24 安徽科大讯飞信息科技股份有限公司 Speech synthesis method and speech synthesis system
CN105261355A (en) * 2015-09-02 2016-01-20 百度在线网络技术(北京)有限公司 Voice synthesis method and apparatus
CN105304080A (en) * 2015-09-22 2016-02-03 科大讯飞股份有限公司 Speech synthesis device and speech synthesis method

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109785823A (en) * 2019-01-22 2019-05-21 中财颐和科技发展(北京)有限公司 Phoneme synthesizing method and system
CN109933658A (en) * 2019-03-21 2019-06-25 中国联合网络通信集团有限公司 Customer service speaking analysis method and device
CN109933658B (en) * 2019-03-21 2021-05-11 中国联合网络通信集团有限公司 Customer service call analysis method and device
CN110085209A (en) * 2019-04-11 2019-08-02 广州多益网络股份有限公司 A kind of tone color screening technique and device
CN110085209B (en) * 2019-04-11 2021-07-23 广州多益网络股份有限公司 Tone screening method and device
CN110610720A (en) * 2019-09-19 2019-12-24 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN113808576A (en) * 2020-06-16 2021-12-17 阿里巴巴集团控股有限公司 Voice conversion method, device and computer system
CN111883133A (en) * 2020-07-20 2020-11-03 深圳乐信软件技术有限公司 Customer service voice recognition method, customer service voice recognition device, customer service voice recognition server and storage medium
CN111883133B (en) * 2020-07-20 2023-08-29 深圳乐信软件技术有限公司 Customer service voice recognition method, customer service voice recognition device, server and storage medium
CN112988998A (en) * 2021-03-15 2021-06-18 中国联合网络通信集团有限公司 Response method and device
CN112988998B (en) * 2021-03-15 2023-06-16 中国联合网络通信集团有限公司 Response method and device

Also Published As

Publication number Publication date
CN108184032B (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN108184032A (en) The method of servicing and device of a kind of customer service system
CN105869626B (en) A kind of method and terminal of word speed automatic adjustment
CN103903627B (en) The transmission method and device of a kind of voice data
CN102254553B (en) The automatic normalization of spoken syllable duration
CN108847249A (en) Sound converts optimization method and system
CN109599092B (en) Audio synthesis method and device
Clark et al. Evaluating long-form text-to-speech: Comparing the ratings of sentences and paragraphs
CN108833722A (en) Audio recognition method, device, computer equipment and storage medium
Tanaka et al. A hybrid approach to electrolaryngeal speech enhancement based on noise reduction and statistical excitation generation
US20190378532A1 (en) Method and apparatus for dynamic modifying of the timbre of the voice by frequency shift of the formants of a spectral envelope
DE112004000187T5 (en) Method and apparatus of prosodic simulation synthesis
DE102004012208A1 (en) Individualization of speech output by adapting a synthesis voice to a target voice
Hansen et al. On the issues of intra-speaker variability and realism in speech, speaker, and language recognition tasks
JP2013167900A (en) System and technique for producing spoken voice prompt
EP1280137B1 (en) Method for speaker identification
CN103370743A (en) Voice quality conversion system, voice quality conversion device, method therefor, vocal tract information generating device, and method therefor
Picart et al. Continuous control of the degree of articulation in HMM-based speech synthesis
JP2004226556A (en) Method and device for diagnosing speaking, speaking learning assist method, sound synthesis method, karaoke practicing assist method, voice training assist method, dictionary, language teaching material, dialect correcting method, and dialect learning method
CN107705782A (en) Method and apparatus for determining phoneme pronunciation duration
CN109599094A (en) The method of sound beauty and emotion modification
Doi et al. Statistical approach to enhancing esophageal speech based on Gaussian mixture models
Köster Multidimensional analysis of conversational telephone speech
Siegert et al. Speech signal compression deteriorates acoustic cues to perceived speaker charisma
Zahner et al. Conversion from facial myoelectric signals to speech: a unit selection approach
Murphy et al. Testing the GlórCáil system in a speaker and affect voice transformation task

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100053 53a, xibianmennei street, Xuanwu District, Beijing

Patentee after: CHINA MOBILE COMMUNICATION LTD., Research Institute

Patentee after: CHINA MOBILE COMMUNICATIONS GROUP Co.,Ltd.

Address before: 100053 53a, xibianmennei street, Xuanwu District, Beijing

Patentee before: CHINA MOBILE COMMUNICATION LTD., Research Institute

Patentee before: CHINA MOBILE COMMUNICATIONS Corp.