CN111090726A

CN111090726A - NLP-based electric power industry character customer service interaction method

Info

Publication number: CN111090726A
Application number: CN201911226062.3A
Authority: CN
Inventors: 胡飞飞; 洪丹轲; 黄昱; 曾时博; 刘丽; 舒然; 范俊成; 梁寿愚; 王科; 张坤; 方文崇
Original assignee: China Southern Power Grid Co Ltd
Current assignee: China Southern Power Grid Co Ltd
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2020-05-01

Abstract

The invention relates to the technical field of artificial intelligence customer service and intelligent voice analysis, and particularly discloses a character customer service interaction method in the power industry based on NLP, which specifically comprises the following steps: firstly, preprocessing voice data; step two, voice recognition and a natural generation type question-answering system; step three, generating a semantic database; step four, semantic matching; step five, generating characters by a natural language processing method; step six, outputting the characters; by adopting intelligent voice text interaction, the communication efficiency and experience of customers can be improved, the labor cost is reduced, and the method has higher research value.

Description

NLP-based electric power industry character customer service interaction method

Technical Field

The invention relates to the technical field of voice recognition interaction, in particular to a character customer service interaction method based on NLP in the power industry.

Background

The electric power industry is one of the most important basic industries in China, and great progress and development are achieved in recent years. With the increase of the user base number, the manual service pressure is huge in the field of power grid customer service. With the improvement of the technical level of artificial intelligence, intelligent customer service has become an important development direction in the customer service field. The method is continuously promoted in the fields of interactive experience, function perception and scene service, and can gradually strengthen and exceed the traditional service. Due to the complexity of the industry in the field of power grids, the application of intelligent customer service is still in the exploration and starting stage and is not mature.

The intelligent voice technology is started from the publication of Siri of apple corporation in 2010, the development of the intelligent voice technology in China is basically and internationally synchronous, and with the successive establishment and development of various large voice enterprises, the intelligent voice technology represented by voice recognition and voice synthesis is continuously broken through, and the intelligent voice industry in China is rapidly developed. In the electric power electric wire netting field, also need follow the development of technique, fully combine technologies such As Speech Recognition (ASR), characters change pronunciation (TTS), natural semantic processing (NLP), be applied to electric power electric wire netting customer service field and realize intellectuality, the high efficiency alleviates the manpower burden to reduce the human cost, promote customer viscidity and customer experience.

Disclosure of Invention

In order to solve the above problems, the present invention provides an NLP-based method for interacting text and customer service in the power industry, which specifically comprises the following steps:

the method comprises the following steps: preprocessing input general voice signals and voice signals with region information to form training data, wherein the preprocessing comprises denoising processing and feature extraction processing;

step two: carrying out voice recognition on the preprocessed general voice signal to obtain semantic information, and generating a semantic library by all the semantic information; meanwhile, the preprocessed voice signal with the regional information is subjected to a natural generation type question-answering system consisting of sequence-to-sequence models to obtain syntax information and keyword information of sentences;

step three: fusing the syntactic information, the keyword information and the semantic library to form a database with syntactic information; step four: extracting key words in the client voice sentences by the voice recognition model through semantic understanding, and matching the key words with key words in a database;

step five: generating reply characters corresponding to the replies according to the answers matched with the database and by combining the syntactic information and the keyword information;

step six: and displaying the corresponding characters according to the reply voice to complete the interaction with the client.

Preferably, the database includes a plurality of customer question sentences, a keyword corresponding to each of the plurality of customer question sentences, a reply voice corresponding to each of the plurality of customer question sentences, and region information of the voices of the plurality of customers.

Preferably, the natural-generating question-answering system is composed of a Sequence-to-Sequence (Sequence-to-Sequence, Sequence 2Seq) model, and is trained by customer service standard questions and extended questions in the power industry.

Compared with the prior art, the invention has the beneficial effects that: the intelligent interaction in the customer service field of the power industry is realized, the problem of high repetition rate in the customer service field can be completely completed by the method, the semantic library is perfected by using the naturally-generated question-answering system, and the output interactive characters are more accurate and are more like sentences spoken by natural people. Not only improves the efficiency and the customer satisfaction, but also greatly reduces the labor cost.

Drawings

FIG. 1 is a block diagram showing the structure of the present invention.

FIG. 2 is a flow chart of speech processing according to the present invention.

FIG. 3 is a schematic diagram of a model for a natural-occurring question-answering system according to the present invention.

FIG. 4 is a schematic diagram of a general long-term and short-term memory network.

FIG. 5 is a schematic diagram of a bidirectional long term and short term memory network according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1 to 5, the present invention provides a customer service intelligent navigation voice data processing method based on artificial intelligence, which specifically includes the following steps:

step one, voice preprocessing, namely, processing modes such as pre-emphasis, windowing, discrete Fourier transform, filter filtering and the like are carried out on an input voice signal to obtain a voice characteristic vector.

Step two, in voice recognition; the speech feature vector is input to a bidirectional long and short term memory network (BLSTM) to train a speech recognition model, the network not only has the data storage capacity of a common long and short term memory network (LSTM), but also takes the influence of reverse timing sequence information into consideration, the network is more advantageous in speech data recognition, and a Gaussian Mixture Model (GMM) is added behind the network to replace the traditional full-connection network, so that the recognition capacity of the network is improved. The output of the Gaussian mixture model is the posterior probability of the feature vector, and the speech features are obtained by decoding. The data set is used as 300-hour voice data collected in the field of power grids, and a two-channel wav format of 8000Hz, 16bit and 128kpbs is adopted. The BLSTM output information is calculated as follows:

i^t＝sigmoid(W_xix^t+W_rir^t-1+W_cic^t-1+b_i)

f^t＝sigmoid(W_xfx^t+W_rfr^t-1+W_cfc^t-1+b_f)

g^t＝tanh(W_xgx^t+|W_rgr^t-1+b_g)

h^t＝sigmoid(W_xhx^t+W_rhr^t-1+W_chc^t-1+b_h)

i is an input gate which controls the amount of information input to the BLSTM; f is a forgetting gate and controls the operation of the excitation signal; h is an output gate which controls the output information quantity of the BLSTM; c is information of BLSTM after the gate is forgotten; r is the information output by BLSTM; w is the weight matrix and b is the offset. In the speech recognition, a client speech statement is input into a Gaussian mixture model for recognition, the final activation function of the bidirectional long-short term memory network is a Sigmoid function, data is mapped between 0 and 1, whether recognition is successful or not is judged, and a specific calculation function is as follows:

and obtaining a client voice statement through voice recognition, and preparing for semantic matching in the next step.

The naturally-occurring question-answering uses a Sequence-to-Sequence (Seq 2Seq) model, and obtains syntactic information through training standard question-answering and extended question-answering predictions in customer service in the power industry. Wherein the system is formulated as:

p represents the probability of the model output sequence; x represents an input sequence; y represents an output sequence; i denotes the sequence index.

Step three, merging the syntax information and the keyword information obtained by the natural generation type question-answering system with a semantic base into a database with syntax information, wherein the database is called a syntax information database;

step four, semantic matching; and obtaining the intention of inputting the user voice through the processing of semantic hierarchy, and extracting keywords in the sentence and corresponding keywords in the syntax confidence database for matching. The matching method is as follows:

1) in the bidirectional long and short term memory network, the speech sentences of the clients are read in two directions, so that the semantic information in the sentences can be better obtained, and the reading process is as follows:

where c represents the customer speech statement read from both directions and h is the resulting implicit semantic information.

2) And dividing the obtained semantic information h by the characters of the whole client voice sentence to obtain the final semantic representation W at the character level. The following formula:

where Q is the number of characters in the entire customer speech statement.

3) Judging whether the answer speech is matched with the reply speech in the database through a matching algorithm, wherein the judgment formula is as follows:

r＝σ(W_rq+b_r)

Score＝W_scorer+b_score

where r is the data weighted information, b is the bias, and w is each weight matrix.

4) On the basis of the bidirectional long and short term memory network framework, a bidirectional long and short term memory network of an attention mechanism is added, the attention network adopts Softmax to carry out importance ranking and normalization on semantic units, and in combination with the bidirectional long and short term memory network used in the invention, a final formula to a matched text is as follows:

wherein T is a text semantic unit; w is a weight matrix of the attention network; z represents a high-dimensional semantic text matching vector output by the layered convolutional neural network; omega is the weight of text semantic learning; gamma is the bias of the attention mechanism in the text semantics. After the attention mechanism is added, the voice recognition capability and the text matching capability of the network are effectively improved.

5) After obtaining the semantic information, the error rate is judged by the following formula:

where N is the number of total words identified, S is a replacement error, I is an insertion error, and D is a deletion error. And judging whether the output results are matched or not through a preset threshold value.

Step five, generating a text; and after the semantic matching is completed, displaying matched answer generation texts in the syntactic information database to a display to complete the interaction with the user.

To illustrate the effectiveness and adaptability of the present invention, the conventional method without a natural-generating question-answering system was used for comparison in the experiment. In the test of using 50 test voices and actual scenes, the recognition rate of the method is improved from 80 percent to 91 percent. Therefore, the method can effectively improve the customer service capacity and level in the field of power grids.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. An NLP-based electric power industry character customer service interaction method is characterized by comprising the following steps:

step two: carrying out voice recognition on the preprocessed general voice signal to obtain semantic information, and generating a semantic library by all the semantic information; meanwhile, the preprocessed voice signal with the region information is processed through a natural generation type question-answering system composed of sequence-to-sequence models to obtain syntax information and keyword information of sentences;

step three: fusing the syntactic information, the keyword information and the semantic library to form a database with syntactic information;

step four: extracting key words in the client voice sentences by the voice recognition model through semantic understanding, and matching the key words with key words in a database;

2. The method of claim 1, wherein the speech recognition model uses a bidirectional long short term memory network (BLSTM) with attention to extract the keywords and converts the keywords into corresponding textual information for matching against the database.

3. The method of claim 1, wherein the database comprises a plurality of customer question statements, a keyword associated with each of the plurality of customer question statements, and a reply utterance associated with each of the plurality of customer question statements.

4. The method of claim 1, wherein the matching of the speech recognition model to the database in step four is formulated as follows:

wherein T is a text semantic unit; w is a weight matrix of the attention network; z represents a high-dimensional semantic text matching vector output by the layered convolutional neural network; omega is the weight of text semantic learning; gamma is the bias of an attention mechanism in text semantics, the voice recognition capability and the text matching capability of the network are effectively improved after the attention mechanism is added, h is an output gate, and b is the bias.