CN114792117A

CN114792117A - Training method and device of session classification model and session classification method and device

Info

Publication number: CN114792117A
Application number: CN202210622535.7A
Authority: CN
Inventors: 王颢; 张振华; 聂强强; 曹喆岫
Original assignee: Ctrip Travel Network Technology Shanghai Co Ltd
Current assignee: Ctrip Travel Network Technology Shanghai Co Ltd
Priority date: 2022-06-01
Filing date: 2022-06-01
Publication date: 2022-07-26

Abstract

The invention discloses a training method and a device of a conversation classification model, and a conversation classification method and a device, wherein the model training method comprises the following steps: obtaining a training sample; cutting the sample conversation by utilizing an industry word stock to obtain target word segmentation; acquiring word vector characteristics corresponding to the target word segmentation; inquiring order information and/or user information corresponding to the sample session; collecting order characteristics and/or user characteristics; and training a machine learning network to obtain a conversation classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the category labels as output. According to the invention, the industry words related to the OTA scene are added into the word stock, so that the word segmentation accuracy is ensured, and the learning difficulty of the model is simplified. Meanwhile, on the basis of the text characteristics, the user characteristics and the order characteristics are combined, and more accurate classification can be given to the conversation contents. In addition, the use of different networks for different types of categories is more accurate than the use of a single network to predict multiple types simultaneously.

Description

Training method and device of session classification model and session classification method and device

Technical Field

The invention relates to the technical field of natural language processing, in particular to a training method and device of a conversation classification model and a conversation classification method and device.

Background

After accessing the customer service page, the user can firstly have a conversation with the intelligent customer service, when the answer given by the intelligent customer service cannot solve the user problem, the user can access the artificial customer service, and then the artificial customer service gives a professional answer. The prediction of user intent and the availability of sufficient human customer service resources will directly impact the efficiency of the user in resolving problems, thereby impacting the user's service experience.

At present, most of mainstream conversation classification methods extract features according to texts of conversations and predict conversation types, and due to the fact that the texts related to OTA (on-line travel agency) customer service scenes are few, the accuracy of classification results of the customer service conversations in the OTA scenes is low, so that unreasonable allocation of manual customer service resources is caused, and the working efficiency of manual customer service is influenced.

Disclosure of Invention

The invention aims to overcome the defect of low accuracy of session classification in the prior art, and provides a method and a device for training a session classification model and a method and a device for session classification.

The invention solves the technical problems through the following technical scheme:

according to a first aspect of the present invention, there is provided a training method for a conversational classification model, comprising the following steps:

acquiring a training sample, wherein the training sample comprises a sample session and a corresponding class label;

cutting the sample conversation by utilizing an industry word stock to obtain target participles, wherein the industry word stock comprises a plurality of participles corresponding to the OTA field;

acquiring word vector characteristics corresponding to the target participles based on an Embedding method;

inquiring order information and/or user information corresponding to the sample session;

acquiring order features and/or user features from the order information and/or the user information;

and training a machine learning network to obtain the session classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the category labels as output.

Preferably, the machine learning network includes a recurrent neural network and a fully connected network, the step of training the machine learning network to obtain the conversation classification model by taking the word vector feature, the order feature and/or the user feature as input and the category label as output includes:

inputting the word vector features into the recurrent neural network to obtain intermediate output;

inputting the intermediate output, the order characteristics and/or the user characteristics into the full-connection network to obtain a conversation classification result;

and training the machine learning network according to the conversation classification result and the class label to obtain the conversation classification model.

Preferably, between the step of cutting the sample conversation by using an industry dictionary and the step of obtaining word vector features corresponding to the target participles based on an Embedding method, the method further includes:

acquiring a synonym replacement library corresponding to the OTA field, wherein the synonym replacement library comprises a plurality of standard words and a plurality of synonyms corresponding to the standard words;

respectively judging whether synonyms corresponding to the target participles exist in the synonym replacement library, and if yes, replacing the target participles by adopting standard words corresponding to the synonyms;

filtering out special characters of the sample conversation;

adjusting the length of the sample session after filtering to unify the length of the sample session.

Preferably, the class labels include at least two different types of labels, the step of training the machine learning network to obtain the conversation classification model by taking the word vector feature, the order feature and/or the user feature as input and the class label as output further includes:

for each type of label, respectively taking the word vector feature, the order feature and/or the user feature as input, and taking the label as output training to obtain a corresponding session classification model; wherein the types include product and user behavior.

According to a second aspect of the present invention, there is provided a session classification method, comprising the steps of:

acquiring a customer service session to be classified;

and inputting the customer service session into a session classification model to obtain a customer service session classification result, wherein the session classification model is obtained by the training method of the session classification model.

Preferably, when the trained conversational classification model includes a plurality, the step of inputting the customer service session into a conversational classification model includes:

and respectively inputting the customer service session into a plurality of different session classification models to obtain a plurality of different customer service session classification results.

According to a third aspect of the present invention, there is provided a training apparatus for a conversational classification model, comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a training sample, and the training sample comprises a sample session and a corresponding class label;

the segmentation module is used for segmenting the sample conversation by utilizing an industry word bank to obtain target participles, and the industry word bank comprises a plurality of participles corresponding to the OTA field;

the second acquisition module is used for acquiring word vector characteristics corresponding to the target word segmentation based on an Embedding method;

the query module is used for querying order information and/or user information corresponding to the sample session;

the acquisition module is used for acquiring order features and/or user features from the order information and/or the user information;

and the training module is used for training a machine learning network to obtain the session classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the category labels as output.

According to a fourth aspect of the present invention, there is provided a conversation classification apparatus comprising:

the session acquisition module is used for acquiring the customer service sessions to be classified;

and the session classification module is used for inputting the customer service session into the session classification model to obtain a customer service session classification result, and the session classification model is obtained through the training device of the session classification model.

According to a fifth aspect of the present invention, there is provided an electronic device comprising a memory and a processor connected to the memory, the processor implementing the inventive training method of a conversational classification model or the inventive conversational classification method when executing a computer program stored on the memory.

According to a sixth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the training method of the conversational classification model of the invention or the conversational classification method of the invention.

The positive progress effects of the invention are as follows:

by adding the professional OTA scene related industry words into the word stock, the accuracy of word segmentation is ensured, and the learning difficulty of the model is simplified. Meanwhile, a basic network model is improved in training, and the categories corresponding to the conversation can be better identified by combining the user characteristics and the order characteristics on the basis of the text characteristics of the conversation, so that the conversation contents of the customer service and the user are accurately classified. In addition, the accuracy of using different networks for different types of categories is higher than using a single network to predict multiple types simultaneously, and the two networks can be computed in parallel.

Drawings

Fig. 1 is a flowchart illustrating a training method of a conversational classification model according to embodiment 1 of the present invention.

Fig. 2 is a schematic diagram of a framework of a training method of a conversational classification model according to embodiment 1 of the present invention.

Fig. 3 is a schematic diagram of a framework of a training method of a conversational classification model according to embodiment 2 of the present invention.

Fig. 4 is a flowchart illustrating a session classification method according to embodiment 3 of the present invention.

Fig. 5 is a schematic structural diagram of a training apparatus for a conversational classification model according to embodiment 5 of the present invention.

Fig. 6 is a schematic structural diagram of a session classification apparatus according to embodiment 7 of the present invention.

Fig. 7 is a schematic structural diagram of an electronic device according to embodiment 9 of the present invention.

Detailed Description

The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.

Example 1

The present embodiment provides a training method of a conversational classification model, as shown in fig. 1, the training method of the conversational classification model includes the following steps:

and S11, obtaining a training sample, wherein the training sample comprises a sample session and a corresponding class label.

In the embodiment, the sample session is a complete session between the user and the customer service, and is spliced according to the session sequence; the category label is specifically a label that the expert judges and labels the category of each complete conversation. In order to better extract the characteristics of the sample conversation, the sample conversation needs to be processed by Chinese, and as an optional implementation manner, a fixed conversation which is certainly appeared in the conversation between the customer service and the user is deleted, such as "what can be helped by the housekeeper who swims to serve you? "such as, these fixed dialogs exist in every sample session, with little help for classification. As an optional implementation manner, the fixed conversation is deleted by using a regular matching manner, so that the characteristic attributes of the conversation content are better characterized.

As an optional implementation manner, a unification operation is performed on the fonts of the sample session, that is, a traditional font in the sample session is converted into a simplified font, and the influence of font differences in the session is eliminated, and the present embodiment is not limited to the above chinese processing operation.

And S12, cutting the sample conversation by using an industry word bank to obtain target participles, wherein the industry word bank comprises a plurality of participles corresponding to the OTA field.

In this embodiment, in order to reduce the input dimension, the text of the sample conversation needs to be converted into word segmentation in units of words, so as to capture the deep meaning of the phrase and better characterize the characteristics of the conversation.

As an alternative embodiment, the sample conversation is segmented into different words (i.e. target participles) by using a participle tool, the participle tool uses a dictionary in the participle, and the dictionary in the existing participle tool is mainly derived from a popular corpus, such as a people daily corpus, etc. As an optional implementation manner, an industry lexicon is constructed in advance, and the participles related to the OTA field are added into the industry lexicon, wherein the participles are mainly professional industry words in the OTA field, such as "aeronautical transformer", "universal service pack", "luggage amount", and the like. As an optional implementation manner, a Jieba (chinese word segmentation tool) is used as a word segmentation tool, and the Jieba word segmentation tool can not only use a dictionary of the Jieba, but also provide an adding function of a custom dictionary, so that a built industry lexicon can be added to the word segmentation tool conveniently.

In this embodiment, the user may have different expressions for the same event, for example, when consulting the value-added product of "gold service pack", the user may express as "service pack", "gold service", etc. In order to simplify the learning difficulty of the model, the participles under different expressions in the conversation are replaced by a specific synonym replacement library, wherein the synonym replacement library comprises a plurality of standard words related to the OTA field and a plurality of synonyms corresponding to each standard word.

As an optional implementation manner, for each target participle, it is determined whether a synonym corresponding to the target participle exists in the synonym replacement library, and if so, the target participle is replaced with a standard word corresponding to the synonym. For example, assuming that there is a "service package" in the session, the standard word corresponding to the "service package" in the synonym replacement library is a "gold service package", and the "gold service package" is used to replace the "service package".

As an alternative embodiment, special characters in the sample session are filtered out, i.e. special character processing. The special character processing mainly includes filtering of special characters such as punctuations, numbers, expressions, and the like, and specifically, the special characters in the sample conversation can be deleted by using an existing special character library. It should be noted that, since special characters such as punctuation marks in the sample conversation can help to improve the accuracy of word segmentation, the filtering process for the special characters is usually performed after word segmentation.

As an optional embodiment, the length of the sample session after filtering is adjusted to unify the length of the sample session, that is, text isometric processing is performed on the sample session, where the text isometric processing is a process of intercepting or supplementing the text in the sample session to the same length. As an alternative embodiment, a text length of 95 quantiles is used as the standard length.

And S13, acquiring word vector characteristics corresponding to the target participles based on an Embedding method.

Referring to FIG. 2, the processed target word is segmented, i.e., W ₁ 、W ₂ 、……W _n Converting the word vector characteristics into word vector characteristics corresponding to the target participle, wherein the word vector characteristics are used for representing an Embedding vector taking a word as a unit, namely e ₁ 、e ₂ 、……e _n . As an optional implementation manner, the word vector feature corresponding to the target participle is obtained based on the Embedding method, specifically: encoding different target participles, and simultaneously capturing deep meaning of the target participles and semantic relation between the target participlesAnd the system is used for fusing a large amount of valuable information and converting the target word segmentation into word embedding, wherein the word embedding can be also called as word vector. In this embodiment, the Embedding method is a mature technology in natural language processing, and the existing Embedding method includes: word2vec algorithm, GloVe algorithm, CWE algorithm, cw2vec algorithm, etc., can be called directly.

And S14, inquiring order information and/or user information corresponding to the sample conversation.

As an optional implementation manner, for each communication sample session, an order related to the sample session is located, and order information and user information are obtained from the order. The order information includes an order number, an order date, an order status, goods or services purchased in the order, and the like, and the user information includes a user portrait, historical behaviors, and the like.

And S15, collecting order characteristics and/or user characteristics from the order information and/or user information.

In this embodiment, a feature engineering is performed on the order information and the user information, and better order features and user features are extracted from the order information and the user information.

And S16, training the machine learning network to obtain a session classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the class labels as output.

As an optional implementation manner, the machine learning Network adopts a current Neural Network, RNN (Recurrent Neural Network) and Bi-directional Gated Recurrent Unit, Bi-GRU (bidirectional gate Recurrent Unit) as a basic structure, adds Self-attention (attention mechanism) to the basic structure, and adds a full connection layer to fuse order features and user features.

Referring to FIG. 2, the converted word vector features, i.e., e ₁ 、e ₂ 、……e _n Input to a recurrent neural network incorporating attention mechanism, resulting in an intermediate output, specifically:

inputting the word vector characteristics into a bidirectional GRU (gated cyclic Unit) unit, coding the word vector characteristics one by one through the bidirectional GRU unit, and outputting each objectAnd marking the hidden layer state vector of the word at a preset moment. As an alternative embodiment, the hidden layer state vector is expressed as h ₁ 、h ₂ 、……h _n Wherein the preset time is the corresponding time when the word vector characteristics are input, and the hidden layer state vector h _n Word vector feature e input by preset time _n And hidden layer state vector h at the previous moment _n-1 And (6) determining.

As shown in fig. 2, the bidirectional GRU unit calculates a set of forward hidden layer state vectors from 1 to n and a set of backward hidden layer state vectors from n to 1, and then connects the two state vectors together to obtain a state vector corresponding to each final target participle.

As an alternative embodiment, the bidirectional GRU unit and attention mechanism layer u _w And the attention mechanism layer is used for calculating a corresponding attention weight value for the state vector of each target participle, so that useful word vector characteristics are focused, and then the state vector of each target participle and the corresponding attention weight value are subjected to weighted accumulation to obtain a final output vector of the bidirectional GRU.

As shown in FIG. 2, the output vector obtained from the bidirectional GRU unit is connected to the full connection layer, the full connection layer shrinks the length of the output vector, and the order feature and the user feature, namely X, are simultaneously connected ₁ 、X ₂ And X ₃ Inputting the input data into a full-connection layer, fusing an output vector corresponding to the word vector characteristics, order characteristics and user characteristics by using a full-connection network, inputting the fused output vector into a softmax (an activation function) layer, and obtaining the prediction probability (namely a conversation classification result) of each conversation category through activation operation.

As an alternative embodiment, when training the session classification model, the model is trained using a cross-entropy loss function. Specifically, cross entropy loss is calculated based on a session classification result and a class label corresponding to a sample session, then network parameters in a bidirectional GRU unit, an attention mechanism and a full-connection network are adjusted by reversely propagating the cross entropy loss, and the session classification model is determined to be trained completely by increasing iteration times until the cross entropy loss is converged. Alternatively, the back propagation of cross-entropy loss employs a momentum-based random gradient descent method to accelerate convergence.

According to the embodiment, the industrial words related to the professional OTA scene are added into the word bank, so that the word segmentation accuracy is ensured, and the learning difficulty of the model is simplified. Meanwhile, a basic network model is improved in training, and the user characteristics and the order characteristics are combined on the basis of the text characteristics of the session, so that the categories corresponding to the session can be better identified, the session contents of the customer service and the user can be accurately classified, the follow-up consultation behavior of the user can be predicted, and the user experience is greatly improved.

Example 2

The present embodiment provides another training method of the conversational classification model, which is a further improvement of embodiment 1.

As shown in fig. 1, steps S11, S12, S13, S14 and S15 of the present embodiment are the same as the corresponding steps of embodiment 1, except for step S16 of the present embodiment.

In this embodiment, the category labels include at least two different types of labels, the types include product and user behaviors, wherein the product includes items similar to refund and change, and the user behaviors include items similar to consultation and promotion. As an alternative embodiment, each sample session includes a product category and a corresponding user behavior category, such as refund and change of ticket, and the consultation and hastening are performed, and the difference is that the learning bias of the two categories is different, so that different models are used for training separately for the two different types of category labels.

As an alternative embodiment, models of the same infrastructure and different additional units are used for different types of classification, respectively. For example, the machine learning network adopts a recurrent neural network combined with a bidirectional gate cyclic unit as a basic structure, adds an attention mechanism on the basic structure aiming at a classification model of a product, and adds a full connection layer to fuse order features and user features, wherein the training process of the classification model aiming at the product is the same as the corresponding steps of the embodiment 1; and for the classification model of the user behavior, the machine learning network adopts a recurrent neural network combined with a bidirectional gate cyclic unit as a basic structure, adds a multi-head attention-force mechanism (Multihead-attention) on the basic structure, and adds a fully-connected layer to fuse order features and user features.

Referring to FIG. 3, the converted word vector features, i.e., e ₁ 、e ₂ 、……e _n Inputting to a recurrent neural network combined with a multi-head attention mechanism to obtain an intermediate output, specifically:

and inputting the word vector characteristics into a bidirectional GRU unit, coding the word vector characteristics one by one through the bidirectional GRU unit, and outputting hidden layer state vectors of each target participle at a preset moment. As an alternative embodiment, the hidden layer state vector is expressed as h ₁ 、h ₂ 、……h _n Wherein the preset time is the corresponding time when the word vector characteristics are input, and the hidden layer state vector h _n Word vector feature e input by preset time _n And hidden layer state vector h at the previous moment _n-1 And (6) determining.

As shown in fig. 3, the bidirectional GRU unit calculates a set of forward hidden layer state vectors from 1 to n and a set of backward hidden layer state vectors from n to 1, and then connects the two state vectors together to obtain a state vector corresponding to each final target participle.

As an optional implementation mode, the bidirectional GRU unit and the multi-head attention mechanism layer u ₁ 、u ₂ 、……u _k Are connected. In this embodiment, the number of the attention heads of the multi-head attention mechanism layer is k, where k is a positive integer greater than 1, and can be set according to actual requirements. Through different linear changes, each head respectively obtains a corresponding characteristic matrix, and then all the characteristic matrices are spliced and weighted to calculate, so that the model can understand input contents from different angles and combine extracted information to obtain a final output vector.

As shown in FIG. 3, the output vectors from the bidirectional GRU units are connected to the pooling layer, and the output vectors are pooled globally and then input to the totality unitA connection layer for shrinking the length of the output vector and simultaneously combining the order feature and the user feature, namely X ₁ 、X ₂ And X ₃ Inputting the input data into a full-connection layer, fusing an output vector corresponding to the word vector characteristics, order characteristics and user characteristics by using a full-connection network, inputting the fused output vector, order characteristics and user characteristics into a softmax layer, and obtaining the prediction probability (namely a conversation classification result) of each conversation category aiming at the user behavior through activation operation.

As an alternative embodiment, when training the session classification model, the model is trained using a cross-entropy loss function. Specifically, cross entropy loss is calculated based on a session classification result and a class label corresponding to a sample session, then network parameters in a bidirectional GRU unit, a multi-head attention mechanism layer and a full-connection network are adjusted through back propagation of the cross entropy loss, and the session classification model is determined to be trained completely through increasing of iteration times until the cross entropy loss is converged. Alternatively, the back propagation of cross-entropy loss employs a momentum-based random gradient descent method to accelerate convergence.

As an alternative implementation, the present embodiment is not limited to the model structure described above, and the accuracy of the model is tested by periodic sampling. According to actual conditions, such as seasonal change, policy change, service update, sampling accuracy rate reduction and the like, from the aspects of sample update, word bank update, model optimization, feature optimization and the like, the model is updated iteratively, so that the availability and the accuracy of the online conversation classification model are ensured.

In the embodiment, different networks are used for training different types of classes, and compared with the method that a single network is used for predicting multiple types at the same time, the accuracy is higher, and the two networks can be calculated in parallel.

Example 3

The present embodiment provides a session classification method, as shown in fig. 4, the session classification method includes the following steps:

and S21, acquiring the customer service session to be classified. The customer service session to be classified comprises session content of the customer service session and related orders corresponding to the customer service session.

And S22, inputting the customer service session into the session classification model to obtain a customer service session classification result. Wherein, the conversation classification model is obtained by the training method of the conversation classification model in the embodiment 1.

As an optional implementation manner, the customer service session classification results are counted to obtain the occupation ratios of different types, the scheduling of the artificial customer service is optimized according to the occupation ratios of the different types, and the reasonability of the allocation of the artificial customer service resources is ensured, for example, if the occupation ratio of the consultation is larger than the occupation ratio of the hastening schedule, more artificial customer services are allocated to the consultation service.

In this embodiment, the session classification model for performing session classification is obtained by the training method of the dressing classification model in embodiment 1, so that when the session classification model is applied to a specific customer service session classification task, the output customer service session classification result is more accurate. Reasonable manual customer service scheduling can be carried out according to the proportion of different classes. Sufficient manual customer service resources will directly affect the efficiency of the user in solving the problem, thereby improving the service experience of the user.

Example 4

The present embodiment provides another conversation classification method, which is a further improvement of embodiment 3.

As shown in fig. 4, step S21 of the present embodiment is the same as the corresponding step of embodiment 1, except for step S22 of the present embodiment.

When the trained session classification models comprise a plurality of models, respectively inputting the customer service sessions into a plurality of different session classification models to obtain a plurality of different customer service session classification results. Wherein, a plurality of different conversation classification models are obtained by the training method of the conversation classification models of the embodiment 1 and the embodiment 2 respectively.

For example, if it is assumed that the user wants to consult a policy related to change his or her subscription in the customer service session, the classification result of the customer service session obtained for the product is changed his or her subscription, and the classification result of the customer service session obtained for the user behavior is consulted.

In the embodiment, the multi-type category classification can refine scenes, and can find problems existing in products through analysis, so that product iteration is promoted, the number of incoming calls of a user is reduced, and time of customers and customer service is saved. For example, the consultation amount of the change of the sign in a period of time is greatly increased, the content displayed for the user on the change page may be incomplete, and by improving the display content of the change page, the consultation amount of the user can be reduced, and the customer service and the efficiency of the user can be improved.

Example 5

As shown in fig. 5, the training apparatus for a conversational classification model includes a first obtaining module 31, a segmenting module 32, a second obtaining module 33, a querying module 34, an acquisition model 35, and a training module 36.

The first obtaining module 31 is configured to obtain a training sample, where the training sample includes a sample session and a corresponding class label. In this embodiment, the sample session is specifically a complete session between the user and the customer service, and is spliced according to the session sequence; the category label is specifically a label that the expert judges and labels the category of each complete conversation. In order to better extract the characteristics of the sample session, the first obtaining module 31 needs to perform chinese processing on the sample session, and as an optional implementation, a fixed-line conversation that must appear in the dialog between the customer service and the user is deleted, such as "what can help you when you travel to a housekeeper? "such as, these fixed dialogs exist in every sample session, with little help for classification. As an optional embodiment, regular matching is used to delete fixed sessions, so as to better characterize the characteristic attributes of the session content itself.

As an optional implementation manner, the first obtaining module 31 performs a unification operation on the fonts of the sample session, that is, converts the traditional fonts in the sample session into simplified fonts, and eliminates the influence of font differences in the session.

The segmentation module 32 is configured to segment the sample session by using an industry lexicon, so as to obtain target segmented words, where the industry lexicon includes a plurality of segmented words corresponding to the OTA field. In this embodiment, in order to reduce the input dimension, the text of the sample conversation needs to be converted into word segmentation in units of words, so as to capture the deep meaning of the phrase and better characterize the characteristics of the conversation.

As an alternative embodiment, the segmentation module 32 uses a segmentation tool to segment the sample conversation into different words (i.e. target segmentation), the segmentation tool uses a dictionary in the segmentation, and the dictionary in the existing segmentation tool mainly originates from a popular corpus, such as a people's japanese corpus, etc. As an optional implementation manner, an industry lexicon is constructed in advance, and the participles related to the OTA field are added into the industry lexicon, wherein the participles are mainly professional industry words in the OTA field, such as "aviation transformer", "all-round service package", "luggage amount", and the like. As an optional implementation manner, a Jieba (a chinese word segmentation tool) is used as a word segmentation tool, and the Jieba word segmentation tool not only can use a dictionary of the Jieba word segmentation tool, but also provides an adding function of a custom dictionary, so that a built industry lexicon can be added to the word segmentation tool conveniently.

In this embodiment, the user may have different expressions for the same event, for example, when consulting the value-added product "gold service pack", the user may express "service pack", "gold service", etc. In order to simplify the difficulty of learning the model, the segmentation module 32 replaces the participles under different expressions in the conversation by a specific synonym replacement library, wherein the synonym replacement library includes a plurality of standard words related to the OTA field and a plurality of synonyms corresponding to each standard word.

As an optional implementation manner, for each target participle, the segmentation module 32 determines whether a synonym corresponding to the target participle exists in the synonym replacement library, and if so, replaces the target participle with a standard word corresponding to the synonym. For example, assuming that there is a "service package" in the session, the standard word corresponding to the "service package" in the synonym replacement library is a "gold service package", and the "gold service package" is used to replace the "service package".

As an alternative embodiment, the segmentation module 32 filters out special characters in the sample session, i.e. special character processing. The special character processing mainly includes filtering of special characters such as punctuation marks, numbers, expressions and the like, and specifically, the existing special character library can be used for deleting the special characters in the sample conversation. It should be noted that, since special characters such as punctuation marks in the sample conversation can help to improve the accuracy of word segmentation, the filtering process for the special characters is usually performed after word segmentation.

As an optional embodiment, the segmentation module 32 adjusts the length of the sample session after filtering to unify the length of the sample session, that is, performs text isometric processing on the sample session, where the text isometric processing is a process of intercepting or supplementing the text in the sample session to the same length. As an alternative embodiment, a text length of 95 quantiles is used as the standard length.

The second obtaining module 33 is configured to obtain word vector features corresponding to the target word segmentation based on an Embedding method.

Referring to fig. 2, the second obtaining module 33 divides the processed target word into words, i.e. W ₁ 、W ₂ 、……W _n Converting the word vector characteristics into word vector characteristics corresponding to the target participle, wherein the word vector characteristics are used for representing an Embedding vector taking a word as a unit, namely e ₁ 、e ₂ 、……e _n . As an optional implementation manner, the word vector feature corresponding to the target participle is obtained based on the Embedding method, specifically: different target participles are coded, and meanwhile deep meanings of the target participles and semantic relations among the target participles are captured, so that a large amount of valuable information is fused, the target participles are converted into word embedding, and the word embedding can be called as word vectors. In this embodiment, the Embedding method is a mature technology in natural language processing, and the existing Embedding method includes: word2vec algorithm, GloVe algorithm, CWE algorithm, cw2vec algorithm, etc., can be called directly.

The query module 34 is used for querying order information and/or user information corresponding to the sample session. As an alternative embodiment, for each through sample session, the query module 34 locates the order related to the sample session, and obtains the order information and the user information from the order. The order information includes an order number, an order date, an order status, goods or services purchased in the order, and the like, and the user information includes a user portrait, historical behaviors, and the like.

The collecting module 35 is used for collecting order characteristics and/or user characteristics from the order information and/or user information. In this embodiment, the acquisition module 35 performs feature engineering on the order information and the user information, and extracts better order features and user features from the order information and the user information.

The training module 36 is configured to train the machine learning network to obtain a session classification model by taking the word vector feature, the order feature, and/or the user feature as input and the category label as output. As an optional implementation manner, the machine learning Network adopts a current Neural Network, RNN (Recurrent Neural Network) and Bi-directional Gated Recurrent Unit, Bi-GRU (bidirectional gate Recurrent Unit) as a basic structure, adds Self-attention (attention mechanism) to the basic structure, and adds a full connection layer to fuse order features and user features.

Referring to FIG. 2, the training module 36 transforms the word vector features, i.e., e ₁ 、e ₂ 、……e _n Input to a recurrent neural network incorporating attention mechanism, resulting in an intermediate output, specifically:

the training module 36 inputs the word vector features into a bidirectional GRU (gated round robin) unit, encodes the word vector features one by one through the bidirectional GRU unit, and outputs hidden layer state vectors of the target participles at preset times. As an alternative embodiment, the hidden layer state vector is expressed as h ₁ 、h ₂ 、……h _n Wherein the preset time is the corresponding time when the word vector characteristics are input, and the hidden layer state vector h _n Word vector feature e input by preset time _n And the hidden layer state vector h at the previous moment _n-1 And (6) determining.

As shown in fig. 2, the bidirectional GRU unit calculates a set of forward hidden layer state vectors from 1 to n and a set of backward hidden layer state vectors from n to 1, and then the training module 36 connects the two state vectors together to obtain a state vector corresponding to each target participle.

As an alternative embodiment, the bidirectional GRU unit and attention mechanism layer u _w And connecting, wherein the attention mechanism layer is configured to calculate a corresponding attention weight for the state vector of each target word segmentation, so as to focus on the useful word vector characteristics, and the training module 36 performs weighted accumulation on the state vector of each target word segmentation and the corresponding attention weight to obtain a final output vector of the bidirectional GRU.

As shown in fig. 2, the output vector obtained by the bidirectional GRU unit is connected to the full-link layer, the full-link layer shrinks the length of the output vector, and the order feature and the user feature, i.e. X, are simultaneously combined ₁ 、X ₂ And X ₃ And inputting the input into the full-connection layer, fusing the output vector corresponding to the word vector feature, the order feature and the user feature by using the full-connection network through the training module 36, inputting the fused output vector into the softmax layer, and obtaining the prediction probability (namely the conversation classification result) of each conversation category through activation operation.

As an alternative embodiment, in training the session classification model, the training module 36 trains the model using a cross-entropy loss function. Specifically, cross entropy loss is calculated based on the session classification result and the class label corresponding to the sample session, then network parameters in the bidirectional GRU unit, the attention mechanism and the full-connection network are adjusted through back propagation of the cross entropy loss, and the session classification model is determined to be trained completely through increasing of iteration times until the cross entropy loss is converged. Alternatively, the back propagation of cross-entropy loss employs a momentum-based random gradient descent method to accelerate convergence.

Example 6

The present embodiment provides another training apparatus for a conversational classification model, which is a further improvement of embodiment 5.

As shown in fig. 5, the first obtaining module 31, the segmenting module 32, the second obtaining module 33, the querying module 34 and the collecting module 35 of the present embodiment are the same as the corresponding models of the embodiment 5, except for the training module 36 of the present embodiment.

In this embodiment, the category labels include at least two different types of labels, the types include product and user behaviors, wherein the product includes items similar to refund and change, and the user behaviors include items similar to consultation and promotion. As an alternative embodiment, each sample session includes a product category and a corresponding user behavior category, such as refund and change of ticket, which may have consultation and urge, and the difference is that the learning bias is different, so the training module 36 uses different models to separately train for the two different types of category labels.

Alternatively, training module 36 may employ models of the same infrastructure and different additional elements for different types of classes, respectively. For example, the machine learning network adopts a recurrent neural network combined with a bidirectional gate cyclic unit as a basic structure, adds an attention mechanism on the basic structure aiming at a classification model of a product, and adds a full connection layer to fuse order features and user features, wherein a training process aiming at the classification model of the product is the same as that in the training module 36 in embodiment 5; and for the classification model of the user behavior, the machine learning network adopts a recurrent neural network combined with a bidirectional gate cyclic unit as a basic structure, adds a multi-head attention-force mechanism (Multihead-attention) on the basic structure, and adds a fully-connected layer to fuse order features and user features.

Referring to FIG. 3, training module 36 transforms the transformed word vector features, i.e., e ₁ 、e ₂ 、……e _n Inputting to a recurrent neural network combined with a multi-head attention mechanism to obtain an intermediate output, specifically:

the training module 36 inputs the word vector features into the bidirectional GRU unit, encodes the word vector features one by one through the bidirectional GRU unit, and outputs the hidden layer state vector of each target participle at a preset time. As an alternative embodiment, the hidden layer state vector is expressed as h ₁ 、h ₂ 、……h _n Wherein the preset time is the corresponding time when the word vector characteristics are input, and the hidden layer state vector h _n Word vector feature e input by preset time _n And hidden layer state vector h at the previous moment _n-1 And (6) determining.

As an alternative embodiment, a bidirectional GRU unit and a multi-headed attention device layer u ₁ 、u ₂ 、……u _k Are connected. In this embodiment, the number of the attentive heads of the multi-head attentive mechanism layer is k, where k is a positive integer greater than 1, and can be set according to actual requirements. The training module 36 obtains corresponding feature matrices for each head through different linear changes, and then performs splicing and weighted calculation on all feature matrices, so that the model can understand input contents from different angles and combine extracted information to obtain a final output vector.

As shown in FIG. 3, training module 36 connects the output vectors from the bidirectional GRU units to the pooling layer, where the output vectors are pooled globally and then input to the full-link layer, where the full-link layer shrinks the length of the output vectors, while training module 36 connects the order features and the user features, namely X ₁ 、X ₂ And X ₃ Inputting the input data into a full-connection layer, fusing an output vector corresponding to the word vector characteristics, order characteristics and user characteristics by using a full-connection network, inputting the fused output vector, order characteristics and user characteristics into a softmax layer, and obtaining the prediction probability (namely a conversation classification result) of each conversation category aiming at the user behavior through activation operation.

As an alternative embodiment, when training the session classification model, the model is trained using a cross-entropy loss function. Specifically, cross entropy loss is calculated based on the session classification result and the class label corresponding to the sample session, then network parameters in the bidirectional GRU unit, the multi-head attention mechanism and the full-connection network are adjusted through back propagation of the cross entropy loss, and the session classification model is determined to be trained completely through increasing of iteration times until the cross entropy loss is converged. Alternatively, the back propagation of cross-entropy loss employs a momentum-based random gradient descent method to accelerate convergence.

As an alternative embodiment, the present embodiment is not limited to the model structure described above, and the training module 36 tests the accuracy of the model by periodic sampling. According to actual conditions, such as seasonal change, policy change, service update, sampling accuracy rate reduction and the like, from the aspects of sample update, word bank update, model optimization, feature optimization and the like, the model is updated iteratively, so that the availability and the accuracy of the online conversation classification model are ensured.

The embodiment trains different types of classes by using different networks, and compared with the method for predicting multiple types of classes by using a single network at the same time, the accuracy is higher, and the two networks can be calculated in parallel.

Example 7

The present embodiment provides a session classification apparatus, as shown in fig. 6, which includes a session obtaining module 41 and a session classification module 42.

The session obtaining module 41 is configured to obtain a customer service session to be classified. The customer service session to be classified comprises session content of the customer service session and related orders corresponding to the customer service session.

The session classification module 42 is configured to input the customer service session into the session classification model to obtain a classification result of the customer service session. The conversational classification model is obtained by the training method of the conversational classification model in embodiment 1.

As an optional implementation manner, the customer service session classification results are counted to obtain the occupation ratios of different categories, the scheduling of the manual customer service is optimized according to the occupation ratios of the different categories, and the rationality of manual customer service resource allocation is ensured, for example, if the occupation ratio of the consultation is larger than the occupation ratio of the scheduling promotion, more manual customer services are allocated to the consultation service.

Example 8

This embodiment provides another apparatus for classifying a conversation, which is a further improvement of embodiment 7.

As shown in fig. 6, the session acquisition module 41 of the present embodiment is the same as the corresponding module of embodiment 7, except for the session classification module 42 of the present embodiment.

When the trained session classification model includes a plurality of models, the session classification module 42 inputs the customer service session into a plurality of different session classification models respectively to obtain a plurality of different customer service session classification results. Wherein, a plurality of different conversational classification models are obtained by the training methods of the conversational classification models of the embodiment 1 and the embodiment 2 respectively.

Example 9

The present embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the program, the processor implements the training method of the conversational classification model according to embodiment 1 or embodiment 2 and the conversational classification method according to embodiment 3 or embodiment 4.

The electronic device 50 shown in fig. 7 is only an example and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.

The electronic device 50 may take the form of a general-purpose computing device, which may be, for example, a server device. The components of the electronic device 50 may include, but are not limited to: the at least one processor 51, the at least one memory 52, and a bus 53 connecting the various system components (including the memory 52 and the processor 51).

The bus 53 includes a data bus, an address bus, and a control bus.

The memory 52 may include volatile memory, such as Random Access Memory (RAM)521 and cache memory 522, and may further include Read Only Memory (ROM) 523.

Memory 52 may also include a program tool 525 having a set (at least one) of program modules 524, such program modules 524 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.

The processor 51 executes various functional applications and data processing, such as a training method of the conversational classification model of embodiment 1 or embodiment 2 and a conversational classification method of embodiment 3 or embodiment 4, by running a computer program stored in the memory 52.

The electronic device 50 may also communicate with one or more external devices 54. Such communication may be through an input/output (I/O) interface 55. Also, the model-generating device 50 may also communicate with one or more networks through a network adapter 56. As shown in FIG. 7, the network adapter 56 communicates with the other modules of the model-generating device 50 over the bus 53. It should be appreciated that although not labeled in FIG. 7, other hardware and/or software modules may be used in conjunction with the model-generating device 50, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Example 10

The present embodiment provides a computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the training method of the conversational classification model of embodiment 1 or embodiment 2 and the conversational classification method of embodiment 3 or embodiment 4.

More specific examples that may be employed by the readable storage medium include, but are not limited to: a portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In an alternative embodiment, the present invention can also be implemented in the form of a program product including program code for causing a terminal device to execute a training method for implementing the session classification model of embodiment 1 or embodiment 2 and a session classification method of embodiment 3 or embodiment 4 when the program product is run on the terminal device.

Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.

While specific embodiments of the invention have been described above, it will be understood by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims

1. A training method of a session classification model is characterized by comprising the following steps:

obtaining a training sample, wherein the training sample comprises a sample session and a corresponding class label;

acquiring word vector characteristics corresponding to the target word segmentation based on an Embedding method;

and training a machine learning network to obtain the conversation classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the category labels as output.

2. The method for training the conversational classification model according to claim 1, wherein the machine learning network comprises a recurrent neural network and a fully connected network, the step of training the machine learning network to obtain the conversational classification model with the word vector feature, the order feature and/or the user feature as input and the class label as output comprises:

3. The method for training the conversational classification model according to claim 1, wherein between the step of segmenting the sample conversation by using an industry dictionary and the step of obtaining word vector features corresponding to the target participles based on an Embedding method, the method further comprises:

respectively judging whether synonyms corresponding to the target participles exist in the synonym replacement library, and if so, replacing the target participles by adopting standard words corresponding to the synonyms;

filtering out special characters of the sample conversation;

4. The method of claim 1, wherein the class labels comprise at least two different types of labels, the step of training the machine learning network to obtain the conversational classification model further comprises:

5. A method of session classification, comprising the steps of:

obtaining customer service sessions to be classified;

inputting the customer service session into a session classification model to obtain a customer service session classification result, wherein the session classification model is obtained by a training method of the session classification model according to any one of claims 1 to 4.

6. The conversational classification method of claim 5, wherein when the trained conversational classification model comprises a plurality, the step of entering the customer service session into a conversational classification model comprises:

7. An apparatus for training a conversational classification model, comprising:

the segmentation module is used for segmenting the sample session by utilizing an industry word stock to obtain target participles, and the industry word stock comprises a plurality of participles corresponding to the OTA field;

and the training module is used for training a machine learning network to obtain the conversation classification model by taking the word vector characteristics, the order characteristics and/or the user characteristics as input and the category labels as output.

8. A conversation classification apparatus, comprising:

a session classification module, configured to input the customer service session into a session classification model to obtain a classification result of the customer service session, where the session classification model is obtained through the training apparatus of the session classification model according to claim 7.

9. An electronic device comprising a memory and a processor coupled to the memory, the processor implementing the method of training a conversational classification model of any one of claims 1-4 or the method of conversational classification of any one of claims 5-6 when executing a computer program stored on the memory.

10. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, carries out a method of training a conversational classification model according to any one of claims 1-4 or a conversational classification method according to any one of claims 5-6.