CN113204952B

CN113204952B - Multi-intention and semantic slot joint identification method based on cluster pre-analysis

Info

Publication number: CN113204952B
Application number: CN202110325369.XA
Authority: CN
Inventors: 张晖; 李吉媛; 赵海涛; 孙雁飞; 朱洪波
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-03-26
Filing date: 2021-03-26
Publication date: 2023-09-15
Anticipated expiration: 2041-03-26
Also published as: JP2023522502A; WO2022198750A1; JP7370033B2; CN113204952A

Abstract

The application provides a multi-purpose recognition and semantic slot filling joint modeling method based on cluster pre-analysis, which comprises the following steps: acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text; constructing a multi-intention recognition model based on cluster pre-analysis for recognizing a plurality of intents of the user; constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic slots by fully utilizing the result of intention recognition; and optimizing the constructed multi-purpose recognition and semantic slot filling joint model. The application fully considers the relation between the intention recognition and the semantic slot filling, and provides a joint modeling method, which combines two semantic analysis subtasks into one semantic analysis task, improves the accuracy of the semantic slot filling while improving the accuracy of multi-intention recognition, thereby improving the quality of natural language semantic analysis; in practical application, the ability of understanding human language of a machine in man-machine conversation can be effectively improved, and the ability of solving problems and the experience of man-machine conversation are improved.

Description

Multi-intention and semantic slot joint identification method based on cluster pre-analysis

Technical Field

The application relates to the field of natural language processing, in particular to a natural language semantic analysis method in a man-machine dialogue system.

Background

With the rapid development of artificial intelligence, people have increasingly demanded intelligence in many application scenes, and to meet the demand of intelligence, good man-machine interaction is indispensable. At present, human-computer interaction modes are diversified, wherein the most convenient mode is to use natural language. Therefore, the voice of realizing the man-machine conversation by using the self-language is also higher, and the man-machine conversation system is widely focused in academia and industry, and has very wide application scenes.

The technology of natural language semantic analysis is to be realized in a man-machine dialogue system. The quality of semantic analysis will directly affect the effect of human-computer interaction. This increases the difficulty of natural language semantic analysis due to the complexity, abstract nature, ambiguity of words, etc. Semantic analysis is divided into two basic subtasks, intent recognition and semantic slot filling. For these two subtasks, the traditional approach is to treat the two tasks as two independent problems to solve, and then connect the results of the two tasks. In practice, however, intent recognition is a determination of the type of user requirements, while semantic slot filling is the materialization of the user requirements. Thus, the user intent and the slot to be identified are strongly correlated, the intent identification being for better filling of semantic slots. However, the conventional method of modeling alone does not fully consider the connection between two tasks, so that semantic information cannot be effectively utilized. In addition, the man-machine interactive system often faces a multi-intention recognition problem, that is, the intention text input by the user may not only contain one intention, but also multiple intentions may occur. At present, research on the problem of intention recognition is mainly focused on the recognition of single intention, and multi-intention recognition is more complicated than single intention recognition and requires a higher degree of semantic understanding.

In summary, how to provide a joint modeling method for effectively solving multi-purpose recognition and semantic slot filling based on the prior art, aiming at the semantic analysis problem in the man-machine conversation system, is also a problem to be solved by those skilled in the art.

Disclosure of Invention

In view of the above problems of semantic analysis in a man-machine conversation system, the application provides a multi-intention and semantic slot joint identification method based on cluster pre-analysis, which is used for solving the joint identification problem of multi-intention identification and semantic slot filling in the man-machine conversation, and the method comprises the following specific processes:

step 1, acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text;

step 2, constructing a multi-intention recognition model based on cluster pre-analysis, wherein the multi-intention recognition model is used for recognizing a plurality of intentions of a user;

step 3, constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic Slot by fully utilizing the result of intention recognition;

and 4, optimizing the constructed multi-purpose recognition and semantic slot filling joint model, and recognizing by using the optimized model.

In step 1, the multi-intention text input by the current user is preprocessed, namely the multi-intention text is vectorized for expression, so that semantic feature extraction is carried out in the input neural network model. The vectorization representation method comprises the steps of firstly pre-training the BERT model by using massive Chinese unsupervised corpus in the same field. The resulting BERT pre-training model is then utilized to vectorize the multi-intent text.

In step 2, the purpose of constructing the multi-intention recognition model based on cluster pre-analysis is to guide the filling of the semantic slots, and whether the multi-intention recognition is accurate or not directly influences the filling of the semantic slots. In order to improve the accuracy of multi-intention recognition, a method based on clustering pre-analysis is provided for uncertainty of the intention input by a user, namely, intention texts are analyzed before the intention recognition, and whether the intention belongs to single intention or multi-intention is judged.

The overall intent recognition is divided into two phases: the first stage is to judge the input intention text by using a K-means clustering algorithm. In general, the intentions are mainly divided into a single intention and a multiple intention, and thus, the clustering center K is two; and the second stage, classifying according to the judged number of intention.

When judging that the intention text contains a plurality of intents, classifying by using a multi-intention classifier, namely adding a full-connection layer behind the BERT pre-training model, wherein each node of the full-connection layer is connected with all nodes of the upper layer and used for fusing semantic features extracted in the previous step; and then, the intention text vector output by the BERT model is input into a sigmoid classifier, and the classifier is used for carrying out two classification on each label, so that a plurality of intention labels are output. The threshold value set by the sigmoid classifier is 0.6, and the best classifying effect can be obtained when the value is obtained through experimental verification. Outputting the tag when the probability is greater than the set threshold; when the probability is less than the set threshold, the tag is not output. The calculation formula of label prediction is as follows:

y ^I ＝sigmoid(W ^I C+b ^I )。

in the formulay^I Predictive output representing intent, W ^I Representing a weight matrix, C representing the input text vector, b ^I Representing the bias.

When the intention text is judged to be single intention, adopting a softmax classifier, directly inputting the sentence vector C of which the BERT output is the first mark ([ CLS ]) into the classifier to classify, and obtaining a predicted intention label according to the following formula:

y ^I ＝softmax(W ^I C+b ^I )。

in the process of pre-analyzing the multi-intention text by using the K-means clustering algorithm, the semantic similarity between the intention texts needs to be judged. The measurement of semantic similarity is critical to the accuracy of the clustered results. For the measurement of text semantic similarity, a common way is to calculate cosine similarity. Cosine similarity can reflect the difference between two vectors in space. However, cosine similarity is insensitive to absolute values, and cannot measure differences in the same direction. The Euclidean distance is sensitive to absolute values when the similarity is calculated, and the difference in the same direction can be well measured. Therefore, the application combines the characteristics of the cosine similarity and the Euclidean distance, and provides a novel measurement method, which is as follows:

wherein ,f₁ Refer to cosine similarity, f ₂ Refer to Euclidean distance, f _Sim (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Distance between f ₁ (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Cosine similarity between f ₂ (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Euclidean distance between them. When f is calculated _Sim The larger the value, the greater the similarity between the data objects, when calculated f _Sim The smaller the value, the smaller the similarity between the data objects is, and the similarity between texts can be better measured by using the measuring method.

In step 3, the construction method of BiLSTM-CRF semantic Slot filling model based on the Slot-gate association mechanism is specifically as follows:

Slot-Gated association gate mechanism that can associate intent recognition tasks with semantic Slot filling tasks. The intent vector for intent recognition is weighted summed with the intent text vector for semantic slot filling. Then, through activating a function tanh, an intention-semantic slot joint feature vector g is obtained, and the calculation method is as follows:

wherein ,representing semantic slot vectors, c ^I Representing intent vector-> and c^I V and W are trainable vectors and matrices, respectively.

Then, the intention-semantic slot joint feature vector g is input into the BiLSTM neural network, so as to extract the word sequence features of the text and capture the deep context semantic information. And then adding a linear layer behind the BiLSTM network, and mapping the dimension of the output vector of the neural network for semantic slot decoding. Finally, using CRF as decoding unit, outputting slot label corresponding to each word in sequence, and calculating the method as follows:

wherein ,semantic slot prediction output representing the i-th word in the input text sequence,/>Is a weight matrix.

In step 4, optimizing the constructed joint model of multi-purpose recognition and semantic slot filling, specifically as follows:

the performance of the joint recognition model is determined jointly by the two subtasks, and the joint probabilities of multi-intent recognition and semantic slot filling are as follows:

wherein ,representing multi-intent recognition y on the premise of inputting multi-intent text sequence x ^I And semantic slot filling->Joint conditional probability of (a)。

In joint model training, the goal of the training is to maximize the joint probability of outputting multi-intent recognition and semantic slot filling. In order to improve the semantic analysis capability, the intended semantic information is fully utilized to guide the filling of the semantic slots, and the joint recognition model is optimized. In model training, the traditional mode that only a plurality of task loss functions are simply added is changed. Based on iteration ideas, a step-by-step iteration training mode combining multi-purpose recognition and semantic slot filling is provided: (1) training the BERT model and the multi-intention recognition model by using a training text, and updating parameters of the BERT model and the multi-intention recognition model; (2) transmitting the output of the multi-intention recognition model in the step (1) to a Slot-formed, and training the updated BERT model and semantic Slot filling model in the step (1) by using the same training text as that in the step (1), so as to update the parameters of the BERT model and the semantic Slot filling model; (3) iteratively performing (1) and (2) until a training goal (optimal) is reached. The multi-purpose recognition and semantic slot filling tasks share BERT model bottom parameters during training, namely, when one model is trained, the bottom model is initialized by training results of the other model. The upstream tasks train respectively, and meanwhile, the result of intention recognition is transmitted to the semantic slot filling task so as to improve the accuracy of semantic slot filling.

The loss function is very important for model parameter updates. If the selection of the loss function is unreasonable, the model is powerful and the final result is not good.

Multi-purpose recognition Loss function Loss in joint recognition model _intent The calculation formula is as follows:

Loss _intent ＝(Loss _multi ) ^k (Loss _single ) ^1-k

where k represents a category of intention text, k is 1 when the intention text contains a plurality of intents, and k is 0 when the intention text is a single intention. Loss (Low Density) _multi Cross entropy, loss for multi-purpose recognition _single Cross entropy for single intent recognition. The specific calculation is as follows:

in the formula,y^I For the prediction output of intent, y ^intent For true intent, T is the number of training text.

Semantic slot filling task Loss function Loss in joint recognition model _slot The calculations are as follows:

wherein ,semantic slot prediction output representing the i-th word in training text sequence,/>The real semantic slot of the ith word in the training text sequence is represented, T is the training text number, and M represents the training text sequence length.

Compared with the prior art, the technical scheme provided by the application has the following technical effects:

the application fully considers the connection between the intention recognition and the semantic slot filling, constructs a joint recognition model, combines two semantic analysis subtasks into one task, and shares the BERT bottom semantic features. Then, the intent-semantic Slot joint feature vector is generated by using the Slot-oriented association gate and is used for the semantic Slot filling task. Capturing the word sequence characteristics of the text by using BiLSTM in the task of filling the semantic slot, and acquiring context semantic information; and CRF is used as a decoder, and the dependency relationship before and after the label is considered, so that the semantic slot labeling is more reasonable. In addition, in order to improve the overall performance of the joint model, in the multi-intention recognition process, aiming at the uncertainty of user input intention, a clustering pre-analysis-based algorithm is provided for judging the number of intention, the traditional semantic similarity measurement method is improved in the algorithm, a new measurement mode is provided, the similarity between intention texts can be more effectively measured by the new measurement mode, the accuracy of intention number judgment is improved, and the robustness of the algorithm is improved. In order to improve the capacity of semantic analysis, the meaning semantic information is fully utilized to guide the filling of semantic slots, and based on an iteration thought, a training mode through step-by-step iteration is provided, so that the interrelationship between the meaning and the semantic slots can be fully utilized, the filling accuracy of the semantic slots is improved, and meanwhile, the accuracy of a multi-meaning recognition model is improved, so that the effect of semantic analysis is improved.

Drawings

In order to make the objects, technical solutions and technical effects of the present application more clear, the present application provides the following drawings for explanation:

FIG. 1 is a block diagram of the overall structure of a joint modeling method of the present application;

FIG. 2 is a flow chart of multi-intent recognition based on cluster pre-analysis in accordance with the present application;

FIG. 3 is a diagram of a semantic slot recognition model of the present application;

FIG. 4 is a schematic diagram of a step-and-repeat training scheme of the joint recognition model of the present application.

Detailed Description

As shown in FIG. 1, the application discloses a multi-intention and semantic slot joint identification method based on cluster pre-analysis, which comprises the following steps:

step S101, acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text;

the multi-intention text input by the current user is preprocessed, namely the multi-intention text is vectorized for representation, so that semantic feature extraction is carried out in the input neural network model. The vectorization representation method comprises the steps of firstly training a BERT model by using massive Chinese unsupervised corpus in the same field. The resulting BERT pre-training model is then utilized to vectorize the multi-intent text.

Step S102, constructing a multi-intention recognition model based on cluster pre-analysis, and recognizing a plurality of intentions of a user; as shown in fig. 2.

The purpose of constructing the multi-intention recognition model based on cluster pre-analysis is to guide the filling of the semantic slots, and whether the multi-intention recognition is accurate or not can directly influence the filling of the semantic slots. In order to improve the accuracy of multi-intention recognition, aiming at the uncertainty of the intention input by a user, a clustering pre-analysis-based method is provided, namely, intention texts are analyzed before the intention recognition, and whether the intention belongs to single intention or multi-intention is judged. The method comprises the following steps:

the overall intent recognition is divided into two phases. The first stage is to judge the input intention text by using a K-means clustering algorithm. In general, the intentions are mainly divided into a single intention and a multiple intention, and thus, the cluster center K is two. And the second stage, classifying according to the judged number of intention. When it is determined that the intention text contains a plurality of intents, classification is performed using a multi-intention classifier. Namely, a full connection layer is added behind the BERT pre-training model, and each node of the full connection layer is connected with all nodes of the upper layer and is used for fusing the semantic features extracted in the previous step. And then, the intention text vector output by the BERT model is input into a sigmoid classifier, and the classifier is used for carrying out two classification on each label, so that a plurality of intention labels are output. The calculation formula of label prediction is as follows:

y ^I ＝sigmoid(W ^I C+b ^I )

when the intention text is judged to be single intention, adopting a softmax classifier, directly inputting the sentence vector C of which the BERT output is the first mark ([ CLS ]) into the classifier for classification, and obtaining a predicted intention label according to the following formula.

y ^I ＝softmax(W ^I C+b ^I )

wherein ,f₁ Refer to cosine similarity, f ₂ Refers to the Euclidean distance, when f is calculated _Sim The larger the value, the greater the similarity between the data objects, when calculated f _Sim The smaller the value, the less similarity between the data objects is accounted for. By using the method, the similarity between texts can be better measured.

Step S103, a BiLSTM-CRF semantic Slot filling model is built based on a Slot-connected association gate mechanism, and the result of intention recognition is fully utilized to guide the filling of the semantic Slot, as shown in FIG. 3.

Slot-Gated association gate mechanism that can associate intent recognition tasks with semantic Slot filling tasks. The intent vector for intent recognition is weighted summed with the intent text vector for semantic slot filling. Then, by activating the function tanh, the intent-semantic slot joint feature vector g is obtained. The calculation method comprises the following steps:

Then, the intention-semantic slot joint feature vector g is input into the BiLSTM neural network, so as to extract the word sequence features of the text and capture the deep context semantic information. And then adding a linear layer behind the BiLSTM network, and mapping the dimension of the output vector of the neural network for semantic slot decoding. And finally, using the CRF as a decoding unit, and outputting a slot label corresponding to each word in the sequence. The calculation method comprises the following steps:

Step S104, optimizing the constructed multi-intention recognition and semantic slot filling joint model. As shown in fig. 4.

The performance of the joint recognition model is determined jointly by the two subtasks. The joint probabilities of multi-intent recognition and semantic slot filling are as follows:

wherein ,representing multi-intent recognition y on the premise of inputting multi-intent text sequence x ^I And semantic slot filling->Is a joint conditional probability of (a).

In joint model training, the goal of the training is to maximize the joint probability of outputting multi-intent recognition and semantic slot filling. In order to improve the semantic analysis capability, the intended semantic information is fully utilized to guide the filling of the semantic slots, and the joint recognition model is optimized. In model training, the traditional mode that only a plurality of task loss functions are simply added is changed. Based on iteration ideas, a step-by-step iteration training mode combining multi-purpose recognition and semantic slot filling is provided: as shown in fig. 4, firstly, training data is input into a joint recognition model, and during training, a round of multi-intention recognition model is trained first, and multi-intention recognition model parameters and bottom-layer BERT model parameters are updated through back propagation; and then, the updated model is utilized to transmit semantic features of the multi-intention recognition result to Slot-Gated, the intention features are fused with semantic Slot features generated by adopting the updated BERT model through a Slot-Gated association gate, and an intention-semantic Slot joint feature vector is generated and used for training a semantic Slot filling model. During training, updating semantic slot filling model parameters and bottom BERT model parameters through back propagation; the training is repeated until the optimum is reached.

The multi-purpose recognition and semantic slot filling tasks share BERT model bottom parameters during training, namely, when one model is trained, the bottom model is initialized by training results of the other model. The upstream tasks are respectively trained, and meanwhile, the intention recognition result is transmitted to the semantic slot filling task, so that the accuracy of semantic slot filling is improved, and meanwhile, the accuracy of the multi-intention recognition model is improved.

Loss _intent ＝(Loss _multi ) ^k (Loss _single ) ^1-k

in the formula,y^I For the prediction output of intent, y ^intent Is a true intention.

The application also provides a multi-intention and semantic slot joint identification system for cluster pre-analysis, which comprises the following steps: a memory and a processor; the memory stores a computer program which, when executed by the processor, implements the multi-intent and semantic slot joint recognition method described above.

The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the multi-intent and semantic slot joint recognition method described above. The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

It should be noted that the above embodiments are only for aiding in understanding the method of the present application and its core idea, and that it will be obvious to those skilled in the art that several improvements and modifications can be made to the present application without departing from the principle of the present application, and these improvements and modifications are also within the scope of the claims of the present application.

Claims

1. The multi-intention and semantic slot joint identification method based on cluster pre-analysis is characterized by comprising the following steps of:

step 1, acquiring a text input by a current user in real time, and vectorizing the text by using a BERT model;

step 2, constructing a multi-intention recognition model based on cluster pre-analysis, and recognizing a plurality of intentions of a user;

step 3, constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic Slot by using the result of intention recognition;

step 4, optimizing and training a joint model formed by the BERT model, the multi-intention recognition model and the semantic slot filling model, and recognizing by using the joint model completed by optimizing and training; the recognition method of the multi-intention recognition model in the step 2 comprises two stages:

the first stage: dividing the input intention text vector into two types of single intention and multiple intention by using a K-means clustering algorithm;

and a second stage: carrying out classification and identification on the intention text vectors of the single intention category by adopting a softmax classifier; classifying and identifying the multi-intention text vector by adopting a sigmoid classifier;

in the step 4, optimization training is carried out by adopting a step-by-step iterative training mode of combining multi-intention recognition and semantic slot filling: (1) training the BERT model and the multi-intention recognition model by using a training text, and updating parameters of the BERT model and the multi-intention recognition model; (2) transmitting the output of the multi-intention recognition model in the step (1) to a Slot-formed, and training the updated BERT model and semantic Slot filling model in the step (1) by using the same training text as that in the step (1), so as to update the parameters of the BERT model and the semantic Slot filling model; (3) iteratively performing (1) and (2) until a training goal is reached.

2. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein a distance function in a K-means clustering algorithm is as follows:

wherein ,f_Sim (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Distance between f ₁ (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Cosine similarity between f ₂ (x _i ,x _j ) Representing an intended text vector x _i And intent text vector x _j Euclidean distance between them.

3. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein the Loss function Loss of the multi-intent recognition model _intent The following is shown:

Loss _intent ＝(Loss _multi ) ^k (Loss _single ) ^1-k

where k represents the category of the intention text, k being when the intention text contains a plurality of intents1, k is 0 when the intention text is single intention;for cross entropy loss for multi-purpose recognition,cross entropy loss for single intent recognition, y ^I For the prediction output of intent, y ^intent For true intent, T is the number of training text.

4. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein a Loss function Loss of a semantic slot filling model _slot The following is shown:

5. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the multi-intent and semantic slot joint recognition method according to any of claims 1 to 4.

6. A multi-intent and semantic slot joint recognition system for cluster pre-analysis, comprising: a memory and a processor; the memory has stored thereon a computer program which, when executed by the processor, implements the multi-intent and semantic slot joint identification method as claimed in any of claims 1 to 4.