CN113204952B - Multi-intention and semantic slot joint identification method based on cluster pre-analysis - Google Patents

Multi-intention and semantic slot joint identification method based on cluster pre-analysis Download PDF

Info

Publication number
CN113204952B
CN113204952B CN202110325369.XA CN202110325369A CN113204952B CN 113204952 B CN113204952 B CN 113204952B CN 202110325369 A CN202110325369 A CN 202110325369A CN 113204952 B CN113204952 B CN 113204952B
Authority
CN
China
Prior art keywords
intention
semantic
intent
model
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110325369.XA
Other languages
Chinese (zh)
Other versions
CN113204952A (en
Inventor
张晖
李吉媛
赵海涛
孙雁飞
朱洪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110325369.XA priority Critical patent/CN113204952B/en
Priority to PCT/CN2021/091024 priority patent/WO2022198750A1/en
Priority to JP2022512826A priority patent/JP7370033B2/en
Publication of CN113204952A publication Critical patent/CN113204952A/en
Application granted granted Critical
Publication of CN113204952B publication Critical patent/CN113204952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a multi-purpose recognition and semantic slot filling joint modeling method based on cluster pre-analysis, which comprises the following steps: acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text; constructing a multi-intention recognition model based on cluster pre-analysis for recognizing a plurality of intents of the user; constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic slots by fully utilizing the result of intention recognition; and optimizing the constructed multi-purpose recognition and semantic slot filling joint model. The application fully considers the relation between the intention recognition and the semantic slot filling, and provides a joint modeling method, which combines two semantic analysis subtasks into one semantic analysis task, improves the accuracy of the semantic slot filling while improving the accuracy of multi-intention recognition, thereby improving the quality of natural language semantic analysis; in practical application, the ability of understanding human language of a machine in man-machine conversation can be effectively improved, and the ability of solving problems and the experience of man-machine conversation are improved.

Description

Multi-intention and semantic slot joint identification method based on cluster pre-analysis
Technical Field
The application relates to the field of natural language processing, in particular to a natural language semantic analysis method in a man-machine dialogue system.
Background
With the rapid development of artificial intelligence, people have increasingly demanded intelligence in many application scenes, and to meet the demand of intelligence, good man-machine interaction is indispensable. At present, human-computer interaction modes are diversified, wherein the most convenient mode is to use natural language. Therefore, the voice of realizing the man-machine conversation by using the self-language is also higher, and the man-machine conversation system is widely focused in academia and industry, and has very wide application scenes.
The technology of natural language semantic analysis is to be realized in a man-machine dialogue system. The quality of semantic analysis will directly affect the effect of human-computer interaction. This increases the difficulty of natural language semantic analysis due to the complexity, abstract nature, ambiguity of words, etc. Semantic analysis is divided into two basic subtasks, intent recognition and semantic slot filling. For these two subtasks, the traditional approach is to treat the two tasks as two independent problems to solve, and then connect the results of the two tasks. In practice, however, intent recognition is a determination of the type of user requirements, while semantic slot filling is the materialization of the user requirements. Thus, the user intent and the slot to be identified are strongly correlated, the intent identification being for better filling of semantic slots. However, the conventional method of modeling alone does not fully consider the connection between two tasks, so that semantic information cannot be effectively utilized. In addition, the man-machine interactive system often faces a multi-intention recognition problem, that is, the intention text input by the user may not only contain one intention, but also multiple intentions may occur. At present, research on the problem of intention recognition is mainly focused on the recognition of single intention, and multi-intention recognition is more complicated than single intention recognition and requires a higher degree of semantic understanding.
In summary, how to provide a joint modeling method for effectively solving multi-purpose recognition and semantic slot filling based on the prior art, aiming at the semantic analysis problem in the man-machine conversation system, is also a problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the above problems of semantic analysis in a man-machine conversation system, the application provides a multi-intention and semantic slot joint identification method based on cluster pre-analysis, which is used for solving the joint identification problem of multi-intention identification and semantic slot filling in the man-machine conversation, and the method comprises the following specific processes:
step 1, acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text;
step 2, constructing a multi-intention recognition model based on cluster pre-analysis, wherein the multi-intention recognition model is used for recognizing a plurality of intentions of a user;
step 3, constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic Slot by fully utilizing the result of intention recognition;
and 4, optimizing the constructed multi-purpose recognition and semantic slot filling joint model, and recognizing by using the optimized model.
In step 1, the multi-intention text input by the current user is preprocessed, namely the multi-intention text is vectorized for expression, so that semantic feature extraction is carried out in the input neural network model. The vectorization representation method comprises the steps of firstly pre-training the BERT model by using massive Chinese unsupervised corpus in the same field. The resulting BERT pre-training model is then utilized to vectorize the multi-intent text.
In step 2, the purpose of constructing the multi-intention recognition model based on cluster pre-analysis is to guide the filling of the semantic slots, and whether the multi-intention recognition is accurate or not directly influences the filling of the semantic slots. In order to improve the accuracy of multi-intention recognition, a method based on clustering pre-analysis is provided for uncertainty of the intention input by a user, namely, intention texts are analyzed before the intention recognition, and whether the intention belongs to single intention or multi-intention is judged.
The overall intent recognition is divided into two phases: the first stage is to judge the input intention text by using a K-means clustering algorithm. In general, the intentions are mainly divided into a single intention and a multiple intention, and thus, the clustering center K is two; and the second stage, classifying according to the judged number of intention.
When judging that the intention text contains a plurality of intents, classifying by using a multi-intention classifier, namely adding a full-connection layer behind the BERT pre-training model, wherein each node of the full-connection layer is connected with all nodes of the upper layer and used for fusing semantic features extracted in the previous step; and then, the intention text vector output by the BERT model is input into a sigmoid classifier, and the classifier is used for carrying out two classification on each label, so that a plurality of intention labels are output. The threshold value set by the sigmoid classifier is 0.6, and the best classifying effect can be obtained when the value is obtained through experimental verification. Outputting the tag when the probability is greater than the set threshold; when the probability is less than the set threshold, the tag is not output. The calculation formula of label prediction is as follows:
y I =sigmoid(W I C+b I )。
in the formulayI Predictive output representing intent, W I Representing a weight matrix, C representing the input text vector, b I Representing the bias.
When the intention text is judged to be single intention, adopting a softmax classifier, directly inputting the sentence vector C of which the BERT output is the first mark ([ CLS ]) into the classifier to classify, and obtaining a predicted intention label according to the following formula:
y I =softmax(W I C+b I )。
in the process of pre-analyzing the multi-intention text by using the K-means clustering algorithm, the semantic similarity between the intention texts needs to be judged. The measurement of semantic similarity is critical to the accuracy of the clustered results. For the measurement of text semantic similarity, a common way is to calculate cosine similarity. Cosine similarity can reflect the difference between two vectors in space. However, cosine similarity is insensitive to absolute values, and cannot measure differences in the same direction. The Euclidean distance is sensitive to absolute values when the similarity is calculated, and the difference in the same direction can be well measured. Therefore, the application combines the characteristics of the cosine similarity and the Euclidean distance, and provides a novel measurement method, which is as follows:
wherein ,f1 Refer to cosine similarity, f 2 Refer to Euclidean distance, f Sim (x i ,x j ) Representing an intended text vector x i And intent text vector x j Distance between f 1 (x i ,x j ) Representing an intended text vector x i And intent text vector x j Cosine similarity between f 2 (x i ,x j ) Representing an intended text vector x i And intent text vector x j Euclidean distance between them. When f is calculated Sim The larger the value, the greater the similarity between the data objects, when calculated f Sim The smaller the value, the smaller the similarity between the data objects is, and the similarity between texts can be better measured by using the measuring method.
In step 3, the construction method of BiLSTM-CRF semantic Slot filling model based on the Slot-gate association mechanism is specifically as follows:
Slot-Gated association gate mechanism that can associate intent recognition tasks with semantic Slot filling tasks. The intent vector for intent recognition is weighted summed with the intent text vector for semantic slot filling. Then, through activating a function tanh, an intention-semantic slot joint feature vector g is obtained, and the calculation method is as follows:
wherein ,representing semantic slot vectors, c I Representing intent vector-> and cI V and W are trainable vectors and matrices, respectively.
Then, the intention-semantic slot joint feature vector g is input into the BiLSTM neural network, so as to extract the word sequence features of the text and capture the deep context semantic information. And then adding a linear layer behind the BiLSTM network, and mapping the dimension of the output vector of the neural network for semantic slot decoding. Finally, using CRF as decoding unit, outputting slot label corresponding to each word in sequence, and calculating the method as follows:
wherein ,semantic slot prediction output representing the i-th word in the input text sequence,/>Is a weight matrix.
In step 4, optimizing the constructed joint model of multi-purpose recognition and semantic slot filling, specifically as follows:
the performance of the joint recognition model is determined jointly by the two subtasks, and the joint probabilities of multi-intent recognition and semantic slot filling are as follows:
wherein ,representing multi-intent recognition y on the premise of inputting multi-intent text sequence x I And semantic slot filling->Joint conditional probability of (a)。
In joint model training, the goal of the training is to maximize the joint probability of outputting multi-intent recognition and semantic slot filling. In order to improve the semantic analysis capability, the intended semantic information is fully utilized to guide the filling of the semantic slots, and the joint recognition model is optimized. In model training, the traditional mode that only a plurality of task loss functions are simply added is changed. Based on iteration ideas, a step-by-step iteration training mode combining multi-purpose recognition and semantic slot filling is provided: (1) training the BERT model and the multi-intention recognition model by using a training text, and updating parameters of the BERT model and the multi-intention recognition model; (2) transmitting the output of the multi-intention recognition model in the step (1) to a Slot-formed, and training the updated BERT model and semantic Slot filling model in the step (1) by using the same training text as that in the step (1), so as to update the parameters of the BERT model and the semantic Slot filling model; (3) iteratively performing (1) and (2) until a training goal (optimal) is reached. The multi-purpose recognition and semantic slot filling tasks share BERT model bottom parameters during training, namely, when one model is trained, the bottom model is initialized by training results of the other model. The upstream tasks train respectively, and meanwhile, the result of intention recognition is transmitted to the semantic slot filling task so as to improve the accuracy of semantic slot filling.
The loss function is very important for model parameter updates. If the selection of the loss function is unreasonable, the model is powerful and the final result is not good.
Multi-purpose recognition Loss function Loss in joint recognition model intent The calculation formula is as follows:
Loss intent =(Loss multi ) k (Loss single ) 1-k
where k represents a category of intention text, k is 1 when the intention text contains a plurality of intents, and k is 0 when the intention text is a single intention. Loss (Low Density) multi Cross entropy, loss for multi-purpose recognition single Cross entropy for single intent recognition. The specific calculation is as follows:
in the formula,yI For the prediction output of intent, y intent For true intent, T is the number of training text.
Semantic slot filling task Loss function Loss in joint recognition model slot The calculations are as follows:
wherein ,semantic slot prediction output representing the i-th word in training text sequence,/>The real semantic slot of the ith word in the training text sequence is represented, T is the training text number, and M represents the training text sequence length.
Compared with the prior art, the technical scheme provided by the application has the following technical effects:
the application fully considers the connection between the intention recognition and the semantic slot filling, constructs a joint recognition model, combines two semantic analysis subtasks into one task, and shares the BERT bottom semantic features. Then, the intent-semantic Slot joint feature vector is generated by using the Slot-oriented association gate and is used for the semantic Slot filling task. Capturing the word sequence characteristics of the text by using BiLSTM in the task of filling the semantic slot, and acquiring context semantic information; and CRF is used as a decoder, and the dependency relationship before and after the label is considered, so that the semantic slot labeling is more reasonable. In addition, in order to improve the overall performance of the joint model, in the multi-intention recognition process, aiming at the uncertainty of user input intention, a clustering pre-analysis-based algorithm is provided for judging the number of intention, the traditional semantic similarity measurement method is improved in the algorithm, a new measurement mode is provided, the similarity between intention texts can be more effectively measured by the new measurement mode, the accuracy of intention number judgment is improved, and the robustness of the algorithm is improved. In order to improve the capacity of semantic analysis, the meaning semantic information is fully utilized to guide the filling of semantic slots, and based on an iteration thought, a training mode through step-by-step iteration is provided, so that the interrelationship between the meaning and the semantic slots can be fully utilized, the filling accuracy of the semantic slots is improved, and meanwhile, the accuracy of a multi-meaning recognition model is improved, so that the effect of semantic analysis is improved.
Drawings
In order to make the objects, technical solutions and technical effects of the present application more clear, the present application provides the following drawings for explanation:
FIG. 1 is a block diagram of the overall structure of a joint modeling method of the present application;
FIG. 2 is a flow chart of multi-intent recognition based on cluster pre-analysis in accordance with the present application;
FIG. 3 is a diagram of a semantic slot recognition model of the present application;
FIG. 4 is a schematic diagram of a step-and-repeat training scheme of the joint recognition model of the present application.
Detailed Description
As shown in FIG. 1, the application discloses a multi-intention and semantic slot joint identification method based on cluster pre-analysis, which comprises the following steps:
step S101, acquiring a multi-intention text input by a current user in real time and preprocessing the multi-intention text;
the multi-intention text input by the current user is preprocessed, namely the multi-intention text is vectorized for representation, so that semantic feature extraction is carried out in the input neural network model. The vectorization representation method comprises the steps of firstly training a BERT model by using massive Chinese unsupervised corpus in the same field. The resulting BERT pre-training model is then utilized to vectorize the multi-intent text.
Step S102, constructing a multi-intention recognition model based on cluster pre-analysis, and recognizing a plurality of intentions of a user; as shown in fig. 2.
The purpose of constructing the multi-intention recognition model based on cluster pre-analysis is to guide the filling of the semantic slots, and whether the multi-intention recognition is accurate or not can directly influence the filling of the semantic slots. In order to improve the accuracy of multi-intention recognition, aiming at the uncertainty of the intention input by a user, a clustering pre-analysis-based method is provided, namely, intention texts are analyzed before the intention recognition, and whether the intention belongs to single intention or multi-intention is judged. The method comprises the following steps:
the overall intent recognition is divided into two phases. The first stage is to judge the input intention text by using a K-means clustering algorithm. In general, the intentions are mainly divided into a single intention and a multiple intention, and thus, the cluster center K is two. And the second stage, classifying according to the judged number of intention. When it is determined that the intention text contains a plurality of intents, classification is performed using a multi-intention classifier. Namely, a full connection layer is added behind the BERT pre-training model, and each node of the full connection layer is connected with all nodes of the upper layer and is used for fusing the semantic features extracted in the previous step. And then, the intention text vector output by the BERT model is input into a sigmoid classifier, and the classifier is used for carrying out two classification on each label, so that a plurality of intention labels are output. The calculation formula of label prediction is as follows:
y I =sigmoid(W I C+b I )
when the intention text is judged to be single intention, adopting a softmax classifier, directly inputting the sentence vector C of which the BERT output is the first mark ([ CLS ]) into the classifier for classification, and obtaining a predicted intention label according to the following formula.
y I =softmax(W I C+b I )
In the process of pre-analyzing the multi-intention text by using the K-means clustering algorithm, the semantic similarity between the intention texts needs to be judged. The measurement of semantic similarity is critical to the accuracy of the clustered results. For the measurement of text semantic similarity, a common way is to calculate cosine similarity. Cosine similarity can reflect the difference between two vectors in space. However, cosine similarity is insensitive to absolute values, and cannot measure differences in the same direction. The Euclidean distance is sensitive to absolute values when the similarity is calculated, and the difference in the same direction can be well measured. Therefore, the application combines the characteristics of the cosine similarity and the Euclidean distance, and provides a novel measurement method, which is as follows:
wherein ,f1 Refer to cosine similarity, f 2 Refers to the Euclidean distance, when f is calculated Sim The larger the value, the greater the similarity between the data objects, when calculated f Sim The smaller the value, the less similarity between the data objects is accounted for. By using the method, the similarity between texts can be better measured.
Step S103, a BiLSTM-CRF semantic Slot filling model is built based on a Slot-connected association gate mechanism, and the result of intention recognition is fully utilized to guide the filling of the semantic Slot, as shown in FIG. 3.
Slot-Gated association gate mechanism that can associate intent recognition tasks with semantic Slot filling tasks. The intent vector for intent recognition is weighted summed with the intent text vector for semantic slot filling. Then, by activating the function tanh, the intent-semantic slot joint feature vector g is obtained. The calculation method comprises the following steps:
wherein ,representing semantic slot vectors, c I Representing intent vector-> and cI V and W are trainable vectors and matrices, respectively.
Then, the intention-semantic slot joint feature vector g is input into the BiLSTM neural network, so as to extract the word sequence features of the text and capture the deep context semantic information. And then adding a linear layer behind the BiLSTM network, and mapping the dimension of the output vector of the neural network for semantic slot decoding. And finally, using the CRF as a decoding unit, and outputting a slot label corresponding to each word in the sequence. The calculation method comprises the following steps:
wherein ,semantic slot prediction output representing the i-th word in the input text sequence,/>Is a weight matrix.
Step S104, optimizing the constructed multi-intention recognition and semantic slot filling joint model. As shown in fig. 4.
The performance of the joint recognition model is determined jointly by the two subtasks. The joint probabilities of multi-intent recognition and semantic slot filling are as follows:
wherein ,representing multi-intent recognition y on the premise of inputting multi-intent text sequence x I And semantic slot filling->Is a joint conditional probability of (a).
In joint model training, the goal of the training is to maximize the joint probability of outputting multi-intent recognition and semantic slot filling. In order to improve the semantic analysis capability, the intended semantic information is fully utilized to guide the filling of the semantic slots, and the joint recognition model is optimized. In model training, the traditional mode that only a plurality of task loss functions are simply added is changed. Based on iteration ideas, a step-by-step iteration training mode combining multi-purpose recognition and semantic slot filling is provided: as shown in fig. 4, firstly, training data is input into a joint recognition model, and during training, a round of multi-intention recognition model is trained first, and multi-intention recognition model parameters and bottom-layer BERT model parameters are updated through back propagation; and then, the updated model is utilized to transmit semantic features of the multi-intention recognition result to Slot-Gated, the intention features are fused with semantic Slot features generated by adopting the updated BERT model through a Slot-Gated association gate, and an intention-semantic Slot joint feature vector is generated and used for training a semantic Slot filling model. During training, updating semantic slot filling model parameters and bottom BERT model parameters through back propagation; the training is repeated until the optimum is reached.
The multi-purpose recognition and semantic slot filling tasks share BERT model bottom parameters during training, namely, when one model is trained, the bottom model is initialized by training results of the other model. The upstream tasks are respectively trained, and meanwhile, the intention recognition result is transmitted to the semantic slot filling task, so that the accuracy of semantic slot filling is improved, and meanwhile, the accuracy of the multi-intention recognition model is improved.
The loss function is very important for model parameter updates. If the selection of the loss function is unreasonable, the model is powerful and the final result is not good.
Multi-purpose recognition Loss function Loss in joint recognition model intent The calculation formula is as follows:
Loss intent =(Loss multi ) k (Loss single ) 1-k
where k represents a category of intention text, k is 1 when the intention text contains a plurality of intents, and k is 0 when the intention text is a single intention. Loss (Low Density) multi Cross entropy, loss for multi-purpose recognition single Cross entropy for single intent recognition. The specific calculation is as follows:
in the formula,yI For the prediction output of intent, y intent Is a true intention.
Semantic slot filling task Loss function Loss in joint recognition model slot The calculations are as follows:
wherein ,semantic slot prediction output representing the i-th word in training text sequence,/>The real semantic slot of the ith word in the training text sequence is represented, T is the training text number, and M represents the training text sequence length.
The application also provides a multi-intention and semantic slot joint identification system for cluster pre-analysis, which comprises the following steps: a memory and a processor; the memory stores a computer program which, when executed by the processor, implements the multi-intent and semantic slot joint recognition method described above.
The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the multi-intent and semantic slot joint recognition method described above. The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should be noted that the above embodiments are only for aiding in understanding the method of the present application and its core idea, and that it will be obvious to those skilled in the art that several improvements and modifications can be made to the present application without departing from the principle of the present application, and these improvements and modifications are also within the scope of the claims of the present application.

Claims (6)

1. The multi-intention and semantic slot joint identification method based on cluster pre-analysis is characterized by comprising the following steps of:
step 1, acquiring a text input by a current user in real time, and vectorizing the text by using a BERT model;
step 2, constructing a multi-intention recognition model based on cluster pre-analysis, and recognizing a plurality of intentions of a user;
step 3, constructing a BiLSTM-CRF semantic Slot filling model based on a Slot-connected association gate mechanism, and guiding the filling of the semantic Slot by using the result of intention recognition;
step 4, optimizing and training a joint model formed by the BERT model, the multi-intention recognition model and the semantic slot filling model, and recognizing by using the joint model completed by optimizing and training; the recognition method of the multi-intention recognition model in the step 2 comprises two stages:
the first stage: dividing the input intention text vector into two types of single intention and multiple intention by using a K-means clustering algorithm;
and a second stage: carrying out classification and identification on the intention text vectors of the single intention category by adopting a softmax classifier; classifying and identifying the multi-intention text vector by adopting a sigmoid classifier;
in the step 4, optimization training is carried out by adopting a step-by-step iterative training mode of combining multi-intention recognition and semantic slot filling: (1) training the BERT model and the multi-intention recognition model by using a training text, and updating parameters of the BERT model and the multi-intention recognition model; (2) transmitting the output of the multi-intention recognition model in the step (1) to a Slot-formed, and training the updated BERT model and semantic Slot filling model in the step (1) by using the same training text as that in the step (1), so as to update the parameters of the BERT model and the semantic Slot filling model; (3) iteratively performing (1) and (2) until a training goal is reached.
2. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein a distance function in a K-means clustering algorithm is as follows:
wherein ,fSim (x i ,x j ) Representing an intended text vector x i And intent text vector x j Distance between f 1 (x i ,x j ) Representing an intended text vector x i And intent text vector x j Cosine similarity between f 2 (x i ,x j ) Representing an intended text vector x i And intent text vector x j Euclidean distance between them.
3. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein the Loss function Loss of the multi-intent recognition model intent The following is shown:
Loss intent =(Loss multi ) k (Loss single ) 1-k
where k represents the category of the intention text, k being when the intention text contains a plurality of intents1, k is 0 when the intention text is single intention;for cross entropy loss for multi-purpose recognition,cross entropy loss for single intent recognition, y I For the prediction output of intent, y intent For true intent, T is the number of training text.
4. The multi-intent and semantic slot joint recognition method based on cluster pre-analysis according to claim 1, wherein a Loss function Loss of a semantic slot filling model slot The following is shown:
wherein ,semantic slot prediction output representing the i-th word in training text sequence,/>The real semantic slot of the ith word in the training text sequence is represented, T is the training text number, and M represents the training text sequence length.
5. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the multi-intent and semantic slot joint recognition method according to any of claims 1 to 4.
6. A multi-intent and semantic slot joint recognition system for cluster pre-analysis, comprising: a memory and a processor; the memory has stored thereon a computer program which, when executed by the processor, implements the multi-intent and semantic slot joint identification method as claimed in any of claims 1 to 4.
CN202110325369.XA 2021-03-26 2021-03-26 Multi-intention and semantic slot joint identification method based on cluster pre-analysis Active CN113204952B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202110325369.XA CN113204952B (en) 2021-03-26 2021-03-26 Multi-intention and semantic slot joint identification method based on cluster pre-analysis
PCT/CN2021/091024 WO2022198750A1 (en) 2021-03-26 2021-04-29 Semantic recognition method
JP2022512826A JP7370033B2 (en) 2021-03-26 2021-04-29 Semantic recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110325369.XA CN113204952B (en) 2021-03-26 2021-03-26 Multi-intention and semantic slot joint identification method based on cluster pre-analysis

Publications (2)

Publication Number Publication Date
CN113204952A CN113204952A (en) 2021-08-03
CN113204952B true CN113204952B (en) 2023-09-15

Family

ID=77025737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110325369.XA Active CN113204952B (en) 2021-03-26 2021-03-26 Multi-intention and semantic slot joint identification method based on cluster pre-analysis

Country Status (3)

Country Link
JP (1) JP7370033B2 (en)
CN (1) CN113204952B (en)
WO (1) WO2022198750A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115292463B (en) * 2022-08-08 2023-05-12 云南大学 Information extraction-based method for joint multi-intention detection and overlapping slot filling
CN115273849B (en) * 2022-09-27 2022-12-27 北京宝兰德软件股份有限公司 Intention identification method and device for audio data
CN116795886B (en) * 2023-07-13 2024-03-08 杭州逍邦网络科技有限公司 Data analysis engine and method for sales data
CN117435738B (en) * 2023-12-19 2024-04-16 中国人民解放军国防科技大学 Text multi-intention analysis method and system based on deep learning
CN117435716A (en) * 2023-12-20 2024-01-23 国网浙江省电力有限公司宁波供电公司 Data processing method and system of power grid man-machine interaction terminal

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767408A (en) * 2020-05-27 2020-10-13 青岛大学 Causal graph construction method based on integration of multiple neural networks

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200257856A1 (en) * 2019-02-07 2020-08-13 Clinc, Inc. Systems and methods for machine learning based multi intent segmentation and classification
CN110008476B (en) 2019-04-10 2023-04-28 出门问问信息科技有限公司 Semantic analysis method, device, equipment and storage medium
CN110321418B (en) * 2019-06-06 2021-06-15 华中师范大学 Deep learning-based field, intention recognition and groove filling method
CN112035626A (en) 2020-07-06 2020-12-04 北海淇诚信息科技有限公司 Rapid identification method and device for large-scale intentions and electronic equipment
CN112183062B (en) 2020-09-28 2024-04-19 云知声智能科技股份有限公司 Spoken language understanding method based on alternate decoding, electronic equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767408A (en) * 2020-05-27 2020-10-13 青岛大学 Causal graph construction method based on integration of multiple neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
人机对话系统中意图识别方法综述;刘娇;李艳玲;林民;;计算机工程与应用(第12期);6-12+48 *
基于Attention+Bi-LSTM的公交出行意图和语义槽填充联合识别;陈婷婷;林民;李艳玲;;青海师范大学学报(自然科学版)(第04期);19-24 *
端到端对话系统意图语义槽联合识别研究综述;王堃;林民;李艳玲;;计算机工程与应用(第14期);14-25 *
融合多约束条件的意图和语义槽填充联合识别;侯丽仙;李艳玲;林民;李成城;;计算机科学与探索(第09期);1545-1553 *

Also Published As

Publication number Publication date
JP2023522502A (en) 2023-05-31
WO2022198750A1 (en) 2022-09-29
JP7370033B2 (en) 2023-10-27
CN113204952A (en) 2021-08-03

Similar Documents

Publication Publication Date Title
CN113204952B (en) Multi-intention and semantic slot joint identification method based on cluster pre-analysis
CN111626063B (en) Text intention identification method and system based on projection gradient descent and label smoothing
CN112733866B (en) Network construction method for improving text description correctness of controllable image
CN111581961A (en) Automatic description method for image content constructed by Chinese visual vocabulary
CN112306494A (en) Code classification and clustering method based on convolution and cyclic neural network
CN113255366B (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN111538846A (en) Third-party library recommendation method based on mixed collaborative filtering
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
CN115687610A (en) Text intention classification model training method, recognition device, electronic equipment and storage medium
CN116150361A (en) Event extraction method, system and storage medium for financial statement notes
CN111984780A (en) Multi-intention recognition model training method, multi-intention recognition method and related device
CN116361442B (en) Business hall data analysis method and system based on artificial intelligence
CN115204143B (en) Method and system for calculating text similarity based on prompt
CN116910190A (en) Method, device and equipment for acquiring multi-task perception model and readable storage medium
CN113204971B (en) Scene self-adaptive Attention multi-intention recognition method based on deep learning
CN114548104A (en) Few-sample entity identification method and model based on feature and category intervention
CN114330350A (en) Named entity identification method and device, electronic equipment and storage medium
CN114417872A (en) Contract text named entity recognition method and system
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism
Wu et al. A text emotion analysis method using the dual-channel convolution neural network in social networks
US11934794B1 (en) Systems and methods for algorithmically orchestrating conversational dialogue transitions within an automated conversational system
CN116955579B (en) Chat reply generation method and device based on keyword knowledge retrieval
CN117113977B (en) Method, medium and system for identifying text generated by AI contained in test paper

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant