CN114756678A - Unknown intention text identification method and device - Google Patents

Unknown intention text identification method and device Download PDF

Info

Publication number
CN114756678A
CN114756678A CN202210307174.7A CN202210307174A CN114756678A CN 114756678 A CN114756678 A CN 114756678A CN 202210307174 A CN202210307174 A CN 202210307174A CN 114756678 A CN114756678 A CN 114756678A
Authority
CN
China
Prior art keywords
samples
sentence
text
category
decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210307174.7A
Other languages
Chinese (zh)
Inventor
李健铨
刘小康
穆晶晶
胡加明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dingfu Intelligent Technology Co ltd
Original Assignee
Dingfu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dingfu Intelligent Technology Co ltd filed Critical Dingfu Intelligent Technology Co ltd
Priority to CN202210307174.7A priority Critical patent/CN114756678A/en
Publication of CN114756678A publication Critical patent/CN114756678A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the application provides a method and a device for identifying an unknown intention text. The scheme comprises the following steps: acquiring K positive samples and S negative samples corresponding to each training sample, wherein K and S are positive integers greater than or equal to 1; obtaining sentence representations of training samples and corresponding positive samples and negative samples by using a classifier, enabling the sentence representations of the samples of the same type to be gathered together, and enabling the sentence representations of different types to be far away from each other; determining a decision center of each category according to sentence representation, and learning a decision boundary of each category; judging whether the text to be recognized is positioned outside decision boundaries of all categories; if yes, the text to be recognized is determined to be the unknown intention text. According to the embodiment of the application, contrast learning and classification learning are introduced in the stage of training the classifier, sentence expressions of samples of the same category are gathered together, the sentence expressions of different categories are far away from each other, the effect is better when a decision boundary is trained, and the classifier can accurately recognize texts with unknown intentions.

Description

Unknown intention text identification method and device
Technical Field
The application relates to the technical field of natural language processing, in particular to a method and a device for identifying unknown intention texts.
Background
Text classification is one of the basic tasks in the field of natural language processing technology, and has a very rich application in real life, for example, applications such as public opinion monitoring, news classification, emotion classification based on natural language processing technology are all realized through the text classification task.
At present, a text classification task trains a classification model through training samples of several fixed classes, so that the classification model can identify texts of several fixed classes from unknown texts, however, for unknown texts (namely unknown intentions) which do not belong to the several fixed classes, the classification model cannot classify the texts. For example: in a news classification scenario, if training samples include tags of three categories of sports, economy and entertainment, a classification model trained by using the training samples of the three categories can only classify texts to be recognized of the three categories of sports, economy and entertainment, but texts to be recognized of life categories are unknown intentions for the classification model, but the classification model cannot recognize the unknown intentions.
Additionally, in some scenarios, there may be many text classes, and the class labels of the training samples may only cover the class classes, i.e., the class labels of the training samples are incomplete. For example: in the field of travel mode identification, the class labels of the training samples may include walking, bus taking, bicycle riding and driving, but the travel modes may also include network car booking, train taking, multi-mode transfer and the like, and for the classification model, network car booking, train taking, multi-mode transfer and the like are unknown intentions which cannot be identified.
Disclosure of Invention
The embodiment of the application provides a method and a device for identifying an unknown intention text, which can accurately identify the unknown intention text from the text to be identified.
In a first aspect, an embodiment of the present application provides a method for identifying an unknown intention text, including: acquiring K positive samples and S negative samples corresponding to each training sample, wherein the positive samples are randomly acquired from the same class samples of the training samples, the negative samples are randomly acquired from different class samples of the training samples, and both K and S are positive integers greater than or equal to 1; obtaining sentence expressions of training samples and corresponding positive samples and negative samples by using a classifier, enabling the sentence expressions of the samples of the same class to be gathered together by the classifier through comparing a learning loss function, and enabling the sentence expressions of different classes to be far away from each other through classifying the learning loss function; determining a decision center of each category according to sentence representation, and learning a decision boundary of each category; acquiring the similarity of the text to be recognized and the decision centers of all categories to determine a target category corresponding to the maximum similarity; judging whether the text to be recognized is positioned outside the decision boundary of the target category; if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is an unknown intention text; and if the text to be recognized is located in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
According to the method provided by the embodiment of the application, the contrast learning and the classification learning are introduced in the stage of training the classifier, so that the sentence representations of the samples in the same category are gathered together, the sentence representations in different categories are far away from each other, the effect is better when the decision boundary is trained, and the classifier can more accurately identify the text with unknown intention from the text to be identified.
In one implementation, the contrast learning penalty function is constructed from the distance between a training sample and any of its positive samples, and the sum of the distances between the training sample and all of its negative samples.
In one implementation, the contrast learning Loss function is embodied as the following Loss1
Figure BDA0003565965640000021
Where N is the number of positive samples, viNormalized result, v, representing sentence representation of training samplesjNormalized result, v, representing the sentence representation of positive samples-Normalized result, V, of sentence representation representing negative samples+Represents the set of all positive samples, V-Denotes the set of all negative samples, τ is the hyper-parameter, exp (v)i·vjT) represents the distance between the training sample and any positive sample thereof, Σv-∈V-[expvi·v-/τ)+expvj·v-/τ)]Representing the sum of the distances between the training sample and all its negative samples.
In one implementation, the classification learning loss function is constructed from the cosine distance between the sentence representation of the training sample and the representation of the true label corresponding to its class, and the sum of the cosine distances between the sentence representation of the training sample and the representations of all other class labels.
In one implementation, the class learning penalty function is embodied as Loss2
Figure BDA0003565965640000022
Wherein z isiSentence representation, θ, representing training samplesyiRepresentation of the true label, θ, representing the training samplejRepresentation of labels of other classes, cos (θ)yi,zi) The cosine distance, cos (θ), between the sentence representation representing the training sample and the representation of the true label corresponding to its classj,zi) The cosine distance between the sentence representation of the training sample and the representation of other class labels is represented, m is a preset parameter, and s is a preset multiple.
In one implementation, learning the decision boundary for each category includes: constructing a decision boundary optimization function according to the numerical relationship between the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample and the decision radius, wherein the numerical relationship comprises that the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is greater than the decision boundary of the category, or the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is less than or equal to the decision boundary of the category; and learning the decision boundary of each category cluster according to a decision boundary optimization function.
In one implementation, the decision boundary optimization function is embodied as L b
Figure BDA0003565965640000023
Figure BDA0003565965640000024
Where N is the number of positive samples, ΔyiRadius of decision representing class, cyiDecision center, z, representing a categoryiSentence representation, cos (c), representing a training sampleyi,zi) Representing training samples ziAnd decision center cyiThe cosine distance between.
In one implementation, the classifier employs the following overall LOSS function LOSS:
LOSS=Loss1×a+(1-a)×Loss2
wherein a is an adjustable hyper-parameter.
In one implementation, the representation of the tag is obtained by: obtaining sentence representations of all training samples of the label using a classifier; the central point of the sentence representations of all training samples of the label is taken as the sentence representation of the label.
In a second aspect, an embodiment of the present application provides an apparatus for recognizing an unknown intention text, including: a processor and a memory, the memory including program instructions which, when executed by the processor, cause the apparatus for recognizing an unknown intended text to perform the method steps of: acquiring K positive samples and S negative samples corresponding to each training sample, wherein the positive samples are randomly acquired from the same class samples of the training samples, the negative samples are randomly acquired from different class samples of the training samples, and both K and S are positive integers greater than or equal to 1; obtaining sentence expressions of training samples and corresponding positive samples and negative samples by using a classifier, enabling the sentence expressions of the samples of the same category to be gathered together by the classifier through comparing a learning loss function, and enabling the sentence expressions of different categories to be far away from each other through the classification learning loss function; determining a decision center of each category according to sentence representation, and learning a decision boundary of each category; acquiring the similarity of the text to be recognized and the decision center of each category to determine a target category corresponding to the maximum similarity; judging whether the text to be recognized is positioned outside the decision boundary of the target category; if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is the unknown intention text; and if the text to be recognized is positioned in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
The device provided by the embodiment of the application introduces contrast learning and classification learning at the stage of training the classifier, so that sentences of samples of the same category are represented and gathered together, sentences of different categories are represented and kept away from each other, the effect is better when a decision boundary is trained, and the classifier can more accurately identify texts with unknown intentions from texts to be identified.
Drawings
Fig. 1 is a schematic structural diagram of a classifier provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for recognizing unknown intention text according to an embodiment of the present application;
FIG. 3 is a flow chart for learning decision boundaries for each category provided by embodiments of the present application;
fig. 4 is a schematic structural diagram of an apparatus for recognizing an unknown intention text according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of another unknown intention text recognition apparatus provided in an embodiment of the present application.
Detailed Description
Text classification is one of the basic tasks in the field of natural language processing technology, and has a very rich application in real life, for example, applications such as public opinion monitoring, news classification, emotion classification, etc. implemented based on natural language processing technology are implemented by the text classification task.
At present, a text classification task trains a classification model through training samples of several fixed classes, so that the classification model can identify texts of several fixed classes from unknown texts, however, for unknown texts (i.e. unknown intents) which do not belong to the several fixed classes, the classification model cannot classify the texts. For example: in a news classification scenario, if training samples include tags of three categories of sports, economy and entertainment, a classification model trained by using the training samples of the three categories can only classify texts to be recognized of the three categories of sports, economy and entertainment, but texts to be recognized of life categories are unknown intentions for the classification model, but the classification model cannot recognize the unknown intentions.
Additionally, in some scenarios, there may be many text classes, and the class labels of the training samples may only cover the class classes, i.e., the class labels of the training samples are incomplete. For example: in the field of travel mode identification, class labels of training samples may include walking, bus riding, bicycle riding and driving, but travel modes may also include network car booking, train riding, multi-mode transfer and the like, for classification models, network car booking, train riding, multi-mode transfer and the like are unknown intentions, and the current classification models cannot identify the unknown intentions.
In addition, the current classification model is usually obtained by deep learning model training, and the deep learning model can only give class judgment of the input text in the trained class. For the input text of the untrained class, the deep learning model also gives the class with the highest probability in all the known classes, so that the input text can be classified into the wrong class.
In order to more accurately identify texts with unknown intentions from texts to be identified, the embodiment of the application provides an identification method of texts with unknown intentions. The method may be implemented by training a classification model based on a deep learning algorithm or by other algorithms or means. The training of the classification model may include two stages as a whole, where the first stage is to train the classifier and the second stage is to train the decision boundary. A decision boundary is here understood to be a boundary of a class, which can be used to determine whether a certain sample belongs to a certain class. For example: if the sample of a certain category is positioned in the decision boundary of the certain category, the sample is indicated to belong to the category; if a sample of a certain class is outside the decision boundary of a certain class, it is indicated that the sample does not belong to this class.
The classification model can adopt a pretrained language model such as BERT, roberta, GPT, UniLM and the like as a feature extractor. The classification model may be a deep learning model of an arbitrary structure, such as: the deep learning model is built by RNN, CNN and transformer. Fig. 1 is a schematic structural diagram of a BERT model shown in an embodiment of the present application. As shown in fig. 1, the BERT model as a feature extractor may include an Input Encoding layer embed, a position Encoding layer position Encoding, and N transform blocks. In the stage of training the classifier, an Input Encoding layer is used for carrying out Embedding Encoding on a training sample, a position Encoding layer is used for adding position Encoding to the Embedding Encoding of the training sample, and N transform blocks are used for extracting sentence representation of the training sample.
Fig. 2 is a flowchart of a method for identifying an unknown intention text according to an embodiment of the present application. As shown in fig. 2, the method may include the following steps S101 to S105. Step S101 and step S102 correspond to a stage of training a classifier, and step S103 corresponds to a stage of training a decision boundary.
Step S101, K positive samples and S negative samples corresponding to each training sample are obtained, the positive samples are randomly obtained from the same class samples of the training samples, the negative samples are randomly obtained from different class samples of the training samples, and both K and S are positive integers larger than or equal to 1.
In this embodiment, the training samples may be texts of known categories, such as: word, phrase, sentence, etc. There may be multiple known classes, each of which may contain one or more training samples. For any training sample, the other samples belonging to one class may be used as positive samples, and the samples belonging to a different class may be used as negative samples.
In order to train the classifier, in the embodiment of the present application, for each training sample, K positive samples, for example, 2 positive samples, 3 positive samples, and the like, are randomly selected from the samples of all classes thereof, and S negative samples, for example, 2 negative samples, 3 negative samples, and the like, are randomly selected from the samples of different classes thereof, so as to construct the input of the feature extractor.
And step S102, obtaining sentence expressions of training samples and corresponding positive samples and negative samples by using a classifier, enabling the sentence expressions of the samples of the same category to be mutually gathered by the classifier through comparing a learning loss function, and enabling the sentence expressions of different categories to be mutually separated through the classification learning loss function.
Compared with the traditional mode that sentence representations are gathered and separated from each other only through the classification learning loss function, the embodiment of the application also introduces the comparison learning loss function on the basis of the classification learning loss function. The comparative learning loss function and the classification learning loss function are different in division, the comparative learning loss function is used for enabling sentence expressions of samples of the same type to be gathered together, and the classification learning loss function is used for enabling sentence expressions of different types to be far away from each other. Because the comparison learning can focus on learning common characteristics of samples of the same category, the method provided by the embodiment of the application can enable the gathering effect of sentence expression of the samples of the same category to be better, and is beneficial to improving the accuracy of a follow-up learning decision center and a decision boundary.
In specific implementation, in order to obtain sentence representations of the training samples, the positive samples and the negative samples, the Embedding codes of the training samples, the positive samples and the negative samples may be obtained through an Embedding Layer, and then the Embedding codes are input to the feature extractor to obtain corresponding sentence representations.
Taking the example of the feature extractor as BERT or Robert, the sentence representation of the sample (comprising training samples, sentence representations of positive samples and negative samples) may be a vector corresponding to the first character or the first participle of the output vector of the sample by the feature extractor, i.e., a vector corresponding to [ CLS ] bits.
For example: the participle result of the training sample "Olympic Association man relay" is "Olympic Association/man/relay", so the sentence representation of the training sample is the vector corresponding to the first participle "Olympic Association" output by the feature extractor.
According to the embodiment of the application, sentence representations of the same type of samples are gathered together through comparison learning at the output end of the feature extractor, and sentence representations of different types are separated from each other through classification learning.
In one implementation, the goal of the contrast learning may be through the contrast learning Loss function Loss1In one implementation, the comparative learning loss function may be constructed according to the distance between the training sample and any positive sample thereof, and the sum of the distances between the training sample and all negative samples thereof.
Exemplary, the comparative learning Loss function Loss1May be in the form of:
Figure BDA0003565965640000051
where N is the number of positive samples, viNormalized result, v, representing a sentence representation of a training samplejNormalization result, v, of a sentence representation representing positive samples-Normalization result, V, of sentence representation representing negative examples+Represents the set of all positive samples, V-Denotes the set of all negative samples, τ is the hyper-parameter, exp (v)i·vjT) represents the distance between the training sample and any positive sample thereof, Σv-∈V-[exp(vi·v-/τ)+exp(vj·v-/τ)]Representing the sum of the distances between the training sample and all its negative samples.
In one implementation, the normalization of the sentence representation can be achieved using the following formula:
Figure BDA0003565965640000052
wherein, X represents the normalized result of sentence representation,
Figure BDA0003565965640000053
expressing sentence to represent vector, n is dimension of sentence to represent vector, xiRepresenting the ith dimension value in the sentence representation vector.
In one implementation, the objective of class learning may be through the class learning Loss function Loss2In one implementation, the taxonomic learning loss function can be constructed from the sum of the cosine distances between the sentence representations of the training samples and the representations of the true tags corresponding to their classes, and the cosine distances between the sentence representations of the training samples and the representations of all other class tags.
Exemplary, class learning Loss function Loss2May be in the form of:
Figure BDA0003565965640000054
wherein z isiRepresentation of sentences, theta, representing training samplesyiRepresentation of the true label, θ, representing the training samplejRepresentation of labels of other classes, cos (θ)yi,zi) The cosine distance, cos (θ), between the sentence representation representing the training sample and the representation of the true label corresponding to its classj,zi) The cosine distance between the sentence representation representing the training sample and the representation of other class labels, m is a preset parameter, s is a preset multiple, and m and s are both modifiable parameters.
Illustratively, the Loss function Loss is learned in classification2For example, s can take a value of 10, 15, 20, etc., and m can take an arbitrary value between 0.3 and 0.5, so that the training sample can be obtainedThe cosine distance between the sentence representation of this text and the representation of the real label corresponding to its category is greater than m.
It should be added here that, in the embodiment of the present application, the representation of the category label can be implemented in three ways:
a first implementation is to initialize the representation of the class label randomly and then learn in the classifier.
The second implementation is to add label description text to the category label, input the category label and its label description text embed coding into the feature extractor, and take the first character of the output vector of the feature extractor or the vector corresponding to the first participle, i.e. the vector corresponding to [ CLS ] bit, as the representation of the category label.
For example, for the category label "sports", the label description text thereof may be "is a kind of physical and cultural activities of human society", and thus the text input into the feature extractor may be "sports: is a physical education activity and a social culture activity of the human society.
A third implementation is to obtain a representation of all training samples for each class label by the feature extractor, and then take the central point of the representation of all training samples for each class label as the representation of each class label.
Illustratively, the representation of the category label may be obtained by the following formula:
Figure BDA0003565965640000061
wherein, ckRepresentation of a class label representing the kth class, ziFor the sentence representation of the i-th training sample in the category, SkRepresents the set of all training samples in the kth class, | SkL represents the number of training samples in the kth class.
Based on the two training targets of contrast learning and classification learning introduced in the training classifier stage, the total LOSS function LOSS of the training classifier stage may be:
LOSS=Loss1×a+(1-a)×Loss2
therein, Loss1Loss function for contrast learning, Loss2And a is an adjustable super parameter which is used for adjusting the weight occupied by the contrast learning and the boundary learning when the classifier is trained.
And step S103, determining a decision center of each category according to sentence representation, and learning a decision boundary of each category.
Wherein, the decision center may be a central point of all training samples in the category in the semantic space. When the third implementation manner is adopted to obtain the representation of the category label, the representation of the category label can be used as a decision center.
Fig. 3 is a flowchart for learning a decision boundary of each category according to an embodiment of the present application.
As shown in fig. 3, in one implementation, the decision boundary of each category can be obtained by:
step S301, a decision boundary optimization function is constructed according to the numerical relationship between the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the training sample and the decision radius.
The numerical relationship comprises that the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is larger than the decision boundary of the category, or the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is smaller than or equal to the decision boundary of the category.
Step S302, learning the decision boundary of each category cluster according to a decision boundary optimization function.
Different from the traditional method for measuring the similarity by using the Euclidean distance in decision boundary learning, the method for measuring the similarity between the training sample and the decision center by using the cosine distance is adopted in the embodiment of the application. Among these are considered: the euclidean distance is more important for the absolute distance between the weighing samples, and the cosine distance is more important for the difference between the two samples in a certain direction (for example, an intention), so that the use of the cosine distance for measuring the similarity between the training sample and the decision center can better reflect whether the training sample is similar to or the same as the decision center in the intention.
Exemplary, decision boundary optimization function LbThe following forms are possible:
Figure BDA0003565965640000062
Figure BDA0003565965640000063
where N is the number of positive samples, ΔyiThe radius of the decision representing the category,
Figure BDA0003565965640000064
decision center for the representation of class, ziA sentence representation representing a training sample,
Figure BDA0003565965640000065
representing training samples ziAnd decision center
Figure BDA0003565965640000066
Cosine distance of between, δiIndicating whether the training sample is inside the decision boundary. The optimization function is such that LbAnd smaller is an optimization goal.
Wherein: the larger the cosine distance is, the larger the similarity between the training sample and the decision center is, and the closer the distance between the training sample and the decision center is; the smaller the cosine distance, the smaller the similarity between the training sample and the decision center, and the farther the distance between the training sample and the decision center. Therefore, the above formula is given as
Figure BDA0003565965640000067
And the decision radius deltayiCarrying out comparison;
Figure BDA0003565965640000068
the larger the distance between the training sample and the decision center is;
Figure BDA0003565965640000069
the smaller the distance between the training sample and the decision center.
According to the optimization function, the main idea of decision boundary learning is as follows: if the training sample of a certain category is inside the decision boundary of the category, the decision boundary is narrowed down to be close to the training sample, and if the training sample of a certain category is outside the decision boundary of the category, the decision boundary is enlarged to contain the training sample. Therefore, the decision boundary of each category can be adaptively adjusted according to the position of the training sample of the category, so that as many training samples of the same category as possible are positioned in the decision boundary of the category, and training samples outside the category are not positioned in the decision boundary of the category as much as possible, so that the learned decision boundary is more accurate. For example: when 1 and training sample ziAnd decision center cyiHas a cosine distance greater than deltayiWhen deltaiThe optimization objective of the optimization function is actually 1
Figure BDA0003565965640000071
Then, in order to make LbSmaller, the boundary Δ may be increasedyi
In addition, if the similarity between the training samples and the decision center is measured by the euclidean distance, the decision boundary optimization function L is determined bMay be in the form of:
Figure BDA0003565965640000072
Figure BDA0003565965640000073
where N is the number of positive samples, ΔyiRadius of decision representing class, cyiDecision center for representing categories,ziA sentence representation representing a training sample,
Figure BDA0003565965640000074
representing training samples ziAnd decision center cyiOf between the Euclidean distance, deltaiIndicating whether the training sample is inside the decision boundary. The optimization function is such that LbAnd smaller is an optimization goal.
Wherein: the larger the Euclidean distance is, the smaller the similarity between the training sample and the decision center is, and the farther the distance between the training sample and the decision center is; the smaller the Euclidean distance is, the greater the similarity between the training sample and the decision center is, and the closer the distance between the training sample and the decision center is.
And step S104, acquiring the similarity of the text to be recognized and the decision centers of all categories to determine the target category corresponding to the maximum similarity.
In step S104, after the text to be recognized is input to the classifier, the classifier may calculate the similarity between the text to be recognized and the decision center of each category, respectively, so as to determine the target category with the largest similarity.
Wherein:
if the cosine distance is used for representing the similarity, the larger the cosine distance between the text to be recognized and the decision center is, the larger the similarity between the text to be recognized and the decision center is, and conversely, the smaller the cosine distance between the text to be recognized and the decision center is, the smaller the similarity between the text to be recognized and the decision center is. Therefore, the class corresponding to the maximum value of the cosine distance is the target class.
If the similarity is expressed by the Euclidean distance, the larger the Euclidean distance between the text to be recognized and the decision center is, the smaller the similarity between the text to be recognized and the decision center is, and conversely, the smaller the Euclidean distance between the text to be recognized and the decision center is, the larger the similarity between the text to be recognized and the decision center is. Therefore, the class corresponding to the minimum value of the euclidean distance is the target class.
Step S105, judging whether the text to be recognized is located outside the decision boundary of the target category.
Wherein:
if the similarity is expressed in terms of cosine distance, the distance between the text to be recognized and the decision center of the target category may be expressed as: 1-cosine distance. If the 1-cosine distance is larger than the decision radius of the target category, the text to be recognized is positioned outside the decision boundary of the target category; if the 1-cosine distance is less than the decision radius of the target class, it is indicated that the text to be recognized is located within the decision boundary of the target class.
If the similarity is expressed by the Euclidean distance, if the Euclidean distance is greater than the decision radius of the target category, the text to be recognized is positioned outside the decision boundary of the target category; and if the Euclidean distance is smaller than the decision radius of the target category, the text to be recognized is positioned in the decision boundary of the target category.
In addition, for the case that the 1-cosine distance is equal to the decision radius of the target category and the case that the euclidean distance is equal to the decision radius of the target category, the text to be recognized may be considered to be located outside the decision boundary of the target category, or may be considered to be located within the decision boundary of the target category.
And S106, if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is an unknown intention text.
And S107, if the text to be recognized is located in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
The above steps S104-S106 can be implemented in a test phase or a production phase of unknown intention text recognition.
According to the method provided by the embodiment of the application, the contrast learning and the classification learning are introduced in the stage of training the classifier, so that the sentence representations of the samples in the same category are gathered together, the sentence representations in different categories are far away from each other, the effect is better when the decision boundary is trained, and the classifier can more accurately identify the text with unknown intention from the text to be identified.
The above embodiments describe various aspects of the method for recognizing unknown intention texts provided by the present application. It is to be understood that each device or module, in order to implement the above-described functions, includes a corresponding hardware structure and/or software module for performing each function. Those of skill in the art will readily appreciate that the various hardware and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Fig. 4 is a schematic structural diagram of an apparatus for recognizing an unknown intention text according to an embodiment of the present application. As shown in fig. 4, the apparatus includes hardware modules for implementing the method for recognizing an unknown intention text provided by the embodiment of the present application, and includes: a processor 210 and a memory 220, the memory 220 comprising program instructions 230, which when executed by the processor 210, cause the apparatus for recognizing an unknown intended text to perform the following method steps:
Acquiring K positive samples and S negative samples corresponding to each training sample, wherein the positive samples are randomly acquired from the same class samples of the training samples, the negative samples are randomly acquired from different class samples of the training samples, and both K and S are positive integers greater than or equal to 1;
obtaining sentence expressions of training samples and corresponding positive samples and negative samples by using a classifier, enabling the sentence expressions of the samples of the same class to be gathered together by the classifier through comparing a learning loss function, and enabling the sentence expressions of different classes to be far away from each other through classifying the learning loss function;
determining a decision center of each category according to sentence representation, and learning a decision boundary of each category;
acquiring the similarity of the text to be recognized and the decision centers of all categories to determine a target category corresponding to the maximum similarity;
judging whether the text to be recognized is positioned outside the decision boundary of the target category;
if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is an unknown intention text;
and if the text to be recognized is located in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
Fig. 5 is a schematic structural diagram of another unknown intention text recognition apparatus provided in an embodiment of the present application. As shown in fig. 5, the apparatus includes software modules for implementing the method for recognizing an unknown intention text provided by the embodiment of the present application, including:
A sample obtaining module 310, configured to obtain K positive samples and S negative samples corresponding to each training sample, where the positive samples are randomly obtained from samples of the same category of the training samples, the negative samples are randomly obtained from samples of different categories of the training samples, and K and S are both positive integers greater than or equal to 1;
the first training module 320 is configured to obtain training samples and corresponding sentence representations of positive samples and negative samples of the training samples by using a classifier, and the classifier gathers the sentence representations of the samples of the same category by comparing learning loss functions and keeps the sentence representations of different categories away from each other by classifying learning loss functions;
a second training module 330, configured to determine a decision center of each category according to the sentence representation, and learn a decision boundary of each category;
the prediction module 340 is configured to obtain similarity between the text to be recognized and the decision centers of the categories, so as to determine a target category corresponding to the maximum similarity;
the prediction module 340 is further configured to determine whether the text to be recognized is located outside the decision boundary of the target category;
the prediction module 340 is further configured to determine that the text to be recognized is an unknown intention text if the text to be recognized is located outside the decision boundary of the target category;
The prediction module 340 is further configured to determine that the text to be classified belongs to the target category if the text to be recognized is located within the decision boundary of the target category.
According to the device provided by the embodiment of the application, the contrast learning and the classification learning are introduced at the stage of training the classifier, so that the sentence representations of the samples of the same type are gathered together, the sentence representations of different types are far away from each other, the effect is better when the decision boundary is trained, and the classifier can more accurately identify the text with unknown intention from the text to be identified.
It is easily understood that, on the basis of the several embodiments provided in the present application, a person skilled in the art may combine, split, recombine, etc. the embodiments of the present application to obtain other embodiments, which do not depart from the scope of the present application.
The above embodiments, objects, technical solutions and advantages of the embodiments of the present application are described in further detail, it should be understood that the above embodiments are only specific embodiments of the present application, and are not intended to limit the scope of the embodiments of the present application, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the embodiments of the present application should be included in the scope of the embodiments of the present application.

Claims (10)

1. A method for recognizing unknown intention text, comprising:
obtaining K positive samples and S negative samples corresponding to each training sample, wherein the positive samples are randomly obtained from the same class samples of the training samples, the negative samples are randomly obtained from different class samples of the training samples, and both K and S are positive integers greater than or equal to 1;
obtaining the training samples and the sentence representations of the positive samples and the negative samples corresponding to the training samples by using a classifier, wherein the classifier enables the sentence representations of the samples of the same category to be gathered together by comparing with a learning loss function, and enables the sentence representations of different categories to be far away from each other by classifying with the learning loss function;
determining a decision center of each category according to the sentence representation, and learning a decision boundary of each category;
acquiring the similarity of the text to be recognized and the decision centers of all categories to determine a target category corresponding to the maximum similarity;
judging whether the text to be recognized is positioned outside the decision boundary of the target category;
if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is an unknown intention text;
And if the text to be recognized is located in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
2. The method of claim 1, wherein the contrast learning penalty function is constructed from a distance between the training sample and any of the positive samples thereof, and a sum of distances between the training sample and all of the negative samples thereof.
3. The method of claim 2, wherein the contrast learning penalty function is specified by the following Loss1
Figure FDA0003565965630000011
Where N is the number of positive samples, viNormalized result, v, representing a sentence representation of a training samplejNormalization result, v, of a sentence representation representing positive samples-Normalization result, V, of sentence representation representing negative examples+Represents the set of all positive samples, V-Denotes the set of all negative samples, τ is the hyper-parameter, exp (v)i·vj/τ) represents the distance between the training sample and any of its positive samples, the
Figure FDA0003565965630000013
Representing the distance between the training sample and all of its negative samplesAnd (c).
4. The method of claim 3, wherein the classification learning loss function is constructed from a cosine distance between the sentence representations of the training samples and the representations of the true tags corresponding to their classes, and a sum of cosine distances between the sentence representations of the training samples and the representations of all other class tags.
5. The method of claim 4, wherein the class learning penalty function is specified by the following Loss2
Figure FDA0003565965630000012
Wherein z isiRepresentation of sentences, theta, representing training samplesyiRepresentation of the true label, θ, representing the training samplejRepresentation of labels of other classes, cos (θ)yi,zi) Cosine distance, cos (theta), between sentence representations representing the training samples and representations of the true labels corresponding to their classesj,zi) And expressing cosine distances between sentence expressions of the training samples and expressions of other class labels, wherein m is a preset parameter, and s is a preset multiple.
6. The method of claim 1, wherein learning the decision boundary for each category comprises:
constructing a decision boundary optimization function according to a numerical relationship between the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample and the decision radius, wherein the numerical relationship comprises that the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is greater than the decision boundary of the category, or the cosine distance between the sentence representation of the training sample and the decision center corresponding to the category of the sentence representation of the training sample is less than or equal to the decision boundary of the category;
And learning the decision boundary of each category cluster according to the decision boundary optimization function.
7. The method according to claim 6, wherein the decision boundary optimization function is specified by Lb
Figure FDA0003565965630000021
Figure FDA0003565965630000022
Where N is the number of positive samples, ΔyiRadius of decision representing class, cyiDecision center for the representation of class, ziSentence representation, cos (c), representing training samplesyi,zi) Representing training samples ziAnd decision center cyiThe cosine distance between.
8. The method of claim 5, wherein the classifier employs the following overall LOSS function LOSS:
LOSS=Loss1×a+(1-a)×Loss2
wherein a is an adjustable hyper-parameter.
9. The method of claim 4, wherein the representation of the tag is obtained by:
obtaining sentence representations of all training samples of the labels using the classifier;
and taking the central point of sentence representations of all training samples of the label as the sentence representation of the label.
10. An apparatus for recognizing unknown intention text, comprising: a processor and a memory, said memory including program instructions therein which, when executed by said processor, cause said apparatus for recognizing an unknown intended text to perform the method steps of:
Acquiring K positive samples and S negative samples corresponding to each training sample, wherein the positive samples are randomly acquired from samples of the same category of the training samples, the negative samples are randomly acquired from samples of different categories of the training samples, and both K and S are positive integers greater than or equal to 1;
obtaining sentence representations of the training samples and the corresponding positive samples and negative samples by using a classifier, wherein the classifier enables the sentence representations of the samples of the same class to be gathered together by comparing learning loss functions, and enables the sentence representations of different classes to be far away from each other by classifying the learning loss functions;
determining a decision center of each category according to the sentence representation, and learning a decision boundary of each category;
acquiring the similarity of the text to be recognized and the decision centers of all categories to determine a target category corresponding to the maximum similarity;
judging whether the text to be recognized is positioned outside the decision boundary of the target category;
if the text to be recognized is located outside the decision boundary of the target category, determining that the text to be recognized is an unknown intention text;
and if the text to be recognized is located in the decision boundary of the target category, determining that the text to be classified belongs to the target category.
CN202210307174.7A 2022-03-25 2022-03-25 Unknown intention text identification method and device Pending CN114756678A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210307174.7A CN114756678A (en) 2022-03-25 2022-03-25 Unknown intention text identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210307174.7A CN114756678A (en) 2022-03-25 2022-03-25 Unknown intention text identification method and device

Publications (1)

Publication Number Publication Date
CN114756678A true CN114756678A (en) 2022-07-15

Family

ID=82326401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210307174.7A Pending CN114756678A (en) 2022-03-25 2022-03-25 Unknown intention text identification method and device

Country Status (1)

Country Link
CN (1) CN114756678A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702048A (en) * 2023-08-09 2023-09-05 恒生电子股份有限公司 Newly added intention recognition method, training method and device of distributed external monitoring model and electronic equipment
CN116796290A (en) * 2023-08-23 2023-09-22 江西尚通科技发展有限公司 Dialog intention recognition method, system, computer and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702048A (en) * 2023-08-09 2023-09-05 恒生电子股份有限公司 Newly added intention recognition method, training method and device of distributed external monitoring model and electronic equipment
CN116702048B (en) * 2023-08-09 2023-11-10 恒生电子股份有限公司 Newly added intention recognition method, model training method, device and electronic equipment
CN116796290A (en) * 2023-08-23 2023-09-22 江西尚通科技发展有限公司 Dialog intention recognition method, system, computer and storage medium
CN116796290B (en) * 2023-08-23 2024-03-29 江西尚通科技发展有限公司 Dialog intention recognition method, system, computer and storage medium

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
WO2023024412A1 (en) Visual question answering method and apparatus based on deep learning model, and medium and device
CN114756678A (en) Unknown intention text identification method and device
CN113673254B (en) Knowledge distillation position detection method based on similarity maintenance
CN111597328B (en) New event theme extraction method
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN113705238B (en) Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model
CN114676255A (en) Text processing method, device, equipment, storage medium and computer program product
CN108536781B (en) Social network emotion focus mining method and system
CN111984780A (en) Multi-intention recognition model training method, multi-intention recognition method and related device
CN113849653B (en) Text classification method and device
CN111191033A (en) Open set classification method based on classification utility
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN115905187B (en) Intelligent proposition system oriented to cloud computing engineering technician authentication
CN116050419B (en) Unsupervised identification method and system oriented to scientific literature knowledge entity
CN116167353A (en) Text semantic similarity measurement method based on twin long-term memory network
CN116662924A (en) Aspect-level multi-mode emotion analysis method based on dual-channel and attention mechanism
CN115934948A (en) Knowledge enhancement-based drug entity relationship combined extraction method and system
CN116227486A (en) Emotion analysis method based on retrieval and contrast learning
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
CN115238693A (en) Chinese named entity recognition method based on multi-word segmentation and multi-layer bidirectional long-short term memory
CN114722798A (en) Ironic recognition model based on convolutional neural network and attention system
CN114462418A (en) Event detection method, system, intelligent terminal and computer readable storage medium
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination