CN112966088B - Unknown intention recognition method, device, equipment and storage medium - Google Patents

Unknown intention recognition method, device, equipment and storage medium Download PDF

Info

Publication number
CN112966088B
CN112966088B CN202110297693.5A CN202110297693A CN112966088B CN 112966088 B CN112966088 B CN 112966088B CN 202110297693 A CN202110297693 A CN 202110297693A CN 112966088 B CN112966088 B CN 112966088B
Authority
CN
China
Prior art keywords
intention
sample
unknown
intent
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110297693.5A
Other languages
Chinese (zh)
Other versions
CN112966088A (en
Inventor
包梦蛟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Qiandai Beijing Information Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202110297693.5A priority Critical patent/CN112966088B/en
Publication of CN112966088A publication Critical patent/CN112966088A/en
Application granted granted Critical
Publication of CN112966088B publication Critical patent/CN112966088B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The application discloses an unknown intention identification method, device, equipment and storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: calling an intention identification model to obtain an intention unknown sample when the intention identification of the sample to be identified fails; carrying out anomaly detection on the semantic representation vector of the unknown-intention sample to obtain the local outlier factor density of the sample; and in response to the density being larger than the density threshold value, determining that the unknown intention sample has the unknown intention, wherein the unknown intention sample is used for retraining the intention recognition model after manually marking the new intention. According to the method, the unknown intention sample is confirmed again through the abnormal detection of the local outlier factor, and the sample with the real unknown intention is screened out, so that the intention identification type of the intention identification model is subjected to extension training; particularly, aiming at an intelligent customer service scene, the requirements that new user intentions are diversified, new user intentions need to be continuously mined, and the types of the user intentions can be identified by the intelligent customer service robot are increased can be met.

Description

Unknown intention recognition method, device, equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for identifying an unknown intention.
Background
With the development of deep learning, classic learning tasks such as image classification and intention recognition make breakthrough progress in performance and precision.
The classification task, one of the most basic supervised learning tasks in deep learning, often depends on a large amount of training data. The training data set is first labeled manually, i.e., the data set is labeled according to a given intention, and the intention of each training sample is labeled. However, at the beginning of the start-up of the intention recognition model, it is difficult to provide a relatively comprehensive set of intention-labeled samples, inevitably resulting in some of the intention labels being ignored. Therefore, an unknown intention discovering method is devised which models an unknown intention discovering problem as an abnormality detection problem, identifies an abnormal point in data as an unknown intention, and marks the abnormal point as a possible unknown intention.
However, such methods have a high probability of determining a noise sample as an unknown intention, resulting in a low accuracy of determining the unknown intention.
Disclosure of Invention
The embodiment of the application provides an unknown intention identification method, an unknown intention identification device, unknown intention identification equipment and a storage medium, wherein after an unknown intention sample is detected through an intention identification model, whether the unknown intention exists in the unknown intention sample is determined through abnormal detection of local outliers, noise interference is avoided, the identification accuracy of the unknown intention is improved, and then the sample actually having the unknown intention is accurately screened out, so that the intention identification model is further trained, and the identification intention type of the intention identification model is expanded; particularly, aiming at an intelligent customer service scene, the requirements that new user intentions are diversified, new user intentions need to be continuously mined, and the types of the user intentions can be identified by the intelligent customer service robot are increased can be met. The technical scheme is as follows:
according to an aspect of the present application, there is provided a method of identifying an unknown intention, the method including:
calling an intention identification model to obtain an unknown intention sample when the intention identification of a sample to be identified fails;
performing anomaly detection on local outlier factors on semantic feature vectors of unknown samples of intentions to obtain the local outlier factor density of the unknown samples of intentions, wherein the semantic feature vectors are intermediate vectors generated in the process of identifying the intentions aiming at the unknown samples of intentions;
and in response to the local outlier factor density being larger than the density threshold, determining that the unknown intention sample has the unknown intention, wherein the unknown intention sample is used for retraining the intention recognition model after manually marking the new intention.
According to another aspect of the present application, there is provided an unknown intention identifying apparatus, including:
the identification module is used for calling the intention identification model to obtain an unknown intention sample when the intention identification of the sample to be identified fails;
the detection module is used for carrying out anomaly detection on the local outlier factor on the semantic feature vector of the unknown intention sample to obtain the local outlier factor density of the unknown intention sample, and the semantic feature vector is an intermediate vector generated in the process of recognizing the intention aiming at the unknown intention sample;
and the determining module is used for determining that the unknown intention sample has the unknown intention in response to the fact that the density of the local outlier factor is larger than the density threshold, and the unknown intention sample is used for retraining the intention recognition model after artificially marking the new intention.
According to another aspect of the present application, there is provided a computer apparatus, including: a processor and a memory, the memory storing a computer program that is loaded and executed by the processor to implement the method of unknown intent recognition as described above.
According to another aspect of the present application, there is provided a computer-readable storage medium having stored therein a computer program that is loaded and executed by a processor to implement the method of identifying an unknown intention as described above.
According to another aspect of the present application, a computer program product is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and executes the computer instructions to cause the computer device to perform the method for recognizing unknown intention as described above.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:
the method comprises the steps of identifying a sample to be identified by adopting an intention identification model, screening out an intention unknown sample from the sample to be identified, then screening out a part of samples which are misjudged as the intention unknown sample by the intention identification model due to noise interference through abnormal detection of local outliers, determining the sample with the real unknown intention, improving the identification accuracy of the unknown intention, manually marking a new intention on the intention unknown sample, forming a sample set carrying more types of intentions by combining the samples with the known intention, and further performing intention identification training on the intention identification model, so that the intention identification model can identify more types of intentions.
Particularly, for an intelligent customer service scene, such as an application scene of an intelligent customer service robot, with the development of services and the accumulation of data, new user intentions related to the services are endless, and therefore, the types of the user intentions capable of being identified by the intention identification model in the intelligent customer service need to be continuously increased.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates a flow chart of a method of training an intent recognition model provided by an exemplary embodiment of the present application;
FIG. 2 illustrates a schematic structural diagram of an intent recognition model provided by an exemplary embodiment of the present application;
FIG. 3 illustrates a flow chart of a method of unknown intent identification provided by an exemplary embodiment of the present application;
FIG. 4 illustrates a block diagram of an unknown intent recognition apparatus provided by an exemplary embodiment of the present application;
fig. 5 shows a schematic structural diagram of a computer device provided in an exemplary embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Reference will first be made to several terms referred to in this application:
artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Natural Language Processing (NLP) is a sub-field of artificial intelligence; the method is mainly applied to the aspects of machine translation, public opinion monitoring, automatic summarization, viewpoint extraction, text classification, intention recognition, question answering, text semantic comparison, voice recognition and the like. The intention identification means that the purpose that the input content is expected to achieve is determined according to the input content such as texts, voices, even pictures and the like; for the purpose recognition, there are wide applications in the dialogue system and the question-answering system, and it is exemplary that in the dialogue system, it is first necessary to recognize the dialogue intention of the user input sentence, and then generate a reply sentence based on the dialogue intention to feed back the input sentence, for example, the input sentence is "how is the weather of today's magic ceremony? "the input sentence expresses query weather, and the query weather is the dialog intention of the input sentence.
Fig. 1 shows a flowchart of a training method for an intention recognition model provided in an exemplary embodiment of the present application, the method is applied to a computer device, which may be a terminal or a server, for example, and the method includes:
step 101, a training sample is obtained, and the training sample is labeled with a known intention.
For example, after completing the intent tagging of the training samples, the training samples may be stored in a memory of the computer device or in a database. When the computer device trains the neural network model, the training samples are read from the memory or the database.
And 102, calling a neural network model to perform feature learning on the training sample to obtain a sample semantic representation vector.
After obtaining the training sample, the computer equipment calls the neural network to perform feature learning on the training sample to obtain a sample semantic representation vector. Optionally, the neural network model includes a text labeling layer, a word embedding layer, and a Bidirectional Encoder Representation (BERT) model; the computer equipment inputs the training sample into a text labeling layer for labeling processing to obtain a labeled sample text; performing word embedding on the marked sample text input word embedding layer to obtain a sample text characterization vector; and inputting the sample text representation vector into a BERT model for feature learning to obtain a sample semantic representation vector.
For example, the BERT model is used for learning a semantic representation of a training sample, and may also be used for learning an intention of the training sample, and a Long Short-Term Memory Network (LSTM) model or a Convolutional Neural Network (CNN) model may also be used for learning the semantic representation of the training sample, where the learning manner of the semantic representation is not limited in this embodiment.
And 103, calling a neural network model to calculate the loss between the semantic representation vector of the sample and the known intention, and obtaining the intention identification loss.
Illustratively, in the process of embedding words into the training sample, word embedding is also performed on the known intentions to generate a sample intention characterization vector; and the computer equipment calls the neural network model to calculate the loss between the sample intention characterization vector and the sample semantic characterization vector so as to obtain the intention identification loss.
Optionally, the neural network model comprises a fully connected layer and a similarity prediction function; the computer equipment inputs the sample semantic representation vector into a full connection layer to obtain a sample feature vector after feature mapping; and calculating the similarity between the sample feature vector and the known intention through a similarity prediction function, and determining the intention identification loss based on the similarity.
Optionally, the similarity prediction function may include an Additive Angular interval Loss (Additive Angular interval Loss) function, which is also called ArcFace Loss function; the computer device may calculate a similarity between the sample feature vector and the known intention through an ArcFace loss function, and determine an intention recognition loss based on the similarity.
Alternatively, the similarity prediction function may further include an a-softmax (artificial softmax) function or an AM-softmax (explicit markingsoft) function, and the type of the loss function used for similarity prediction is not limited in this embodiment.
And 104, carrying out back propagation training on the neural network model based on the intention recognition loss, and finally obtaining the trained intention recognition model.
After obtaining the intention recognition loss, the computer device conducts back propagation training on the neural network model based on the intention recognition loss, adjusts model parameters in the neural network model, and finally obtains the trained intention recognition model through sequential training of a plurality of training samples.
Illustratively, as shown in fig. 2, the structure of an alternative neural network model is shown, the neural network model comprises a text annotation layer 201, a word embedding layer 202, a BERT model 203, a full connection layer 204 and an output layer 205, and an ArcFace loss function is set in the output layer 205;
the electronic device will train the sample "T1,T2,……,To,……,Tp"input text label layer 201, labeling to obtain labeled sample text' [ CLS ]]T1,T2,……,To[SEP]To+1,……,Tp[SEP]", o, p are positive integers, o is less than p; performing word embedding on the labeled sample text input word embedding layer 202 to obtain a sample text characterization vector "[ 101,239, … …,318,101],[292,301,……,029,122]"; inputting the sample text representation vector into a BERT model 203 for forward propagation to obtain a sample semantic representation vector HCLS(ii) a Inputting the sample semantic representation vector into the full-connection layer 204, inputting the output sample feature vector into the output layer 205, and calculating the intention recognition loss L through the ArcFace loss function in the output layer 205ArcfaceThe calculation formula is as follows:
Figure GDA0003578630800000061
wherein m is the number of samples of the training samples, and m is a positive integer; e is a natural base number; wjJ is the jth weight of the fully-connected layer, and j is a positive integer; x is the number ofiIs the sample feature vector of the ith training sample, yiIs the known intention of the ith training sample label, i is a positive integer; n is the number of classes classified according to intention for the training samples, and n is a positive integer; s is a scaling parameter for the eigenvalues, where,
Figure GDA0003578630800000062
Wjand xiThe following constraints are satisfied by L2 normalization:
Figure GDA0003578630800000063
and then training a neural network model by utilizing a back propagation algorithm based on the intention recognition loss to finally obtain an intention recognition model. Illustratively, an AdamW optimizer may be employed to optimize model parameters when back-propagating a trained neural network model.
In summary, according to the training method for the intention recognition model provided in this embodiment, the BERT model and the ArcFace loss function may be used to construct the intention recognition model, so as to learn semantic representation vectors of samples with relatively large class distances and relatively small intra-class distances in a semantic space, and accurately recognize samples with known intentions and samples with unknown intentions, so that the method can be used as a powerful tool for recognizing unknown intentions in a training stage, and can be used as a powerful tool for recognizing known intentions in a using stage.
The intention recognition model described in the above embodiments may be used in the present application to recognize unknown intention in a sample to be recognized, and for example, as shown in fig. 3, a flowchart of an unknown intention recognition method provided in an exemplary embodiment of the present application is shown, and the method is applied to a computer device, and includes:
step 301, when the intention recognition of the sample to be recognized fails by calling the intention recognition model, obtaining an unknown intention sample.
After the intention recognition model is trained, known intentions can be recognized, and when the recognition of the known intentions fails, an intention unknown sample is determined. Optionally, the computer device calls an intention recognition model to perform feature learning on the sample to be recognized to obtain a semantic representation vector; calling an intention recognition model to match the semantic representation vector with the known intention so as to obtain the similarity between the intention of the sample to be recognized and the known intention; and setting a similarity threshold value in the intention identification model, and determining the sample to be identified as an intention unknown sample in response to the similarity being less than or equal to the similarity threshold value.
It should be noted that, in response to the similarity being greater than the similarity threshold, the computer device determines that the sample to be identified is a false unknown sample, and screens out the false unknown sample. For example, the pseudo intent unknown sample may be a sample that fails to be identified as a known intent by the intent identification model due to noise interference, or may be a sample that does not have any intent but is misjudged as an intent unknown sample by the intent identification model due to noise interference.
Optionally, the intent recognition model comprises a text annotation layer, a word embedding layer, and a BERT model; for the generation of the semantic representation vector, the computer equipment inputs a sample to be identified into a text labeling layer for labeling processing to obtain a labeled text; performing word embedding on the marked text input word embedding layer to obtain a text representation vector; and inputting the text representation vector into a BERT model for feature learning to obtain a semantic representation vector.
Optionally, the intention recognition model further comprises a fully connected layer and a similarity prediction function; for similarity calculation, inputting the semantic representation vector into a full-connection layer by computer equipment to obtain a feature vector after feature mapping; the similarity between the feature vector and the known intention is calculated by a similarity prediction function.
Optionally, the similarity prediction function may include at least one of an ArcFace loss function, an a-softmax loss function, and an AM-softmax loss function.
For example, the intention recognition model may refer to the model structure shown in fig. 2, and unlike the model training process, the result logits calculated by the arcfacelos function is not subjected to the cross entropy calculation. Wherein, the meaning of each dimension of the logits is the similarity between the intention of the sample to be identified and each known intention, and finally the maximum similarity is output as the similarity between the intention of the sample to be identified and the known intention.
And 302, performing anomaly detection on the local outlier factor of the semantic representation vector of the unknown intention sample to obtain the local outlier factor density of the unknown intention sample.
The semantic representation vector is an intermediate vector generated in the intention identification process for the intention-unknown sample. After determining the unknown sample of intent, the computer device calculates a Local Outlier Factor (LOF) density of the unknown sample of intent for the semantic characterization vector of the unknown sample of intent using a LOF algorithm.
And 303, in response to the fact that the density of the local outlier factor is larger than or equal to the density threshold, determining that unknown intention exists in the unknown intention sample, wherein the unknown intention sample is used for retraining the intention recognition model after a new intention is marked manually.
A density threshold is set in the computer device, when the local outlier factor density of the unknown sample of the intention is less than the density threshold, it is indicated that the unknown intention that may be present in the unknown sample of the intention is actually noise, and therefore the computer device determines that the unknown intention is present in the unknown sample of the intention when the local outlier factor density of the unknown sample of the intention is greater than or equal to the density threshold.
In other embodiments, after the computer device performs intent recognition on a plurality of samples to be recognized, at least two unknown samples with unknown intentions are determined, and the at least two unknown samples with unknown intentions correspond to at least two semantic representation vectors; and performing clustering calculation on the at least two unknown intention samples based on the at least two semantic representation vectors to obtain a sample set of at least one type of unknown intention samples.
Illustratively, the computer device takes at least two semantic representation vectors as input of a K-means clustering algorithm to calculate a sample set of at least one type of unknown intention samples, and after the clustered sample set of the at least one type of unknown intention samples is manually marked with new intention, the clustered sample set is used as a training sample to train an intention recognition model so as to expand the type of the intention recognized by the intention recognition model.
In summary, in the method for identifying an unknown intention provided by this embodiment, an intention identification model is used to identify a sample to be identified, an intention unknown sample is screened from the sample to be identified, then, a part of samples which are misjudged as the intention unknown sample by the intention identification model due to noise interference is screened out through anomaly detection of a local outlier, a sample in which the unknown intention really exists is determined, the identification accuracy of the unknown intention is improved, after the intention unknown sample is manually labeled with a new intention, a sample set carrying more types of intentions can be formed by combining the samples with the known intention, and further, intention identification training is performed on the intention identification model, so that the intention identification model can identify more types of intentions.
Particularly, for an intelligent customer service scene, such as an application scene of an intelligent customer service robot, with the development of services and the accumulation of data, new user intentions related to the services are endless, and therefore, the types of the user intentions capable of being identified by the intention identification model in the intelligent customer service need to be continuously increased.
With reference to the embodiment shown in fig. 1, the embodiment provided by the application is divided into two stages to realize new intention discovery, the first stage is semantic representation learning, a training sample with known intention is used as a reference sample, and semantic representation is performed by pre-training a BERT model combined with an arcfacelos algorithm, so that semantic representation vectors with larger class spacing and smaller class inner distance of the sample in a semantic space can be learned; and in the second stage, extracting a semantic representation vector from the sample to be recognized through the intention recognition model in the first stage, comparing the similarity of the sample to be recognized with the sample with known intention based on the semantic representation vector, screening out the sample to be recognized with low similarity, serving as the sample which is not similar to the sample with known intention, namely an intention unknown sample, removing a noise sample in the intention unknown sample through an LOF (low-level error detection) anomaly detection algorithm, screening out the intention unknown sample with the real unknown intention, determining a set of the intention unknown samples through a clustering algorithm, serving as a training sample set of a new intention, and training the intention recognition model to recognize more types of intentions by combining with the sample set with the known intention.
Fig. 4 illustrates a block diagram of an unknown intention identification apparatus provided by an exemplary embodiment of the present application, which may be implemented as part or all of a computer device, which may include a server or a terminal, through software, hardware, or a combination thereof. The device includes:
the identification module 401 is configured to call an intention identification model to obtain an unknown intention sample when the intention identification of the sample to be identified fails;
a detection module 402, configured to perform anomaly detection on a local outlier factor for a semantic feature vector of an unknown-of-intention sample to obtain a local outlier factor density of the unknown-of-intention sample, where the semantic feature vector is an intermediate vector generated in an intention recognition process for the unknown-of-intention sample;
and a determining module 403, configured to determine that there is an unknown intention in the intention unknown sample in response to the local outlier density being greater than the density threshold, where the intention unknown sample is used for retraining the intention recognition model after a new intention is manually labeled.
In some embodiments, the identifying module 401 is configured to:
calling an intention recognition model to perform feature learning on a sample to be recognized to obtain a semantic representation vector;
calling an intention recognition model to match the semantic representation vector with the known intention so as to obtain the similarity between the intention of the sample to be recognized and the known intention;
and in response to the similarity being smaller than the similarity threshold, determining the sample to be identified as the unknown-intention sample.
In some embodiments, the intent recognition model includes a fully connected layer and a similarity prediction function; an identification module 401 configured to:
inputting the semantic representation vector into a full-connection layer to obtain a feature vector after feature mapping;
the similarity between the feature vector and the known intention is calculated by a similarity prediction function.
In some embodiments, the intent recognition model includes a text annotation layer, a word embedding layer, and a bi-directional encoder representation BERT model; an identification module 401 configured to:
inputting a sample to be identified into a text labeling layer for labeling processing to obtain a labeled text;
performing word embedding on the marked text input word embedding layer to obtain a text representation vector;
and inputting the text representation vector into a BERT model for feature learning to obtain a semantic representation vector.
In some embodiments, the determining module is further configured to:
determining at least two unknown samples with unknown intentions, wherein the at least two unknown samples with unknown intentions correspond to at least two semantic representation vectors;
and performing clustering calculation on the at least two unknown intention samples based on the at least two semantic representation vectors to obtain a sample set of at least one type of unknown intention samples.
In some embodiments, the apparatus further comprises: a training module 404;
a training module 404, configured to obtain a training sample, where the training sample is marked with a known intention; calling a neural network model to perform feature learning on the training samples to obtain sample semantic representation vectors; calling a neural network model to calculate the loss between the semantic representation vector of the sample and the known intention so as to obtain the intention identification loss; and carrying out back propagation training on the neural network model based on the intention recognition loss, and finally obtaining the trained intention recognition model.
In some embodiments, the neural network model includes a fully connected layer and a similarity prediction function; a training module 404 to:
inputting the sample semantic representation vector into a full connection layer to obtain a sample feature vector after feature mapping;
and calculating the similarity between the sample feature vector and the known intention through a similarity prediction function, and determining the intention identification loss based on the similarity.
In some embodiments, the neural network model includes a text labeling layer, a word embedding layer, and a BERT model; a training module 404 to:
inputting a training sample into a text labeling layer for labeling processing to obtain a labeled sample text;
performing word embedding on the marked sample text input word embedding layer to obtain a sample text characterization vector;
and inputting the sample text characterization vector into a BERT model for feature learning to obtain a sample semantic characterization vector.
In summary, in the identification apparatus for unknown intention provided in this embodiment, an intention identification model is used to identify samples to be identified, an intention unknown sample is screened out from the samples to be identified, then, a part of samples that are erroneously determined as the intention unknown sample by the intention identification model due to noise interference is screened out through anomaly detection of a local outlier, a sample with an unknown intention is determined, the identification accuracy of the unknown intention is improved, after the intention unknown sample is manually labeled with a new intention, a sample set carrying more types of intentions can be formed by combining the samples with the known intention, and further, intention identification training is performed on the intention identification model, so that the intention identification model can identify more types of intentions.
Particularly, aiming at an intelligent customer service scene, such as an application scene of an intelligent customer service robot, new user intentions related to services are endless along with service development and data accumulation, so that the types of the user intentions which can be identified by the intention identification model in the intelligent customer service need to be increased continuously, the intention of the new user can be accurately mined from newly-added service data by adopting the device, then, model expansion training is carried out by adopting sample data marked with the intention of the new user, finally, an intention identification model which can identify more types of user intentions is obtained, and the intention identification model is further perfected.
Fig. 5 shows a schematic structural diagram of a computer device provided in an exemplary embodiment of the present application. The computer device may be a device that performs the method of identification of unknown intents as provided herein, which may be a terminal or a server. Specifically, the method comprises the following steps:
the computer apparatus 500 includes a Central Processing Unit (CPU) 501, a system Memory 504 including a Random Access Memory (RAM) 502 and a Read Only Memory (ROM) 503, and a system bus 505 connecting the system Memory 504 and the Central Processing Unit 501. The computer device 500 also includes a basic Input/Output System (I/O System)506, which facilitates information transfer between various devices within the computer, and a mass storage device 507, which stores an operating System 513, application programs 514, and other program modules 515.
The basic input/output system 506 comprises a display 508 for displaying information and an input device 509, such as a mouse, keyboard, etc., for user input of information. Wherein a display 508 and an input device 509 are connected to the central processing unit 501 through an input output controller 510 connected to the system bus 505. The basic input/output system 506 may also include an input/output controller 510 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 510 also provides output to a display screen, a printer, or other type of output device.
The mass storage device 507 is connected to the central processing unit 501 through a mass storage controller (not shown) connected to the system bus 505. The mass storage device 507 and its associated computer-readable media provide non-volatile storage for the computer device 500. That is, mass storage device 507 may include a computer-readable medium (not shown) such as a hard disk or Compact disk Read Only Memory (CD-ROM) drive.
Computer-readable media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other Solid State Memory technology, CD-ROM, Digital Versatile Disks (DVD), or Solid State Drives (SSD), other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 504 and mass storage device 507 described above may be collectively referred to as memory.
According to various embodiments of the present application, the computer device 500 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the computer device 500 may be connected to the network 512 through the network interface unit 511 connected to the system bus 505, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 511.
The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.
In an alternative embodiment, a computer device is provided that includes a processor and a memory having at least one instruction, at least one program, set of codes, or set of instructions stored therein, which is loaded and executed by the processor to implement the method of unknown intent recognition as described above.
In an alternative embodiment, a computer readable storage medium is provided having stored therein at least one instruction, at least one program, set of codes, or set of instructions that is loaded and executed by a processor to implement the method of unknown intent recognition as described above.
Optionally, the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), Solid State Drive (SSD), or optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The present application further provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the method of identifying unknown intents provided by the above-described method embodiments.
The present application also provides a computer program product comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and executes the computer instructions to cause the computer device to perform the method for recognizing unknown intention as described above.
It should be understood that reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (11)

1. A method of identifying an unknown intent, the method comprising:
calling an intention identification model to obtain an unknown intention sample when the intention identification of the sample to be identified fails;
performing anomaly detection on a local outlier factor on a semantic feature vector of the unknown intention sample to obtain the local outlier factor density of the unknown intention sample, wherein the semantic feature vector is an intermediate vector generated in an intention identification process aiming at the unknown intention sample;
in response to the local outlier factor density being greater than a density threshold, determining that the unknown-intent sample has an unknown intent, the unknown-intent sample for retraining the intent recognition model after manually labeling a new intent.
2. The method according to claim 1, wherein when the calling of the intention recognition model fails to recognize the intention of the sample to be recognized, obtaining the sample with unknown intention comprises:
calling the intention recognition model to perform feature learning on the sample to be recognized to obtain the semantic representation vector;
calling the intention recognition model to match the semantic representation vector with a known intention to obtain the similarity between the intention of the sample to be recognized and the known intention;
in response to the similarity being less than a similarity threshold, determining the sample to be identified as the unknown-of-intent sample.
3. The method of claim 2, wherein the intent recognition model comprises a fully connected layer and a similarity prediction function;
the calling the intention recognition model to match the semantic representation vector with a known intention to obtain the similarity between the intention of the sample to be recognized and the known intention comprises the following steps:
inputting the semantic representation vector into the full-connection layer to obtain a feature vector after feature mapping;
calculating the similarity between the feature vector and the known intent by the similarity prediction function.
4. The method of claim 2, wherein the intent recognition model comprises a text annotation layer, a word embedding layer, and a bi-directional encoder representation BERT model;
the calling the intention recognition model to perform feature learning on the sample to be recognized to obtain the semantic representation vector comprises the following steps:
inputting the sample to be identified into the text labeling layer for labeling processing to obtain a labeled text;
inputting the marked text into the word embedding layer for word embedding to obtain a text representation vector;
and inputting the text representation vector into the BERT model for feature learning to obtain the semantic representation vector.
5. The method of any one of claims 1 to 4, wherein said determining that the unknown sample of intent has an unknown intent in response to the local outlier factor density being greater than a density threshold comprises:
determining at least two unknown samples with unknown intentions, wherein the at least two unknown samples with unknown intentions correspond to at least two semantic representation vectors;
and performing clustering calculation on the at least two unknown intention samples based on the at least two semantic representation vectors to obtain a sample set of at least one type of unknown intention samples.
6. The method of any of claims 1 to 4, wherein the training process of the intent recognition model comprises:
obtaining a training sample, wherein the training sample is marked with a known intention;
calling a neural network model to perform feature learning on the training samples to obtain sample semantic representation vectors;
calling the neural network model to calculate the loss between the sample semantic representation vector and the known intention to obtain the intention identification loss;
and carrying out back propagation training on the neural network model based on the intention recognition loss, and finally obtaining the trained intention recognition model.
7. The method of claim 6, wherein the neural network model comprises a fully connected layer and a similarity prediction function;
the calling the neural network model to calculate the loss between the sample semantic representation vector and the known intention, and obtaining the intention identification loss comprises:
inputting the sample semantic representation vector into the full-connection layer to obtain a sample feature vector after feature mapping;
calculating a similarity between the sample feature vector and the known intent by the similarity prediction function, the intent recognition loss being determined based on the similarity.
8. The method of claim 6, wherein the neural network model comprises a text labeling layer, a word embedding layer, and a BERT model;
the calling of the neural network model to perform feature learning on the training sample to obtain a sample semantic representation vector comprises the following steps:
inputting the training sample into the text labeling layer for labeling processing to obtain a labeled sample text;
inputting the marked sample text into the word embedding layer for word embedding to obtain a sample text characterization vector;
and inputting the sample text representation vector into the BERT model for feature learning to obtain the sample semantic representation vector.
9. An apparatus for recognizing unknown intentions, the apparatus comprising:
the identification module is used for calling the intention identification model to obtain an unknown intention sample when the intention identification of the sample to be identified fails;
the detection module is used for carrying out anomaly detection on a local outlier factor on a semantic feature vector of the unknown intention sample to obtain the local outlier factor density of the unknown intention sample, wherein the semantic feature vector is an intermediate vector generated in an intention identification process aiming at the unknown intention sample;
a determining module, configured to determine that the unknown intent sample has an unknown intent in response to the local outlier factor density being greater than a density threshold, where the unknown intent sample is used to retrain the intent recognition model after a new intent is manually labeled.
10. A computer device, characterized in that the computer device comprises: a processor and a memory, the memory storing a computer program that is loaded and executed by the processor to implement the method of unknown intent identification as claimed in any of claims 1 to 8.
11. A computer-readable storage medium, in which a computer program is stored, which is loaded and executed by a processor to implement the method of identification of unknown intentions as claimed in any one of claims 1 to 8.
CN202110297693.5A 2021-03-19 2021-03-19 Unknown intention recognition method, device, equipment and storage medium Active CN112966088B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110297693.5A CN112966088B (en) 2021-03-19 2021-03-19 Unknown intention recognition method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110297693.5A CN112966088B (en) 2021-03-19 2021-03-19 Unknown intention recognition method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112966088A CN112966088A (en) 2021-06-15
CN112966088B true CN112966088B (en) 2022-06-03

Family

ID=76277794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110297693.5A Active CN112966088B (en) 2021-03-19 2021-03-19 Unknown intention recognition method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112966088B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780007A (en) * 2021-10-22 2021-12-10 平安科技(深圳)有限公司 Corpus screening method, intention recognition model optimization method, equipment and storage medium
CN114564964B (en) * 2022-02-24 2023-05-26 杭州中软安人网络通信股份有限公司 Unknown intention detection method based on k nearest neighbor contrast learning
CN115168593B (en) * 2022-09-05 2022-11-29 深圳爱莫科技有限公司 Intelligent dialogue management method capable of self-learning and processing equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633724A (en) * 2018-06-25 2019-12-31 中兴通讯股份有限公司 Intention recognition model dynamic training method, device, equipment and storage medium
CN111292752A (en) * 2018-12-06 2020-06-16 北京嘀嘀无限科技发展有限公司 User intention identification method and device, electronic equipment and storage medium
CN112148874A (en) * 2020-07-07 2020-12-29 四川长虹电器股份有限公司 Intention identification method and system capable of automatically adding potential intention of user

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504901B (en) * 2014-12-29 2016-06-08 浙江银江研究院有限公司 A kind of traffic abnormity point detecting method based on multidimensional data
KR102338618B1 (en) * 2017-07-25 2021-12-10 삼성에스디에스 주식회사 Method for providing chatting service with chatbot assisted by human agents
CN111144124B (en) * 2018-11-02 2023-10-20 华为技术有限公司 Training method of machine learning model, intention recognition method, and related device and equipment
CN110334347A (en) * 2019-06-27 2019-10-15 腾讯科技(深圳)有限公司 Information processing method, relevant device and storage medium based on natural language recognition
CN111881991B (en) * 2020-08-03 2023-11-10 联仁健康医疗大数据科技股份有限公司 Method and device for identifying fraud and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633724A (en) * 2018-06-25 2019-12-31 中兴通讯股份有限公司 Intention recognition model dynamic training method, device, equipment and storage medium
CN111292752A (en) * 2018-12-06 2020-06-16 北京嘀嘀无限科技发展有限公司 User intention identification method and device, electronic equipment and storage medium
CN112148874A (en) * 2020-07-07 2020-12-29 四川长虹电器股份有限公司 Intention identification method and system capable of automatically adding potential intention of user

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification;Guangfeng Yan等;《arXiv》;20190603;第1050-1060页 *

Also Published As

Publication number Publication date
CN112966088A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN111767405B (en) Training method, device, equipment and storage medium of text classification model
CN112966088B (en) Unknown intention recognition method, device, equipment and storage medium
CN108694225B (en) Image searching method, feature vector generating method and device and electronic equipment
US11210470B2 (en) Automatic text segmentation based on relevant context
CN114821271B (en) Model training method, image description generation device and storage medium
CN107992937B (en) Unstructured data judgment method and device based on deep learning
CN113610787A (en) Training method and device of image defect detection model and computer equipment
CN109871891B (en) Object identification method and device and storage medium
CN112131876A (en) Method and system for determining standard problem based on similarity
CN110633475A (en) Natural language understanding method, device and system based on computer scene and storage medium
CN114287005A (en) Negative sampling algorithm for enhancing image classification
CN112100377A (en) Text classification method and device, computer equipment and storage medium
CN110909768B (en) Method and device for acquiring marked data
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN110929013A (en) Image question-answer implementation method based on bottom-up entry and positioning information fusion
CN115713082A (en) Named entity identification method, device, equipment and storage medium
CN114117037A (en) Intention recognition method, device, equipment and storage medium
CN115080745A (en) Multi-scene text classification method, device, equipment and medium based on artificial intelligence
CN114780757A (en) Short media label extraction method and device, computer equipment and storage medium
CN114036946B (en) Text feature extraction and auxiliary retrieval system and method
US11934794B1 (en) Systems and methods for algorithmically orchestrating conversational dialogue transitions within an automated conversational system
CN117009595A (en) Text paragraph acquisition method and device, storage medium and program product thereof
CN117390454A (en) Data labeling method and system based on multi-domain self-adaptive data closed loop
CN116186255A (en) Method for training unknown intention detection model, unknown intention detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221028

Address after: 1311, Floor 13, No. 27, Zhongguancun Street, Haidian District, Beijing 100080

Patentee after: QIANDAI (BEIJING) INFORMATION TECHNOLOGY CO.,LTD.

Patentee after: BEIJING SANKUAI ONLINE TECHNOLOGY Co.,Ltd.

Address before: 100080 2106-030, 9 North Fourth Ring Road, Haidian District, Beijing.

Patentee before: BEIJING SANKUAI ONLINE TECHNOLOGY Co.,Ltd.