CN113139063A - Intention recognition method, device, equipment and storage medium - Google Patents

Intention recognition method, device, equipment and storage medium Download PDF

Info

Publication number
CN113139063A
CN113139063A CN202110682988.4A CN202110682988A CN113139063A CN 113139063 A CN113139063 A CN 113139063A CN 202110682988 A CN202110682988 A CN 202110682988A CN 113139063 A CN113139063 A CN 113139063A
Authority
CN
China
Prior art keywords
model
data
sample data
domain
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110682988.4A
Other languages
Chinese (zh)
Other versions
CN113139063B (en
Inventor
丁嘉罗
董世超
乔建秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110682988.4A priority Critical patent/CN113139063B/en
Publication of CN113139063A publication Critical patent/CN113139063A/en
Application granted granted Critical
Publication of CN113139063B publication Critical patent/CN113139063B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and provides an intention recognition method, an intention recognition device, an intention recognition equipment and a storage medium, wherein a Transformer model can be pre-trained by utilizing first sample data of a source domain to obtain a feature extraction model, the feature extraction model is externally connected with a first preset classifier to obtain a first initial model, the first initial model is supervised and learned by utilizing second sample data associated with a target task to obtain a first model, data of a target domain is obtained to be used as third sample data, the first model is externally connected with a second preset classifier to obtain a second initial model, the second initial model is subjected to confrontation training by utilizing the third sample data to obtain an intention recognition model, data to be processed is input to the intention recognition model, and the data to be processed is output to be used as a target intention. The invention can be used for realizing the accurate recognition of the intention by combining the idea of transfer learning and a multi-level classification framework. The invention also relates to blockchain techniques, and the intention recognition model can be stored on blockchain nodes.

Description

Intention recognition method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an intention identification method, device, equipment and storage medium.
Background
At present, a dialog system and an intelligent assistant are typical applications of artificial intelligence, and are used as an efficient and friendly interactive interface and widely applied to various service scene software. Among the core components of a dialog system is an intent understanding module. The operation mode of intention understanding is to receive an instruction or a conversation in a natural language and judge the intention category of a sender.
Typically, intent recognition will be modeled as a supervised text classification task. Namely, a large amount of dialog texts are labeled, and a machine learning model is trained in a supervised mode to have the capability of judging the type of an input text. To ensure the model effect, such a training paradigm usually has three assumptions:
1. a large amount of training data exists or can be labeled;
2. the training data and the prediction data are of the same distribution;
3. the training model can extract the corresponding domain knowledge.
However, in a real industrial environment, the above three assumptions do not always exist. In an industrial environment, the labeling and acquisition of data often require high cost, and particularly in a deep learning paradigm, the training of a model usually requires a large amount of training data; in addition, a large amount of domain-specific data exists in the industry, but the distribution is different from the definition distribution of target task data, so that no way is available for direct training and prediction; finally, with a single task of supervised learning, the training mode usually captures limited information, and there is no way to obtain more knowledge from more domain corpora and related tasks.
Disclosure of Invention
The embodiment of the invention provides an intention identification method, an intention identification device, intention identification equipment and a storage medium, which can be used for accurately identifying an intention by combining a transfer learning idea and a multi-level classification framework.
In a first aspect, an embodiment of the present invention provides an intention identification method, which includes:
in response to an intention identification instruction, determining a source domain according to the intention identification instruction, and constructing first sample data;
pre-training a Transformer model by using the first sample data to obtain a feature extraction model;
determining a target task, and acquiring data associated with the target task as second sample data;
externally connecting the feature extraction model with a first preset classifier to obtain a first initial model, and performing supervised learning on the first initial model by using the second sample data to obtain a first model;
determining a target domain, and acquiring data of the target domain as third sample data;
connecting the first model with a second preset classifier externally to obtain a second initial model;
constructing a target loss function;
performing countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model;
and acquiring data to be processed, inputting the data to be processed into the intention recognition model, and acquiring the output of the intention recognition model as a target intention.
According to the preferred embodiment of the present invention, the pre-training the Transformer model by using the first sample data to obtain the feature extraction model includes:
acquiring general corpus data from the first sample data;
randomly selecting words from the general corpus data, and replacing the selected words with masks;
randomly disturbing the sentence relation of the general corpus data;
performing masking prediction training on the Transformer model according to the mask, and performing next sentence prediction training on the Transformer model by using the disordered sentences to obtain an intermediate model;
and acquiring the data of the source domain from the first sample data, and retraining the intermediate model by using the data of the source domain to obtain the feature extraction model.
According to a preferred embodiment of the present invention, the performing supervised learning on the first initial model by using the second sample data to obtain a first model includes:
inputting the second sample data into the feature extraction model for feature extraction to obtain an embedded vector representation of the second sample data;
acquiring a label of each data in the second sample data;
taking the embedded vector representation of the second sample data as a training sample, training the first preset classifier according to the label of each data until the loss of the first preset classifier reaches convergence, and stopping training;
determining a current first initial model as the first model.
According to a preferred embodiment of the present invention, said constructing the objective loss function comprises:
constructing a class classification loss function for the first preset classifier;
constructing a domain classification loss function;
configuring a first weight for the class classification loss function and a second weight for the domain classification loss function;
and calculating a weighted sum according to the first weight, the second weight, the class classification loss function and the domain classification loss function to obtain the target loss function.
According to a preferred embodiment of the present invention, the formula of the class classification loss function is as follows:
Figure DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE002
a loss function representing the class of the signal,
Figure DEST_PATH_IMAGE003
representing the first predictionOutput correspondence of classifier
Figure DEST_PATH_IMAGE004
The probability of the label is determined by the probability of the label,
Figure DEST_PATH_IMAGE005
an embedded vector representation representing the output of the feature extraction model,
Figure DEST_PATH_IMAGE006
which represents the (i) th input(s),
Figure 408215DEST_PATH_IMAGE004
to represent
Figure 59776DEST_PATH_IMAGE006
And i is a positive integer corresponding to the label.
According to a preferred embodiment of the present invention, the formula of the domain classification loss function is as follows:
Figure DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE008
representing the domain classification loss function in question,
Figure DEST_PATH_IMAGE009
a domain label is represented that indicates a domain tag,
Figure DEST_PATH_IMAGE010
indicating that the second preset classifier output label is
Figure 511617DEST_PATH_IMAGE009
The probability of (c).
According to a preferred embodiment of the present invention, the performing a countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model includes:
randomly extracting data with a preset proportion from the third sample data, and recording the label of the extracted data as the label of the source domain to obtain fourth sample data;
and performing simulated reversal gradient training on the second initial model by using the fourth sample data until the value of the target loss function is not reduced any more, and stopping training to obtain the intention recognition model.
In a second aspect, an embodiment of the present invention provides an intention identifying apparatus, which includes:
the construction unit is used for responding to the intention identification instruction, determining a source domain according to the intention identification instruction and constructing first sample data;
the training unit is used for pre-training the Transformer model by utilizing the first sample data to obtain a feature extraction model;
the acquisition unit is used for determining a target task and acquiring data associated with the target task as second sample data;
the learning unit is used for externally connecting the feature extraction model with a first preset classifier to obtain a first initial model, and performing supervised learning on the first initial model by using the second sample data to obtain a first model;
the acquiring unit is further configured to determine a target domain and acquire data of the target domain as third sample data;
the external unit is used for externally connecting the first model with a second preset classifier to obtain a second initial model;
the construction unit is also used for constructing a target loss function;
the training unit is further configured to perform countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model;
the acquisition unit is further configured to acquire data to be processed, input the data to be processed to the intention recognition model, and acquire an output of the intention recognition model as a target intention.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the intention identifying method according to the first aspect when executing the computer program.
In a fourth aspect, the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the intention identifying method according to the first aspect.
The embodiment of the invention provides an intention recognition method, an intention recognition device, computer equipment and a storage medium, which can pre-train a Transformer model by utilizing first sample data of a source domain to obtain a feature extraction model, externally connect the feature extraction model with a first preset classifier to obtain a first initial model, perform supervised learning on the first initial model by utilizing second sample data associated with a target task to obtain a first model, obtain data of a target domain as third sample data, externally connect the first model with a second preset classifier to obtain a second initial model, perform confrontation training on the second initial model by utilizing the third sample data to obtain an intention recognition model based on a target loss function, input data to be processed into the intention recognition model, and output the intention as a target intention. The invention can be used for realizing the accurate recognition of the intention by combining the idea of transfer learning and a multi-level classification framework.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of an intention identification method according to an embodiment of the present invention;
FIG. 2 is a schematic block diagram of an intent recognition apparatus provided by an embodiment of the present invention;
FIG. 3 is a schematic block diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Fig. 1 is a schematic flow chart of an intention identification method according to an embodiment of the invention.
S10, responding to the intention identification instruction, determining a source domain according to the intention identification instruction, and constructing first sample data.
In this embodiment, the intention identification instruction may be configured to be triggered periodically or by a relevant worker, and the present invention is not limited thereto.
In the present embodiment, the source domain refers to a domain having sufficient data satisfying training requirements as a support, such as an insurance domain.
In at least one embodiment of the invention, the determining a source domain from the intent recognition instruction and constructing the first sample data comprises:
analyzing the intention identification instruction to obtain information carried by the intention identification instruction;
acquiring a label corresponding to the field;
constructing a regular expression according to the acquired label;
traversing the information carried by the intention identification instruction by using the regular expression, and determining the traversed information as a domain name;
determining the source domain according to the domain name;
acquiring data of the source domain;
calling a general corpus and acquiring data of the general corpus;
integrating the acquired data as the first sample data.
For example: the label can be NAME, and the constructed regular expression is NAME ().
Through the embodiment, the source domain can be accurately determined based on the label and the regular expression, and the training sample is further constructed by combining the source domain and the general language library so as to ensure the effect of subsequent model training.
And S11, pre-training the Transformer model by using the first sample data to obtain a feature extraction model.
In this embodiment, the pre-training the Transformer model by using the first sample data to obtain a feature extraction model includes:
acquiring general corpus data from the first sample data;
randomly selecting words from the general corpus data, and replacing the selected words with masks;
randomly disturbing the sentence relation of the general corpus data;
performing masking prediction training on the Transformer model according to the mask, and performing next sentence prediction training on the Transformer model by using the disordered sentences to obtain an intermediate model;
and acquiring the data of the source domain from the first sample data, and retraining the intermediate model by using the data of the source domain to obtain the feature extraction model.
For example, data in an arbitrary chinese encyclopedia database is used as the general corpus data, such as sentences in the general corpus data: "why the company A can say that you are the first choice for insurance
Figure DEST_PATH_IMAGE012
Because the company A is selected, the top company of the insurance industry is also selected as a development platform ". Randomly selecting the word "company a" from the sentence, and replacing "company a" with "mask", resulting in a replaced sentence: why "mask" can be said to be the first choice of the insurance you are working on
Figure 716946DEST_PATH_IMAGE012
Because a mask is selected, a top company of the insurance industry is selected as a development platform. And taking the replaced sentences as the input of the transform model, if the model is wrong in prediction and loss is increased, carrying out gradient reduction to update model parameters, and in the non-label unsupervised learning process, training aims to enable the model to capture more context information as much as possible to correctly predict words of the mask.
Meanwhile, taking the above sentence as an example, the sentence relationship of the sentence is randomly disturbed, and the original sentence is changed into: "because of selecting the first company, the top company of the insurance industry is also selected as the development platform. Why company A can say that you are the first choice of insurance ", i.e. the 50% ordering is randomly disturbed. It can be understood that before the confusion, the sentence is in a top-bottom sentence relationship, but after the confusion, the sentence is not in a top-bottom sentence relationship, and the training is to make the model recognize the top-bottom sentence relationship as much as possible.
By masking prediction training and next sentence prediction training, the randomly masked word and sentence relation is predicted according to limited context information, so that the model learns text semantic information and text representation capability based on context.
Furthermore, the intermediate model is retrained through the data of the source domain (such as an insurance domain), so that the model has the semantic information and language expression capability of a specific domain.
In the above embodiment, the feature representation capability of the model in the specific field is enhanced by pre-training based on a large amount of general corpus which is easy to obtain and re-training based on a small amount of source domain corpus which is difficult to obtain, and by adopting the hierarchical and progressive training mode, the difference in corpus quantity can be fully utilized, and the requirement of the vertical field can be met.
S12, determining a target task, and acquiring data associated with the target task as second sample data.
In this embodiment, the target task may include intention recognition, and the target task may be determined according to the intention recognition instruction or according to a configuration of a user, and the present invention is not limited thereto.
In at least one embodiment of the present invention, the acquiring data associated with the target task as second sample data includes:
acquiring historical data of the target task;
identifying tagged data from the historical data as the second sample data; and/or
And acquiring data from the historical data, and performing label processing on the acquired data to obtain the second sample data.
It should be noted that, because supervised training is performed subsequently, a large amount of labeled data is required to be used as data support.
And S13, externally connecting the feature extraction model with a first preset classifier to obtain a first initial model, and performing supervised learning on the first initial model by using the second sample data to obtain a first model.
In this embodiment, the first preset classifier may be any conventional classification model, such as a full connectivity layer combined with a Softmax activation function.
In this embodiment, the structure of the first initial model is that the trained feature extraction model is externally connected with a first preset classifier.
Specifically, the performing supervised learning on the first initial model by using the second sample data to obtain a first model includes:
inputting the second sample data into the feature extraction model for feature extraction to obtain an embedded vector representation of the second sample data;
acquiring a label of each data in the second sample data;
taking the embedded vector representation of the second sample data as a training sample, training the first preset classifier according to the label of each data until the loss of the first preset classifier reaches convergence, and stopping training;
determining a current first initial model as the first model.
It can be understood that, since the first initial model is an integral model formed by the feature extraction model externally connecting the first preset classifier, the parameters of the feature extraction model and the first preset classifier are updated simultaneously when supervised learning is performed.
Through the embodiment, the model can be finely adjusted by combining with an actual target task.
Of course, according to a specific service scenario, a certain proportion of parameters are frozen to the feature extraction model and are not updated. In this way, only a small fraction of the parameters are updated with the first pre-set classifier to achieve fine tuning of the feature extraction model. For example: when the feature extraction model is a 12-layer (base) model, the first 10 transformers are frozen upwards from the bottom layer, and the specific number of frozen transformers belongs to a hyper-parameter, and needs to be adjusted and determined according to actual effects.
S14, determining a target domain and acquiring data of the target domain as third sample data.
In the present embodiment, the target domain refers to a domain relatively lacking data support, such as a bank domain. The target domain may be determined according to the intention identifying instruction, or may be determined according to a configuration of a user, which is not limited in the present invention.
It should be noted that, due to the lack of sufficient data support, the data in the target domain may be tagged, untagged, or partially tagged.
And S15, connecting the first model with a second preset classifier externally to obtain a second initial model.
In this embodiment, the second preset classifier may also be any conventional classification model, such as a full connectivity layer combined with a Softmax activation function, and the present invention is not limited thereto.
The structure of the second initial model is that the first model obtained by training is externally connected with the second preset classifier, namely the second initial model is composed of the feature extraction model, the first preset classifier and the second preset classifier.
S16, constructing an objective loss function.
In keeping with the above embodiments, the constructing the target loss function includes:
constructing a class classification loss function for the first preset classifier;
constructing a domain classification loss function;
configuring a first weight for the class classification loss function and a second weight for the domain classification loss function;
and calculating a weighted sum according to the first weight, the second weight, the class classification loss function and the domain classification loss function to obtain the target loss function.
The first weight and the second weight may be configured by user-defined, and may be adjusted dynamically according to an actual application scenario, which is not limited in the present invention.
Through the configuration of the loss function, the class classification loss function and the domain classification loss function are integrated together to be used as the overall loss, so that the trained model can meet the requirements of class classification and domain classification at the same time.
Specifically, the formula of the class classification loss function is as follows:
Figure 796897DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 212835DEST_PATH_IMAGE002
a loss function representing the class of the signal,
Figure 948710DEST_PATH_IMAGE003
indicating that the first preset classifier output corresponds to
Figure 226239DEST_PATH_IMAGE004
The probability of the label is determined by the probability of the label,
Figure 363959DEST_PATH_IMAGE005
an embedded vector representation representing the output of the feature extraction model,
Figure 950798DEST_PATH_IMAGE006
which represents the (i) th input(s),
Figure 173969DEST_PATH_IMAGE004
to represent
Figure 255189DEST_PATH_IMAGE006
And i is a positive integer corresponding to the label.
Through the above embodiment, the class classification loss function is first constructed for the first preset classifier, so as to realize correct classification of the classes.
Further, the domain classification loss function is formulated as follows:
Figure 247415DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 536314DEST_PATH_IMAGE008
representing the domain classification loss function in question,
Figure 325410DEST_PATH_IMAGE009
a domain label is represented that indicates a domain tag,
Figure 69375DEST_PATH_IMAGE010
indicating that the second preset classifier output label is
Figure 306321DEST_PATH_IMAGE009
The probability of (c).
Wherein the content of the first and second substances,
Figure 110329DEST_PATH_IMAGE009
representing a particular domain, e.g. configurable
Figure 104830DEST_PATH_IMAGE009
A value of 1 indicates the target domain,
Figure 524923DEST_PATH_IMAGE009
and 0 represents the source domain.
Through the implementation mode, the domain classification loss function is constructed for the second preset classifier so as to realize correct classification of the domain.
And S17, performing countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model.
In this embodiment, the performing countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain the intention recognition model includes:
randomly extracting data with a preset proportion from the third sample data, and recording the label of the extracted data as the label of the source domain to obtain fourth sample data;
and performing simulated reversal gradient training on the second initial model by using the fourth sample data until the value of the target loss function is not reduced any more, and stopping training to obtain the intention recognition model.
Specifically, the intention recognition model includes a feature extraction model, a class classifier (i.e., the first preset classifier), and a domain classifier (i.e., the second preset classifier).
Wherein the function of the feature extraction model is:
(1) extracting features required by classification of a subsequent classifier;
(2) and mapping the source domain data and the target domain data to the same space.
The function of the category classifier is as follows: and classifying the extracted source domain data characteristics.
The function of the domain classifier is as follows: and judging whether the extracted characteristic information is from a source domain or a target domain.
The whole training process can be adapted to a paradigm of countermeasure learning, and in the training process, a part of target domain labels of the same batch are dynamically changed into source domain labels, so that a countermeasure relation is formed to reverse the gradient, the training difficulty of the domain discriminator is increased, and the domain classifier cannot correctly judge which domain the characteristic information comes from. After the loss function of the domain classifier reaches the minimum, the target domain data and the source domain data are mixed together, and at the moment, the feature extractor can map the target domain data and the source domain data to the same feature space, so that the problem of inconsistent data distribution is solved.
Meanwhile, the feature extraction model is limited to ensure that the extracted feature information can be used for subsequent classification tasks, and the problems of small data volume of a target domain and lack of classification labels are solved, so that two loss functions of class classification loss and domain classification loss are added together according to a certain proportion to obtain an overall loss function. And updating parameters of the three parts of the framework simultaneously by a gradient descent method based on the overall loss function. When the total loss function reaches the minimum, the target domain data and the source domain data can be mapped to the same feature space and correctly classified.
In actual work, when the data of a target domain is less, the difficulty in obtaining training corpora is higher, or the target domain has a large amount of unlabeled data but the cost of manual labeling is higher, the traditional supervised learning, especially the deep learning, is adopted, and a large amount of labeled data is needed to fit a model in the training process. By the knowledge and the domain migration method, the existing models and the learned ability can be reused for similar tasks or tasks in the same domain in a small-amount (Few-shot) or non-label (zero-shot) mode, so that the data cost and the labor cost can be reduced, high-quality intention identification service is provided based on limited data, new intention identification can be flexibly added, and the method has easy expandability; a bi-directional coder representation model containing multi-domain text-related information can be obtained while being reusable to other text-related tasks.
S18, acquiring data to be processed, inputting the data to be processed into the intention recognition model, and acquiring the output of the intention recognition model as a target intention.
In this embodiment, the data to be processed may be uploaded by a relevant worker, which is not limited in the present invention.
In the embodiment, the intention recognition model obtained through training can be combined with the idea of transfer learning and a multi-level classification framework to realize accurate recognition of the intention.
It should be noted that, in order to further ensure the security of the data and avoid malicious tampering of the data, the intention identification model may be stored on the blockchain node.
According to the technical scheme, the method can utilize first sample data of a source domain to pre-train a Transformer model to obtain a feature extraction model, the feature extraction model is externally connected with a first preset classifier to obtain a first initial model, second sample data associated with a target task is utilized to perform supervised learning on the first initial model to obtain the first model, data of a target domain is obtained to serve as third sample data, the first model is externally connected with a second preset classifier to obtain a second initial model, the second initial model is subjected to countermeasure training on the basis of a target loss function and the third sample data to obtain an intention identification model, data to be processed are input to the intention identification model, and the data to be processed are output to serve as a target intention. The invention can be used for realizing the accurate recognition of the intention by combining the idea of transfer learning and a multi-level classification framework.
Embodiments of the present invention further provide an intention identification apparatus, which is configured to execute any one of the embodiments of the aforementioned intention identification method. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of an intention identifying apparatus according to an embodiment of the present invention.
As shown in fig. 2, the intention recognition apparatus 100 includes: the device comprises a construction unit 101, a training unit 102, an acquisition unit 103, a learning unit 104 and an external unit 105.
In response to the intention identifying instruction, the constructing unit 101 determines a source domain from the intention identifying instruction, and constructs first sample data.
In this embodiment, the intention identification instruction may be configured to be triggered periodically or by a relevant worker, and the present invention is not limited thereto.
In the present embodiment, the source domain refers to a domain having sufficient data satisfying training requirements as a support, such as an insurance domain.
In at least one embodiment of the present invention, the constructing unit 101 determines the source domain according to the intention identifying instruction, and constructing the first sample data includes:
analyzing the intention identification instruction to obtain information carried by the intention identification instruction;
acquiring a label corresponding to the field;
constructing a regular expression according to the acquired label;
traversing the information carried by the intention identification instruction by using the regular expression, and determining the traversed information as a domain name;
determining the source domain according to the domain name;
acquiring data of the source domain;
calling a general corpus and acquiring data of the general corpus;
integrating the acquired data as the first sample data.
For example: the label can be NAME, and the constructed regular expression is NAME ().
Through the embodiment, the source domain can be accurately determined based on the label and the regular expression, and the training sample is further constructed by combining the source domain and the general language library so as to ensure the effect of subsequent model training.
The training unit 102 pre-trains the Transformer model by using the first sample data to obtain a feature extraction model.
In this embodiment, the pre-training the transform model by the training unit 102 using the first sample data to obtain the feature extraction model includes:
acquiring general corpus data from the first sample data;
randomly selecting words from the general corpus data, and replacing the selected words with masks;
randomly disturbing the sentence relation of the general corpus data;
performing masking prediction training on the Transformer model according to the mask, and performing next sentence prediction training on the Transformer model by using the disordered sentences to obtain an intermediate model;
and acquiring the data of the source domain from the first sample data, and retraining the intermediate model by using the data of the source domain to obtain the feature extraction model.
For example, data in an arbitrary chinese encyclopedia database is used as the general corpus data, such as sentences in the general corpus data: "why the company A can say that you are the first choice for insurance
Figure 491742DEST_PATH_IMAGE012
Because the company A is selected, the top company of the insurance industry is also selected as a development platform ". Randomly selecting the word "company a" from the sentence, and replacing "company a" with "mask", resulting in a replaced sentence: why "mask" can be said to be the first choice of the insurance you are working on
Figure 591285DEST_PATH_IMAGE012
Because a mask is selected, a top company of the insurance industry is selected as a development platform. And taking the replaced sentences as the input of the transform model, if the model is wrong in prediction and loss is increased, carrying out gradient reduction to update model parameters, and in the non-label unsupervised learning process, training aims to enable the model to capture more context information as much as possible to correctly predict words of the mask.
Meanwhile, taking the above sentence as an example, the sentence relationship of the sentence is randomly disturbed, and the original sentence is changed into: "because of selecting the first company, the top company of the insurance industry is also selected as the development platform. Why company A can say that you are the first choice of insurance ", i.e. the 50% ordering is randomly disturbed. It can be understood that before the confusion, the sentence is in a top-bottom sentence relationship, but after the confusion, the sentence is not in a top-bottom sentence relationship, and the training is to make the model recognize the top-bottom sentence relationship as much as possible.
By masking prediction training and next sentence prediction training, the randomly masked word and sentence relation is predicted according to limited context information, so that the model learns text semantic information and text representation capability based on context.
Furthermore, the intermediate model is retrained through the data of the source domain (such as an insurance domain), so that the model has the semantic information and language expression capability of a specific domain.
In the above embodiment, the feature representation capability of the model in the specific field is enhanced by pre-training based on a large amount of general corpus which is easy to obtain and re-training based on a small amount of source domain corpus which is difficult to obtain, and by adopting the hierarchical and progressive training mode, the difference in corpus quantity can be fully utilized, and the requirement of the vertical field can be met.
The acquisition unit 103 determines a target task and acquires data associated with the target task as second sample data.
In this embodiment, the target task may include intention recognition, and the target task may be determined according to the intention recognition instruction or according to a configuration of a user, and the present invention is not limited thereto.
In at least one embodiment of the present invention, the acquiring unit 103 acquires data associated with the target task as second sample data includes:
acquiring historical data of the target task;
identifying tagged data from the historical data as the second sample data; and/or
And acquiring data from the historical data, and performing label processing on the acquired data to obtain the second sample data.
It should be noted that, because supervised training is performed subsequently, a large amount of labeled data is required to be used as data support.
The learning unit 104 connects the feature extraction model with a first preset classifier to obtain a first initial model, and performs supervised learning on the first initial model by using the second sample data to obtain a first model.
In this embodiment, the first preset classifier may be any conventional classification model, such as a full connectivity layer combined with a Softmax activation function.
In this embodiment, the structure of the first initial model is that the trained feature extraction model is externally connected with a first preset classifier.
Specifically, the learning unit 104 performs supervised learning on the first initial model by using the second sample data, and obtaining a first model includes:
inputting the second sample data into the feature extraction model for feature extraction to obtain an embedded vector representation of the second sample data;
acquiring a label of each data in the second sample data;
taking the embedded vector representation of the second sample data as a training sample, training the first preset classifier according to the label of each data until the loss of the first preset classifier reaches convergence, and stopping training;
determining a current first initial model as the first model.
It can be understood that, since the first initial model is an integral model formed by the feature extraction model externally connecting the first preset classifier, the parameters of the feature extraction model and the first preset classifier are updated simultaneously when supervised learning is performed.
Through the embodiment, the model can be finely adjusted by combining with an actual target task.
Of course, according to a specific service scenario, a certain proportion of parameters are frozen to the feature extraction model and are not updated. In this way, only a small fraction of the parameters are updated with the first pre-set classifier to achieve fine tuning of the feature extraction model. For example: when the feature extraction model is a 12-layer (base) model, the first 10 transformers are frozen upwards from the bottom layer, and the specific number of frozen transformers belongs to a hyper-parameter, and needs to be adjusted and determined according to actual effects.
The acquiring unit 103 determines a target domain and acquires data of the target domain as third sample data.
In the present embodiment, the target domain refers to a domain relatively lacking data support, such as a bank domain. The target domain may be determined according to the intention identifying instruction, or may be determined according to a configuration of a user, which is not limited in the present invention.
It should be noted that, due to the lack of sufficient data support, the data in the target domain may be tagged, untagged, or partially tagged.
The external connection unit 105 connects the first model with a second preset classifier to obtain a second initial model.
In this embodiment, the second preset classifier may also be any conventional classification model, such as a full connectivity layer combined with a Softmax activation function, and the present invention is not limited thereto.
The structure of the second initial model is that the first model obtained by training is externally connected with the second preset classifier, namely the second initial model is composed of the feature extraction model, the first preset classifier and the second preset classifier.
The construction unit 101 constructs an objective loss function.
In line with the above embodiment, the constructing unit 101 constructs the target loss function, including:
constructing a class classification loss function for the first preset classifier;
constructing a domain classification loss function;
configuring a first weight for the class classification loss function and a second weight for the domain classification loss function;
and calculating a weighted sum according to the first weight, the second weight, the class classification loss function and the domain classification loss function to obtain the target loss function.
The first weight and the second weight may be configured by user-defined, and may be adjusted dynamically according to an actual application scenario, which is not limited in the present invention.
Through the configuration of the loss function, the class classification loss function and the domain classification loss function are integrated together to be used as the overall loss, so that the trained model can meet the requirements of class classification and domain classification at the same time.
Specifically, the formula of the class classification loss function is as follows:
Figure DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 354972DEST_PATH_IMAGE002
a loss function representing the class of the signal,
Figure 706319DEST_PATH_IMAGE003
indicating that the first preset classifier output corresponds to
Figure 917858DEST_PATH_IMAGE004
The probability of the label is determined by the probability of the label,
Figure 673455DEST_PATH_IMAGE005
an embedded vector representation representing the output of the feature extraction model,
Figure 580231DEST_PATH_IMAGE006
which represents the (i) th input(s),
Figure 328745DEST_PATH_IMAGE004
to represent
Figure 270156DEST_PATH_IMAGE006
And i is a positive integer corresponding to the label.
Through the above embodiment, the class classification loss function is first constructed for the first preset classifier, so as to realize correct classification of the classes.
Further, the domain classification loss function is formulated as follows:
Figure DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 196655DEST_PATH_IMAGE008
representing the domain classification loss function in question,
Figure 715361DEST_PATH_IMAGE009
a domain label is represented that indicates a domain tag,
Figure 205248DEST_PATH_IMAGE010
indicating that the second preset classifier output label is
Figure 735586DEST_PATH_IMAGE009
The probability of (c).
Wherein the content of the first and second substances,
Figure 361215DEST_PATH_IMAGE009
representing a particular domain, e.g. configurable
Figure 570480DEST_PATH_IMAGE009
A value of 1 indicates the target domain,
Figure 67320DEST_PATH_IMAGE009
and 0 represents the source domain.
Through the implementation mode, the domain classification loss function is constructed for the second preset classifier so as to realize correct classification of the domain.
The training unit 102 performs countermeasure training on the second initial model by using the third sample data based on the target loss function, to obtain an intention recognition model.
In this embodiment, the training unit 102, based on the target loss function, and performing a countermeasure training on the second initial model by using the third sample data, to obtain the intention recognition model includes:
randomly extracting data with a preset proportion from the third sample data, and recording the label of the extracted data as the label of the source domain to obtain fourth sample data;
and performing simulated reversal gradient training on the second initial model by using the fourth sample data until the value of the target loss function is not reduced any more, and stopping training to obtain the intention recognition model.
Specifically, the intention recognition model includes a feature extraction model, a class classifier (i.e., the first preset classifier), and a domain classifier (i.e., the second preset classifier).
Wherein the function of the feature extraction model is:
(1) extracting features required by classification of a subsequent classifier;
(2) and mapping the source domain data and the target domain data to the same space.
The function of the category classifier is as follows: and classifying the extracted source domain data characteristics.
The function of the domain classifier is as follows: and judging whether the extracted characteristic information is from a source domain or a target domain.
The whole training process can be adapted to a paradigm of countermeasure learning, and in the training process, a part of target domain labels of the same batch are dynamically changed into source domain labels, so that a countermeasure relation is formed to reverse the gradient, the training difficulty of the domain discriminator is increased, and the domain classifier cannot correctly judge which domain the characteristic information comes from. After the loss function of the domain classifier reaches the minimum, the target domain data and the source domain data are mixed together, and at the moment, the feature extractor can map the target domain data and the source domain data to the same feature space, so that the problem of inconsistent data distribution is solved.
Meanwhile, the feature extraction model is limited to ensure that the extracted feature information can be used for subsequent classification tasks, and the problems of small data volume of a target domain and lack of classification labels are solved, so that two loss functions of class classification loss and domain classification loss are added together according to a certain proportion to obtain an overall loss function. And updating parameters of the three parts of the framework simultaneously by a gradient descent method based on the overall loss function. When the total loss function reaches the minimum, the target domain data and the source domain data can be mapped to the same feature space and correctly classified.
In actual work, when the data of a target domain is less, the difficulty in obtaining training corpora is higher, or the target domain has a large amount of unlabeled data but the cost of manual labeling is higher, the traditional supervised learning, especially the deep learning, is adopted, and a large amount of labeled data is needed to fit a model in the training process. By the knowledge and the domain migration method, the existing models and the learned ability can be reused for similar tasks or tasks in the same domain in a small-amount (Few-shot) or non-label (zero-shot) mode, so that the data cost and the labor cost can be reduced, high-quality intention identification service is provided based on limited data, new intention identification can be flexibly added, and the method has easy expandability; a bi-directional coder representation model containing multi-domain text-related information can be obtained while being reusable to other text-related tasks.
The acquisition unit 103 acquires data to be processed, inputs the data to be processed to the intention recognition model, and acquires an output of the intention recognition model as a target intention.
In this embodiment, the data to be processed may be uploaded by a relevant worker, which is not limited in the present invention.
In the embodiment, the intention recognition model obtained through training can be combined with the idea of transfer learning and a multi-level classification framework to realize accurate recognition of the intention.
It should be noted that, in order to further ensure the security of the data and avoid malicious tampering of the data, the intention identification model may be stored on the blockchain node.
According to the technical scheme, the method can utilize first sample data of a source domain to pre-train a Transformer model to obtain a feature extraction model, the feature extraction model is externally connected with a first preset classifier to obtain a first initial model, second sample data associated with a target task is utilized to perform supervised learning on the first initial model to obtain the first model, data of a target domain is obtained to serve as third sample data, the first model is externally connected with a second preset classifier to obtain a second initial model, the second initial model is subjected to countermeasure training on the basis of a target loss function and the third sample data to obtain an intention identification model, data to be processed are input to the intention identification model, and the data to be processed are output to serve as a target intention. The invention can be used for realizing the accurate recognition of the intention by combining the idea of transfer learning and a multi-level classification framework.
The above-mentioned intention recognition means may be implemented in the form of a computer program which can be run on a computer device as shown in fig. 3.
Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 is a server, and the server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 3, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a storage medium 503 and an internal memory 504.
The storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032, when executed, may cause the processor 502 to perform the intent recognition method.
The processor 502 is used to provide computing and control capabilities that support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of the computer program 5032 in the storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 can be caused to execute the intention identification method.
The network interface 505 is used for network communication, such as providing transmission of data information. Those skilled in the art will appreciate that the configuration shown in fig. 3 is a block diagram of only a portion of the configuration associated with aspects of the present invention and is not intended to limit the computing device 500 to which aspects of the present invention may be applied, and that a particular computing device 500 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The processor 502 is configured to run the computer program 5032 stored in the memory to implement the intention identifying method disclosed in the embodiment of the invention.
Those skilled in the art will appreciate that the embodiment of a computer device illustrated in fig. 3 does not constitute a limitation on the specific construction of the computer device, and in other embodiments a computer device may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may only include a memory and a processor, and in such embodiments, the structures and functions of the memory and the processor are consistent with those of the embodiment shown in fig. 3, and are not described herein again.
It should be understood that, in the embodiment of the present invention, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer-readable storage medium may be a nonvolatile computer-readable storage medium or a volatile computer-readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program realizes the intention identifying method disclosed by the embodiment of the invention when being executed by a processor.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. An intent recognition method, comprising:
in response to an intention identification instruction, determining a source domain according to the intention identification instruction, and constructing first sample data;
pre-training a Transformer model by using the first sample data to obtain a feature extraction model;
determining a target task, and acquiring data associated with the target task as second sample data;
externally connecting the feature extraction model with a first preset classifier to obtain a first initial model, and performing supervised learning on the first initial model by using the second sample data to obtain a first model;
determining a target domain, and acquiring data of the target domain as third sample data;
connecting the first model with a second preset classifier externally to obtain a second initial model;
constructing a target loss function;
performing countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model;
and acquiring data to be processed, inputting the data to be processed into the intention recognition model, and acquiring the output of the intention recognition model as a target intention.
2. The method of claim 1, wherein the pre-training the Transformer model with the first sample data to obtain a feature extraction model comprises:
acquiring general corpus data from the first sample data;
randomly selecting words from the general corpus data, and replacing the selected words with masks;
randomly disturbing the sentence relation of the general corpus data;
performing masking prediction training on the Transformer model according to the mask, and performing next sentence prediction training on the Transformer model by using the disordered sentences to obtain an intermediate model;
and acquiring the data of the source domain from the first sample data, and retraining the intermediate model by using the data of the source domain to obtain the feature extraction model.
3. The method according to claim 1, wherein the supervised learning of the first initial model using the second sample data to obtain the first model comprises:
inputting the second sample data into the feature extraction model for feature extraction to obtain an embedded vector representation of the second sample data;
acquiring a label of each data in the second sample data;
taking the embedded vector representation of the second sample data as a training sample, training the first preset classifier according to the label of each data until the loss of the first preset classifier reaches convergence, and stopping training;
determining a current first initial model as the first model.
4. The intent recognition method of claim 1, wherein said constructing an objective loss function comprises:
constructing a class classification loss function for the first preset classifier;
constructing a domain classification loss function;
configuring a first weight for the class classification loss function and a second weight for the domain classification loss function;
and calculating a weighted sum according to the first weight, the second weight, the class classification loss function and the domain classification loss function to obtain the target loss function.
5. The intent recognition method of claim 4, wherein the category classification loss function is formulated as follows:
Figure 118038DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 566337DEST_PATH_IMAGE002
a loss function representing the class of the signal,
Figure 627965DEST_PATH_IMAGE003
indicating that the first preset classifier output corresponds to
Figure 554333DEST_PATH_IMAGE004
The probability of the label is determined by the probability of the label,
Figure 103126DEST_PATH_IMAGE005
an embedded vector representation representing the output of the feature extraction model,
Figure 456747DEST_PATH_IMAGE006
which represents the (i) th input(s),
Figure 254939DEST_PATH_IMAGE004
to represent
Figure 998379DEST_PATH_IMAGE006
And i is a positive integer corresponding to the label.
6. The intent recognition method of claim 5, wherein the domain classification loss function is formulated as follows:
Figure 198416DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 722939DEST_PATH_IMAGE008
representing the domain classification loss function in question,
Figure 8426DEST_PATH_IMAGE009
a domain label is represented that indicates a domain tag,
Figure 479859DEST_PATH_IMAGE010
indicating that the second preset classifier output label is
Figure 285135DEST_PATH_IMAGE009
The probability of (c).
7. The method of claim 1, wherein performing a countermeasure training on the second initial model using the third sample data based on the target loss function to obtain an intent recognition model comprises:
randomly extracting data with a preset proportion from the third sample data, and recording the label of the extracted data as the label of the source domain to obtain fourth sample data;
and performing simulated reversal gradient training on the second initial model by using the fourth sample data until the value of the target loss function is not reduced any more, and stopping training to obtain the intention recognition model.
8. An intention recognition apparatus, comprising:
the construction unit is used for responding to the intention identification instruction, determining a source domain according to the intention identification instruction and constructing first sample data;
the training unit is used for pre-training the Transformer model by utilizing the first sample data to obtain a feature extraction model;
the acquisition unit is used for determining a target task and acquiring data associated with the target task as second sample data;
the learning unit is used for externally connecting the feature extraction model with a first preset classifier to obtain a first initial model, and performing supervised learning on the first initial model by using the second sample data to obtain a first model;
the acquiring unit is further configured to determine a target domain and acquire data of the target domain as third sample data;
the external unit is used for externally connecting the first model with a second preset classifier to obtain a second initial model;
the construction unit is also used for constructing a target loss function;
the training unit is further configured to perform countermeasure training on the second initial model by using the third sample data based on the target loss function to obtain an intention recognition model;
the acquisition unit is further configured to acquire data to be processed, input the data to be processed to the intention recognition model, and acquire an output of the intention recognition model as a target intention.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the intent recognition method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to execute the intention identification method according to any one of claims 1 to 7.
CN202110682988.4A 2021-06-21 2021-06-21 Intention recognition method, device, equipment and storage medium Active CN113139063B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110682988.4A CN113139063B (en) 2021-06-21 2021-06-21 Intention recognition method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110682988.4A CN113139063B (en) 2021-06-21 2021-06-21 Intention recognition method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113139063A true CN113139063A (en) 2021-07-20
CN113139063B CN113139063B (en) 2021-09-14

Family

ID=76815848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110682988.4A Active CN113139063B (en) 2021-06-21 2021-06-21 Intention recognition method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113139063B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569581A (en) * 2021-08-26 2021-10-29 中国联合网络通信集团有限公司 Intention recognition method, device, equipment and storage medium
CN113658178A (en) * 2021-10-14 2021-11-16 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment
CN115905548A (en) * 2023-03-03 2023-04-04 美云智数科技有限公司 Water army identification method and device, electronic equipment and storage medium
CN117077003A (en) * 2023-08-16 2023-11-17 中国船舶集团有限公司第七〇九研究所 Distributed target intention recognition method and system
WO2023231676A1 (en) * 2022-05-30 2023-12-07 京东方科技集团股份有限公司 Instruction recognition method and device, training method, and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599922A (en) * 2016-12-16 2017-04-26 中国科学院计算技术研究所 Transfer learning method and transfer learning system for large-scale data calibration
CN111581978A (en) * 2020-04-28 2020-08-25 深圳市一号互联科技有限公司 Intention identification method through domain migration and antagonistic learning
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network
US10916242B1 (en) * 2019-08-07 2021-02-09 Nanjing Silicon Intelligence Technology Co., Ltd. Intent recognition method based on deep learning network
CN112800239A (en) * 2021-01-22 2021-05-14 中信银行股份有限公司 Intention recognition model training method, intention recognition method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599922A (en) * 2016-12-16 2017-04-26 中国科学院计算技术研究所 Transfer learning method and transfer learning system for large-scale data calibration
US10916242B1 (en) * 2019-08-07 2021-02-09 Nanjing Silicon Intelligence Technology Co., Ltd. Intent recognition method based on deep learning network
CN111581978A (en) * 2020-04-28 2020-08-25 深圳市一号互联科技有限公司 Intention identification method through domain migration and antagonistic learning
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network
CN112800239A (en) * 2021-01-22 2021-05-14 中信银行股份有限公司 Intention recognition model training method, intention recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵鹏飞等: "面向迁移学习的意图识别研究进展", 《计算机科学与探索》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569581A (en) * 2021-08-26 2021-10-29 中国联合网络通信集团有限公司 Intention recognition method, device, equipment and storage medium
CN113569581B (en) * 2021-08-26 2023-10-17 中国联合网络通信集团有限公司 Intention recognition method, device, equipment and storage medium
CN113658178A (en) * 2021-10-14 2021-11-16 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment
WO2023231676A1 (en) * 2022-05-30 2023-12-07 京东方科技集团股份有限公司 Instruction recognition method and device, training method, and computer readable storage medium
CN115905548A (en) * 2023-03-03 2023-04-04 美云智数科技有限公司 Water army identification method and device, electronic equipment and storage medium
CN115905548B (en) * 2023-03-03 2024-05-10 美云智数科技有限公司 Water army recognition method, device, electronic equipment and storage medium
CN117077003A (en) * 2023-08-16 2023-11-17 中国船舶集团有限公司第七〇九研究所 Distributed target intention recognition method and system
CN117077003B (en) * 2023-08-16 2024-04-23 中国船舶集团有限公司第七〇九研究所 Distributed target intention recognition method and system

Also Published As

Publication number Publication date
CN113139063B (en) 2021-09-14

Similar Documents

Publication Publication Date Title
CN113139063B (en) Intention recognition method, device, equipment and storage medium
CN113822494B (en) Risk prediction method, device, equipment and storage medium
US9547821B1 (en) Deep learning for algorithm portfolios
CN104714931B (en) For selecting the method and system to represent tabular information
CN110196908A (en) Data classification method, device, computer installation and storage medium
WO2021073390A1 (en) Data screening method and apparatus, device and computer-readable storage medium
US20220100963A1 (en) Event extraction from documents with co-reference
CN112101031B (en) Entity identification method, terminal equipment and storage medium
CN111489105B (en) Enterprise risk identification method, device and equipment
US20220100772A1 (en) Context-sensitive linking of entities to private databases
CN110347840A (en) Complain prediction technique, system, equipment and the storage medium of text categories
CN112163099A (en) Text recognition method and device based on knowledge graph, storage medium and server
CN114416995A (en) Information recommendation method, device and equipment
CN115905528A (en) Event multi-label classification method and device with time sequence characteristics and electronic equipment
Jagdish et al. Identification of end-user economical relationship graph using lightweight blockchain-based BERT model
CN114428860A (en) Pre-hospital emergency case text recognition method and device, terminal and storage medium
US20220100967A1 (en) Lifecycle management for customized natural language processing
WO2022072237A1 (en) Lifecycle management for customized natural language processing
CN116029394B (en) Self-adaptive text emotion recognition model training method, electronic equipment and storage medium
CN116797195A (en) Work order processing method, apparatus, computer device, and computer readable storage medium
CN113590846B (en) Legal knowledge map construction method and related equipment
CN111506776B (en) Data labeling method and related device
CN114707483A (en) Zero sample event extraction system and method based on contrast learning and data enhancement
CN114330238A (en) Text processing method, text processing device, electronic equipment and storage medium
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant