CN112765356B - Training method and system of multi-intention recognition model - Google Patents

Training method and system of multi-intention recognition model Download PDF

Info

Publication number
CN112765356B
CN112765356B CN202110123802.1A CN202110123802A CN112765356B CN 112765356 B CN112765356 B CN 112765356B CN 202110123802 A CN202110123802 A CN 202110123802A CN 112765356 B CN112765356 B CN 112765356B
Authority
CN
China
Prior art keywords
intention
training
recognition model
loss function
true
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110123802.1A
Other languages
Chinese (zh)
Other versions
CN112765356A (en
Inventor
刘枭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Sipic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sipic Technology Co Ltd filed Critical Sipic Technology Co Ltd
Priority to CN202110123802.1A priority Critical patent/CN112765356B/en
Publication of CN112765356A publication Critical patent/CN112765356A/en
Application granted granted Critical
Publication of CN112765356B publication Critical patent/CN112765356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a training method of a multi-intention recognition model. The method comprises the following steps: encoding the original labeled training data through an encoder to obtain a sentence vector; determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through a classifier; determining a differentiable soft-f loss function based on a true positive case, a true negative case, a false negative case and a false positive case; and (4) performing back propagation training on the multi-intention recognition model by using a soft-f loss function, and optimizing parameters of a classifier and an encoder until the multi-intention recognition model is trained. The embodiment of the invention also provides a training system of the multi-intention recognition model. The embodiment of the invention modifies the calculation mode of the F1 value to construct a differentiable loss function, which means that the optimization can be carried out by using the back propagation algorithms, thereby greatly simplifying the training process and improving the performance of the identification of the intention field.

Description

Training method and system of multi-intention recognition model
Technical Field
The invention relates to the field of intelligent voice, in particular to a training method and a training system for a multi-intention recognition model.
Background
In a dialogue system, all reasonable intentions in a user statement need to be recognized, for example, "yes, i want to send a courier," two intentions of "confirm" and "send a courier" need to be recognized. In general, the problem of multi-intention recognition can be modeled as a multi-label classification problem, and a plurality of two classification models are trained to recognize the intention in a one-vs-all mode.
A Binary cross entropy loss (Binary cross entropy loss) function, which is the most commonly used loss function for training Binary models, is typically used; a Hinge loss (Hinge loss) function, which is a loss function of a binary model for training maximum compartment classification, such as Support Vector Machines (SVMs) is also used. Compared with cross entropy loss, hinge loss generally brings better model generalization capability, but the output of a model trained by hinge loss has no good probability definition and cannot be used as the confidence of the model on a recognition result. The Focal loss is an improvement on the basis of cross entropy loss, so that the model can focus on difficult samples during training, and the model has better performance on the difficult samples.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the related art:
in a dialogue system, F1 scores are generally used as an index for evaluating a certain intention recognition, and the performance of the entire intention recognition is expressed by a macro average of all intention F1 scores. However, these methods do not directly optimize the objective function of the macro-average correlation of F1 values during the training of the classification model, which leads to the following problems:
1. the trained output of the classification model does not provide an optimal F1 value macroaverage. The output of the model needs to be optimized, and an optimal threshold is found to make the value of F optimal, for example, if the output of the model for a certain class is 0 to 1, a threshold between the optimal threshold and the optimal threshold needs to be found to ensure the optimal performance;
2. the threshold value needs to be searched again each time the model is updated, and the whole process is complicated.
On a validation set, the output of each class of the model is tuned separately, finding an optimal threshold such that the macro-average of the F1 values is highest. The output of a model trained for hinge loss generally has no particularly good way to solve this type of problem, since there is no good probability definition. And because the F1 value is not trivial by definition, it cannot be directly used as a loss function for classification model training.
Disclosure of Invention
The method at least solves the problem that the training method in the prior art is not an objective function for directly optimizing the macro-average correlation of the F1 value and can not be directly used as a loss function for training a classification model.
In a first aspect, an embodiment of the present invention provides a training method for a multi-intent recognition model, including:
encoding the original labeled training data through an encoder to obtain a sentence vector;
determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through a classifier;
determining a differentiable soft-f loss function based on the true positive case, the true negative case, the false negative case, and the false positive case;
and carrying out back propagation training on the multi-intention recognition model by utilizing the soft-f loss function, wherein the back propagation training is used for optimizing the parameters of the classifier and the encoder until the multi-intention recognition model training is completed.
In a second aspect, an embodiment of the present invention provides a training system for a multi-intent recognition model, including:
the coding program module is used for coding the original marked training data through a coder to obtain a sentence vector;
the classification program module is used for determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through the classifier;
a loss function determination program module for determining a differentiable soft-f loss function based on the true positive, the true negative, the false negative, and the false positive;
and the training program module is used for carrying out back propagation training on the multi-intention recognition model by utilizing the soft-f loss function and optimizing the parameters of the classifier and the encoder until the multi-intention recognition model is trained.
In a third aspect, an electronic device is provided, comprising: the training system comprises at least one processor and a memory which is connected with the at least one processor in a communication mode, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute the steps of the training method of the multi-purpose recognition model of any embodiment of the invention.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program is configured to, when executed by a processor, implement the steps of the training method for a multi-intent recognition model according to any embodiment of the present invention.
The embodiment of the invention has the beneficial effects that: the calculation mode of the F1 value is modified to construct a differentiable loss function, which means that the back propagation algorithms can be used for optimization, the training process is greatly simplified, and the performance of the identification of the intention field is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flowchart of a training method for a multi-intent recognition model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a training phase of a training method for a multi-intent recognition model according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating an inference phase of a training method for a multi-intent recognition model according to an embodiment of the present invention;
FIG. 4 is a comparison graph of the performance of a training method for a multi-intent recognition model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a training system for a multi-intent recognition model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a training method for a multi-intent recognition model according to an embodiment of the present invention, which includes the following steps:
s11: encoding the original labeled training data through an encoder to obtain a sentence vector;
s12: determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through a classifier;
s13: determining a differentiable soft-f loss function based on the true positive case, the true negative case, the false negative case, and the false positive case;
and S14, carrying out back propagation training on the multi-intention recognition model by using the soft-f loss function, wherein the back propagation training is used for optimizing the parameters of the classifier and the encoder until the training of the multi-intention recognition model is completed.
In the embodiment, a differentiable loss function is constructed as a loss function for training the binary model by modifying the calculation mode of the F1 value.
For step S11, the implementation prepares the original annotation training labels with multiple intentions, for example, "yes, i want to send a courier", where "yes" and "send a courier" each have a respective intention. Assume multiple statements Q1, Q2, Q3. Wherein Q1 and Q3 contain intention A and Q2 does not contain intention A.
For step S12, the classification results for Q1, Q2, and Q3 using the classifier (classification model) are: q1 has an 80% probability of containing intention A, Q2 has a 10% probability of containing intention A, and Q3 has a 90% probability of containing intention A. Calculating a true positive case TP, a false positive case FP, a false negative case FN and a true negative case TN from the angle of probability:
TP=1*0.8+1*0.9=1.7;
FP=1*0.1=0.1;
FN=1*0.2+1*0.1=0.3;
TN=1*0.9=0.9。
for step S13, the differentiable soft-F loss function is determined in consideration of modifying the calculation method of the F1 value, and the relationship between the true case TP and the true negative case TN is additionally considered in the calculation method of the existing F1 value. And then the whole modification is carried out.
As an embodiment, in this embodiment, the soft-f loss function is:
Figure BDA0002923181650000041
the TP is a true positive case, the TN is a true negative case, the FP is a false positive case, and the FN is a false negative case.
In this embodiment, the formula is extended further:
Figure BDA0002923181650000051
where X represents the entire data set, X is one sample in X, yxLabel (0 or 1), p, representing xxRefers to the probability value (value between 0 and 1) that the prediction sample x is labeled 1.
In step S14, after determining the soft-f loss function in the multi-intent recognition model, the back propagation training is performed, so as to train the classifier and the encoder in the multi-intent recognition model as a whole.
Because soft-F is directly related to the F1 value, training with soft-F as a classification model loss function can obtain better macro-averaging of the F1 value of the final model. The optimal probability threshold of the model output is about 0.5 (as a better mode, the optimal probability threshold can be properly adjusted according to different requirements), searching is not needed, large changes caused by retraining are avoided, and the complexity of the process can be greatly reduced.
In the definition of the soft-f loss function, another case is also considered:
Figure BDA0002923181650000052
in this way, the loss function is designed only from the perspective of positive examples, and in practice, it is found that for some data sets with the most positive examples, the derivative of the loss function is always positive, which results in that the finally trained classification model predicts all classes as positive examples. To solve this problem, negative case dependent terms are therefore added to the loss function.
It can be seen from this embodiment that, by modifying the calculation mode of the F1 value, a differentiable loss function is constructed, and in the deep learning-based classification system, a back propagation algorithm is relied on to perform gradient descent optimization on the parameters of the classification model, which requires that the loss function we use is differentiable. The F1 value is used as a common index, the index is not differentiable in a common calculation mode, cross entropy and the like are used as loss functions to perform indirect optimization, a differentiable F1 value function is constructed, the F1 value is optimized directly, better performance is brought compared with indirect optimization, the training process is simplified greatly, and the performance of identification of the intention field is improved.
As an embodiment, the encoding, by an encoder, the original annotation training data includes: encoding the original labeling training data through a BERT encoder to obtain a sentence vector; performing label vectorization on the original labeling training data to obtain a vectorized intention label;
calculating a soft-f loss function based on the vectorized intent labels and the probabilities of the intents in the sentence vectors.
In the present embodiment, as shown in fig. 2, a specific flowchart of the training phase is shown.
Tagging annotation data is vectorized, for example, tags of a common intention in the data are three intentions of a, B and C, tags of a sentence Q1 are an intention a and an intention B, and then vectorized is 110, tags of a sentence Q2 are an intention a and an intention C, and then vectorized is 101.
The sentences are encoded into d-dimensional sentence vectors using BERT (bidirectional Encoder retrieval from transformations). The network architecture of the BERT uses a multi-layer Transformer structure, and has the biggest characteristic that the traditional RNN and CNN are abandoned, and the distance between two words at any position is converted into 1 through an Attention mechanism, so that the problem of troublesome long-term dependence in NLP is effectively solved.
The sentence vectors are passed through a classifier formed by a fully-connected network to output the probability of each intention, and the true case TP, the false positive case FP, the false negative case FN and the true negative case TN are not described herein again. And further calculating soft-f loss by using vectorized labels and the probability output by the classifier, and optimizing parameters of the classifier and a sentence vector encoder through a back propagation algorithm, thereby further improving the performance of identification of the intention field.
As an embodiment, the method further comprises:
recognizing a text corresponding to a sentence input by a user, and converting the text into a sentence vector through an encoder in a multi-intention recognition model;
determining probability values of all intentions in the sentence vector through the multi-intention recognition intra-model classifier, and outputting at least one predicted intention with the probability value higher than a preset threshold value.
In this embodiment, as shown in the flowchart of the inference and prediction stage shown in fig. 3, a text to be predicted is input into a sentence vector encoder in a trained multi-intent recognition model, so as to obtain a sentence vector of the text to be predicted.
A classifier in a multi-intent recognition based model determines probability values for each intent in a sentence vector of text to be predicted. For example, if the probability value of "express delivery field" in "i want to send an express delivery" is 0.935, and exceeds the preset 0.5 threshold, the intention of "express delivery field" is output.
Further, the method further comprises:
and checking at least one output predicted intention based on at least one actual intention preset in a text corresponding to the user input sentence, and determining the performance of the multi-intention recognition model based on the checking result.
In this embodiment, the experimental data is derived from a customer service dialogue system in the express delivery field and the financial field. The express delivery field data has 10 intention categories, the training set size is 16112, and the test set size is 4000. The financial field data has 15 intention categories in total, the training data size is 8179, and the test set size is 2000. The base line method and the sentence vector encoder in the method both use BERT, the classifier both use two layers of fully connected neural networks, 0.5 is adopted as a threshold value of all the classification outputs of the classifier, and the macro-average of F1 score is used as an evaluation index, and the result on the test set is shown in figure 4. From experimental results, it can be found that, due to the direct optimization of the F1 value, when the threshold value output by the classifier is not adjusted, relative cross entropy loss is caused, and the performance is obviously improved when soft-F loss is used.
Fig. 5 is a schematic structural diagram of a training system for multiple intention recognition models according to an embodiment of the present invention, which can execute the training method for multiple intention recognition models according to any of the above embodiments and is configured in a terminal.
The training system 10 for the multi-intent recognition model provided by the embodiment comprises: a coding program module 11, a classification program module 12, a loss function determination program module 13 and a training program module 14.
The encoding program module 11 is configured to encode the original labeled training data through an encoder to obtain a sentence vector; the classifier module 12 is configured to determine, through the classifier, probabilities of true positive, true negative, false negative, and false positive of each intention in the sentence vector; the loss function determination program module 13 is used for determining a differentiable soft-f loss function based on the true positive case, the true negative case, the false negative case and the false positive case; the training program module 14 is configured to train the multi-intent recognition model using the soft-f loss function in a back propagation manner, so as to optimize the parameters of the classifier and the encoder until the training of the multi-intent recognition model is completed.
Further, the soft-f loss function is:
Figure BDA0002923181650000071
the TP is a true positive case, the TN is a true negative case, the FP is a false positive case, and the FN is a false negative case.
Further, the encoded program modules are for: encoding the original labeling training data through a BERT encoder to obtain a sentence vector;
the method further comprises the following steps: the label vectorization program module is used for carrying out label vectorization on the original labeling training data to obtain a vectorized intention label;
a loss function determination program module for calculating a soft-f loss function based on the vectorized intent tags and the probabilities of the intents in the sentence vector.
The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the training method of the multi-intention recognition model in any method embodiment;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
encoding the original labeled training data through an encoder to obtain a sentence vector;
determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through a classifier;
determining a differentiable soft-f loss function based on the true positive case, the true negative case, the false negative case, and the false positive case;
and carrying out back propagation training on the multi-intention recognition model by utilizing the soft-f loss function, wherein the back propagation training is used for optimizing the parameters of the classifier and the encoder until the multi-intention recognition model training is completed.
As a non-transitory computer-readable storage medium, it may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium, which when executed by a processor, perform a method of training a multi-intent recognition model in any of the method embodiments described above.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present invention further provides an electronic device, which includes: the training system comprises at least one processor and a memory which is connected with the at least one processor in a communication mode, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute the steps of the training method of the multi-purpose recognition model of any embodiment of the invention.
The electronic device of the embodiments of the present application exists in various forms, including but not limited to:
(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones, multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as tablet computers.
(3) Portable entertainment devices such devices may display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) Other electronic devices with data processing capabilities.
As used herein, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A training method of a multi-intention recognition model comprises the following steps:
encoding the original labeled training data through an encoder to obtain a sentence vector;
determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through a classifier;
determining a differentiable soft-f loss function based on the true positive case, the true negative case, the false negative case, and the false positive case;
back-propagating training the multi-intent recognition model by using the soft-f loss function for optimizing the parameters of the classifier and the encoder until the multi-intent recognition model training is completed,
wherein the soft-f loss function is:
Figure DEST_PATH_IMAGE001
the TP is a true positive case, the TN is a true negative case, the FP is a false positive case, and the FN is a false negative case.
2. The method of claim 1, wherein said encoding, by an encoder, original annotation training data comprises: encoding the original labeling training data through a BERT encoder to obtain a sentence vector;
the method further comprises the following steps:
performing label vectorization on the original labeling training data to obtain a vectorized intention label;
calculating a soft-f loss function based on the vectorized intent labels and the probabilities of the intents in the sentence vector.
3. The method of claim 1, wherein the method further comprises:
recognizing a text corresponding to a sentence input by a user, and converting the text into a sentence vector through an encoder in a multi-intention recognition model;
determining probability values of all intentions in the sentence vector through the classifier in the multi-intention recognition model, and outputting at least one predicted intention with the probability value higher than a preset threshold value.
4. The method of claim 3, wherein the method further comprises:
and checking at least one output predicted intention based on at least one preset actual intention in a text corresponding to the user input sentence, and determining the performance of the multi-intention recognition model based on the checking result.
5. A training system for a multi-intent recognition model, comprising:
the coding program module is used for coding the original marked training data through a coder to obtain a sentence vector;
the classification program module is used for determining the probability of true positive, true negative, false negative and false positive of each intention in the sentence vector through the classifier;
a loss function determination program module for determining a differentiable soft-f loss function based on the true positive, the true negative, the false negative, and the false positive;
a training program module for back propagation training of the multi-intent recognition model by using the soft-f loss function, for optimizing the parameters of the classifier and the encoder until the multi-intent recognition model training is completed,
wherein the soft-f loss function is:
Figure DEST_PATH_IMAGE003
the TP is a true positive case, the TN is a true negative case, the FP is a false positive case, and the FN is a false negative case.
6. The system of claim 5, wherein the encoded program modules are to: encoding the original labeling training data through a BERT encoder to obtain a sentence vector;
the system further comprises: the label vectorization program module is used for carrying out label vectorization on the original labeling training data to obtain a vectorized intention label;
a loss function determination program module for calculating a soft-f loss function based on the vectorized intent tags and the probabilities of the intents in the sentence vector.
7. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any of claims 1-4.
8. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
CN202110123802.1A 2021-01-29 2021-01-29 Training method and system of multi-intention recognition model Active CN112765356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110123802.1A CN112765356B (en) 2021-01-29 2021-01-29 Training method and system of multi-intention recognition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110123802.1A CN112765356B (en) 2021-01-29 2021-01-29 Training method and system of multi-intention recognition model

Publications (2)

Publication Number Publication Date
CN112765356A CN112765356A (en) 2021-05-07
CN112765356B true CN112765356B (en) 2022-07-12

Family

ID=75706609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110123802.1A Active CN112765356B (en) 2021-01-29 2021-01-29 Training method and system of multi-intention recognition model

Country Status (1)

Country Link
CN (1) CN112765356B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408509B (en) * 2022-11-01 2023-02-14 杭州一知智能科技有限公司 Intention identification method, system, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9607246B2 (en) * 2012-07-30 2017-03-28 The Trustees Of Columbia University In The City Of New York High accuracy learning by boosting weak learners
CN108920622B (en) * 2018-06-29 2021-07-20 北京奇艺世纪科技有限公司 Training method, training device and recognition device for intention recognition
CN110209817B (en) * 2019-05-31 2023-06-09 安徽省泰岳祥升软件有限公司 Training method and device for text processing model and text processing method
CN111159358A (en) * 2019-12-31 2020-05-15 苏州思必驰信息科技有限公司 Multi-intention recognition training and using method and device

Also Published As

Publication number Publication date
CN112765356A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN109635273B (en) Text keyword extraction method, device, equipment and storage medium
CN110046221B (en) Machine dialogue method, device, computer equipment and storage medium
CN112100349B (en) Multi-round dialogue method and device, electronic equipment and storage medium
WO2022142006A1 (en) Semantic recognition-based verbal skill recommendation method and apparatus, device, and storage medium
CN110704641A (en) Ten-thousand-level intention classification method and device, storage medium and electronic equipment
JP6677419B2 (en) Voice interaction method and apparatus
CN114357973B (en) Intention recognition method and device, electronic equipment and storage medium
CN110377733B (en) Text-based emotion recognition method, terminal equipment and medium
CN111159358A (en) Multi-intention recognition training and using method and device
CN110414005B (en) Intention recognition method, electronic device and storage medium
CN113505198A (en) Keyword-driven generating type dialogue reply method and device and electronic equipment
CN112632248A (en) Question answering method, device, computer equipment and storage medium
CN114491077A (en) Text generation method, device, equipment and medium
CN112765356B (en) Training method and system of multi-intention recognition model
CN111209297A (en) Data query method and device, electronic equipment and storage medium
CN114817478A (en) Text-based question and answer method and device, computer equipment and storage medium
CN114416981A (en) Long text classification method, device, equipment and storage medium
CN114238656A (en) Reinforced learning-based affair atlas completion method and related equipment thereof
CN117056474A (en) Session response method and device, electronic equipment and storage medium
CN110795531A (en) Intention identification method, device and storage medium
CN113254575B (en) Machine reading understanding method and system based on multi-step evidence reasoning
CN116306679A (en) Semantic configurable multi-mode intelligent customer service dialogue based method and system
CN115033683A (en) Abstract generation method, device, equipment and storage medium
CN114329005A (en) Information processing method, information processing device, computer equipment and storage medium
CN113704466A (en) Text multi-label classification method and device based on iterative network and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant