CN113837370B

CN113837370B - Method and apparatus for training a model based on contrast learning

Info

Publication number: CN113837370B
Application number: CN202111221793.6A
Authority: CN
Inventors: 窦辰晓
Original assignee: Seashell Housing Beijing Technology Co Ltd
Current assignee: Seashell Housing Beijing Technology Co Ltd
Priority date: 2021-10-20
Filing date: 2021-10-20
Publication date: 2023-12-05
Anticipated expiration: 2041-10-20
Also published as: CN113837370A

Abstract

The embodiment of the invention provides a method and a device for training a model based on contrast learning, and belongs to the technical field of computers. The method comprises the following steps: determining an countermeasure statement sample of the original statement sample; obtaining an original sentence vector corresponding to an original sentence in an original sentence sample of the current batch; obtaining a countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the current batch; obtaining a contrast loss function value; and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value, and repeating the processes of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value so that the times of adjusting the parameters of the first preset neural network model reach the first preset value times to complete the training process. Therefore, the problems of sentence semantic change and the like possibly caused by simple random word insertion and deletion operation are solved.

Description

Method and apparatus for training a model based on contrast learning

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for training a model based on contrast learning.

Background

The contrast learning is a discriminant self-supervision learning mode, and is characterized in that a similar sample and a dissimilar sample are automatically constructed through a preset data conversion strategy, and then the distance between the similar samples (positive samples) is shortened and the distance between the dissimilar samples (negative samples) is pushed away by using a contrast loss function. This training approach has been widely used in CV and NLP fields to obtain high quality vector representation space, and the excellent performance achieved attracts great attention. There are two important parts of the model based on contrast learning: one is a data conversion strategy, and an effective data conversion strategy can determine invariance of the finally learned vector representation, improve the representation of the representation vector in a downstream task, and currently, in the NLP field, common data conversion strategies comprise word insertion and deletion, word sequence disorder and other simpler and shallower methods. Another is the design of contrast loss, which practice shows that increasing the number of negative samples can improve the quality of the representation vector in the final representation space. However, in the NLP field, most of the text representation models based on contrast learning currently use an end-to-end structure, and adding negative samples in this structure is equivalent to adding the number of training samples (batch) of the same batch, which obviously suffers from a memory limitation, and on the other hand, research shows that simply increasing the size of batch even reduces the quality of sentence vectors. Moreover, a great deal of research (especially in the field of NLP) is currently focused mainly on applying contrast learning in a self-supervised manner, with little integration of the idea of contrast learning on supervised tasks.

The disadvantages of the prior art include the following. One aspect is the data conversion aspect. In the NLP field, the currently used data conversion method includes random deletion, insertion, or using a dropout mechanism in a network as a data enhancement policy, however, each method has a respective problem. For example, a way to randomly delete words may delete words that represent the entire sentence meaning, thereby changing the meaning of the original sentence. In addition, the dropoff mechanism is simple and effective, but research shows that since the dropoff strategy does not change the length of sentences, the length information of sentences can be used as a feature to distinguish positive and negative examples, so that the model is biased to consider sentences with consistent lengths as positive examples and sentences with larger length differences as negative examples. Yet another aspect is the loss function aspect. For the ith sentence, the design of the contrast loss function includes one positive sample and N-1 negative samples. N is the batch size. In end-to-end based structures, the design of such contrast loss functions currently involves only one positive sample, and is limited by memory and other factors, failing to increase the number of negative samples. Furthermore, the self-supervision approach inevitably samples sentences that are substantially semantically similar to the original sentences as negative samples, which is disadvantageous for learning vector representations. Yet another aspect is that current NLP field research is mainly focused on contrast learning training sentence vectors in a self-supervised manner, while contrast learning is rarely used on supervised tasks as an auxiliary task to improve the performance of model supervised tasks.

Disclosure of Invention

It is an aim of embodiments of the present invention to provide a method and apparatus for training a model based on contrast learning which solves or at least partially solves the above mentioned problems.

To achieve the above object, an aspect of an embodiment of the present invention provides a method for training a model based on contrast learning, the method comprising: determining an countermeasure statement sample of the original statement sample; inputting the original sentence samples of the current batch into a first preset neural network model for obtaining sentence vectors, and obtaining original sentence vectors corresponding to original sentences in the original sentence samples of the current batch; inputting the countermeasure sentence sample of the original sentence sample of the current batch into the first preset neural network model to obtain a countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the current batch; based on a preset contrast loss function, combining the obtained original sentence vector and the obtained countermeasure sentence vector to obtain a contrast loss function value; and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value, and repeating the processes of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value so that the times of adjusting the parameters of the first preset neural network model reach the first preset value times to complete the training process.

Optionally, the determining the countermeasure sentence sample of the original sentence sample includes: inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining sentence vectors in the sentence classification model to enable the category of the original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category; changing the sentence structure of the original sentence in the original sentence sample through synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category; and repeating the process of changing the sentence structure and the prediction category for any original sentence which is input to the sentence classification model again and has the same predicted category as the real category until the category prediction is wrong or the number of times of changing the sentence structure and the prediction category reaches a second preset value for times, wherein the sentence with the wrong category prediction is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the second preset value version of the sentence corresponding to the original sentence after the sentence structure of the original sentence is changed is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, the obtained contrast loss function value is further combined with the countermeasure statement vector corresponding to the countermeasure statement in the history batch of the countermeasure statement samples.

Optionally, in a case where it is determined that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the history batch of the countermeasure sentence samples is obtained based on the trained second preset neural network model.

Optionally, the method further comprises: and adjusting the trained parameters of the second preset neural network model according to the obtained contrast loss function value.

Optionally, for any one of the original sentence samples, among the countermeasure sentence samples of the current lot and the countermeasure sentence samples of the history lot, the countermeasure sentence of the same category as the original sentence is a positive sample, and the countermeasure sentence of a different category from the original sentence is a negative sample.

Optionally, the preset contrast loss function includes:

wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the history batch, y _i Representing primitive languageClass of sentence i, y _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _i And y is _j When the difference is 0, N represents the number of original sentences in the original sentence sample of the current batch or the number of countermeasure sentences in the countermeasure sentence sample of the current batch, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence sample of the history batch, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL An average value, z, of the contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature.

Accordingly, another aspect of an embodiment of the present invention provides an apparatus for training a model based on contrast learning, the apparatus comprising: the countermeasure sentence sample determining module is used for determining a countermeasure sentence sample of the original sentence sample; the original sentence vector determining module is used for inputting the original sentence samples of the current batch into a first preset neural network model for obtaining sentence vectors, and obtaining original sentence vectors corresponding to original sentences in the original sentence samples of the current batch; the countermeasure sentence vector determining module is used for inputting the countermeasure sentence samples of the original sentence samples of the current batch into the first preset neural network model to obtain countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence samples of the current batch; the contrast loss function value determining module is used for obtaining a contrast loss function value based on a preset contrast loss function by combining the obtained original sentence vector and the obtained countermeasure sentence vector; and the adjusting module is used for adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value, repeating the processes of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value so that the times of adjusting the parameters of the first preset neural network model reach a first preset value for times, and completing the training process.

Optionally, the countermeasure sentence sample determination module determining the countermeasure sentence sample of the original sentence sample includes: inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining sentence vectors in the sentence classification model to enable the category of the original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category; changing the sentence structure of the original sentence in the original sentence sample through synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category; and repeating the process of changing the sentence structure and the prediction category for any original sentence which is input to the sentence classification model again and has the same predicted category as the real category until the category prediction is wrong or the number of times of changing the sentence structure and the prediction category reaches a second preset value for times, wherein the sentence with the wrong category prediction is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the second preset value version of the sentence corresponding to the original sentence after the sentence structure of the original sentence is changed is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, the contrast loss function value determining module obtains a contrast loss function value and further combines the countermeasure statement vector corresponding to the countermeasure statement in the history batch of the countermeasure statement samples.

Optionally, in a case where the countermeasure sentence sample determination module determines that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the history batch of the countermeasure sentence samples is obtained based on the trained second preset neural network model.

Optionally, the adjusting module is further configured to adjust parameters of the trained second preset neural network model according to the obtained contrast loss function value.

Optionally, the preset contrast loss function includes:

Wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the history batch, y _i Representing the class, y, of the original sentence i _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _i And y is _j When the difference is 0, N represents the number of original sentences in the original sentence sample of the current batch or the number of countermeasure sentences in the countermeasure sentence sample of the current batch, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence sample of the history batch, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL An average value, z, of the contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature.

Still another aspect of an embodiment of the present invention provides a machine-readable storage medium having stored thereon instructions for causing a machine to perform the above-described method.

In addition, another aspect of the embodiments of the present invention further provides a processor, configured to execute a program, where the program is executed to perform the method described above.

Furthermore, another aspect of the embodiments of the present invention provides a computer program product comprising a computer program/instruction which, when executed by a processor, implements the method described above.

According to the technical scheme, the contrast learning-based model is trained by using the contrast sentence sample, and the contrast sentence sample is used as a strategy for data enhancement, so that the problems of sentence semantic change and the like possibly caused by simple random word insertion and deletion operations are avoided.

Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain, without limitation, the embodiments of the invention. In the drawings:

FIG. 1 is a schematic diagram of a BERT model;

FIG. 2 is a schematic diagram of data conversion;

FIG. 3 is a schematic diagram of a text representation model structure based on contrast learning;

FIG. 4 is a flow chart of a method for training a model based on contrast learning provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of a challenge sample according to another embodiment of the present invention;

FIG. 6 is a schematic diagram of a contrast learning-based model provided by another embodiment of the present invention; and

FIG. 7 is a block diagram of an apparatus for training a model based on contrast learning according to another embodiment of the present invention.

Description of the reference numerals

1. Countermeasure sentence sample determination Module 2 countermeasure sentence vector determination Module

3. Original sentence vector determining module 4 contrast loss function value determining module

5. Adjustment module

Detailed Description

The following describes the detailed implementation of the embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.

Pre-trained language models (such as BERT) have achieved significant effects on various NLP tasks, the most commonly used encoder. At present, a remarkable performance can be shown on tasks such as text classification by using the BERT, and when the BERT is used for performing the text classification task, vectors corresponding to characters (namely [ CLS ]) at the first position of the last layer are generally taken as semantic representations of the whole sentence to be classified, as shown in figure 1.

Self-supervision learning belongs to one of non-supervision learning, and effective vector representation is learned from a training model in non-labeling data through a pre-established auxiliary task, and the pre-trained model is used by a downstream task, so that the performance of the model on the downstream task can be greatly improved. Such training mechanisms have been widely used in various fields. Common auxiliary tasks such as mask language model tasks in the text field belong to the generating type self-supervision learning task, the model is enabled to reconstruct the original input by adding noise on the input, or the angle prediction task in the image field belongs to the discriminant type self-supervision learning task, and the model is enabled to predict the rotated angle of a picture, which is a pseudo tag required by most discriminant type self-supervision learning tasks. Recently, a discriminant self-supervised learning method (also referred to as contrast learning) has attracted tremendous attention due to its excellent performance in CV and NLP fields. Unlike the self-supervised learning task described above, neither pseudo tags nor reconstruction of the original input is required unlike the generated self-supervised learning, the idea of contrast learning is to learn a representation space by comparing input samples, in which the distance between similar samples is very close and the distance between dissimilar samples is relatively far. Specifically, the objective of the comparison self-supervised learning is to generate a transformed version of the original input data by using a data transformation strategy, randomly choose a sample from the original data, choose a sample from the transformed data, and let the model predict whether the two samples are from the same original data (or whether one sample is obtained by transforming the other sample), as shown in fig. 2.

Network structures based on contrast learning are various at present, and can be roughly divided into four types of structures according to different collection processes of negative samples: end-to-end form, use of memory pools, use of momentum encoding, introduction of clustering. In the CV field, the above four architectures have respective advantages, while in the NLP field, the end-to-end (end-to-end) based architecture is most commonly used. Further, in the CV field, commonly used data conversion includes rotation, clipping, and the like, whereas in the NLP field, currently commonly used data conversion includes back translation, random insertion (insert) or deletion (delete) words, and the like. The structure of a representation model based on contrast learning in the present NLP field is approximately shown in fig. 3. The contrast learning loss function (contrast loss) generally uses the following formula:where U represents the number of samples entered (e.g., u=2 in fig. 3); sim (a, b) represents the similarity of two vectors, and generally adopts a cosine similarity measure to represent the cosine similarity between the vector a and the vector b, and the calculation formula is +.>z _p Representing the sentence vector, z 'obtained by BERT coding of the p-th sentence' _p The idea of contrast learning is to consider the two sentences as positive samples to shorten the distance between the two sentences in the representation space and push away the p-th sentence and the p-th sentence Distance of its sentence.

One aspect of an embodiment of the present invention provides a method for training a model based on contrast learning.

FIG. 4 is a flow chart of a method for training a model based on contrast learning provided by an embodiment of the present invention. As shown in fig. 4, the method includes the following.

In step S40, a countermeasure sentence sample of the original sentence sample is determined. The challenge sample is initially used in the CV field, and refers to a sample in which the image after the data processing operation is misjudged by the model, and compared with the original image, the sample is not changed by human eyes. In the NLP field, since sentences are composed of words, the discreteness can enable partial words to be slightly changed to be perceived by human eyes, so that countermeasure samples in the NLP field refer to sentences subjected to data processing operation, and compared with original sentences, the sentences have unchanged semantics, but can enable misjudgment of models. The manner in which the challenge samples are generated is classified into white-box attacks and black-box attacks. The strategy of white box attack requires calculating the gradient of model parameters according to the label, adding disturbance along the gradient rising direction and then reducing loss. The black box attack strategy does not care details such as specific parameters of the model, but changes sentences through operations such as synonym replacement and the like on the premise of not changing the semantics of original sentences as much as possible, so that misjudgment of the model occurs. The black box attack method may employ TextFooler as shown in fig. 5 below. Specifically, in the embodiment of the present invention, the countermeasure sentence sample that determines the original sentence sample may be determined by countermeasure training a second preset neural network model for obtaining sentence vectors of sentences in the sentence classification model for sentence classification. Wherein the second predetermined neural network model may be BERT. In addition, in the embodiment of the invention, the synonym replacement mode is used, so that the sentence length of the sentence can be changed, and the problem caused by a dropout mechanism is solved.

In step S41, the original sentence sample of the current batch is input to a first preset neural network model for obtaining sentence vectors of sentences, so as to obtain original sentence vectors corresponding to the original sentences in the original sentence sample of the current batch. Wherein, the first preset neural network model may employ BERT.

In step 42, the countermeasure sentence sample of the original sentence sample of the current batch is input to the first preset neural network model, so as to obtain a countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the current batch.

In step S43, based on the preset contrast loss function, the obtained original sentence vector and the obtained countermeasure sentence vector are combined to obtain a contrast loss function value.

In step S44, the process of adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value and obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value is repeated continuously so that the number of times of adjusting the parameters of the first preset neural network model reaches the first preset value to complete the training process. For example, the adjustment may be performed by adjusting parameters of the first preset neural network model by back propagation.

Alternatively, in an embodiment of the present invention, the countermeasure sentence sample that determines the original sentence sample may include the following. Inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model in the sentence classification model for obtaining sentence vectors so that the category of the original sentence in the original sentence sample predicted by the sentence classification model is the same as the real category. Specifically, the second preset neural network model is trained by adjusting parameters of the second preset neural network model. The sentence structure of the original sentence in the original sentence sample is changed by synonym substitution. And inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category, wherein a second preset neural network model in the sentence classification model is trained. And repeating the processes of changing the sentence structure and predicting the category for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category until the category is predicted incorrectly or the times of changing the sentence structure and predicting the category reach a second preset value times, wherein the sentence with the wrong category prediction is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the sentences of the second preset value versions corresponding to the original sentence after the sentence structure of the original sentence is changed, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, in the embodiment of the present invention, the obtained contrast loss function value is further combined with a countermeasure statement vector corresponding to a countermeasure statement in the history batch of countermeasure statement samples.

Optionally, in an embodiment of the present invention, in a case where it is determined that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, a countermeasure sentence vector corresponding to a countermeasure sentence in the countermeasure sentence samples of the history batch is obtained based on the trained second preset neural network model.

Optionally, in an embodiment of the present invention, the method further includes: and adjusting the parameters of the trained second preset neural network model according to the obtained contrast loss function value.

Optionally, in the embodiment of the present invention, for any one of the original sentence samples, in the current lot of countermeasure sentence samples and the historical lot of countermeasure sentence samples, the countermeasure sentence same as the category of the original sentence is a positive sample, and the countermeasure sentence different from the category of the original sentence is a negative sample. The countermeasure sentence of the same class as an original sentence is used as a positive sample, and the number of sentences included in the positive sample is increased. The countermeasure sentence samples of the history lot are also added to the positive and negative samples, increasing the number of sentences included in the positive and negative samples. In addition, by taking the countermeasure sentence of the same category as the original sentence as a positive sample and the countermeasure sentence of a different category from the original sentence as a negative sample, it is realized that the contrast learning is applied as an auxiliary task on the supervision task in a supervised manner.

Optionally, in an embodiment of the present invention, the preset contrast loss function includes:

wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the countermeasure sentence sample of the history batch, y _i Representing the class, y, of the original sentence i _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _i And y is _j When the difference is 0, N represents the number of original sentences in the original sentence samples of the current batch or the number of countermeasure sentences in the countermeasure sentence samples of the current batch, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence samples of the history batch, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL Mean value, z, of contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature.

FIG. 6 is a schematic diagram of a model based on contrast learning according to another embodiment of the present invention. In this embodiment, the generation of the countermeasure sample may employ a black box attack technique, such as textfooller method, where sentences with category labels are input into the model, and the sentences are repeatedly input into the model through operations such as synonym substitution, etc., until the model predicts a category error or outputs a low confidence level without changing the semantics of the sentences. In the self-supervising form of the contrast loss function, for a sentence, all the remaining sentences within a batch are typically pushed away as negative samples for their distance in the representation space. However, many sentences in a batch belong to one category, and pushing away the distance between sentences in the same category reduces the accuracy of model classification. The loss of contrast in the form of supervision is thus employed, i.e. the tag information of the supervision task is utilized. Specifically, for a certain original sentence, all countermeasure sentences in the batch and belonging to the same class as the original sentence are taken as positive samples of the original sentence, and the countermeasure sentences of the rest classes are taken as negative samples. The contrast penalty of the supervised form is thus different from the self-supervised form, and multiple countermeasure sentences can be included in a positive sample of one original sentence. The momentum contrast mechanism mainly comprises a queue and a slowly updated encoder. The queue stores sentence vectors and corresponding categories of each batch of countermeasure sentences, and the encoder adopts a model trained in a countermeasure training stage, so that the model has certain fault tolerance and is more robust to some countermeasure samples. The technical scheme provided by the embodiment of the invention can be realized: 1) The quality of positive samples is improved by using the countermeasure samples as a strategy for enhancing data, so that the problems of sentence semantic change and the like possibly caused by simple random word insertion and deletion operations are avoided; 2) The contrast learning is applied to the supervision task in a supervision mode by taking the countermeasure sentences in the same category as the original sentences as positive samples and the countermeasure sentences in different categories as negative samples, so that the performance of the model on the supervision task is improved, and meanwhile, the label category information in the supervision task is used as an additional learning signal, so that the contrast loss function is improved, and the quality of the finally learned sentence vector is improved; 3) The momentum contrast mechanism is utilized to realize the increase of the number of negative samples on the premise of not expanding the batch size, and the aim is still to improve the quality of the finally learned sentence vector.

An exemplary description of a method for training a model based on contrast learning provided in an embodiment of the present invention is provided below in connection with fig. 6. In this embodiment, the first preset neural network model and the second preset neural network model are both BERT, the first preset neural network model is denoted by BERTc, and the second preset neural network model is denoted by BERTa.

First, a challenge sentence sample is determined by challenge training BERTa. In the countermeasure training, a black box attack technique is adopted, for example, a synonym substitution or the like is adopted to change the sentence structure. Specifically, a textfoole method is adopted, and a sentence classification model for sentence classification comprises BERTa and a full-connection layer neural network. Specifically, the BERTa is first trimmed with the original sentence sample so that the BERTa can learn the features of the original sentence in the original sentence sample. That is, by fine tuning parameters of the BERTa, the classification model classifies the original sentence in the original sentence sample to obtain the same category as the true category of the original sentence, wherein the true category may be input into the BERTa. After the BERTa can learn the characteristics of the original sentence in the original sentence sample, the sentence structure of the original sentence is changed through operations such as synonym replacement, the original sentence in the original sentence sample is input into the sentence classification model again, the category of the original sentence is predicted through the sentence classification model, the sentence structure is continuously changed through operations such as synonym replacement until the category is predicted incorrectly, and the original sentence with the changed sentence structure is used as an countermeasure sentence of the original sentence with the unchanged sentence structure. The antagonism statements of the original statements in the original statement sample constitute an antagonism statement sample. In addition, if the model cannot be made wrong all the time and the number of times of changing the sentence structure has reached the second preset value times, selecting the sentence with the lowest category confidence as the countermeasure sentence in the second preset value categories and the corresponding category confidence obtained by carrying out category prediction on the sentences with the second preset value categories by the sentence classification model. In addition, the countermeasure sentence sample generated by the operation can be added into the training set to further fine tune BERTa, so that the robustness of the model is enhanced.

Next, in an embodiment of the present invention, a momentum contrast mechanism is used, where the momentum contrast mechanism includes a queue and a momentum encoding module. Specifically, the original sentence sample of the current batch is input to the BERTc to obtain an original sentence vector corresponding to the original sentence in the original sentence sample. Inputting the countermeasure sentence sample of the original sentence sample of the current batch into the BERTc to obtain a countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the current batch. And inputting the countermeasure sentence samples of the history batch into a momentum coding module to obtain countermeasure sentence vectors corresponding to the countermeasure sentences, wherein an encoder of the momentum coding module adopts the BERTa after countermeasure training. The countermeasure sentence vector obtained by the momentum encoding module is stored in a queue. If the queue reaches maximum capacity, the countermeasure statement vector placed in the previous lot is removed. In addition, since the text adopts supervised form of contrast learning, the category of the sentence corresponding to each sentence vector needs to be saved.

Then, based on a preset contrast loss function, combining the obtained original sentence vector and the countermeasure sentence vector to obtain a contrast loss function value. The positive samples in the self-supervising form of contrast loss are derived from the data-enhanced samples, and the negative samples are derived from all the data-enhanced samples of the current lot or from randomly selected samples in the dataset. This is not suitable for supervision tasks, as there is a potential for containing many samples of the same class within a lot, especially if the number of classes is small and the number of lots is large. In the technical scheme provided by the embodiment of the invention, the comparison loss in the supervision form is adopted, and for an original sentence, negative samples are taken from countermeasure sentences belonging to different categories with the original sentence samples in the same batch, and in order to increase the number of positive samples, the countermeasure sentences belonging to the same category with the original sentence can be taken as positive samples. In addition, by utilizing the momentum contrast mechanism, positive and negative samples are also taken from the countermeasure sentences in the countermeasure samples of the history batch, the number of positive and negative samples can be further increased. Specifically, for any one of the original sentence samples, in the countermeasure sentence samples of the current lot and the countermeasure sentence samples of the history lot, the countermeasure sentence of the same category as that of the original sentence is a positive sample, and the countermeasure sentence of a different category from that of the original sentence is a negative sample. The design of the preset contrast loss function is as follows:

Wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the countermeasure sentence sample of the history batch, y _i Representing the class, y, of the original sentence i _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _i And y is _j When the difference is 0, n represents the number of original sentences in the original sentence sample of the current batch or the number of countermeasure sentences in the countermeasure sentence sample of the current batch, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence sample of the history batch, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL Mean value, z, of the contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature. Furthermore, contrast loss L for original statement i _SCL (i) The method is characterized in that the distance between the same class samples is shortened by taking the countermeasure sentence of the same class as the original sentence i in the countermeasure sentence samples of the current batch and all the countermeasure sentence samples in the queue as positive samples, and the distance between the different class samples is pushed away by taking the countermeasure sentences of the other different classes as negative samples.

And finally, adding the contrast loss function value of the original sentence sample and the cross entropy loss function value of the original sentence sample, adjusting the parameter of BERTc according to the summation in a back propagation mode, and adjusting the parameter of BERTa used in the momentum coding module in a slow updating mode. The update mechanism of the BERTa parameter in the momentum encoding module may use the following formula: θ _a ＝λθ _a +(1-λ)θ _c Wherein θ _a Parameters, θ, representing BERTa in the momentum encoding module _c The parameter representing the BERTc is a parameter obtained by performing parameter adjustment based on the sum of the contrast loss function value and the cross entropy loss function value. The sentence vectors of a plurality of batches stored in the queue are kept consistent to a certain extent through a slow updating mode. Furthermore, the formula for the cross entropy loss function may take the form:p may be a softmax function. Repeating the process of calculating the cross entropy loss function value and the contrast loss function value, adjusting the parameters of the BERTc according to the sum of the cross entropy loss function value and the contrast loss function value and adjusting the parameters of the BERTa in the momentum coding module in a slow updating mode, so that the times of adjusting the parameters of the BERTc and the BERTa reach a first preset value for times, and completing the training process.

In summary, in the technical scheme provided by the embodiment of the invention, 1) the comparison loss is applied to the text classification task in a supervision mode, so that the model can learn the differences between sample representation features of different categories, the classification accuracy is further improved, and the final model can be used for both the classification task and the semantic matching task; 2) By introducing mechanisms such as antagonism sample, momentum contrast, etc., the quality of the finally learned text representation is improved.

Accordingly, another aspect of the embodiments of the present invention also provides an apparatus for training a model based on contrast learning.

FIG. 7 is a block diagram of an apparatus for training a model based on contrast learning according to another embodiment of the present invention. As shown in fig. 7, the apparatus includes an countermeasure sentence sample determination module 1, an countermeasure sentence vector determination module 2, an original sentence vector determination module 3, a contrast loss function value determination module 4, and an adjustment module 5. Wherein, the countermeasure sentence sample determining module 1 is used for determining a countermeasure sentence sample of the original sentence sample; the original sentence vector determining module 3 is configured to input an original sentence sample of a current batch to a first preset neural network model for obtaining sentence vectors of sentences, so as to obtain an original sentence vector corresponding to an original sentence in the original sentence sample of the current batch; the countermeasure sentence vector determining module 2 is configured to input a countermeasure sentence sample of the original sentence sample of the current batch into a first preset neural network model, so as to obtain a countermeasure sentence vector corresponding to a countermeasure sentence in the countermeasure sentence sample of the current batch; the contrast loss function value determining module 4 is used for obtaining a contrast loss function value based on a preset contrast loss function by combining the obtained original sentence vector and the obtained countermeasure sentence vector; the adjusting module 5 is configured to adjust parameters of the first preset neural network model according to the obtained contrast loss function value, and repeat the process of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value, so that the number of times of adjusting the parameters of the first preset neural network model reaches a first preset value, so as to complete the training process.

Optionally, in an embodiment of the present invention, the countermeasure statement sample determination module determines a countermeasure statement sample of the original statement sample includes: inputting an original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model in the sentence classification model for obtaining sentence vectors so that the category of the original sentence in the original sentence sample predicted by the sentence classification model is the same as the real category; changing the sentence structure of the original sentence in the original sentence sample through synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category; and repeating the processes of changing the sentence structure and predicting the category for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category until the category is mispredicted or the times of changing the sentence structure and predicting the category reach a second preset value times, wherein the sentence with the mispredicted category is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the sentences with the second preset value versions corresponding to the original sentence after the sentence structure of the original sentence is changed, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, in an embodiment of the present invention, the contrast loss function value determining module obtains the contrast loss function value and further combines a countermeasure statement vector corresponding to a countermeasure statement in the history batch of countermeasure statement samples.

Optionally, in an embodiment of the present invention, in a case where the countermeasure sentence sample determining module determines that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, a countermeasure sentence vector corresponding to a countermeasure sentence in the countermeasure sentence samples of the history batch is obtained based on the trained second preset neural network model.

Optionally, in an embodiment of the present invention, the adjustment module is further configured to adjust parameters of the trained second preset neural network model according to the obtained contrast loss function value.

Optionally, in the embodiment of the present invention, for any one of the original sentence samples, in the current lot of countermeasure sentence samples and the historical lot of countermeasure sentence samples, the countermeasure sentence same as the category of the original sentence is a positive sample, and the countermeasure sentence different from the category of the original sentence is a negative sample.

Wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the countermeasure sentence sample of the history batch, y _i Representing the class, y, of the original sentence i _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _i And y is _j When not identical is0, N represents the number of original sentences in the original sentence sample of the current lot or the number of countermeasure sentences in the countermeasure sentence sample of the current lot, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence sample of the history lot, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL Mean value, z, of contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature.

The specific working principle of the device for training the model based on the contrast learning provided by the embodiment of the invention is similar to the specific working principle and benefits of the method for training the model based on the contrast learning provided by the embodiment of the invention, and will not be described herein.

The device for training the model based on contrast learning comprises a processor and a memory, wherein the countermeasure sentence sample determining module, the original sentence vector determining module, the countermeasure sentence vector determining module, the contrast loss function value determining module, the adjusting module and the like are all stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.

The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel can be provided with one or more than one kernel, and the problems of sentence semantic change and the like possibly caused by simple random word insertion and deletion operation are avoided by adjusting kernel parameters.

The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.

Additionally, another aspect of embodiments of the present invention provides a machine-readable storage medium having stored thereon instructions for causing a machine to perform the method described in the above embodiments. .

In addition, another aspect of the embodiment of the present application further provides a processor, where the processor is configured to execute a program, where the program executes the method described in the foregoing embodiment.

In addition, another aspect of the embodiment of the present application further provides an apparatus, where the apparatus includes a processor, a memory, and a program stored in the memory and capable of running on the processor, and the processor executes the program to implement the method described in the foregoing embodiment. The device herein may be a server, PC, PAD, cell phone, etc.

Furthermore, another aspect of the embodiments of the present application provides a computer program product comprising a computer program/instruction which, when executed by a processor, implements the method described in the above embodiments.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. A method for training a model based on contrast learning, the method comprising:

determining an countermeasure statement sample of the original statement sample;

inputting the original sentence samples of the current batch into a first preset neural network model for obtaining sentence vectors, and obtaining original sentence vectors corresponding to original sentences in the original sentence samples of the current batch;

Inputting the countermeasure sentence sample of the original sentence sample of the current batch into the first preset neural network model to obtain a countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the current batch;

based on a preset contrast loss function, combining the obtained original sentence vector and the obtained countermeasure sentence vector to obtain a contrast loss function value; and

adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value, repeating the processes of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value so that the times of adjusting the parameters of the first preset neural network model reach a first preset value times to complete the training process;

the determining the countermeasure sentence sample of the original sentence sample includes:

inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining sentence vectors in the sentence classification model to enable the category of the original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category;

Changing the sentence structure of the original sentence in the original sentence sample through synonym replacement;

inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category; and

and repeating the process of changing the sentence structure and the prediction category for any original sentence which is input into the sentence classification model again and has the same prediction category as the real category until the category prediction is wrong or the number of times of changing the sentence structure and the prediction category reaches a second preset value for times, wherein the sentence with the wrong category prediction is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the sentences with the second preset value versions corresponding to the original sentence after the sentence structure of the original sentence is changed is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

2. The method of claim 1, wherein the deriving contrast loss function value is further in combination with the challenge sentence vector corresponding to the challenge sentence in the history batch of challenge sentence samples.

3. The method according to claim 2, wherein in the case where it is determined that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the history batch of the countermeasure sentence samples is obtained based on the trained second preset neural network model.

4. A method according to claim 3, characterized in that the method further comprises:

and adjusting the trained parameters of the second preset neural network model according to the obtained contrast loss function value.

5. The method according to claim 2, wherein, for any one of the original sentence samples, among the countermeasure sentence samples of a current lot and the countermeasure sentence samples of a history lot, the countermeasure sentence of the same category as the original sentence is a positive sample and the countermeasure sentence of a category different from the original sentence is a negative sample.

6. The method of claim 5, wherein the preset contrast loss function comprises:

wherein m represents the number of countermeasure sentences belonging to the same category as the original sentence i in the original sentence sample of the current batch in the countermeasure sentence sample of the current batch and the history batch, y _i Representing the class, y, of the original sentence i _j The category of the countermeasure statement j is represented,when y is expressed as _i And y is _j 1 when the same is true and y when the same is true _o And y is _j When the difference is 0, N represents the number of original sentences in the original sentence sample of the current batch or the number of countermeasure sentences in the countermeasure sentence sample of the current batch, Q represents the capacity of a queue storing countermeasure sentence vectors corresponding to the countermeasure sentences in the countermeasure sentence sample of the history batch, L _SCL (i) A contrast loss function value representing the original sentence i, L _SCL An average value, z, of the contrast loss function values representing all original sentences in the original sentence sample of the current batch _i An original sentence vector z representing an original sentence i _j A countermeasure statement vector z representing a countermeasure statement j _k The countermeasure statement vector representing the countermeasure statement k, sim represents cosine similarity, τ represents temperature.

7. A machine-readable storage medium having stored thereon instructions for causing a machine to perform the method of any one of claims 1-6.

8. A processor configured to run a program, wherein the program is configured to perform the method of any of claims 1-6 when run.