Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a method and an apparatus for generating word vector countermeasure samples for text classification, so as to effectively generate word vector countermeasure samples for text classification, and make a neural network text classifier identify errors.
To solve the above problem, an embodiment of the present invention provides a method for generating word vector confrontation samples for text classification, which at least includes the following steps:
initializing an English text needing text classification, performing word embedding on the English text, and converting the English text into corresponding vector representation;
repeatedly performing partial derivative operation on word vectors in the English text according to the loss function until the classification result output by the neural network model is wrong;
based on the modified word and word vectors, selecting the word closest to the modified word vectors in the space by adopting an Euclidean distance formula, and constructing an attack alternative word set;
and carrying out random replacement on the words of the English text according to the attack surrogate word set to generate a confrontation sample.
Further, the method for generating word vector confrontation samples for text classification further comprises the following steps:
designing a neural network model for classifying texts; the neural network model comprises an input layer, a hidden layer and an output layer;
converting the training text into word vector representation, inputting the word vector representation to the neural network model for training to obtain an output result of the neural network model;
and correcting parameters of the neural network model according to the output result of the neural network model and the correct category of the current training text, and fixing the parameters of the neural network model after training.
Further, the word embedding is performed on the english text, and the english text is converted into a corresponding vector representation, specifically:
word2vec word embedding is carried out on each word in the English text, and each word is converted into a word vector with a fixed length of m;
representing the English text as a two-dimensional matrix of n x m; wherein n is the total number of words in the English text, and m is a preset fixed length.
Further, the repeatedly performing partial derivative operation on word vectors in the english text according to the loss function until the classification result output by the neural network model is wrong specifically:
performing partial derivative operation on word vectors in the English text according to the loss function to obtain the forward change rate of the loss function along each dimension of the input word vectors;
modifying each dimension of the input word vector according to the forward change rate to maximize a loss function in a constraint range;
and repeating the steps to modify a plurality of words in the English text until the classification result output by the neural network model is wrong.
One embodiment of the present invention provides a word vector confrontation sample generation apparatus for text classification, including:
the word embedding module is used for initializing the English text needing text classification, embedding words in the English text and converting the words into corresponding vector representation;
the word vector modification module is used for repeatedly carrying out partial derivative operation on word vectors in the English text according to the loss function until the classification result output by the neural network model is wrong;
the attack substitute word set module is used for selecting the word closest to the modified word vector in the space by adopting an Euclidean distance formula based on the modified word vector to construct an attack substitute word set;
and the countermeasure sample module is used for randomly replacing the words of the English text according to the attack substitute word set to generate a countermeasure sample.
Further, the word vector confrontation sample generation device for text classification further includes:
the neural network model module is used for designing a neural network model for classifying texts; the neural network model comprises an input layer, a hidden layer and an output layer;
the training module is used for converting a training text into a word vector representation and inputting the word vector representation to the neural network model for training to obtain an output result of the neural network model;
and the parameter correction module is used for correcting the parameters of the neural network model according to the output result of the neural network model and the correct type of the current training text, and fixing the parameters of the neural network model after the training is finished.
Further, the word embedding module specifically includes:
word2vec word embedding is carried out on each word in the English text, and each word is converted into a word vector with a fixed length of m;
representing the English text as a two-dimensional matrix of n x m; wherein n is the total number of words in the English text, and m is a preset fixed length.
Further, the word vector modification module specifically includes:
performing partial derivative operation on word vectors in the English text according to the loss function to obtain the forward change rate of the loss function along each dimension of the input word vectors;
modifying each dimension of the input word vector according to the forward change rate to maximize a loss function in a constraint range;
and repeating the steps to modify a plurality of words in the English text until the classification result output by the neural network model is wrong.
An embodiment of the present invention further provides a terminal device for generating word vector countermeasure samples for text classification, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor executes the computer program to implement the word vector countermeasure sample generation method for text classification as described above.
An embodiment of the present invention further provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the word vector confrontation sample generation method for text classification as described above.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a method and a device for generating word vector confrontation samples for text classification, wherein the method comprises the following steps: initializing an English text needing text classification, performing word embedding on the English text, and converting the English text into corresponding vector representation; repeatedly performing partial derivative operation on word vectors in the English text according to the loss function until the classification result output by the neural network model is wrong; based on the modified word and word vectors, selecting the word closest to the modified word vectors in the space by adopting an Euclidean distance formula, and constructing an attack alternative word set; and carrying out random replacement on the words of the English text according to the attack surrogate word set to generate a confrontation sample. The invention can effectively generate the word vector countermeasure sample aiming at the text classification by utilizing a small amount of knowledge of the original classification model, makes the recognition of the neural network text classifier have errors on the premise of ensuring that the semantics is not changed, the word vector countermeasure sample is not perceived by people and the recognition and classification of the text by the human are not influenced, selects the most appropriate substitute word by using the Euclidean distance approximation word vector method, and ensures that the countermeasure sample does not generate illegal characters while reducing the recognition probability of the neural network.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the present application, it is to be understood that the terms "first", "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless otherwise specified.
First, an application scenario that can be provided by the present invention is described, such as generating word vector countermeasure samples for text classification.
The first embodiment of the present invention:
please refer to fig. 1-2.
As shown in fig. 1, the present embodiment provides a method for generating word vector confrontation samples for text classification, which at least includes the following steps:
s101, initializing the English text needing text classification, embedding words in the English text, and converting the words into corresponding vector representations.
In a preferred embodiment, the word embedding is performed on the english text, and the english text is converted into a corresponding vector representation, specifically:
word2vec word embedding is carried out on each word in the English text, and each word is converted into a word vector with a fixed length of m;
representing the English text as a two-dimensional matrix of n x m; wherein n is the total number of words in the English text, and m is a preset fixed length.
Specifically, in step S101, since the text sequences are discrete special characters and cannot be directly processed by the computer, word embedding is performed on the text sequences first to convert the text sequences into vector representations. Assuming that each piece of english text can be represented as [ X1, X2, …, Xn ], there are n words, and the word Xi is converted into a word vector with a fixed length of m through word2vec word embedding, i.e., [ Xi ═ Xi1, Xi2, …, Xim ], so each piece of text is represented as a two-dimensional matrix of n × m. It should be noted that, in the word vector attack sample generation method for text classification, word embedding methods include, but are not limited to, word2vec word embedding methods.
And S102, repeatedly performing partial derivative operation on word vectors in the English text according to the loss function until the classification result output by the neural network model is wrong.
In a preferred embodiment, the repeatedly performing partial derivative operation on the word vectors in the english text according to the loss function until the classification result output by the neural network model is wrong specifically:
performing partial derivative operation on word vectors in the English text according to the loss function to obtain the forward change rate of the loss function along each dimension of the input word vectors;
modifying each dimension of the input word vector according to the forward change rate to maximize a loss function in a constraint range;
and repeating the steps to modify a plurality of words in the English text until the classification result output by the neural network model is wrong.
Specifically, for step S102, the loss function performs a partial derivative operation on the word vector, and since the partial derivative is a change rate reflecting a positive direction of the objective function (i.e. the loss function) along the coordinate axis, in this embodiment, the loss function is used to derive the input word Xi to obtain a positive change rate of the loss function along each dimension (Xi1, Xi2, …, Xim) of the input Xi, and each dimension of the input Xi is modified according to the change rate to maximize the loss function (in the constraint rule), and the above operation is repeated to modify the words until the classification of the neural network model is wrong. For example, the output result of the model is (0.3, 0.7), when the english text X is classified as bad, and is inconsistent with the original classification, and when the neural network model is judged to be classified incorrectly. By maximizing the loss function, the model is classified or identified as erroneous with the maximum probability for the modified input sample.
S103, based on the modified word and word vectors, selecting the word closest to the modified word vectors in the space by adopting an Euclidean distance formula, and constructing an attack substitute word set.
Specifically, for step S103, the two-dimensional representation of the english text X is changed through the previous step S102, the matrix value is changed, the matrix dimension is not changed, and the input word vector of each word is modified to [ X ' i1, X ' i2, …, X ' im ]. However, the inventor of the present invention considers that [ X ' i1, X ' i2, …, X ' im ] is probably not present in the mapping space of word2vec, which means that the word vector cannot find the corresponding word in the real space, so the metric of euclidean distance is adopted:
and selecting a word vector [ X 'i1, X' i2, …, X 'im ] closest to the [ X' i1, X 'i2, … and X' im ], wherein a word corresponding to the word vector is X 'i, and thus obtaining the corresponding relation of Xi → X' i. Repeatedly searching a plurality of pairs Xi → X' i, thereby constructing an attack alternative word set. A distance approximation method is adopted to select legal words for replacement, so that semantic information of a text is not influenced to the maximum extent, human recognition is not influenced to the maximum extent, and generation of replacement words is not influenced, and meanwhile, the countermeasure samples are guaranteed not to have illegal characters while the recognition probability of a neural network is reduced.
It should be noted that, the distance manner used for measuring the distance between word vectors in the word vector attack sample generation method for text classification includes, but is not limited to, euclidean distance.
And S104, randomly replacing the words of the English text according to the attack surrogate word set to generate a confrontation sample.
Specifically, for step S104, for each piece of test text data, words of the text are randomly replaced according to the existing attack replacement word set, and the replaced text is the countermeasure sample.
In a specific embodiment, the countermeasure sample generation of the movie text comment data is taken as an example.
First, a movie comment text X is optionally selected, and assuming that the text is 5[ X1, X2, X3, X4, X5] in length, each word is converted into a word vector of length 3 through word2vec word embedding, and the text X is converted into a two-dimensional matrix: [ [0.3,0.5,0.2], [0.6,0.9,0.1], [0.5,0.4,0.6], [0.8,0.9,0.1], [0.5,0.6,0.8] ];
modifying the input word vector using a loss function, the modified word vector being represented as:
[[0.3,0.5,0.2],[0.3,0.5,0.8],[0.5,0.4,0.6],[0.7,0.8,0.5],[0.5,0.6,0.8]];
obviously, only the embedded word vectors of words X2 and X4 have changed. However, considering that the modified word vectors [0.3,0.5,0.8] and [0.7,0.8,0.5] may not have mapping objects in the embedding space, the modified word vectors are approximated by using the euclidean distance as a standard, and finally the modified word vectors are selected to be closest to [0.4,0.5,0.8] (corresponding to the word X '2) and [0.7,0.8,0.6] (corresponding to the word X'4) to [0.7,0.8,0.5], so that the attack substitution word set collects two pairs of substitution words.
And re-selecting a movie comment text Y, and replacing X2 with X'2 to generate a confrontation sample on the assumption that the word X2 exists in the text Y.
In a preferred embodiment, as shown in fig. 2, the method for generating word vector confrontation samples for text classification further includes:
designing a neural network model for classifying texts; the neural network model comprises an input layer, a hidden layer and an output layer;
converting the training text into word vector representation, inputting the word vector representation to the neural network model for training to obtain an output result of the neural network model;
and correcting parameters of the neural network model according to the output result of the neural network model and the correct category of the current training text, and fixing the parameters of the neural network model after training.
Specifically, a neural network model M is designed to classify the text. The structure of the neural network can be roughly divided into an input layer, a hidden layer and an output layer, wherein the node number of the input layer of the neural network corresponds to the dimension of an input word vector; the number of layers of the hidden layer and the number of nodes of each layer can be set arbitrarily; the number of output layer nodes corresponds to the number of categories. For example, in the movie comment data, the number of nodes of the output layer is 2, and the probability that each point of output text is recognized as a certain category is denoted as (a1, a2), where a1 denotes the probability of being classified as good comment and a2 denotes the probability of being classified as bad comment, and it is obvious that a1+ a2 is 1.
The training text is converted into word vector representation and then placed into a neural network model for training, network parameters are corrected according to the model output result and the correct type of the current sample, an optimizer is generally used for minimizing the loss function of the neural network model, and after training is finished, the fixed neural network model parameters are recorded as a model M.
The embodiment provides a method for generating word vector confrontation samples for text classification, which includes: initializing an English text needing text classification, performing word embedding on the English text, and converting the English text into corresponding vector representation; repeatedly performing partial derivative operation on word vectors in the English text according to the loss function until the classification result output by the neural network model is wrong; based on the modified word and word vectors, selecting the word closest to the modified word vectors in the space by adopting an Euclidean distance formula, and constructing an attack alternative word set; and carrying out random replacement on the words of the English text according to the attack surrogate word set to generate a confrontation sample.
Compared with the prior art, the word vector attack sample generation method for text classification provided by the embodiment of the invention has simple thought and is easy to realize, only needs to search the replacement word set and directly replaces the word set at the input end, and is suitable for application scenes of a large number of text classifications; and the generated countermeasure sample makes the recognition of the neural network text classifier go wrong on the premise of ensuring that the semantics are not changed and the recognition and classification of the text by the human are not influenced, and in addition, the perception of the human can not be caused to the maximum extent by inserting legal characters by adopting an approximation method.
Second embodiment of the invention:
please refer to fig. 3.
As shown in fig. 3, the present embodiment provides a word vector confrontation sample generation apparatus for text classification, including:
the word embedding module 100 is configured to initialize an english text that needs to be subjected to text classification, perform word embedding on the english text, and convert the english text into corresponding vector representation.
In a preferred embodiment, the word embedding module 100 specifically includes:
word2vec word embedding is carried out on each word in the English text, and each word is converted into a word vector with a fixed length of m;
representing the English text as a two-dimensional matrix of n x m; wherein n is the total number of words in the English text, and m is a preset fixed length.
Specifically, for the word embedding module 100, since the text sequences are discrete special characters and cannot be directly processed by a computer, word embedding is performed on the text sequences first to convert the text sequences into vector representations. Assuming that each piece of english text can be represented as [ X1, X2, …, Xn ], there are n words, and the word Xi is converted into a word vector with a fixed length of m through word2vec word embedding, i.e., [ Xi ═ Xi1, Xi2, …, Xim ], so each piece of text is represented as a two-dimensional matrix of n × m. It should be noted that, in the word vector attack sample generation method for text classification, word embedding methods include, but are not limited to, word2vec word embedding methods.
And the word vector modification module 200 is configured to repeatedly perform partial derivative operation on word vectors in the english text according to the loss function until a classification result output by the neural network model is incorrect.
In a preferred embodiment, the word vector modification module 200 specifically includes:
performing partial derivative operation on word vectors in the English text according to the loss function to obtain the forward change rate of the loss function along each dimension of the input word vectors;
modifying each dimension of the input word vector according to the forward change rate to maximize a loss function in a constraint range;
and repeating the steps to modify a plurality of words in the English text until the classification result output by the neural network model is wrong.
Specifically, for the word vector modification module 200, the partial derivative operation is performed on the word vector by the loss function, and since the partial derivative is a change rate reflecting the positive direction of the objective function (i.e., the loss function) along the coordinate axis, in this embodiment, the loss function is used to derive the input word Xi to obtain the positive change rate of the loss function along each dimension (Xi1, Xi2, …, Xim) of the input Xi, and each dimension of the input Xi is modified according to the change rate, so that the loss function is maximized (in the constraint rule), and the above operation is repeatedly performed to modify the words until the classification of the neural network model is wrong. For example, the output result of the model is (0.3, 0.7), when the english text X is classified as bad, and is inconsistent with the original classification, and when the neural network model is judged to be classified incorrectly. By maximizing the loss function, the model is classified or identified as erroneous with the maximum probability for the modified input sample.
And the attack alternative word set module 300 is configured to select, based on the modified word vector, a word closest to the modified word vector in the space by using an euclidean distance formula, and construct an attack alternative word set.
Specifically, for the attack set of alternative words module 300, the two-dimensional representation of the english text X is changed through the previous step S102, the matrix value is changed, the matrix dimension is not changed, and the input word vector of each word is modified to [ X ' i1, X ' i2, …, X ' im ]. However, the inventor of the present invention considers that [ X ' i1, X ' i2, …, X ' im ] is probably not present in the mapping space of word2vec, which means that the word vector cannot find the corresponding word in the real space, so the metric of euclidean distance is adopted:
and selecting a word vector [ X 'i1, X' i2, …, X 'im ] closest to the [ X' i1, X 'i2, … and X' im ], wherein a word corresponding to the word vector is X 'i, and thus obtaining the corresponding relation of Xi → X' i. Repeatedly searching a plurality of pairs Xi → X' i, thereby constructing an attack alternative word set. A distance approximation method is adopted to select legal words for replacement, so that semantic information of a text is not influenced to the maximum extent, human recognition is not influenced to the maximum extent, and generation of replacement words is not influenced, and meanwhile, the countermeasure samples are guaranteed not to have illegal characters while the recognition probability of a neural network is reduced.
It should be noted that, the distance manner used for measuring the distance between word vectors in the word vector attack sample generation method for text classification includes, but is not limited to, euclidean distance.
And the countermeasure sample module 400 is configured to perform random replacement on the words of the english text according to the attack substitute word set, and generate a countermeasure sample.
Specifically, for the confrontation sample module 400, a movie comment text X is optionally selected, and assuming that the text length is 5[ X1, X2, X3, X4, X5], each word is converted into a word vector with a length of 3 through word2vec word embedding, and the text X is converted into a two-dimensional matrix:
[[0.3,0.5,0.2],[0.6,0.9,0.1],[0.5,0.4,0.6],[0.8,0.9,0.1],[0.5,0.6,0.8]];
modifying the input word vector using a loss function, the modified word vector being represented as:
[[0.3,0.5,0.2],[0.3,0.5,0.8],[0.5,0.4,0.6],[0.7,0.8,0.5],[0.5,0.6,0.8]];
obviously, only the embedded word vectors of words X2 and X4 have changed. However, considering that the modified word vectors [0.3,0.5,0.8] and [0.7,0.8,0.5] may not have mapping objects in the embedding space, the modified word vectors are approximated by using the euclidean distance as a standard, and finally the modified word vectors are selected to be closest to [0.4,0.5,0.8] (corresponding to the word X '2) and [0.7,0.8,0.6] (corresponding to the word X'4) to [0.7,0.8,0.5], so that the attack substitution word set collects two pairs of substitution words.
And re-selecting a movie comment text Y, and replacing X2 with X'2 to generate a confrontation sample on the assumption that the word X2 exists in the text Y.
In a preferred embodiment, the word vector confrontation sample generation apparatus for text classification further includes:
the neural network model module is used for designing a neural network model for classifying texts; the neural network model comprises an input layer, a hidden layer and an output layer;
the training module is used for converting a training text into a word vector representation and inputting the word vector representation to the neural network model for training to obtain an output result of the neural network model;
and the parameter correction module is used for correcting the parameters of the neural network model according to the output result of the neural network model and the correct type of the current training text, and fixing the parameters of the neural network model after the training is finished.
Specifically, the neural network model module is mainly used for designing the neural network model M to classify the text. The structure of the neural network can be roughly divided into an input layer, a hidden layer and an output layer, wherein the node number of the input layer of the neural network corresponds to the dimension of an input word vector; the number of layers of the hidden layer and the number of nodes of each layer can be set arbitrarily; the number of output layer nodes corresponds to the number of categories.
Specifically, for the training module, the training text is mainly converted into word vector representation and then put into a neural network model for training.
Specifically, for the parameter correction module, the network parameters are corrected according to the model output result and the correct type of the current sample, an optimizer is generally used to minimize the loss function of the neural network model, and after training is finished, the fixed neural network model parameters are recorded as a model M.
The embodiment provides a word vector confrontation sample generation device for text classification, which includes: the word embedding module 100 is configured to initialize an english text that needs to be subjected to text classification, perform word embedding on the english text, and convert the english text into corresponding vector representation; the word vector modification module 200 is configured to repeatedly perform partial derivative operation on word vectors in the english text according to the loss function until a classification result output by the neural network model is incorrect; the attack substitute word set module 300 is configured to select, based on the modified word vector, a word closest to the modified word vector in the space by using an euclidean distance formula, and construct an attack substitute word set; and the countermeasure sample module 400 is configured to perform random replacement on the words of the english text according to the attack substitute word set, and generate a countermeasure sample. The embodiment can effectively generate the word vector countermeasure sample aiming at the text classification by utilizing a small amount of knowledge of the original classification model, makes the recognition of the neural network text classifier wrong on the premise of ensuring that the semantics is not changed, the recognition and the classification of the text are not influenced by human beings, selects the most appropriate substitute word by using the Euclidean distance approximation word vector method, and ensures that the countermeasure sample does not generate illegal characters while reducing the recognition probability of the neural network.
An embodiment of the present invention further provides a terminal device for generating word vector countermeasure samples for text classification, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor executes the computer program to implement the word vector countermeasure sample generation method for text classification as described above.
An embodiment of the present invention further provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the word vector confrontation sample generation method for text classification as described above.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the modules may be a logical division, and in actual implementation, there may be another division, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The foregoing is directed to the preferred embodiment of the present invention, and it is understood that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, and it is intended that such changes and modifications be considered as within the scope of the invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.