CN111652267B - Method and device for generating countermeasure sample, electronic equipment and storage medium - Google Patents

Method and device for generating countermeasure sample, electronic equipment and storage medium Download PDF

Info

Publication number
CN111652267B
CN111652267B CN202010317965.9A CN202010317965A CN111652267B CN 111652267 B CN111652267 B CN 111652267B CN 202010317965 A CN202010317965 A CN 202010317965A CN 111652267 B CN111652267 B CN 111652267B
Authority
CN
China
Prior art keywords
particle
sample
word
original text
optimal solution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010317965.9A
Other languages
Chinese (zh)
Other versions
CN111652267A (en
Inventor
岂凡超
臧原
刘知远
孙茂松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202010317965.9A priority Critical patent/CN111652267B/en
Priority to PCT/CN2020/103219 priority patent/WO2021212675A1/en
Publication of CN111652267A publication Critical patent/CN111652267A/en
Application granted granted Critical
Publication of CN111652267B publication Critical patent/CN111652267B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method and a device for generating a countermeasure sample, electronic equipment and a storage medium, wherein an original text is obtained; determining a candidate set of replacement words of each word in the original text; and searching a sample of the attack target model from a discrete space formed by the combination of the candidate sets of the replacement words based on a particle swarm optimization algorithm to generate a countersample. The embodiment of the invention searches the antagonistic sample by using the particle swarm optimization algorithm, and the particle swarm optimization is more efficient than the genetic algorithm as a metaheuristic group evolution calculation method, so that the search speed can be increased and the attack success rate can be increased when the antagonistic sample is searched by using the algorithm. Aiming at different natural language processing models, the embodiment of the invention can quickly and efficiently generate a large amount of high-quality countermeasure samples, successfully deceive the target model, further expose the vulnerability of the target model and has good practicability.

Description

Method and device for generating countermeasure sample, electronic equipment and storage medium
Technical Field
The present invention relates to the field of natural speech processing technologies, and in particular, to a method and an apparatus for generating a countermeasure sample, an electronic device, and a storage medium.
Background
Counterattack refers to the process of making the target model decision go wrong by generating a countersample. Fighting attacks can expose the vulnerability of the machine learning model, thereby improving the robustness and interpretability of the model. Text counterattack refers to the process of making natural language processing models misjudge by modifying the original text to generate countersamples.
Existing studies have shown that deep-learning models are very susceptible to adversarial attacks, such as simple modification of abusive text, which can fool the most advanced abuse detection systems. In view of the fact that the natural language processing model based on the deep learning technology is widely applied to multiple application systems such as spam detection and malicious comment detection at present, research on text counterattack to find the weaknesses of the systems and improve the weaknesses has increasingly practical significance and value.
The existing text counterattack method is mainly in word level, and by determining a candidate set of replacement words of each word in an original text, a countersample capable of successfully attacking a target model is searched in a discrete space formed by the combination of all candidate sets of replacement words. The existing search algorithm is mainly based on a greedy or genetic algorithm, and the algorithm has a larger performance improvement space in the aspects of search speed and attack success rate.
Disclosure of Invention
The embodiment of the invention provides a method and a device for generating a countermeasure sample, electronic equipment and a storage medium, which are used for solving the problems of low speed of a search algorithm and low attack success rate in the prior art.
The embodiment of the invention provides a generation method of a confrontation sample, which comprises the following steps:
acquiring an original text;
determining a candidate set of replacement words of each word in the original text;
and searching a sample of an attack target model from a discrete space formed by the combination of the candidate set of the replacement words based on a particle swarm optimization algorithm to generate a countersample.
Optionally, the determining a candidate set of replacement words for each word in the original text includes:
marking the part of speech of each word in the original text;
obtaining the semantic labels of each semantic item of each word under the same part of speech, and determining the words with the same semantic labels and the same part of speech as candidate replacement words;
and determining the set of candidate replacement words as the candidate set of replacement words.
Optionally, the tagging parts of speech of each word in the original text includes:
determining the original text to be a Chinese text, performing word segmentation operation on the original text, and labeling the part of speech of each word after word segmentation;
determining that the original text is an English text, restoring each word in the original text to an original shape, and labeling the part of speech of each restored word.
Optionally, the searching, based on the particle swarm optimization algorithm, a sample of an attack target model from a discrete space formed by the combination of the candidate sets of replacement words, and generating a countersample includes:
copying the original text k times to obtain initial samples, performing variation operation on each initial sample to generate a new particle swarm, wherein each particle in the particle swarm is a varied sample;
recording a global optimal solution and a historical optimal solution in the particle swarm during each iteration, wherein the global optimal solution is the position of a particle with the highest target tag prediction score given by a target model, and the historical optimal solution is the position of the particle with the highest target tag prediction score in each particle historical iteration;
and stopping searching and outputting the countermeasure sample when the recorded optimal solution is determined to be the countermeasure sample, otherwise updating the particle speed and the particle position, and returning to execute the operation of the particle with the highest target label prediction score given by the target model in the recorded particle swarm and the position with the highest target label prediction score in each particle through iteration after performing the mutation operation until the corresponding countermeasure sample is stopped searching and output when the recorded optimal solution is determined to be the countermeasure sample.
Optionally, the updating the particle velocity and position comprises:
at each iteration, the velocity of the particle is updated as follows:
Figure GDA0003914682180000031
in the formula (I), the compound is shown in the specification,
Figure GDA0003914682180000032
the d-dimensional velocity of the nth particle, omega is an inertia factor which decreases with the number of iterations,
Figure GDA0003914682180000033
the position of the nth particle in dimension d,
Figure GDA0003914682180000034
the position of the d-th dimension is solved for the historical best solution of the nth particle,
Figure GDA0003914682180000035
to solve the location of dimension d globally, I (a, b) is defined as:
Figure GDA0003914682180000036
the particle location update includes: moving to the historical optimal solution of each particle, wherein the moving probability is P i (ii) a Move to the global optimal solution with a moving probability of P g (ii) a Wherein, P i And P g Updating with the number of iterations:
Figure GDA0003914682180000037
Figure GDA0003914682180000038
wherein, 1 > P max >P min More than 0 is a predefined hyper-parameter, T is the current iteration number, and T is the maximum iteration number.
Optionally, the performing mutation operations comprises:
each particle in the population of particles has a probability P m Performing mutation operation。
The embodiment of the invention also provides a generation device of the confrontation sample, which comprises the following components:
the acquisition module is used for acquiring an original text;
the determining module is used for determining a candidate set of replacement words of each word in the original text;
and the generation module is used for searching samples of the attack target model from a discrete space formed by the combination of the candidate set of the replacement words based on a particle swarm optimization algorithm and generating countersamples.
Optionally, the determining module includes:
the labeling unit is used for labeling the part of speech of each word in the original text;
the first determining unit is used for acquiring the semantic label of each meaning item of each word under the same part of speech and determining the word with the same semantic label and the same part of speech as a candidate replacement word;
and the second determining unit is used for determining the set formed by the candidate replacement words as the candidate set of replacement words.
An embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of any one of the above-mentioned methods for generating a challenge sample when executing the program.
Embodiments of the present invention further provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of any one of the above-mentioned methods for generating a countermeasure sample.
According to the generation method, the generation device, the electronic equipment and the storage medium of the countermeasure sample, provided by the embodiment of the invention, the original text is obtained; determining a candidate set of replacement words of each word in the original text; and searching a sample of the attack target model from a discrete space formed by the combination of the candidate sets of the replacement words based on a particle swarm optimization algorithm to generate a countersample. The embodiment of the invention searches the antagonistic sample by using the particle swarm optimization algorithm, and the particle swarm optimization is more efficient than the genetic algorithm as a metaheuristic group evolution calculation method, so that the search speed can be increased and the attack success rate can be increased when the antagonistic sample is searched by using the algorithm. Aiming at different natural language processing models, the embodiment of the invention can quickly and efficiently generate a large amount of high-quality countermeasure samples, successfully deceive the target model, further expose the vulnerability of the target model and has good practicability.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of one embodiment of a method for generating a challenge sample according to an embodiment of the present invention;
fig. 2 is a flowchart of determining a candidate set of alternative words in the method for generating a countermeasure sample according to the embodiment of the present invention;
FIG. 3 is a flowchart of searching for a challenge sample in the method for generating a challenge sample according to an embodiment of the present invention;
FIG. 4 is a block diagram of an apparatus for generating a countermeasure sample according to an embodiment of the present invention;
fig. 5 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
A flowchart of a specific implementation of a method for generating a challenge sample according to an embodiment of the present invention is shown in fig. 1, where the method specifically includes:
step S101: acquiring an original text;
step S102: determining a candidate set of replacement words of each word in the original text;
after the original text is acquired, the type of the original text is determined to be a Chinese text or an English text. If the text is English text, word segmentation operation is not needed; if the text is a Chinese text, word segmentation operation is carried out to obtain each word in the original text. And generating candidate replacement words corresponding to the words respectively aiming at the words in the original text. And determining a set consisting of one or more candidate replacement words as a candidate set of replacement words. In order to further ensure the quality of the replaced text, when the original text is English, the candidate replaced word determination operation can be performed after the word form reduction operation is performed on each word in the original text. And the morphology reduction is an important part in text preprocessing, namely, the morphology reduction is to remove affixes of words and extract a main part of the words. For example, the word "cars" is morphed and reduced to "car" and the word "ate" is morphed and reduced to "eat".
Further, the embodiment of the invention can generate a candidate set containing alternative words with the same or similar semanteme for each word of the original text by means of the knowledge network semantic knowledge base. Specifically, part-of-speech tagging may be performed on an original text, after part-of-speech of each word is obtained, an semantic tag of each semantic item of the same part-of-speech of the word is obtained from a knowledge network, the word having the same part-of-speech as the original tag is regarded as a candidate replacement word, and then all candidate replacement words are combined into a candidate replacement word set.
Step S103: and searching a sample of an attack target model from a discrete space formed by the combination of the candidate set of the replacement words based on a particle swarm optimization algorithm to generate a countersample.
Based on a particle swarm optimization algorithm, a countermeasure sample capable of successfully attacking the target model is rapidly searched from a discrete space formed by the combination of all the candidate sets of the replacement words.
According to the generation method of the countermeasure sample provided by the embodiment of the invention, the original text is obtained; determining a candidate set of replacement words of each word in an original text; and searching a sample of the attack target model from a discrete space formed by the combination of the candidate sets of the replacement words based on a particle swarm optimization algorithm to generate a countersample. The embodiment of the invention searches the antagonistic sample by using the particle swarm optimization algorithm, and the particle swarm optimization is more efficient than the genetic algorithm as a metaheuristic group evolution calculation method, so that the search speed can be increased and the attack success rate can be increased when the antagonistic sample is searched by using the algorithm.
When generating a candidate set of replacement words, a common method is to construct the candidate set of replacement words from synonyms of words in the original text by means of a synonym dictionary. However, a considerable part of words in real text have no synonyms (such as named entity words), and the number of synonyms of a word having a synonym is also very limited. This results in a smaller number of candidate challenge samples being generated ultimately, which in turn affects the success rate of the attack.
According to the method for generating the confrontation sample, provided by the embodiment of the invention, through the help of other knowledge bases, for example, the knowledge network (HowNet) is a language knowledge base, the semantic annotation is carried out on more than 10 ten thousand Chinese and English words by using the predefined good semantic source, namely the smallest semantic unit in the linguistics, the words with the same semantic source annotation can be considered to have the same meaning, and further the words can be used as candidate replacement words. Moreover, the knowledge network labels the sememes for all kinds of words including the actual words, and ensures that most words in the actual text can find the candidate replacement words. Therefore, the present embodiment can improve the number and diversity of candidate replacement words. As shown in fig. 2, which is a flowchart of determining a candidate set of replacement words in the countermeasure sample generation method provided in the embodiment of the present invention, a specific process of determining the candidate set of replacement words in step S102 may include:
step S201: marking the part of speech of each word in the original text;
determining the original text to be a Chinese text, performing word segmentation operation on the original text, and labeling the part of speech of each word after word segmentation; determining that the original text is an English text, restoring each word in the original text to an original shape, and labeling the part of speech of each restored word.
Step S202: obtaining the semantic labels of each semantic item of each word under the same part of speech, and determining the words with the same semantic labels and the same part of speech as candidate replacement words;
step S203: and determining the set formed by the candidate replacement words as the candidate set of replacement words.
According to the embodiment of the invention, a candidate set containing the replaced words with the same or similar semantics is generated for each word of the original text by means of the knowledge network semantic knowledge base, so that the number and diversity of the candidate replaced words can be greatly improved, and the attack success rate of the generated countermeasure sample is further improved.
On the basis of any of the above embodiments, referring to fig. 3, a specific process of a search algorithm in the method for generating a countermeasure sample provided by the embodiment of the present invention includes:
step S301: initializing a particle swarm;
and setting the particle swarm size as k, copying the original text k times to obtain initial samples, and performing variation operation on each initial sample once to generate a new particle swarm. Mutation operation refers to randomly selecting a word in a text and replacing it with a random word in its candidate set of replacement words. Each particle in the particle group is a mutated sample, and can also be regarded as an n-dimensional vector, where n is the number of words in the text. The position of the particle in discrete space represents the combination of the alternative words chosen for each word of the sample. We initialize a velocity v randomly for each dimension of each particle.
Step S302: recording the optimal solution;
and recording the particles (global optimal solution) with the highest target label prediction score given by the target model in the particle swarm and the positions (historical optimal solution) with the highest target label prediction score in each particle historical iteration during each iteration. The target label refers to a label that the model is expected to classify the antagonistic sample, such as positive for the original sample label and negative for the target label in the emotion classification task, because it is expected that the antagonistic sample will make the model classification incorrect.
Step S303: judging whether the operation can be stopped, if not, entering the step S304; if yes, go to step S305;
if the best solution (the particle with the highest target label prediction score) recorded at present can make the model classification wrong, which indicates that a successful confrontation sample has been found, the search is stopped and the sample is output. Otherwise, after the particle speed and the position are updated and mutation operation is performed, the operation of recording the particles with the highest target label prediction score given by the target model in the particle swarm and the position with the highest target label prediction score in each particle in the past iteration is returned to be performed until the searching is stopped and the corresponding confrontation sample is output when the recorded optimal solution is determined to be the confrontation sample.
Step S304: and updating the speed and the position of the particle, carrying out mutation, returning to the step S302, and carrying out a new iteration.
At each iteration, the velocity of the particle is updated as follows:
Figure GDA0003914682180000081
in the formula, in the following formula,
Figure GDA0003914682180000082
the d-dimensional velocity of the nth particle, omega is an inertia factor decreasing with the number of iterations,
Figure GDA0003914682180000083
the position of the nth particle in dimension d,
Figure GDA0003914682180000084
the position of the d-th dimension is solved for the historical best solution of the nth particle,
Figure GDA0003914682180000085
to solve the location of dimension d globally, I (a, b) is defined as:
Figure GDA0003914682180000091
after the velocity update is completed, the particles need to undergo a two-step position update. The first step is to move to the historical optimal solution of each particle, and the moving probability is P i . The second step moves to the global optimal solution with the moving probability of P g . Wherein P is i And P g Update with iteration number:
Figure GDA0003914682180000092
Figure GDA0003914682180000093
wherein 1 > P max >P min More than 0 is a predefined hyper-parameter, T is the current iteration number, and T is the maximum iteration number.
In the examples of the present invention, P i And P g Updated with the number of iterations, and P i And P g All are constant settings, under which P is set i Decreases as the number of iterations increases, P g The particles are searched in respective nearby spaces at the initial stage of searching to search more unknown spaces, and are searched near the currently found optimal solution at the later stage of searching to converge to the optimal solution as soon as possible. Through experimental verification, under the same limitation of the maximum iteration number, the set ratio P is i And P g The attack success rate is set to be 10-15% higher when the attack success rates are all constant.
At each step of position update, once the particle determines to move, its probability of moving in each dimension is
Figure GDA0003914682180000094
After updating the velocity and position, each particle in the population of particles has a probability P m To carry outAnd (5) performing mutation operation.
Step S305: the search is stopped and the sample is output as a challenge sample.
According to the embodiment of the invention, the alternative word candidate set is generated for the words in the original text through the sememe, and meanwhile, the countermeasure sample capable of successfully attacking the target model is searched in the discrete space formed by combining the alternative word candidate set through the particle swarm optimization algorithm. The method can efficiently generate a large amount of high-quality countermeasure samples aiming at different natural language processing models, successfully deceive a target model, further expose the vulnerability of the target model, and has good practicability.
Fig. 4 shows a block diagram of a structure of a device for generating a countermeasure sample, which specifically includes:
an obtaining module 401, configured to obtain an original text;
a determining module 402, configured to determine a candidate set of replacement words for each word in the original text;
a generating module 403, configured to search, based on a particle swarm optimization algorithm, a sample of an attack target model from a discrete space formed by the combination of the candidate replacement word sets, and generate a countermeasure sample.
Further, the determining module 402 may further include:
the labeling unit is used for labeling the part of speech of each word in the original text;
the first determining unit is used for acquiring the semantic labels of each semantic item of each word under the same part of speech, and determining the words with the same semantic label and the same part of speech as candidate replacement words;
and the second determining unit is used for determining the set formed by the candidate replacement words as the candidate set of replacement words.
Further, the labeling unit is specifically configured to: determining the original text as a Chinese text, performing word segmentation operation on the original text, and labeling the part of speech of each word after word segmentation; determining that the original text is an English text, restoring each word in the original text to an original shape, and labeling the part of speech of each restored word.
On the basis of any of the above embodiments, the generating module 403 is specifically configured to: copying the original text k times to obtain initial samples, performing variation operation on each initial sample to generate a new particle swarm, wherein each particle in the particle swarm is a varied sample; during each iteration, recording the particles with the highest target tag prediction score given by the target model in the particle swarm and the position with the highest target tag prediction score in each particle iteration; and stopping searching and outputting the countermeasure sample when the recorded optimal solution is determined as the countermeasure sample, otherwise, updating the particle speed and the particle position, returning and executing the operation of the particle with the highest target label prediction score given by the target model in the recorded particle swarm and the position with the highest target label prediction score in each particle iteration after performing mutation operation, and stopping searching and outputting the corresponding countermeasure sample when the recorded optimal solution is determined as the countermeasure sample.
The apparatus for generating a challenge sample of this embodiment is configured to implement the method for generating a challenge sample, and thus specific embodiments of the apparatus for generating a challenge sample may refer to the foregoing embodiments of the method for generating a challenge sample, for example, the obtaining module 401, the determining module 402, and the generating module 403 are respectively configured to implement steps S101, S102, and S103 in the method for generating a challenge sample, so that the specific embodiments thereof may refer to descriptions of corresponding embodiments of the respective portions, and are not described herein again.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor) 510, a communication Interface (Communications Interface) 520, a memory (memory) 530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may call logic instructions in memory 530 to perform the following method: acquiring an original text; determining a candidate set of replacement words of each word in the original text; and searching a sample of an attack target model from a discrete space formed by the combination of the candidate set of the replacement words based on a particle swarm optimization algorithm to generate a countersample.
In addition, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In one embodiment, processor 510 may call logic instructions in memory 530 to perform the following method: marking the part of speech of each word in the original text; obtaining the semantic labels of each semantic item of each word under the same part of speech, and determining the words with the same semantic labels and the same part of speech as candidate replacement words; and determining the set formed by the candidate replacement words as the candidate set of replacement words.
In one embodiment, processor 510 may call logic instructions in memory 530 to perform the following method: determining the original text as a Chinese text, performing word segmentation operation on the original text, and labeling the part of speech of each word after word segmentation; determining that the original text is an English text, restoring each word in the original text to an original shape, and labeling the part of speech of each restored word.
In one embodiment, processor 510 may call logic instructions in memory 530 to perform the following method: copying the original text k times to obtain initial samples, performing variation operation on each initial sample to generate a new particle swarm, wherein each particle in the particle swarm is a varied sample; during each iteration, recording the particles with the highest target tag prediction score given by the target model in the particle swarm and the position with the highest target tag prediction score in each particle iteration; and stopping searching and outputting the countermeasure sample when the recorded optimal solution is determined to be the countermeasure sample, otherwise updating the particle speed and the particle position, and returning to execute the operation of the particle with the highest target label prediction score given by the target model in the recorded particle swarm and the position with the highest target label prediction score in each particle through iteration after performing the mutation operation until the corresponding countermeasure sample is stopped searching and output when the recorded optimal solution is determined to be the countermeasure sample.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the transmission method provided in the foregoing embodiments when executed by a processor, and for example, the method includes: acquiring an original text; determining a candidate set of replacement words of each word in the original text; and searching a sample of an attack target model from a discrete space formed by the combination of the candidate set of the replacement words based on a particle swarm optimization algorithm to generate a countersample.
The electronic device and the non-transitory computer-readable storage medium provided in the embodiments of the present invention both correspond to the above-mentioned generation method of the countermeasure sample, and specific embodiments thereof refer to the corresponding contents of the foregoing parts, which are not described herein again.
In summary, the method, the apparatus, the electronic device, and the storage medium for generating the countermeasure sample according to the embodiments of the present invention obtain the original text; determining a candidate set of replacement words of each word in an original text; and searching a sample of the attack target model from a discrete space formed by the combination of the candidate sets of the replacement words based on a particle swarm optimization algorithm to generate a countersample. The embodiment of the invention searches the antagonistic sample by using the particle swarm optimization algorithm, and the particle swarm optimization is more efficient than the genetic algorithm as a metaheuristic group evolution calculation method, so that the search speed can be increased and the attack success rate can be increased when the antagonistic sample is searched by using the algorithm. Aiming at different natural language processing models, the embodiment of the invention can quickly and efficiently generate a large amount of high-quality countermeasure samples, successfully deceive the target model, further expose the vulnerability of the target model and has good practicability.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A method of generating a challenge sample, comprising:
acquiring an original text;
determining a candidate set of replacement words of each word in the original text;
searching a sample of an attack target model from a discrete space formed by the combination of the candidate sets of the replacement words based on a particle swarm optimization algorithm to generate a countersample;
copying the original text k times to obtain initial samples, performing variation operation on each initial sample to generate a new particle swarm, wherein each particle in the particle swarm is a varied sample;
recording a global optimal solution and a historical optimal solution in the particle swarm during each iteration, wherein the global optimal solution is the position of a particle with the highest target label prediction score given by a target model, and the historical optimal solution is the position of the highest target label prediction score in each particle history iteration;
stopping searching and outputting the countermeasure sample when the recorded optimal solution is determined to be the countermeasure sample, otherwise updating the particle speed and the particle position, and returning to execute the operation of the particles with the highest target label prediction score given by the target model in the recorded particle swarm and the positions with the highest target label prediction score in each particle through iteration after performing the mutation operation until the corresponding countermeasure sample is stopped searching and output when the recorded optimal solution is determined to be the countermeasure sample;
the updating particle velocity and position comprises:
at each iteration, the velocity of the particle is updated as follows:
Figure FDA0003914682170000011
in the formula (I), the compound is shown in the specification,
Figure FDA0003914682170000012
the d-dimensional velocity of the nth particle, omega is an inertia factor which decreases with the number of iterations,
Figure FDA0003914682170000013
the position of the nth particle in dimension d,
Figure FDA0003914682170000014
the position of the d-th dimension is solved for the historical best solution of the nth particle,
Figure FDA0003914682170000015
to solve the location of dimension d globally, I (a, b) is defined as:
Figure FDA0003914682170000016
the particle location update includes: moving to the historical optimal solution of each particle, wherein the moving probability is P i (ii) a Move to the global optimal solution with a probability of P g (ii) a Wherein, P i And P g Update with iteration number:
Figure FDA0003914682170000021
Figure FDA0003914682170000022
wherein 1 is>P max >P min >0 is a predefined hyper-parameter, T is the current iteration number, and T is the maximum iteration number;
the performing mutation operations comprise:
after updating the velocity and position, each particle in the population of particles has a probability P m And (5) carrying out mutation operation.
2. The method of generating a confrontational sample as claimed in claim 1, wherein said determining a candidate set of replacement words for each word in said original text comprises:
marking the part of speech of each word in the original text;
obtaining the semantic labels of each meaning item of each word under the same part of speech, and determining the words with the same semantic labels and the same part of speech as candidate replacement words;
and determining the set of candidate replacement words as the candidate set of replacement words.
3. The method for generating a confrontational sample according to claim 2, wherein said labeling the part of speech of each word in the original text comprises:
determining the original text as a Chinese text, performing word segmentation operation on the original text, and labeling the part of speech of each word after word segmentation;
and determining that the original text is an English text, restoring each word in the original text into an original shape, and labeling the part of speech of each word after restoration.
4. A challenge sample generating apparatus, comprising:
the acquisition module is used for acquiring an original text;
the determining module is used for determining a candidate set of replacement words of each word in the original text;
the generation module is used for searching a sample of an attack target model from a discrete space formed by the combination of the replacement word candidate sets based on a particle swarm optimization algorithm and generating a confrontation sample;
copying the original text k times to obtain initial samples, performing variation operation on each initial sample to generate a new particle swarm, wherein each particle in the particle swarm is a varied sample;
recording a global optimal solution and a historical optimal solution in the particle swarm during each iteration, wherein the global optimal solution is the position of a particle with the highest target label prediction score given by a target model, and the historical optimal solution is the position of the highest target label prediction score in each particle history iteration;
stopping searching and outputting the countermeasure sample when the recorded optimal solution is determined to be the countermeasure sample, otherwise updating the particle speed and the particle position, and returning to execute the operation of the particles with the highest target label prediction score given by the target model in the recorded particle swarm and the positions with the highest target label prediction score in each particle through iteration after performing the mutation operation until the corresponding countermeasure sample is stopped searching and output when the recorded optimal solution is determined to be the countermeasure sample;
the updating particle velocity and position comprises:
at each iteration, the velocity of the particle is updated as follows:
Figure FDA0003914682170000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003914682170000032
the d-dimensional velocity of the nth particle, omega is an inertia factor which decreases with the number of iterations,
Figure FDA0003914682170000033
the position of the nth particle in dimension d,
Figure FDA0003914682170000034
the position of the d-th dimension is solved for the historical best solution of the nth particle,
Figure FDA0003914682170000035
to solve the location of dimension d globally, I (a, b) is defined as:
Figure FDA0003914682170000036
the particle location update includes: moving to the historical optimal solution of each particle, wherein the moving probability is P i (ii) a Move to the global optimal solution with a moving probability of P g (ii) a Wherein, P i And P g Updating with the number of iterations:
Figure FDA0003914682170000037
Figure FDA0003914682170000038
wherein 1 is>P max >P min >0 is a predefined hyper-parameter, T is the current iteration number, and T is the maximum iteration number;
the performing mutation operations comprise:
each particle in the population of particles has a probability P m And (5) carrying out mutation operation.
5. The apparatus of claim 4, wherein the determining module comprises:
the labeling unit is used for labeling the part of speech of each word in the original text;
the first determining unit is used for acquiring the semantic labels of each semantic item of each word under the same part of speech, and determining the words with the same semantic label and the same part of speech as candidate replacement words;
and the second determining unit is used for determining the set formed by the candidate replacement words as the candidate set of replacement words.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method of generating challenge samples according to any one of claims 1 to 3 when executing the program.
7. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for generating a challenge sample according to any one of claims 1 to 3.
CN202010317965.9A 2020-04-21 2020-04-21 Method and device for generating countermeasure sample, electronic equipment and storage medium Active CN111652267B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010317965.9A CN111652267B (en) 2020-04-21 2020-04-21 Method and device for generating countermeasure sample, electronic equipment and storage medium
PCT/CN2020/103219 WO2021212675A1 (en) 2020-04-21 2020-07-21 Method and apparatus for generating adversarial sample, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010317965.9A CN111652267B (en) 2020-04-21 2020-04-21 Method and device for generating countermeasure sample, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111652267A CN111652267A (en) 2020-09-11
CN111652267B true CN111652267B (en) 2023-01-31

Family

ID=72346469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010317965.9A Active CN111652267B (en) 2020-04-21 2020-04-21 Method and device for generating countermeasure sample, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111652267B (en)
WO (1) WO2021212675A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112216273B (en) * 2020-10-30 2024-04-16 东南数字经济发展研究院 Method for resisting sample attack aiming at voice keyword classification network
CN112380845B (en) * 2021-01-15 2021-04-09 鹏城实验室 Sentence noise design method, equipment and computer storage medium
CN113723506B (en) * 2021-08-30 2022-08-05 南京星环智能科技有限公司 Method and device for generating countermeasure sample and storage medium
CN113806490B (en) * 2021-09-27 2023-06-13 中国人民解放军国防科技大学 Text universal trigger generation system and method based on BERT sampling
CN113642678B (en) * 2021-10-12 2022-01-07 南京山猫齐动信息技术有限公司 Method, device and storage medium for generating confrontation message sample
CN113935481B (en) * 2021-10-12 2023-04-18 中国人民解放军国防科技大学 Countermeasure testing method for natural language processing model under condition of limited times
CN113946687B (en) * 2021-10-20 2022-09-23 中国人民解放军国防科技大学 Text backdoor attack method with consistent labels
CN114169443B (en) * 2021-12-08 2024-02-06 西安交通大学 Word-level text countermeasure sample detection method
CN114238661B (en) * 2021-12-22 2024-03-19 西安交通大学 Text discrimination sample detection generation system and method based on interpretable model
CN114444476B (en) * 2022-01-25 2024-03-01 腾讯科技(深圳)有限公司 Information processing method, apparatus, and computer-readable storage medium
CN115034318B (en) * 2022-06-17 2024-05-17 中国平安人寿保险股份有限公司 Method, device, equipment and medium for generating title discrimination model
CN115333869B (en) * 2022-10-14 2022-12-13 四川大学 Distributed network anti-attack self-training learning method
CN116151392B (en) * 2023-02-28 2024-01-09 北京百度网讯科技有限公司 Training sample generation method, training method, recommendation method and device
CN117808095B (en) * 2024-02-26 2024-05-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Method and device for generating attack-resistant sample and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619292A (en) * 2019-08-31 2019-12-27 浙江工业大学 Countermeasure defense method based on binary particle swarm channel optimization
CN110767216A (en) * 2019-09-10 2020-02-07 浙江工业大学 Voice recognition attack defense method based on PSO algorithm
CN110930182A (en) * 2019-11-08 2020-03-27 中国农业大学 Improved particle swarm optimization algorithm-based client classification method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11468234B2 (en) * 2017-06-26 2022-10-11 International Business Machines Corporation Identifying linguistic replacements to improve textual message effectiveness
CN109214327B (en) * 2018-08-29 2021-08-03 浙江工业大学 Anti-face recognition method based on PSO
CN109599109B (en) * 2018-12-26 2022-03-25 浙江大学 Confrontation audio generation method and system for white-box scene
CN109887496A (en) * 2019-01-22 2019-06-14 浙江大学 Orientation confrontation audio generation method and system under a kind of black box scene

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619292A (en) * 2019-08-31 2019-12-27 浙江工业大学 Countermeasure defense method based on binary particle swarm channel optimization
CN110767216A (en) * 2019-09-10 2020-02-07 浙江工业大学 Voice recognition attack defense method based on PSO algorithm
CN110930182A (en) * 2019-11-08 2020-03-27 中国农业大学 Improved particle swarm optimization algorithm-based client classification method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Modeling Semantic Compositionality with Sememe Knowledge;Fanchao Qi等;《Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics》;20190802;第5706–5715页 *
Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack;Yuan Zang等;《arXiv:1910.12196v1[cs.CL]》;20191027;第1-5页 *
Textual Adversarial Attack as Combinatorial Optimization;Yuan Zang等;《arXiv:1910.12196v2[cs.CL]》;20191110;第1-6页 *
粒子群优化算法及差分进行算法研究;张庆科;《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》;20170815;第58-61页 *

Also Published As

Publication number Publication date
WO2021212675A1 (en) 2021-10-28
CN111652267A (en) 2020-09-11

Similar Documents

Publication Publication Date Title
CN111652267B (en) Method and device for generating countermeasure sample, electronic equipment and storage medium
US11734329B2 (en) System and method for text categorization and sentiment analysis
US11144581B2 (en) Verifying and correcting training data for text classification
US10262272B2 (en) Active machine learning
US9633002B1 (en) Systems and methods for coreference resolution using selective feature activation
KR20220025026A (en) Systems and methods for performing semantic searches using natural language understanding (NLU) frameworks
US20190377793A1 (en) Method and apparatus for establishing a hierarchical intent system
US20120262461A1 (en) System and Method for the Normalization of Text
CN109948140B (en) Word vector embedding method and device
CN111523314B (en) Model confrontation training and named entity recognition method and device
CN111859964A (en) Method and device for identifying named entities in sentences
CN109791570B (en) Efficient and accurate named entity recognition method and device
CN112256842A (en) Method, electronic device and storage medium for text clustering
CN114995903B (en) Class label identification method and device based on pre-training language model
CN114756675A (en) Text classification method, related equipment and readable storage medium
CN111680291A (en) Countermeasure sample generation method and device, electronic equipment and storage medium
CN115062621A (en) Label extraction method and device, electronic equipment and storage medium
CN113934848A (en) Data classification method and device and electronic equipment
WO2024051196A1 (en) Malicious code detection method and apparatus, electronic device, and storage medium
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN115035890B (en) Training method and device of voice recognition model, electronic equipment and storage medium
CN116578700A (en) Log classification method, log classification device, equipment and medium
CN115495578A (en) Text pre-training model backdoor elimination method, system and medium based on maximum entropy loss
CN115906797A (en) Text entity alignment method, device, equipment and medium
CN115909376A (en) Text recognition method, text recognition model training device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant