CN113449097A - Method and device for generating countermeasure sample, electronic equipment and storage medium - Google Patents

Method and device for generating countermeasure sample, electronic equipment and storage medium Download PDF

Info

Publication number
CN113449097A
CN113449097A CN202010213381.7A CN202010213381A CN113449097A CN 113449097 A CN113449097 A CN 113449097A CN 202010213381 A CN202010213381 A CN 202010213381A CN 113449097 A CN113449097 A CN 113449097A
Authority
CN
China
Prior art keywords
word
text
sample
particle
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010213381.7A
Other languages
Chinese (zh)
Inventor
吕中厚
王文华
刘焱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Priority to CN202010213381.7A priority Critical patent/CN113449097A/en
Publication of CN113449097A publication Critical patent/CN113449097A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a generation method and device of a confrontation sample, electronic equipment and a storage medium, and relates to the field of natural language processing. The specific implementation scheme is as follows: generating a plurality of particle text samples based on an original text sample of a text classification model; determining whether effective countermeasure samples aiming at the text classification model exist in the current particle text samples or not, and if so, outputting the effective countermeasure samples; and if not, updating the current particle text samples, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples. According to the method and the device, the plurality of particle text samples are generated based on the original text samples, effective countermeasure samples aiming at the text classification model are obtained based on the plurality of current particle text samples by adopting the thought of the particle swarm optimization algorithm, and the success rate of obtaining the effective countermeasure samples is improved.

Description

Method and device for generating countermeasure sample, electronic equipment and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for generating an countermeasure sample, an electronic device, and a storage medium.
Background
In order to perform vulnerability assessment and the like on the deep learning model, a countermeasure sample (adaptive samples) needs to be generated for the deep learning model. Countermeasure samples, which typically result in the deep learning model giving an erroneous output with high confidence, are typically generated by adding subtle interference information to the input samples.
The deep learning model can be a text classification model, and the confrontation samples generated aiming at the text classification model are text confrontation samples. One approach to generating text countermeasure samples is: for the text used as an input sample, the idea of a greedy algorithm is utilized to perform priority scoring on all words in the text, after the words are sorted according to the sequence of the priority scores from high to low, the first K words are selected, and the selected words are converted to generate a text countermeasure sample. Inputting the text countermeasure sample into a text classification model, if the output result of the text classification model is opposite to the expected result, the text countermeasure sample is an effective countermeasure sample, otherwise, the text countermeasure sample is an ineffective countermeasure sample.
The scheme for generating the text countermeasure sample has the following defects: the method for generating the text countermeasure sample by using the idea of the greedy algorithm is simple and efficient, but has low success rate.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a countermeasure sample, electronic equipment and a storage medium, so as to achieve the purpose of improving the success rate of obtaining an effective countermeasure sample.
In a first aspect, an embodiment of the present application provides a method for generating a countermeasure sample, including:
generating a plurality of particle text samples based on an original text sample of a text classification model, wherein the plurality of particle text samples are different from the original text sample;
determining whether effective countermeasure samples aiming at the text classification model exist in a plurality of current particle text samples, and if so, outputting the effective countermeasure samples;
and if not, updating the current particle text samples, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
One embodiment in the above application has the following advantages or benefits: after a plurality of particle text samples are generated based on the original text sample, effective countermeasure samples aiming at the text classification model are obtained based on the current particle text samples by adopting the thought of a particle swarm optimization algorithm, so that the success rate of obtaining the effective countermeasure samples can be effectively improved.
Optionally, generating a plurality of particle text samples based on the original text samples of the text classification model includes:
generating a plurality of backup text samples corresponding to original text samples of the text classification model;
and aiming at each backup text sample, selecting at least one word from the current backup text sample, and performing content conversion on the selected word to obtain a particle text sample corresponding to the current backup text sample.
One embodiment in the above application has the following advantages or benefits: the particle text sample is determined by selecting the words and converting the selected words, so that the efficiency of obtaining the particle text sample can be improved.
Optionally, selecting at least one word from the current backup text sample includes:
obtaining selection probability values corresponding to all words in the current backup text sample; the selection probability value corresponding to each word is determined according to the influence degree of each word on the detection result of the text classification model;
selecting at least one word from the current backup text sample based on the selection probability value.
One embodiment in the above application has the following advantages or benefits: the words to be converted are determined based on the selection probability values of the words, the words which have larger influence on the detection result of the text classification model can be selected, and the success rate of generating effective countermeasure samples after the selected words are converted can be ensured.
Optionally, determining, according to the influence of each word on the detection result of the text classification model, a selection probability value corresponding to each word, includes:
aiming at each word, acquiring a first classification result and a second classification result; the first classification result is a classification result output by the text classification model after the original text sample without the current word is input into the text classification model; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; determining the difference degree between the first classification result and the second classification result, and determining the influence degree of the current word on the detection result of the text classification model according to the difference degree;
determining a selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model; wherein, the selection probability value corresponding to the word with larger influence degree is larger.
One embodiment in the above application has the following advantages or benefits: the method comprises the steps of inputting samples with current words removed and samples without the current words removed into a text classification model, determining the influence degree of the current words on a detection result according to the difference degree of the model output result, and further determining the selection probability value of the words, so that the accuracy of calculating the selection probability value of each word can be improved, and the words with large influence on the model detection result are avoided being missed when the words are selected.
Optionally, determining a selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model includes:
sequencing the words according to the sequence of the influence degrees from small to large;
determining the sequencing sequence number of each word according to the sequencing result, wherein the sequencing sequence number of the word at the back of the position is larger;
determining the weight value of each word according to the sorting sequence number of each word, wherein the larger the sorting sequence number is, the larger the weight value of each word is;
and determining a selection probability value corresponding to each word according to the weight value of each word, wherein the selection probability value corresponding to the word with the larger weight value is larger.
One embodiment in the above application has the following advantages or benefits: according to the sequencing sequence number of each word after the words are sequenced according to the influence degree, determining the weight value of each word, so that the weight value corresponding to the word with larger influence degree on the detection result of the text classification model is larger, the selection probability value corresponding to the word with larger influence degree is larger, and further performing content conversion on the word with larger selection probability value to obtain a particle text sample, so that the success rate of the particle text sample being an effective countermeasure sample is increased.
Optionally, updating the current multiple particle text samples includes:
determining a global optimal sample and a global worst sample in a plurality of current particle text samples;
for each particle text sample in the current plurality of particle text samples, performing a first content conversion operation on the current particle text sample based on the globally optimal sample to improve content closeness between the current particle text sample and the globally optimal sample;
subjecting the current particle text sample to a second content conversion operation based on the global worst sample to reduce content closeness between the current particle text sample and the global worst sample.
One embodiment in the above application has the following advantages or benefits: and updating each particle text sample in the particle swarm by taking the global optimal sample and the global worst sample in the particle swarm as target functions according to the principle that the global optimal sample is close to one step and the global worst sample is far away from one step, thereby ensuring that the effective countermeasure sample is finally obtained in the iterative process and improving the success rate of obtaining the effective countermeasure sample.
Optionally, determining a global optimal sample and a global worst sample in the current multiple particle text samples includes:
obtaining a second classification result and a third classification result corresponding to each particle text sample in the plurality of current particle text samples; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; the third classification result is a classification result output by the text classification model after the corresponding particle text sample is input into the text classification model;
determining a third classification result with the lowest closeness to the second classification result, and taking a particle text sample corresponding to the third classification result with the lowest closeness as a global optimal sample;
and determining a third classification result with highest closeness to the second classification result, and taking the particle text sample corresponding to the third classification result with highest closeness as a global worst sample.
One embodiment in the above application has the following advantages or benefits: the text model is used for comparing the proximity of the detection results of the original sample and the particle text sample, so that the global optimal sample and the global worst sample are determined according to the comparison result, and the accuracy of determining the global optimal sample and the global worst sample is improved.
Optionally, performing a first content conversion operation on the current particle text sample based on the global optimal sample, including:
determining a first set of word positions and a second set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted relative to the original text sample, and the second set of word positions is a set of word positions for which the global optimal sample is content converted relative to the original text sample;
determining a third set of word positions and a fourth set of word positions; wherein the third set of word positions comprises word positions that belong to the first set of word positions and do not belong to the second set of word positions, and the fourth set of word positions comprises word positions that belong to the second set of word positions and do not belong to the first set of word positions;
selecting a first word position in the third word position set, and executing conversion recovery operation on a word corresponding to the first word position in the current particle text sample; and selecting a second word position in the fourth word position set, and performing content conversion on a word corresponding to the second word position in the current particle text sample.
One embodiment in the above application has the following advantages or benefits: updating each particle text sample in the particle swarm according to the principle that the particle text samples approach to the global optimal sample by one step, so that each updated particle text sample is closer to the global optimal sample, and the success rate of determining the effective countermeasure sample based on each updated particle text sample can be further ensured.
Optionally, performing a second content conversion operation on the current particle text sample based on the global worst sample, including:
determining a first set of word positions and a fifth set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample has been content converted relative to the original text sample, and the fifth set of word positions is a set of word positions for which the global worst sample has been content converted relative to the original text sample;
determining a sixth set of word positions and a seventh set of word positions; wherein the sixth set of word positions comprises word positions belonging to the first set of word positions and belonging to the fifth set of word positions, and the seventh set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions;
selecting a third word position in the sixth word position set, and executing conversion recovery operation on a word corresponding to the third word position in the current particle text sample; and selecting a fourth word position in the seventh word position set, and performing content conversion on a word corresponding to the fourth word position in the current particle text sample.
One embodiment in the above application has the following advantages or benefits: and updating each particle text sample in the particle swarm according to a principle that one step is far away from the global worst sample, so that each updated particle text sample is far away from the global worst sample, and further, the success rate of determining an effective countermeasure sample based on each updated particle text sample can be ensured.
Optionally, the word is a chinese vocabulary; and converting the content of the words, including:
replacing words with other content; and/or, inserting preset characters into the words;
the other contents include: at least one of wrongly written or mispronounced characters, English, homophones, network vocabularies, pinyin, traditional Chinese characters and characters with similar shapes.
One embodiment in the above application has the following advantages or benefits: the words are replaced by the words which cannot be understood by the machine, human reading is not affected, and the conversion of Chinese words is included, so that the word conversion is more complete.
In a second aspect, an embodiment of the present application further provides an apparatus for generating a challenge sample, including:
a particle sample generation module for generating a plurality of particle text samples based on an original text sample of a text classification model, wherein the plurality of particle text samples are different from the original text sample;
the countermeasure sample determination module is used for determining whether an effective countermeasure sample aiming at the text classification model exists in the current particle text samples or not, and if so, outputting the effective countermeasure sample;
and the iteration module is used for updating the current particle text samples if the effective countermeasure samples do not exist, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
In a third aspect, an embodiment of the present application further provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of generating challenge samples as described in any of the embodiments of the present application.
In a fourth aspect, the present application further provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the method for generating an countermeasure sample according to any embodiment of the present application.
One embodiment in the above application has the following advantages or benefits: after a plurality of particle text samples are generated based on an original text sample, the idea of a particle swarm optimization algorithm is adopted, namely, a global optimal sample and a global worst sample in a particle swarm are used as target functions, and the particle text samples in the particle swarm are updated according to the principle that the distance from the global optimal sample to the global worst sample is close to one step and the distance from the global worst sample to the one step, so that the effective countermeasure sample is finally obtained in the iteration process, and the success rate of obtaining the effective countermeasure sample is improved.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic flow chart of a method of generating a challenge sample according to a first embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a method of generating a challenge sample according to a second embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of a method of generating a challenge sample according to a third embodiment of the present application;
fig. 4 is a schematic structural diagram of a challenge sample generating device according to a fourth embodiment of the present application;
fig. 5 is a block diagram of an electronic device for implementing the method for generating countermeasure samples according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flow chart of a method for generating a challenge sample according to a first embodiment of the present application, which is applicable to a case of generating an effective challenge sample. The method can be executed by a countermeasure sample generating device, which is implemented by software and/or hardware, and is preferably configured in an electronic device, such as a server or the like. As shown in fig. 1, the method specifically includes the following steps:
s101, generating a plurality of particle text samples based on the original text samples of the text classification model.
The particle text samples are generated after the interference information is added into the original text samples, and are equivalent to confrontation samples, and specifically, when a plurality of particle text samples are generated, the plurality of particle text samples can be obtained by modifying or replacing one or more words in the original text samples, so that the generated plurality of particle text samples are different from the original text samples. In an alternative embodiment, a plurality of particle text samples may be generated as follows:
s1011, generating a plurality of backup text samples corresponding to the original text samples of the text classification model.
Wherein, each backup text sample has the same content as the corresponding original text sample.
S1012, aiming at each backup text sample, selecting at least one word from the current backup text sample, and performing content conversion on the selected word to obtain a particle text sample corresponding to the current backup text sample.
In the embodiment of the present application, the selected word is a chinese vocabulary, and the operation of converting the content of the selected word includes: replacing words with other content; and/or, inserting a preset character (e.g., a designated letter, symbol, space, etc.) in the word; wherein the other content includes: at least one of wrongly written or mispronounced characters, English, homophones, network vocabularies, pinyin, traditional Chinese characters and characters with similar shapes.
It should be noted that, when words are replaced by wrongly written characters, because many wrongly written characters exist in the writing process of Chinese characters, if a business card is wrongly written into a plain film, the replacement of the wrongly written characters cannot cause reading obstacles of people; when Chinese characters are replaced by English, in the current Chinese environment, the expression of a sentence is often mixed with some English words on some key words, for example, 'this Buffet dinner true delicious', which has gradually become a habitual usage; when the Chinese characters are replaced by homophone characters, if a word or a character is replaced by the homophone characters in a sentence, if 'the cost performance of the mobile phone is really high' is converted into 'the cost performance of the mobile phone is really high', reading is not influenced at all; when Chinese characters are converted into network expressions, due to the development of social networks, popular network expressions are created, such as ' sauce purple ' corresponding to ' the character, material ' corresponding to ' the character, and the like. The Chinese characters are converted into corresponding terms, so that the reading is not influenced for people; when one character of a word is converted into the pinyin of the word, the pinyin is the special existence of Chinese characters relative to English, and one character of the word is converted into the pinyin corresponding to the character, for example, 'the room is really jung', so that 'jung' can be completely understood as clean; when the Chinese character words are converted into the corresponding traditional Chinese characters, the contents are converted, and the reading and understanding of people are not influenced; when a character is inserted in the middle of a word, for example, a space is inserted in the word, the word can be decomposed into a character string unknown by a machine, for example, the space is added in the middle of 'happy' and replaced by 'happy', which has no influence on human reading, but the machine can split the word into two words of 'happy' and 'happy'; when Chinese characters are converted into near characters, because many characters which are similar exist in Chinese, for example, the 'waiting' and the 'waiting' look similar but are two completely different characters, the content conversion can be realized through the near characters.
S102, determining whether effective countermeasure samples aiming at the text classification model exist in the current particle text samples, if so, executing S103, otherwise, executing S104.
Wherein, the definition of effective countermeasure sample: and if a text sample (namely, a countermeasure sample) added with the interference information is input into the text classification model, the text classification model gives a wrong output result with high confidence, and the text sample is determined as an effective countermeasure sample. Therefore, when determining whether the current particle text samples have effective countermeasure samples for the text classification model, only the current particle text samples are sequentially input into the text classification model, and whether the particle text samples are effective countermeasure samples is determined according to the output result of the model, for example, after one particle text sample is input into the model, the model outputs an error result with high confidence level, and then the particle text sample is determined to be an effective countermeasure sample. Further, after determining the effective countermeasure sample, executing S103 to output the effective countermeasure sample, and if no effective countermeasure sample exists, executing S104 to update the particle text sample, and then performing the determination.
And S103, outputting the effective countermeasure sample.
And S104, updating the current particle text samples, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
If it is determined that there is no valid countermeasure sample according to S102, performing iterative processing on the current particle text samples, specifically, updating the current particle text samples, that is, performing content transformation on words in the particle text samples again, for example, adjusting positions of transformed words, to obtain new particle text samples. For example, if the content of the third word in the current particle text sample is converted, and the particle text is updated, the third word of the particle text may be restored, and the fourth word may be converted, so as to obtain an updated particle text sample. After the updated particle text samples are obtained, returning to step S102 to determine whether valid countermeasure samples for the text classification model exist in the updated particle text samples, and if no valid countermeasure sample exists, repeatedly performing operations of updating the current particle text samples and determining whether valid countermeasure samples for the text classification model exist in the updated particle text samples. It should be noted that an iteration number threshold is preset, and the process is ended when the number of iterations (i.e., updates) performed on the current multiple particle text samples reaches the iteration number threshold.
In the embodiment of the application, after a plurality of particle text samples are generated based on original text samples, the thought of the particle swarm optimization algorithm is adopted, the current particle text samples are updated, and then effective countermeasure samples aiming at the text classification model are determined from the updated current particle text samples, so that the success rate of obtaining the effective countermeasure samples can be effectively improved.
Fig. 2 is a schematic flow chart of a method for generating a challenge sample according to a second embodiment of the present application, which is further optimized based on the above embodiments. As shown in fig. 2, the method specifically includes the following steps:
s201, generating a plurality of backup text samples corresponding to the original text samples of the text classification model.
S202, aiming at each backup text sample, selecting at least one word from the current backup text sample, and performing content conversion on the selected word to obtain a particle text sample corresponding to the current backup text sample.
In the embodiment of the application, only if it is ensured that at least one word selected from the current backup text sample has a significant influence on the output result of the text classification model, the success rate of determining the effective countermeasure sample from the particle text sample can be ensured after the content of the selected word is converted and the particle text sample corresponding to the current backup text sample is obtained. In an alternative embodiment, the operation of selecting at least one word from the current backup text sample comprises:
s2021, obtaining selection probability values corresponding to the words in the current backup text sample.
The selection probability values corresponding to the words are determined according to the influence of the words on the detection result of the text classification model, and the larger the influence of the words on the detection result of the text classification model is, the larger the corresponding selection probability value is, namely, the larger the probability of being selected is.
In an optional implementation manner, determining, according to the influence of each word on the detection result of the text classification model, a selection probability value corresponding to each word includes: aiming at each word, acquiring a first classification result and a second classification result; the first classification result is a classification result output by the text classification model after the original text sample without the current word is input into the text classification model; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; determining the difference degree between the first classification result and the second classification result, and determining the influence degree of the current word on the detection result of the text classification model according to the difference degree; the larger the difference degree between the first classification result and the second classification result is, the larger the influence degree of the current word on the detection result of the classification model is.
In the embodiment of the application, the text classification model can be an emotion analysis model, an administrative detection model, a yellow-related detection model, an implication relationship model, an advertisement detection model, a malicious speech detection model and the like. Taking the text classification model as an emotion analysis model as an example, for each word in the backup text sample, detecting the emotion score (neg _ score) of the text sample with the word removed based on the emotion analysis modeli,pos_scorei) And is based onThe emotion analysis model detects emotion scores (neg _ score, pos _ score) of the original text sample, wherein pos _ scoreiAnd pos _ score represents the positive sentiment score, neg _ scoreiAnd neg score represents a negative sentiment score, when the original text sample is detected as positive sentiment, i.e. pos score > neg score, the second classification result is pos score, and the difference degree between the first classification result and the second classification result is delta scorei=pos_score-pos_scoreiAt delta _ scoreiThe larger the number of words, the larger the influence of the removed word on the model detection result. In an alternative embodiment, after obtaining the difference degree between the first classification result and the second classification result, the difference degree value is input to a preset function for calculation, and the influence degree of the removed word on the model detection result is determined according to the calculation result.
Further, determining a selection probability value corresponding to each word according to the influence degree of each word on the detection result of the text classification model; the selection probability value corresponding to the word with larger influence degree is larger, so that the words with great influence on the detection result of the text classification model can be selected. It should be noted that the reason why the word is not selected based on the word priority scoring mechanism in the present application is that the words that are often generated in the active countermeasure sample and subjected to content conversion are not in the words selected based on the word priority scoring mechanism, that is, the words are selected based on the existing word priority scoring mechanism, and the success rate of content conversion on the selected words to generate the active countermeasure sample is low.
Further, determining a selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model, including:
sequencing the words according to the sequence of the influence degrees from small to large; determining the sequence number of each word according to the sequence result, wherein the sequence number is rankiRepresenting, and the more the word at the back of the position has larger sequence number; determining a weighted value of each word according to the sorting order number of each word, wherein the weighted value of each word is exemplarily shown as
Figure BDA0002423569590000111
R is a coefficient, and the weighted value of the word with the larger sorting sequence number is larger; determining a selection probability value corresponding to each word according to the weight value of each word, an exemplary selection probability value corresponding to each word
Figure BDA0002423569590000121
And N text sample word total numbers, wherein the selection probability value corresponding to the word with larger weight value is larger.
S2022, selecting at least one word from the current backup text sample based on the selection probability value.
In the current backup text sample, if the selection probability value of a certain word is larger, the influence of the word on the detection result of the text classification model is larger, the word is selected, the selected word is subjected to content conversion, and the particle text sample corresponding to the current backup text sample is obtained, so that the success rate of generating an effective countermeasure sample can be ensured.
S203, determining whether effective countermeasure samples aiming at the text classification model exist in the current particle text samples, if so, executing S204, otherwise, executing S205.
And S204, outputting the effective countermeasure sample.
S205, updating the current particle text samples, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
In the embodiment of the application, the words to be converted are determined based on the selection probability values of the words, the words which have greater influence on the detection result of the text classification model can be selected, and the success rate of generating an effective countermeasure sample after the selected words are converted can be ensured; and when selecting words, respectively inputting the text samples with a word removed and the text samples without the word removed into the text classification model, determining the influence degree of the word on the detection result according to the difference degree of the model output result, and further determining the selection probability value of the word, so that the accuracy of calculating the selection probability value of each word can be improved, and the word with large influence on the model detection result is avoided being missed when the word is selected.
Fig. 3 is a schematic flow chart of a method for generating a challenge sample according to a third embodiment of the present application, which is further optimized based on the above embodiments. As shown in fig. 3, the method specifically includes the following steps:
s301, generating a plurality of particle text samples based on an original text sample of a text classification model, wherein the plurality of particle text samples are different from the original text sample.
S302, determining whether effective countermeasure samples aiming at the text classification model exist in the current particle text samples, if so, executing S303, otherwise, executing S304-S306 to update the current particle text samples.
And S303, outputting the effective countermeasure sample.
S304, determining a global optimal sample and a global worst sample in the current particle text samples.
The global optimal sample is the particle text sample closest to the effective antagonizing sample, and the global worst sample is the global worst sample. In an alternative embodiment, determining a global optimal sample and a global worst sample of a current plurality of particle text samples includes:
s3041, obtaining a second classification result and a third classification result corresponding to each particle text sample in the plurality of current particle text samples; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; and the third classification result is a classification result output by the text classification model after the corresponding particle text sample is input into the text classification model.
S3042, determining a third classification result with the lowest proximity to the second classification result, and taking a particle text sample corresponding to the third classification result with the lowest proximity as a global optimal sample; the proximity of the third classification result and the second classification result is the lowest, that is, the difference between the two classification results is the largest, and the particle text sample corresponding to the third classification result is the particle text sample closest to the effective countermeasure sample, that is, the global optimal sample;
s3043, determining a third classification result with the highest closeness to the second classification result, and taking a particle text sample corresponding to the third classification result with the highest closeness as a global worst sample; the third classification result is the closest to the second classification result, that is, the difference between the two classification results is the smallest, so that the particle text sample corresponding to the third classification result has the smallest change relative to the original text sample, the influence on the classification result is the smallest, and the probability of being an effective countermeasure sample is the smallest, so that the particle text sample is the global worst sample.
Illustratively, the text classification model is an emotion analysis model, and after the particle text samples are input into the emotion analysis model, the model outputs emotion scores (neg _ score) corresponding to decibels of the particle text samplesi,pos_scorei) Wherein neg _ scoreiPos _ score for negative sentiment scoreiIs the positive emotion score. Screening pos _ scoreiThe smallest particle text sample as the global optimal sample, pos _ scoreiThe largest particle text sample is taken as the global worst sample. It should be noted here that if the emotion result corresponding to the global optimal sample is negative emotion, namely pos _ scorei<neg_scoreiThen the global optimal sample is the generated valid countermeasure sample.
Further, after the global optimal sample and the global worst sample are determined, based on the thought of the particle swarm optimization algorithm, the text samples of the particles in the particle swarm are updated according to the principle that the optimal sample is close to one step and the worst sample is far away from one step. See S306 and S307 for details.
S305, aiming at each particle text sample in the current multiple particle text samples, performing a first content conversion operation on the current particle text sample based on the global optimal sample so as to improve the content proximity between the current particle text sample and the global optimal sample.
In an alternative embodiment, performing a first content conversion operation on the current particle text sample based on the globally optimal sample includes:
s3051, determining a first word position set and a second word position set; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted with respect to the original text sample, and the second set of word positions is a set of word positions for which the global optimal sample is content converted with respect to the original text sample.
S3052, determining a third word position set and a fourth word position set; wherein the third set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions, and the fourth set of word positions comprises word positions belonging to the second set of word positions and not belonging to the first set of word positions.
S3053, selecting a first word position in the third word position set, and performing conversion recovery operation on a word corresponding to the first word position in the current particle text sample; and selecting a second word position in the fourth word position set, and performing content conversion on a word corresponding to the second word position in the current particle text sample.
In the embodiment of the present application, the third word position set includes word positions that belong to the first word position set and do not belong to the second word position set, which indicates that in the current particle text sample, content conversion of words located at the first word position in the third word position set is not optimal, and words corresponding to the word positions need to be restored, so that the words at the positions where the word positions are restored are the same as words at the positions of the original sample.
The fourth word position set contains the word positions which belong to the second word position set and do not belong to the first word position set, which indicates that the content of the words corresponding to the second word positions in the fourth word position set in the current particle text sample is converted and optimal. Therefore, in the current particle text sample, the word located at the second word position in the fourth word position set needs to be subjected to content conversion, so that the current particle text sample after content conversion is closer to the global optimal sample.
S306, performing a second content conversion operation on the current particle text sample based on the global worst sample to reduce the content closeness between the current particle text sample and the global worst sample.
In an optional implementation, performing a second content conversion operation on the current particle text sample based on the global worst sample includes:
s3061, determining a first word position set and a fifth word position set; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted with respect to the original text sample, and the fifth set of word positions is a set of word positions for which the global worst sample is content converted with respect to the original text sample.
S3062, determining a sixth word position set and a seventh word position set; wherein the sixth set of word positions comprises word positions belonging to the first set of word positions and belonging to the fifth set of word positions, and the seventh set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions;
s3063, selecting a third word position in the sixth word position set, and executing conversion recovery operation on a word corresponding to the third word position in the current particle text sample; and selecting a fourth word position in the seventh word position set, and performing content conversion on a word corresponding to the fourth word position in the current particle text sample.
If the sixth word position set contains the word positions belonging to the first word position set and the fifth word position set, it indicates that the content conversion of the word located at the third word position of the sixth word position set is the worst in the current particle text sample, and the word corresponding to the position needs to be recovered. Then, a word position which does not belong to the fifth word set in the current particle text set can be randomly selected, and the content of the word at the position is converted. For example, a seventh word position set is determined, the seventh word position set includes word positions that belong to the first word position set and do not belong to the second word position set, a fourth word position in the seventh word position set is selected, and a word corresponding to the fourth word position is subjected to content conversion in the current particle text sample.
And S307, returning the plurality of updated particle text samples as the plurality of current particle text samples, and executing the step of determining whether effective countermeasure samples aiming at the text classification model exist in the plurality of current particle text samples. In the embodiment, the global optimal sample and the global worst sample in the particle swarm are used as the target function, and the text samples of the particles in the particle swarm are updated according to the principle that the distance from the global optimal sample to the global worst sample is close to one step and the distance from the global worst sample to the one step, so that the effective countermeasure sample is ensured to be finally obtained in the iterative process, and the success rate of obtaining the effective countermeasure sample is improved.
Fig. 4 is a schematic structural diagram of a challenge sample generation device according to a fourth embodiment of the present application, which is applicable to a case of generating an effective challenge sample. The device can realize the generation method of the confrontation sample according to any embodiment of the application. As shown in fig. 4, the apparatus 400 specifically includes:
a particle sample generation module 401, configured to generate a plurality of particle text samples based on an original text sample of a text classification model, where the plurality of particle text samples are different from the original text sample;
a countermeasure sample determination module 402, configured to determine whether there is an effective countermeasure sample for the text classification model in the current multiple particle text samples, and if there is an effective countermeasure sample, output the effective countermeasure sample;
an iteration module 403, configured to update the current multiple particle text samples if an effective countermeasure sample does not exist, use the updated multiple particle text samples as the current multiple particle text samples, and return to execute the step of determining whether an effective countermeasure sample for the text classification model exists in the current multiple particle text samples.
Optionally, the particle sample generation module includes:
the backup unit is used for generating a plurality of backup text samples corresponding to the original text samples of the text classification model;
and the particle sample generating unit is used for selecting at least one word from the current backup text sample aiming at each backup text sample, and performing content conversion on the selected word to obtain the particle text sample corresponding to the current backup text sample.
Optionally, the particle sample generating unit includes:
a selection probability value obtaining subunit, configured to obtain selection probability values corresponding to the words in the current backup text sample; the selection probability value corresponding to each word is determined according to the influence degree of each word on the detection result of the text classification model;
and the word selecting subunit is used for selecting at least one word from the current backup text sample based on the selection probability value.
Optionally, the apparatus includes a selection probability value determining module, which includes:
a classification result acquisition unit configured to acquire a first classification result and a second classification result for each word; the first classification result is a classification result output by the text classification model after the original text sample without the current word is input into the text classification model; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; determining the difference degree between the first classification result and the second classification result, and determining the influence degree of the current word on the detection result of the text classification model according to the difference degree;
the selection probability value determining unit is used for determining the selection probability value corresponding to each word according to the influence degree of each word on the detection result of the text classification model; wherein, the larger the selection probability value corresponding to the affected word is.
Optionally, the selection probability value determining unit is specifically configured to:
sequencing the words according to the sequence of the influence degrees from small to large;
determining the sequencing sequence number of each word according to the sequencing result, wherein the sequencing sequence number of the word at the back of the position is larger;
determining the weight value of each word according to the sorting sequence number of each word, wherein the larger the sorting sequence number is, the larger the weight value of each word is;
and determining a selection probability value corresponding to each word according to the weight value of each word, wherein the selection probability value corresponding to the word with the larger weight value is larger.
Optionally, the iteration module includes:
the optimal and worst sample determining unit is used for determining a global optimal sample and a global worst sample in the current particle text samples;
a first conversion unit, configured to, for each particle text sample of a plurality of current particle text samples, perform a first content conversion operation on the current particle text sample based on the globally optimal sample, so as to improve content proximity between the current particle text sample and the globally optimal sample;
a second conversion unit, configured to perform a second content conversion operation on the current particle text sample based on the global worst sample, so as to reduce content closeness between the current particle text sample and the global worst sample.
Optionally, the optimal and worst sample determining unit includes:
the classification result obtaining subunit is used for obtaining a second classification result and a third classification result corresponding to each particle text sample in the current multiple particle text samples; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; the third classification result is a classification result output by the text classification model after the corresponding particle text sample is input into the text classification model;
the optimal sample determining subunit is used for determining a third classification result which is the lowest in proximity to the second classification result, and taking the particle text sample corresponding to the third classification result with the lowest proximity as a global optimal sample;
and the worst sample determining subunit is used for determining a third classification result which is the highest in proximity to the second classification result, and taking the particle text sample corresponding to the third classification result with the highest proximity as a global worst sample.
Optionally, the first conversion unit is specifically configured to:
determining a first set of word positions and a second set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted relative to the original text sample, and the second set of word positions is a set of word positions for which the global optimal sample is content converted relative to the original text sample;
determining a third set of word positions and a fourth set of word positions; wherein the third set of word positions comprises word positions that belong to the first set of word positions and do not belong to the second set of word positions, and the fourth set of word positions comprises word positions that belong to the second set of word positions and do not belong to the first set of word positions;
selecting a first word position in the third word position set, and executing conversion recovery operation on a word corresponding to the first word position in the current particle text sample; and selecting a second word position in the fourth word position set, and performing content conversion on a word corresponding to the second word position in the current particle text sample.
Optionally, the second conversion unit is specifically configured to:
determining a first set of word positions and a fifth set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample has been content converted relative to the original text sample, and the fifth set of word positions is a set of word positions for which the global worst sample has been content converted relative to the original text sample;
determining a sixth set of word positions and a seventh set of word positions; wherein the sixth set of word positions comprises word positions belonging to the first set of word positions and belonging to the fifth set of word positions, and the seventh set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions;
selecting a third word position in the sixth word position set, and executing conversion recovery operation on a word corresponding to the third word position in the current particle text sample; and selecting a fourth word position in the seventh word position set, and performing content conversion on a word corresponding to the fourth word position in the current particle text sample.
Optionally, the word is a Chinese vocabulary; the device comprises a content conversion module for performing content conversion on the words, and is used for:
replacing words with other content; and/or, inserting preset characters into the words;
the other contents include: at least one of wrongly written or mispronounced characters, English, homophones, network vocabularies, pinyin, traditional Chinese characters and characters with similar shapes.
The generation device 400 of the confrontation sample provided by the embodiment of the present application can execute the generation method of the confrontation sample provided by any embodiment of the present application, and has the corresponding functional modules and beneficial effects of the execution method. Reference may be made to the description of any method embodiment of the present application for details not explicitly described in this embodiment.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the method of generating challenge samples provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the method of generating a challenge sample provided herein.
The memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for generating a challenge sample in the embodiment of the present application (e.g., the particle sample generation module 401, the challenge sample determination module 402, and the iteration module 403 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 502, that is, implements the generation method of the countermeasure sample in the above-described method embodiment.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of an electronic device that implements the generation method of the countermeasure sample of the embodiment of the present application, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 502 may optionally include memory located remotely from the processor 501, which may be connected via a network to an electronic device implementing the challenge sample generation method of embodiments of the present application. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device implementing the method for generating the countermeasure sample according to the embodiment of the present application may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus implementing the method of generating the countermeasure sample of the embodiment of the present application, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, after a plurality of particle text samples are generated based on original text samples, the idea of a particle swarm optimization algorithm is adopted, namely, the global optimal sample and the global worst sample in a particle swarm are used as target functions, and the particle text samples in the particle swarm are updated according to the principle that the distance from the global optimal sample to the global worst sample is close to one step and the distance from the global worst sample to the one step, so that the effective countermeasure sample is finally obtained in the iteration process, and the success rate of obtaining the effective countermeasure sample is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (22)

1. A method of generating a challenge sample, comprising:
generating a plurality of particle text samples based on an original text sample of a text classification model, wherein the plurality of particle text samples are different from the original text sample;
determining whether effective countermeasure samples aiming at the text classification model exist in a plurality of current particle text samples, and if so, outputting the effective countermeasure samples;
and if not, updating the current particle text samples, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
2. The method of claim 1, wherein generating a plurality of particle text samples based on raw text samples of a text classification model comprises:
generating a plurality of backup text samples corresponding to original text samples of the text classification model;
and aiming at each backup text sample, selecting at least one word from the current backup text sample, and performing content conversion on the selected word to obtain a particle text sample corresponding to the current backup text sample.
3. The method of claim 2, wherein selecting at least one word from the current backup text sample comprises:
obtaining selection probability values corresponding to all words in the current backup text sample; the selection probability value corresponding to each word is determined according to the influence degree of each word on the detection result of the text classification model;
selecting at least one word from the current backup text sample based on the selection probability value.
4. The method of claim 3, wherein determining the selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model comprises:
aiming at each word, acquiring a first classification result and a second classification result; the first classification result is a classification result output by the text classification model after the original text sample without the current word is input into the text classification model; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model;
determining the difference degree between the first classification result and the second classification result, and determining the influence degree of the current word on the detection result of the text classification model according to the difference degree;
determining a selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model; wherein, the selection probability value corresponding to the word with larger influence degree is larger.
5. The method of claim 4, wherein determining the selection probability value corresponding to each word according to the influence of each word on the detection result of the text classification model comprises:
sequencing the words according to the sequence of the influence degrees from small to large;
determining the sequencing sequence number of each word according to the sequencing result, wherein the sequencing sequence number of the word at the back of the position is larger;
determining the weight value of each word according to the sorting sequence number of each word, wherein the larger the sorting sequence number is, the larger the weight value of each word is;
and determining a selection probability value corresponding to each word according to the weight value of each word, wherein the selection probability value corresponding to the word with the larger weight value is larger.
6. The method of claim 1, wherein updating the current plurality of particle text samples comprises:
determining a global optimal sample and a global worst sample in a plurality of current particle text samples;
for each particle text sample in the current plurality of particle text samples, performing a first content conversion operation on the current particle text sample based on the globally optimal sample to improve content closeness between the current particle text sample and the globally optimal sample;
subjecting the current particle text sample to a second content conversion operation based on the global worst sample to reduce content closeness between the current particle text sample and the global worst sample.
7. The method of claim 6, wherein determining a global optimal sample and a global worst sample of the current plurality of particle text samples comprises:
obtaining a second classification result and a third classification result corresponding to each particle text sample in the plurality of current particle text samples; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; the third classification result is a classification result output by the text classification model after the corresponding particle text sample is input into the text classification model;
determining a third classification result with the lowest closeness to the second classification result, and taking a particle text sample corresponding to the third classification result with the lowest closeness as a global optimal sample;
and determining a third classification result with highest closeness to the second classification result, and taking the particle text sample corresponding to the third classification result with highest closeness as a global worst sample.
8. The method of claim 6, wherein performing a first content transformation operation on a current particle text sample based on the globally optimal sample comprises:
determining a first set of word positions and a second set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted relative to the original text sample, and the second set of word positions is a set of word positions for which the global optimal sample is content converted relative to the original text sample;
determining a third set of word positions and a fourth set of word positions; wherein the third set of word positions comprises word positions that belong to the first set of word positions and do not belong to the second set of word positions, and the fourth set of word positions comprises word positions that belong to the second set of word positions and do not belong to the first set of word positions;
selecting a first word position in the third word position set, and executing conversion recovery operation on a word corresponding to the first word position in the current particle text sample; and selecting a second word position in the fourth word position set, and performing content conversion on a word corresponding to the second word position in the current particle text sample.
9. The method of claim 6, wherein performing a second content transformation operation on a current particle text sample based on the global worst sample comprises:
determining a first set of word positions and a fifth set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample has been content converted relative to the original text sample, and the fifth set of word positions is a set of word positions for which the global worst sample has been content converted relative to the original text sample;
determining a sixth set of word positions and a seventh set of word positions; wherein the sixth set of word positions comprises word positions belonging to the first set of word positions and belonging to the fifth set of word positions, and the seventh set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions;
selecting a third word position in the sixth word position set, and executing conversion recovery operation on a word corresponding to the third word position in the current particle text sample; and selecting a fourth word position in the seventh word position set, and performing content conversion on a word corresponding to the fourth word position in the current particle text sample.
10. The method of any one of claims 1-9, wherein the words are chinese vocabulary; and converting the content of the words, including:
replacing words with other content; and/or, inserting preset characters into the words;
the other contents include: at least one of wrongly written or mispronounced characters, English, homophones, network vocabularies, pinyin, traditional Chinese characters and characters with similar shapes.
11. A challenge sample generating apparatus, comprising:
a particle sample generation module for generating a plurality of particle text samples based on an original text sample of a text classification model, wherein the plurality of particle text samples are different from the original text sample;
the countermeasure sample determination module is used for determining whether an effective countermeasure sample aiming at the text classification model exists in the current particle text samples or not, and if so, outputting the effective countermeasure sample;
and the iteration module is used for updating the current particle text samples if the effective countermeasure samples do not exist, taking the updated particle text samples as the current particle text samples, and returning to execute the step of determining whether the effective countermeasure samples aiming at the text classification model exist in the current particle text samples.
12. The apparatus of claim 11, wherein the particle sample generation module comprises:
the backup unit is used for generating a plurality of backup text samples corresponding to the original text samples of the text classification model;
and the particle sample generating unit is used for selecting at least one word from the current backup text sample aiming at each backup text sample, and performing content conversion on the selected word to obtain the particle text sample corresponding to the current backup text sample.
13. The apparatus of claim 12, wherein the particle sample generation unit comprises:
a selection probability value obtaining subunit, configured to obtain selection probability values corresponding to the words in the current backup text sample; the selection probability value corresponding to each word is determined according to the influence degree of each word on the detection result of the text classification model;
and the word selecting subunit is used for selecting at least one word from the current backup text sample based on the selection probability value.
14. The apparatus of claim 13, wherein the apparatus comprises a selection probability value determination module, the selection probability value determination module comprising:
a classification result acquisition unit configured to acquire a first classification result and a second classification result for each word; the first classification result is a classification result output by the text classification model after the original text sample without the current word is input into the text classification model; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; determining the difference degree between the first classification result and the second classification result, and determining the influence degree of the current word on the detection result of the text classification model according to the difference degree;
the selection probability value determining unit is used for determining the selection probability value corresponding to each word according to the influence degree of each word on the detection result of the text classification model; wherein, the larger the selection probability value corresponding to the affected word is.
15. The apparatus according to claim 14, wherein the selection probability value determining unit is specifically configured to:
sequencing the words according to the sequence of the influence degrees from small to large;
determining the sequencing sequence number of each word according to the sequencing result, wherein the sequencing sequence number of the word at the back of the position is larger;
determining the weight value of each word according to the sorting sequence number of each word, wherein the larger the sorting sequence number is, the larger the weight value of each word is;
and determining a selection probability value corresponding to each word according to the weight value of each word, wherein the selection probability value corresponding to the word with the larger weight value is larger.
16. The apparatus of claim 11, wherein the iteration module comprises:
the optimal and worst sample determining unit is used for determining a global optimal sample and a global worst sample in the current particle text samples;
a first conversion unit, configured to, for each particle text sample of a plurality of current particle text samples, perform a first content conversion operation on the current particle text sample based on the globally optimal sample, so as to improve content proximity between the current particle text sample and the globally optimal sample;
a second conversion unit, configured to perform a second content conversion operation on the current particle text sample based on the global worst sample, so as to reduce content closeness between the current particle text sample and the global worst sample.
17. The apparatus of claim 16, wherein the optimal and worst sample determination unit comprises:
the classification result obtaining subunit is used for obtaining a second classification result and a third classification result corresponding to each particle text sample in the current multiple particle text samples; the second classification result is a classification result output by the text classification model after the original text sample is input into the text classification model; the third classification result is a classification result output by the text classification model after the corresponding particle text sample is input into the text classification model;
the optimal sample determining subunit is used for determining a third classification result which is the lowest in proximity to the second classification result, and taking the particle text sample corresponding to the third classification result with the lowest proximity as a global optimal sample;
and the worst sample determining subunit is used for determining a third classification result which is the highest in proximity to the second classification result, and taking the particle text sample corresponding to the third classification result with the highest proximity as a global worst sample.
18. The apparatus according to claim 16, wherein the first conversion unit is specifically configured to:
determining a first set of word positions and a second set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample is content converted relative to the original text sample, and the second set of word positions is a set of word positions for which the global optimal sample is content converted relative to the original text sample;
determining a third set of word positions and a fourth set of word positions; wherein the third set of word positions comprises word positions that belong to the first set of word positions and do not belong to the second set of word positions, and the fourth set of word positions comprises word positions that belong to the second set of word positions and do not belong to the first set of word positions;
selecting a first word position in the third word position set, and executing conversion recovery operation on a word corresponding to the first word position in the current particle text sample; and selecting a second word position in the fourth word position set, and performing content conversion on a word corresponding to the second word position in the current particle text sample.
19. The apparatus according to claim 16, wherein the second conversion unit is specifically configured to:
determining a first set of word positions and a fifth set of word positions; wherein the first set of word positions is a set of word positions for which the current particle text sample has been content converted relative to the original text sample, and the fifth set of word positions is a set of word positions for which the global worst sample has been content converted relative to the original text sample;
determining a sixth set of word positions and a seventh set of word positions; wherein the sixth set of word positions comprises word positions belonging to the first set of word positions and belonging to the fifth set of word positions, and the seventh set of word positions comprises word positions belonging to the first set of word positions and not belonging to the second set of word positions;
selecting a third word position in the sixth word position set, and executing conversion recovery operation on a word corresponding to the third word position in the current particle text sample; and selecting a fourth word position in the seventh word position set, and performing content conversion on a word corresponding to the fourth word position in the current particle text sample.
20. The apparatus according to any one of claims 11-19, wherein said words are chinese vocabulary; the device comprises a content conversion module for performing content conversion on the words, and is used for:
replacing words with other content; and/or, inserting preset characters into the words;
the other contents include: at least one of wrongly written or mispronounced characters, English, homophones, network vocabularies, pinyin, traditional Chinese characters and characters with similar shapes.
21. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of countering the generation of samples of any of claims 1-10.
22. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of generating a challenge sample of any of claims 1-10.
CN202010213381.7A 2020-03-24 2020-03-24 Method and device for generating countermeasure sample, electronic equipment and storage medium Pending CN113449097A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010213381.7A CN113449097A (en) 2020-03-24 2020-03-24 Method and device for generating countermeasure sample, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010213381.7A CN113449097A (en) 2020-03-24 2020-03-24 Method and device for generating countermeasure sample, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113449097A true CN113449097A (en) 2021-09-28

Family

ID=77806440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010213381.7A Pending CN113449097A (en) 2020-03-24 2020-03-24 Method and device for generating countermeasure sample, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113449097A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117482A (en) * 2018-09-17 2019-01-01 武汉大学 A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency
US20190220605A1 (en) * 2019-03-22 2019-07-18 Intel Corporation Adversarial training of neural networks using information about activation path differentials
WO2019200806A1 (en) * 2018-04-20 2019-10-24 平安科技(深圳)有限公司 Device for generating text classification model, method, and computer readable storage medium
CN110378474A (en) * 2019-07-26 2019-10-25 北京字节跳动网络技术有限公司 Fight sample generating method, device, electronic equipment and computer-readable medium
CN110619292A (en) * 2019-08-31 2019-12-27 浙江工业大学 Countermeasure defense method based on binary particle swarm channel optimization
US20200019863A1 (en) * 2018-07-12 2020-01-16 International Business Machines Corporation Generative Adversarial Network Based Modeling of Text for Natural Language Processing
CN110852450A (en) * 2020-01-15 2020-02-28 支付宝(杭州)信息技术有限公司 Method and device for identifying countermeasure sample to protect model security

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019200806A1 (en) * 2018-04-20 2019-10-24 平安科技(深圳)有限公司 Device for generating text classification model, method, and computer readable storage medium
US20200019863A1 (en) * 2018-07-12 2020-01-16 International Business Machines Corporation Generative Adversarial Network Based Modeling of Text for Natural Language Processing
CN109117482A (en) * 2018-09-17 2019-01-01 武汉大学 A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency
US20190220605A1 (en) * 2019-03-22 2019-07-18 Intel Corporation Adversarial training of neural networks using information about activation path differentials
CN110378474A (en) * 2019-07-26 2019-10-25 北京字节跳动网络技术有限公司 Fight sample generating method, device, electronic equipment and computer-readable medium
CN110619292A (en) * 2019-08-31 2019-12-27 浙江工业大学 Countermeasure defense method based on binary particle swarm channel optimization
CN110852450A (en) * 2020-01-15 2020-02-28 支付宝(杭州)信息技术有限公司 Method and device for identifying countermeasure sample to protect model security

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUAN ZANG等: "Textualc Attack as Combinatorial Optimization", ARXIV:1910.12196V2[CS. L], 10 November 2019 (2019-11-10), pages 1 - 6 *

Similar Documents

Publication Publication Date Title
US11150804B2 (en) Neural network for keyboard input decoding
US11663404B2 (en) Text recognition method, electronic device, and storage medium
CN111859951B (en) Language model training method and device, electronic equipment and readable storage medium
CN111414482B (en) Event argument extraction method and device and electronic equipment
CN111539223A (en) Language model training method and device, electronic equipment and readable storage medium
CN111859997B (en) Model training method and device in machine translation, electronic equipment and storage medium
US20210397791A1 (en) Language model training method, apparatus, electronic device and readable storage medium
EP3926516A1 (en) Field-dependent machine translation model training method, apparatus, electronic device and storage medium
CN111950291A (en) Semantic representation model generation method and device, electronic equipment and storage medium
CN112001169B (en) Text error correction method and device, electronic equipment and readable storage medium
CN104471639A (en) Voice and gesture identification reinforcement
US11216615B2 (en) Method, device and storage medium for predicting punctuation in text
CN113407100B (en) Time-based word segmentation
CN111737996A (en) Method, device and equipment for obtaining word vector based on language model and storage medium
CN112507101A (en) Method and device for establishing pre-training language model
CN111667056A (en) Method and apparatus for searching model structure
CN111950292A (en) Training method of text error correction model, and text error correction processing method and device
CN111079945A (en) End-to-end model training method and device
CN111950293A (en) Semantic representation model generation method and device, electronic equipment and storage medium
CN111831814A (en) Pre-training method and device of abstract generation model, electronic equipment and storage medium
CN113723278A (en) Training method and device of form information extraction model
US9298276B1 (en) Word prediction for numbers and symbols
CN111310481B (en) Speech translation method, device, computer equipment and storage medium
CN110990569A (en) Text clustering method and device and related equipment
US11893977B2 (en) Method for recognizing Chinese-English mixed speech, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination