CN114169443A - Word-level text countermeasure sample detection method - Google Patents
Word-level text countermeasure sample detection method Download PDFInfo
- Publication number
- CN114169443A CN114169443A CN202111496214.9A CN202111496214A CN114169443A CN 114169443 A CN114169443 A CN 114169443A CN 202111496214 A CN202111496214 A CN 202111496214A CN 114169443 A CN114169443 A CN 114169443A
- Authority
- CN
- China
- Prior art keywords
- word
- sample
- model
- sentence
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 40
- 239000013598 vector Substances 0.000 claims abstract description 53
- 238000013145 classification model Methods 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims description 32
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 238000012706 support-vector machine Methods 0.000 claims description 6
- 101100481876 Danio rerio pbk gene Proteins 0.000 claims description 3
- 238000007476 Maximum Likelihood Methods 0.000 claims description 3
- 101100481878 Mus musculus Pbk gene Proteins 0.000 claims description 3
- 230000007123 defense Effects 0.000 abstract description 7
- 238000013136 deep learning model Methods 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a detection method of word-level text countermeasure samples, and provides a detection method for defense of the text countermeasure samples of a deep learning model. The method models the problem of the detection of the confrontation samples into two classification problems, the confrontation samples are detected by two steps, firstly, the confrontation samples of corresponding normal samples are generated by utilizing a confrontation sample attack algorithm, and the feature vectors for representing the normal samples and the confrontation samples are respectively extracted. And secondly, constructing a confrontation sample detection binary classification model by using a corresponding deep learning model. Through the above method, it is possible to detect whether or not the current sample is a countermeasure sample of the current model.
Description
Technical Field
The invention relates to the field of deep learning security problems, in particular to a method for detecting word-level text countermeasure samples.
Background
In recent years, with the rapid development of deep learning, especially various neural network models are deployed in practical systems such as face recognition, machine translation, fraud detection, etc. on a large scale, the security problem has been gradually recognized and valued by the academic and industrial circles. A counterattack refers to the process of applying a slight perturbation to the raw inputs of a target machine learning model to generate a countersample to spoof the target model. The vulnerability of the deep learning model can be exposed by resisting the attack, so that the robustness and the interpretability of the model are improved, and the extensive research is carried out in the field of images.
In the field of image classification, antagonistic samples are intentionally synthesized images that look almost identical to the original image, but may mislead the classifier to provide a wrong prediction output. In the text field, practical systems such as spam detection, harmful text detection, malware detection and the like have been deployed with deep learning models on a large scale, and security is particularly important for the systems. Compared with the image field, the defense research of the text field against attacks is far from enough. The defense against attacks in the text field mainly has the following difficulties:
1) the image data and the text data are different in inherent, and the countermeasure defense method for the image field cannot be directly applied to the text data;
2) the pixel values of the image data are continuous, the text data are discrete, and the discrete characteristic of the text data makes the generation and detection defense of the countermeasure sample more challenging;
3) small changes to the pixel values can cause perturbations in the image data that are difficult to observe by the human eye. But for text counterattacks, small perturbations are easily perceived;
therefore, the defense method research of the confrontation sample is helpful for improving the robustness and the interpretability of the model.
Disclosure of Invention
The invention provides a detection method of word-level text countermeasure samples, and provides a detection method for defense of the text countermeasure samples of a deep learning model. The method models the problem of confrontation sample detection into two classification problems, specifically comprises four steps to detect the confrontation samples, and firstly trains a text classification model based on the existing training data set; secondly, generating a confrontation sample of the normal sample of the current model based on the existing attack algorithm; respectively extracting characteristic feature vectors from normal and confrontation samples of the current model to construct a training data set of the detection model; and finally, constructing a confrontation sample detection two-classification model according to the data set obtained in the last step, and judging whether the current test sample is the confrontation sample or not based on the confrontation sample detection two-classification model.
In order to achieve the purpose, the invention adopts the following technical scheme:
1) training a text classification model M based on an existing training dataset D, wherein D { (x)i,yi)},0<i<L, L being the length of the data set D, xiFor one data sample in D, yiFor the sample, the label is mapped:
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: training based on the neural network structure to obtain a text classification model M;
2) generating a confrontation sample of a normal sample of the current model based on the existing confrontation sample attack algorithm:
step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiIs normal or antagonisticThis, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes challenge sample;
3) respectively extracting characteristic feature vectors from normal and confrontation samples of the model to construct a training data set S of the detection model:
step S301: according to D2Data (x) in (1)i,yi) A feature vector a belonging to this piece of data is constructed. Let the input sentence be X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar words, the method for obtaining the similar words is Word2Vec technology, and a vector closest to the current Word is found as the similar Word based on the distance between the vectors; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az]。
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiThe top of XkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm]。
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted ciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
4) Constructing a confrontation sample detection binary classification model G according to the data set S obtained in the step 3):
step S401: training a two-classification model according to a training data set S, wherein the model can be a machine learning model such as an MLP (maximum likelihood probability) model and an SVM (support vector machine) model;
step S402: for the sample to be detected, after the characteristics are extracted in the step 3, inputting a trained two-classification model, if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a countersample;
compared with the prior art, the invention has the following advantages:
1) is the first known word-level text countermeasure sample detection method;
2) the word-level text countermeasure sample detection method provided by the invention has the detection rate of 80-95% in different data sets and different algorithm models, and has good applicability in different scenes.
3) The proposed counter sample detection method based on self-attention does not depend on a specific model and has high expandability;
drawings
FIG. 1 trains a text classification model M based on an existing training data set D;
FIG. 2 is a diagram of a prior art challenge sample attack algorithm to generate challenge samples of normal samples of a current model;
FIG. 3 is a diagram illustrating a training data set of a detection model constructed by respectively extracting characteristic feature vectors from normal and confrontation samples of a current model;
FIG. 4 is a diagram of a discrimination input sample of a confrontation sample detection binary model;
Detailed Description
The following describes in detail a specific embodiment of the word-level text confrontation sample detection method with reference to the drawings.
FIG. 1 is an overall flow chart of the present invention for training a text classification model M based on an existing training data set D;
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: training based on the neural network structure to obtain a text classification model M;
specifically, the neural network text classification model in step S101 may be any CNN or RNN series model, a Self-orientation layer is added behind an Embedding layer of any model to obtain a final model structure, and a final text classification model M may be obtained through step S102.
Fig. 2 generates a challenge sample of the current model normal sample based on the existing challenge sample attack algorithm.
Step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiFor normal or challenge samples, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes challenge sample;
specifically, the sample with the correct prediction is found in step S201, then the sample with the correct prediction is attacked by using an arbitrary counterattack sample attack algorithm in step S202, and finally the sample with the correct prediction and the corresponding sample with the successful attack are saved and recorded as step S203.
FIG. 3 is a training data set for constructing a detection model by respectively extracting characteristic feature vectors from normal and confrontation samples of the current model.
Step S301: according to D2Data (x) in (1)i,yi) A feature vector a belonging to this piece of data is constructed. Let the input sentence be X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar meaning words, the method for obtaining the similar meaning words is Word2Vec technology, and the direction closest to the current Word is found based on the distance between vectorsAmount as its synonym; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az]。
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiThe top of XkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm]。
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted asciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
Specifically, the token feature vector is extracted from the countermeasure sample detection data set D2 stored in step S203, the feature vector a is extracted in step S301, the feature vector B is extracted in step S302, and the feature vector C is extracted in step S303. The final sample is characterized as vector I, where I ═ a, B, C.
FIG. 4 is a diagram of a discrimination input sample of a confrontation sample detection two-classification model.
Step S401: training a two-classification model according to a training data set S, wherein the model can be a machine learning model such as an MLP (maximum likelihood probability) model and an SVM (support vector machine) model;
step S402: for the sample to be detected, after the characteristics are extracted in the step 3, inputting a trained two-classification model, if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a countersample;
specifically, a training data set S and a sample to be tested are input, and an antagonistic sample detection binary model is obtained according to any binary algorithm model by using S according to step S401. And extracting a characterization vector I for the input sample to be tested according to the step S402, inputting the vector I into the two classification models for the detection of the confrontation sample, wherein if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a confrontation sample.
Claims (5)
1. The word-level text countermeasure sample detection method is characterized by comprising the following steps of:
1) training a text classification model M based on an existing training dataset D, wherein D { (x)i,yi)},0<i<L, L being the length of the data set D, xiFor one data sample in D, yiFor the sample, the label is mapped:
2) generating a confrontation sample of a normal sample of the current model based on the existing confrontation sample attack algorithm:
3) respectively extracting characteristic feature vectors from normal and confrontation samples of the model to construct a training data set S of the detection model:
4) constructing a confrontation sample detection binary model G according to the data set S obtained in the step 3).
2. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 1) comprises the following specific steps:
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: and training based on the neural network structure to obtain a text classification model M.
3. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 2) comprises the following specific steps:
step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiFor normal or challenge samples, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
4. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 3) comprises the following specific steps:
step S301: according to D2Data (x) in (1)i,yi) Constructing a feature vector A belonging to the data, and setting the input sentence as X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar words, the method for obtaining the similar words is Word2Vec technology, and a vector closest to the current Word is found as the similar Word based on the distance between the vectors; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az];
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiMixing t of XopkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm];
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted ciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
5. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 4) comprises the following specific steps:
step S401: training a two-classification model according to a training data set S, wherein the model is an MLP (maximum likelihood probability) and SVM (support vector machine) learning model;
step S402: and (3) for the sample to be detected, after the characteristics are extracted in the step (3), inputting the trained two-classification model, and if the classification label is 1, determining the sample as a normal sample, and if the classification label is 0, determining the sample as a countersample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111496214.9A CN114169443B (en) | 2021-12-08 | 2021-12-08 | Word-level text countermeasure sample detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111496214.9A CN114169443B (en) | 2021-12-08 | 2021-12-08 | Word-level text countermeasure sample detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114169443A true CN114169443A (en) | 2022-03-11 |
CN114169443B CN114169443B (en) | 2024-02-06 |
Family
ID=80484748
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111496214.9A Active CN114169443B (en) | 2021-12-08 | 2021-12-08 | Word-level text countermeasure sample detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114169443B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457701A (en) * | 2019-08-08 | 2019-11-15 | 南京邮电大学 | Dual training method based on interpretation confrontation text |
WO2020244066A1 (en) * | 2019-06-04 | 2020-12-10 | 平安科技(深圳)有限公司 | Text classification method, apparatus, device, and storage medium |
CN112765355A (en) * | 2021-01-27 | 2021-05-07 | 江南大学 | Text anti-attack method based on improved quantum behavior particle swarm optimization algorithm |
WO2021212675A1 (en) * | 2020-04-21 | 2021-10-28 | 清华大学 | Method and apparatus for generating adversarial sample, electronic device and storage medium |
-
2021
- 2021-12-08 CN CN202111496214.9A patent/CN114169443B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020244066A1 (en) * | 2019-06-04 | 2020-12-10 | 平安科技(深圳)有限公司 | Text classification method, apparatus, device, and storage medium |
CN110457701A (en) * | 2019-08-08 | 2019-11-15 | 南京邮电大学 | Dual training method based on interpretation confrontation text |
WO2021212675A1 (en) * | 2020-04-21 | 2021-10-28 | 清华大学 | Method and apparatus for generating adversarial sample, electronic device and storage medium |
CN112765355A (en) * | 2021-01-27 | 2021-05-07 | 江南大学 | Text anti-attack method based on improved quantum behavior particle swarm optimization algorithm |
Non-Patent Citations (2)
Title |
---|
仝鑫;王罗娜;王润正;王靖亚;: "面向中文文本分类的词级对抗样本生成方法", 信息网络安全, no. 09 * |
李文慧;张英俊;潘理虎;: "改进biLSTM网络的短文本分类方法", 计算机工程与设计, no. 03 * |
Also Published As
Publication number | Publication date |
---|---|
CN114169443B (en) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Invisible backdoor attacks on deep neural networks via steganography and regularization | |
Zhong et al. | Backdoor embedding in convolutional neural network models via invisible perturbation | |
Melis et al. | Is deep learning safe for robot vision? adversarial examples against the icub humanoid | |
Liao et al. | Backdoor embedding in convolutional neural network models via invisible perturbation | |
CN108111489B (en) | URL attack detection method and device and electronic equipment | |
US11893111B2 (en) | Defending machine learning systems from adversarial attacks | |
Li et al. | Invisible backdoor attacks on deep neural networks via steganography and regularization | |
CN111046673B (en) | Training method for defending text malicious sample against generation network | |
CN112085069B (en) | Multi-target countermeasure patch generation method and device based on integrated attention mechanism | |
CN109961145B (en) | Antagonistic sample generation method for image recognition model classification boundary sensitivity | |
Lin et al. | Chinese character CAPTCHA recognition and performance estimation via deep neural network | |
CN111191695A (en) | Website picture tampering detection method based on deep learning | |
CN111753881A (en) | Defense method for quantitatively identifying anti-attack based on concept sensitivity | |
CN112861945B (en) | Multi-mode fusion lie detection method | |
Jain et al. | Adversarial text generation for google's perspective api | |
WO2023093346A1 (en) | Exogenous feature-based model ownership verification method and apparatus | |
Lv et al. | Chinese character CAPTCHA recognition based on convolution neural network | |
CN115913643A (en) | Network intrusion detection method, system and medium based on countermeasure self-encoder | |
CN113435264A (en) | Face recognition attack resisting method and device based on black box substitution model searching | |
Yin et al. | Adversarial attack, defense, and applications with deep learning frameworks | |
CN114169443B (en) | Word-level text countermeasure sample detection method | |
CN115883242A (en) | Network intrusion detection method and device | |
CN112948578B (en) | DGA domain name open set classification method, device, electronic equipment and medium | |
CN115497105A (en) | Multi-modal hate cause detection method based on multi-task learning network | |
Abhishek et al. | CNN Combined with FC Classifier to Combat Artificial Penta-Digit Text-Based Captcha |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |