CN114169443A - Word-level text countermeasure sample detection method - Google Patents

Word-level text countermeasure sample detection method Download PDF

Info

Publication number
CN114169443A
CN114169443A CN202111496214.9A CN202111496214A CN114169443A CN 114169443 A CN114169443 A CN 114169443A CN 202111496214 A CN202111496214 A CN 202111496214A CN 114169443 A CN114169443 A CN 114169443A
Authority
CN
China
Prior art keywords
word
sample
model
sentence
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111496214.9A
Other languages
Chinese (zh)
Other versions
CN114169443B (en
Inventor
范铭
王晨旭
曹慧
魏闻英
陶俊杰
刘烃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202111496214.9A priority Critical patent/CN114169443B/en
Publication of CN114169443A publication Critical patent/CN114169443A/en
Application granted granted Critical
Publication of CN114169443B publication Critical patent/CN114169443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a detection method of word-level text countermeasure samples, and provides a detection method for defense of the text countermeasure samples of a deep learning model. The method models the problem of the detection of the confrontation samples into two classification problems, the confrontation samples are detected by two steps, firstly, the confrontation samples of corresponding normal samples are generated by utilizing a confrontation sample attack algorithm, and the feature vectors for representing the normal samples and the confrontation samples are respectively extracted. And secondly, constructing a confrontation sample detection binary classification model by using a corresponding deep learning model. Through the above method, it is possible to detect whether or not the current sample is a countermeasure sample of the current model.

Description

Word-level text countermeasure sample detection method
Technical Field
The invention relates to the field of deep learning security problems, in particular to a method for detecting word-level text countermeasure samples.
Background
In recent years, with the rapid development of deep learning, especially various neural network models are deployed in practical systems such as face recognition, machine translation, fraud detection, etc. on a large scale, the security problem has been gradually recognized and valued by the academic and industrial circles. A counterattack refers to the process of applying a slight perturbation to the raw inputs of a target machine learning model to generate a countersample to spoof the target model. The vulnerability of the deep learning model can be exposed by resisting the attack, so that the robustness and the interpretability of the model are improved, and the extensive research is carried out in the field of images.
In the field of image classification, antagonistic samples are intentionally synthesized images that look almost identical to the original image, but may mislead the classifier to provide a wrong prediction output. In the text field, practical systems such as spam detection, harmful text detection, malware detection and the like have been deployed with deep learning models on a large scale, and security is particularly important for the systems. Compared with the image field, the defense research of the text field against attacks is far from enough. The defense against attacks in the text field mainly has the following difficulties:
1) the image data and the text data are different in inherent, and the countermeasure defense method for the image field cannot be directly applied to the text data;
2) the pixel values of the image data are continuous, the text data are discrete, and the discrete characteristic of the text data makes the generation and detection defense of the countermeasure sample more challenging;
3) small changes to the pixel values can cause perturbations in the image data that are difficult to observe by the human eye. But for text counterattacks, small perturbations are easily perceived;
therefore, the defense method research of the confrontation sample is helpful for improving the robustness and the interpretability of the model.
Disclosure of Invention
The invention provides a detection method of word-level text countermeasure samples, and provides a detection method for defense of the text countermeasure samples of a deep learning model. The method models the problem of confrontation sample detection into two classification problems, specifically comprises four steps to detect the confrontation samples, and firstly trains a text classification model based on the existing training data set; secondly, generating a confrontation sample of the normal sample of the current model based on the existing attack algorithm; respectively extracting characteristic feature vectors from normal and confrontation samples of the current model to construct a training data set of the detection model; and finally, constructing a confrontation sample detection two-classification model according to the data set obtained in the last step, and judging whether the current test sample is the confrontation sample or not based on the confrontation sample detection two-classification model.
In order to achieve the purpose, the invention adopts the following technical scheme:
1) training a text classification model M based on an existing training dataset D, wherein D { (x)i,yi)},0<i<L, L being the length of the data set D, xiFor one data sample in D, yiFor the sample, the label is mapped:
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: training based on the neural network structure to obtain a text classification model M;
2) generating a confrontation sample of a normal sample of the current model based on the existing confrontation sample attack algorithm:
step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiIs normal or antagonisticThis, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes challenge sample;
3) respectively extracting characteristic feature vectors from normal and confrontation samples of the model to construct a training data set S of the detection model:
step S301: according to D2Data (x) in (1)i,yi) A feature vector a belonging to this piece of data is constructed. Let the input sentence be X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar words, the method for obtaining the similar words is Word2Vec technology, and a vector closest to the current Word is found as the similar Word based on the distance between the vectors; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az]。
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiThe top of XkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm]。
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted ciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
4) Constructing a confrontation sample detection binary classification model G according to the data set S obtained in the step 3):
step S401: training a two-classification model according to a training data set S, wherein the model can be a machine learning model such as an MLP (maximum likelihood probability) model and an SVM (support vector machine) model;
step S402: for the sample to be detected, after the characteristics are extracted in the step 3, inputting a trained two-classification model, if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a countersample;
compared with the prior art, the invention has the following advantages:
1) is the first known word-level text countermeasure sample detection method;
2) the word-level text countermeasure sample detection method provided by the invention has the detection rate of 80-95% in different data sets and different algorithm models, and has good applicability in different scenes.
3) The proposed counter sample detection method based on self-attention does not depend on a specific model and has high expandability;
drawings
FIG. 1 trains a text classification model M based on an existing training data set D;
FIG. 2 is a diagram of a prior art challenge sample attack algorithm to generate challenge samples of normal samples of a current model;
FIG. 3 is a diagram illustrating a training data set of a detection model constructed by respectively extracting characteristic feature vectors from normal and confrontation samples of a current model;
FIG. 4 is a diagram of a discrimination input sample of a confrontation sample detection binary model;
Detailed Description
The following describes in detail a specific embodiment of the word-level text confrontation sample detection method with reference to the drawings.
FIG. 1 is an overall flow chart of the present invention for training a text classification model M based on an existing training data set D;
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: training based on the neural network structure to obtain a text classification model M;
specifically, the neural network text classification model in step S101 may be any CNN or RNN series model, a Self-orientation layer is added behind an Embedding layer of any model to obtain a final model structure, and a final text classification model M may be obtained through step S102.
Fig. 2 generates a challenge sample of the current model normal sample based on the existing challenge sample attack algorithm.
Step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiFor normal or challenge samples, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes challenge sample;
specifically, the sample with the correct prediction is found in step S201, then the sample with the correct prediction is attacked by using an arbitrary counterattack sample attack algorithm in step S202, and finally the sample with the correct prediction and the corresponding sample with the successful attack are saved and recorded as step S203.
FIG. 3 is a training data set for constructing a detection model by respectively extracting characteristic feature vectors from normal and confrontation samples of the current model.
Step S301: according to D2Data (x) in (1)i,yi) A feature vector a belonging to this piece of data is constructed. Let the input sentence be X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar meaning words, the method for obtaining the similar meaning words is Word2Vec technology, and the direction closest to the current Word is found based on the distance between vectorsAmount as its synonym; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az]。
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiThe top of XkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm]。
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted asciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
Specifically, the token feature vector is extracted from the countermeasure sample detection data set D2 stored in step S203, the feature vector a is extracted in step S301, the feature vector B is extracted in step S302, and the feature vector C is extracted in step S303. The final sample is characterized as vector I, where I ═ a, B, C.
FIG. 4 is a diagram of a discrimination input sample of a confrontation sample detection two-classification model.
Step S401: training a two-classification model according to a training data set S, wherein the model can be a machine learning model such as an MLP (maximum likelihood probability) model and an SVM (support vector machine) model;
step S402: for the sample to be detected, after the characteristics are extracted in the step 3, inputting a trained two-classification model, if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a countersample;
specifically, a training data set S and a sample to be tested are input, and an antagonistic sample detection binary model is obtained according to any binary algorithm model by using S according to step S401. And extracting a characterization vector I for the input sample to be tested according to the step S402, inputting the vector I into the two classification models for the detection of the confrontation sample, wherein if the classification label is 1, the sample is a normal sample, and if the classification label is 0, the sample is a confrontation sample.

Claims (5)

1. The word-level text countermeasure sample detection method is characterized by comprising the following steps of:
1) training a text classification model M based on an existing training dataset D, wherein D { (x)i,yi)},0<i<L, L being the length of the data set D, xiFor one data sample in D, yiFor the sample, the label is mapped:
2) generating a confrontation sample of a normal sample of the current model based on the existing confrontation sample attack algorithm:
3) respectively extracting characteristic feature vectors from normal and confrontation samples of the model to construct a training data set S of the detection model:
4) constructing a confrontation sample detection binary model G according to the data set S obtained in the step 3).
2. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 1) comprises the following specific steps:
step S101: selecting a neural network text classification model for the existing training data set D, and adding a Self-orientation layer behind an Embedding layer of the text classification model;
step S102: and training based on the neural network structure to obtain a text classification model M.
3. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 2) comprises the following specific steps:
step S201: finding out a sample with correct prediction of the current model M in the training data set;
step S202: attacking the found sample by using the existing attack algorithm until the attack is successful, wherein the attack success refers to the original piece of data (x)i,yi) After attack, the label is changed from the original one, i.e. from yiBecome yi' and yi≠yi';
Step S203: saving the samples successful in the last step of attack and the confrontation samples thereof as a confrontation sample detection data set D2Wherein D is2={(xi,yi)},0<i<N, N is data set D2Length of (1), xiFor normal or challenge samples, yiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
4. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 3) comprises the following specific steps:
step S301: according to D2Data (x) in (1)i,yi) Constructing a feature vector A belonging to the data, and setting the input sentence as X ═ X1,x2...xn]Where each element in X represents a word in the sentence and the probability that X is classified as the kth class by the output of model M is PkFinding k important words with the maximum weight in the input sentence based on self-attention technology and recording the k important words as topk=[wtop1,wtop2...wtopk](ii) a For topkEach Word in the Word obtains m similar words, the method for obtaining the similar words is Word2Vec technology, and a vector closest to the current Word is found as the similar Word based on the distance between the vectors; for topkEach word in the sentence is selected as a replacement word from the similar meaning words to obtain a new input sentence x ', and the probability that x' is classified into the kth category through the output of the model M is Pk', with ABS (P)k-Pk') as one dimension of the feature vector A, denoted as aiWherein ABS is an absolute value function; traversing all the similar meaning word combinations in the topk to finally obtain a feature vector A with z dimension, wherein z is mkAnd finally a ═ a1,a2…az];
Step S302: according to D2One piece of data (x) in (2)i,yi) A feature vector B belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a Obtaining a set H of m sentences from the data set D, wherein the labels of each sentence are different from the labels X; for each sentence H in HiMixing t of XopkThe word contained being added to HiTo obtain a new sentence Hi'; original sentence HiProbability of being of class k is Pk,Hi' probability of being class k is Pk', with ABS (P)k-Pk') as one dimension of the feature vector B, denoted BiWherein ABS is an absolute value function; traversing each sentence in H to finally obtain a feature vector B with m dimensions, wherein m is the length of the set H, and finally B ═ B1,b2…bm];
Step S303: according to D2One piece of data (x) in (2)i,yi) A feature vector C belonging to this piece of data is constructed. For a given sentence X ═ X1,x2...xn]Each element in X represents a word in the sentence, and k important words with the maximum weight in the input sentence are found and recorded as top based on self-attention technologyk=[wtop1,wtop2...wtopk](ii) a To topkEach word in (a) gets its m-dimensional word vector representation, denoted ciM represents the length of the current word embed, and the word vector can be obtained by word2vec technology; will topkThe word vectors of each word are directly spliced together to be used as a characteristic vector C, and finally C is ═ C1,c2...ck]Wherein C has a length of m × k;
step S304: final D2One piece of data (x) in (2)i,yi) Is I, wherein I ═ A, B, C]I is saved as a training data set S, S { (x) against the sample detection modeli,yi)},0<i<Q, Q being the length of the data set S, xiFeature vector I, y being a normal or challenge sampleiIs its corresponding tag, wherein yi1 denotes a normal sample, yi0 denotes a challenge sample.
5. The method for detecting word-level text countermeasure samples according to claim 1, wherein the step 4) comprises the following specific steps:
step S401: training a two-classification model according to a training data set S, wherein the model is an MLP (maximum likelihood probability) and SVM (support vector machine) learning model;
step S402: and (3) for the sample to be detected, after the characteristics are extracted in the step (3), inputting the trained two-classification model, and if the classification label is 1, determining the sample as a normal sample, and if the classification label is 0, determining the sample as a countersample.
CN202111496214.9A 2021-12-08 2021-12-08 Word-level text countermeasure sample detection method Active CN114169443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111496214.9A CN114169443B (en) 2021-12-08 2021-12-08 Word-level text countermeasure sample detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111496214.9A CN114169443B (en) 2021-12-08 2021-12-08 Word-level text countermeasure sample detection method

Publications (2)

Publication Number Publication Date
CN114169443A true CN114169443A (en) 2022-03-11
CN114169443B CN114169443B (en) 2024-02-06

Family

ID=80484748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111496214.9A Active CN114169443B (en) 2021-12-08 2021-12-08 Word-level text countermeasure sample detection method

Country Status (1)

Country Link
CN (1) CN114169443B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457701A (en) * 2019-08-08 2019-11-15 南京邮电大学 Dual training method based on interpretation confrontation text
WO2020244066A1 (en) * 2019-06-04 2020-12-10 平安科技(深圳)有限公司 Text classification method, apparatus, device, and storage medium
CN112765355A (en) * 2021-01-27 2021-05-07 江南大学 Text anti-attack method based on improved quantum behavior particle swarm optimization algorithm
WO2021212675A1 (en) * 2020-04-21 2021-10-28 清华大学 Method and apparatus for generating adversarial sample, electronic device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020244066A1 (en) * 2019-06-04 2020-12-10 平安科技(深圳)有限公司 Text classification method, apparatus, device, and storage medium
CN110457701A (en) * 2019-08-08 2019-11-15 南京邮电大学 Dual training method based on interpretation confrontation text
WO2021212675A1 (en) * 2020-04-21 2021-10-28 清华大学 Method and apparatus for generating adversarial sample, electronic device and storage medium
CN112765355A (en) * 2021-01-27 2021-05-07 江南大学 Text anti-attack method based on improved quantum behavior particle swarm optimization algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
仝鑫;王罗娜;王润正;王靖亚;: "面向中文文本分类的词级对抗样本生成方法", 信息网络安全, no. 09 *
李文慧;张英俊;潘理虎;: "改进biLSTM网络的短文本分类方法", 计算机工程与设计, no. 03 *

Also Published As

Publication number Publication date
CN114169443B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
Li et al. Invisible backdoor attacks on deep neural networks via steganography and regularization
Zhong et al. Backdoor embedding in convolutional neural network models via invisible perturbation
Melis et al. Is deep learning safe for robot vision? adversarial examples against the icub humanoid
Liao et al. Backdoor embedding in convolutional neural network models via invisible perturbation
CN108111489B (en) URL attack detection method and device and electronic equipment
US11893111B2 (en) Defending machine learning systems from adversarial attacks
Li et al. Invisible backdoor attacks on deep neural networks via steganography and regularization
CN111046673B (en) Training method for defending text malicious sample against generation network
CN112085069B (en) Multi-target countermeasure patch generation method and device based on integrated attention mechanism
CN109961145B (en) Antagonistic sample generation method for image recognition model classification boundary sensitivity
Lin et al. Chinese character CAPTCHA recognition and performance estimation via deep neural network
CN111191695A (en) Website picture tampering detection method based on deep learning
CN111753881A (en) Defense method for quantitatively identifying anti-attack based on concept sensitivity
CN112861945B (en) Multi-mode fusion lie detection method
Jain et al. Adversarial text generation for google's perspective api
WO2023093346A1 (en) Exogenous feature-based model ownership verification method and apparatus
Lv et al. Chinese character CAPTCHA recognition based on convolution neural network
CN115913643A (en) Network intrusion detection method, system and medium based on countermeasure self-encoder
CN113435264A (en) Face recognition attack resisting method and device based on black box substitution model searching
Yin et al. Adversarial attack, defense, and applications with deep learning frameworks
CN114169443B (en) Word-level text countermeasure sample detection method
CN115883242A (en) Network intrusion detection method and device
CN112948578B (en) DGA domain name open set classification method, device, electronic equipment and medium
CN115497105A (en) Multi-modal hate cause detection method based on multi-task learning network
Abhishek et al. CNN Combined with FC Classifier to Combat Artificial Penta-Digit Text-Based Captcha

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant