CN112084505A - Deep learning model malicious sample detection method, system, device and storage medium - Google Patents

Deep learning model malicious sample detection method, system, device and storage medium Download PDF

Info

Publication number
CN112084505A
CN112084505A CN202010996847.5A CN202010996847A CN112084505A CN 112084505 A CN112084505 A CN 112084505A CN 202010996847 A CN202010996847 A CN 202010996847A CN 112084505 A CN112084505 A CN 112084505A
Authority
CN
China
Prior art keywords
sample
variation
model
initial
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010996847.5A
Other languages
Chinese (zh)
Inventor
沈超
金凯迪
蔺琛皓
范铭
陈宇飞
刘烃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202010996847.5A priority Critical patent/CN112084505A/en
Publication of CN112084505A publication Critical patent/CN112084505A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention belongs to the field of intelligent system safety, and discloses a method, a system, equipment and a storage medium for detecting malicious samples of a deep learning model, wherein the method comprises the following steps: acquiring an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models; obtaining a sample to be detected, and respectively inputting the sample to be detected into an initial model and a plurality of variation models to obtain an initial prediction result and a plurality of variation prediction results; when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample. Compared with the existing detection method, the method provided by the invention does not need to change the model, does not need to acquire priori knowledge of the malicious sample in advance, and has small influence on the detection performance of the normal sample by the model.

Description

Deep learning model malicious sample detection method, system, device and storage medium
Technical Field
The invention belongs to the field of intelligent system safety, and relates to a method, a system, equipment and a storage medium for detecting malicious samples of a deep learning model.
Background
The intelligent system is widely applied to complex scenes such as face recognition, malicious software detection, automatic driving, image classification, natural language processing and the like. With the wide application of intelligent systems in security-related key fields, security problems of intelligent systems have gained wide attention, and researchers have pointed out that backdoor attacks may exist in deep learning systems. The implementation technology of the intelligent system is mainly a deep learning technology, a deep learning model needs to acquire a large amount of data, and a special GPU (graphics processing Unit) calculation server for training the intelligent model is not possible to train the model from scratch due to lack of time, data or equipment and the like, the model is often subjected to parameter fine tuning by using an open pre-training model, and model sharing and multiplexing become common.
However, due to the deep learning model black box, it is not possible to perceive whether these published models are implanted in the backdoor. An attacker utilizes training data, constructs a certain proportion of malicious samples by implanting a special trigger (trigger), puts the malicious samples with the trigger on target labels wanted by the attacker, then trains the target labels with the normal training data at the same time, and the trained model has performance consistent with that of a common model on the normal samples, but the classification result of the model can be changed into the labels predefined by the attacker by adding the malicious input of the trigger.
At present, two main types of detection methods for malicious samples such as backdoor exist. Judging whether the model is implanted with malicious information or not, finding out the polluted neurons, and retraining the model for purification; however, this approach limits the size of the trigger by cleaning the network, often resulting in an inability to clean the network when an attacker uses a more complex trigger. The other method is that the malicious samples with the triggers are detected and removed, so that the damage of the malicious samples to the decision of the intelligent system is reduced, but the malicious samples have higher false alarm rate and missing report rate, and the detection performance of the model on the normal samples is influenced.
Disclosure of Invention
The invention aims to overcome the defect that the existing detection method for malicious samples has limitation on the size of a malicious sample trigger in the prior art, and provides a method, a system, equipment and a storage medium for detecting the malicious samples of a deep learning model.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
in a first aspect of the present invention, a method for detecting a malicious sample in a deep learning model includes the following steps:
s1: acquiring an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models;
s2: obtaining a sample to be detected, and respectively inputting the sample to be detected into an initial model and a plurality of variation models to obtain an initial prediction result and a plurality of variation prediction results;
s3: when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample.
The method for detecting the malicious sample of the deep learning model is further improved in that:
the specific method of S1 is as follows:
s11: acquiring an initial model and traversing all fully-connected layers of the initial model to obtain a weight parameter matrix of each fully-connected layer;
s12: establishing a Gaussian distribution noise matrix of each full-connection layer through Gaussian fuzzy test, and superposing the weight parameter matrix of each full-connection layer and the corresponding Gaussian distribution noise matrix to obtain a variation matrix of each full-connection layer; each full-connection layer updates the weight parameters by adopting the parameters in the variation matrix to obtain a variation model;
s13: repeating the preset times S12 to obtain preset variation models.
The specific method for establishing the gaussian distribution noise matrix of each full connection layer through the gaussian fuzzy test in the step S12 is as follows:
and generating a Gaussian distribution noise matrix by taking the mean value of each parameter in the weight parameter matrix multiplied by the multiplier factor of the Gaussian blur test as the mean value and the maximum value of each parameter in the weight parameter matrix multiplied by the multiplier factor of the Gaussian blur test as the variance.
The specific method of S3 is as follows:
s31: combining a plurality of variation prediction results to obtain a variation model prediction sequence;
s32: performing sequential probability ratio hypothesis test according to the initial prediction result and the variation model prediction sequence; the initial sampling number is 0, and a variation prediction result is sampled from the variation model prediction sequence to carry out S33;
s33: calculating an SPRT probability value; when the SPRT probability value is smaller than a preset detection threshold value, the sample to be detected is a malicious sample, and the process is finished; otherwise, when the number of samples is the same as the number of the variation prediction results, the sample to be detected is a normal sample, and the process is finished; if the number of samples is different from the number of variation predictors, S32 is performed.
The specific method for calculating the SPRT probability value in S33 is as follows:
s331: presetting a test parameter and a hypothesis test threshold S of sequential probability ratio hypothesis testh(ii) a Obtaining a first parameter P according to equation (1)1Obtaining a second parameter P according to equation (2)2
P1=Sh- (1)
P2=Sh+ (2)
S332: the SPRT probability value is obtained by equation (3):
Figure BDA0002692821260000031
wherein n is the number of samples, z is the unequal count, the unequal count is initialized to 0, and 1 is added to the unequal count every time the variation prediction result sampled from the variation model prediction sequence is different from the initial prediction result.
The preset detection threshold is as follows: 0.0124.
further comprising the steps of:
s4: and storing the information of the malicious sample, and generating and sending a report based on the information of the malicious sample.
In a second aspect of the present invention, a system for detecting malicious samples in a deep learning model includes:
the variation model generation module is used for obtaining an initial model and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models;
the prediction result generation module is used for acquiring a sample to be detected, and inputting the sample to be detected into the initial model and the plurality of variation models respectively to obtain an initial prediction result and a plurality of variation prediction results;
the sample detection module is used for determining the sample to be detected as a malicious sample when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value; otherwise, determining the sample to be detected as a normal sample.
In a third aspect of the present invention, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the deep learning model malicious sample detection method when executing the computer program.
In a fourth aspect of the present invention, a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the above-mentioned deep learning model malicious sample detection method.
Compared with the prior art, the invention has the following beneficial effects:
the invention relates to a method for detecting a malicious sample of a deep learning model, which obtains a plurality of variant models by modifying parameters of a full connection layer of an initial model, realizes the detection of the sample according to the characteristics of different prediction differences of normal samples and malicious samples on the basis of different sensitivities of the normal samples and the malicious samples to the variant models, does not need prior knowledge about an input trigger in the whole process of detecting the input sample, does not depend on the architectural characteristics of the model, ensures that the algorithm has extremely strong generalization performance, can detect various malicious information, is not limited to the specific size of the trigger, is also suitable for a complex trigger, and can still well detect an attacker when using the complex trigger. Meanwhile, the method is based on the preset detection threshold value, and the selection between the missing report rate and the false report rate can be balanced by adjusting the detection threshold value. The method is simple to realize, low in complexity and free of large calculation cost, and the detection time cost for detecting a single sample to be detected is within 0.5ms after the generation of the variation model is finished through practical verification.
Furthermore, the model is mutated based on the noise of Gaussian distribution, the operation of model mutation is simple, the structural dependence on the model is small, the mutation rate is more convenient to control, and the prediction difference of a malicious sample and a normal sample in the mutation model can be better reflected.
Furthermore, the sequential probability is used for detecting the difference compared with the hypothesis test, so that the method is simple to implement, low in complexity and low in calculation cost.
Drawings
FIG. 1 is a flow chart of a method for detecting malicious samples in a deep learning model according to an embodiment of the present invention;
FIG. 2 is a block diagram of a variant model generation process according to an embodiment of the present invention;
FIG. 3 is a block diagram of a process for detecting a sample to be detected according to an embodiment of the present invention;
FIG. 4 is a block diagram of a sequential probability ratio hypothesis testing process according to an embodiment of the invention;
FIG. 5 is a diagram illustrating the differences in sensitivity to model variations among different samples according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention is described in further detail below with reference to the accompanying drawings:
with the breakthrough development in recent years, deep learning is widely applied to many key system scenes, however, due to the black box of deep learning, a user may not find that a backdoor is implanted in the model of the user, and thus a potential attack threat may exist.
Referring to fig. 1, the invention provides a method for detecting a malicious sample of a deep learning model, which comprises the steps of defending against the difference of prediction differences of a malicious sample and a normal sample due to the different sensibility of the malicious sample and the normal sample to the model variation, generating a variation model set by performing variation operation on the model, adding Gaussian distribution noise to a full-connection layer weight matrix of the model, inputting a sample to be detected to the variation model set for prediction, obtaining a sequence vector of a prediction result, and then performing sampling hypothesis test by using a sequential probability ratio to judge whether the input sample is the malicious sample. Compared with the existing model detection method, the method does not need to change the model, does not need to acquire priori knowledge of the malicious sample in advance, has strong generalization performance, and has small influence on the detection performance of the normal sample by the model.
Specifically, the method for detecting the malicious sample of the deep learning model comprises the following steps.
S1: and obtaining an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models. Specifically, referring to fig. 2, S1 includes the following steps:
s11: and acquiring an initial model and traversing all fully-connected layers of the initial model to obtain a weight parameter matrix of each fully-connected layer. In particular, each fully-connected layer in the initial model Mo is traversediObtaining the weight parameter matrix M of the ith full connection layerw
S12: establishing a Gaussian distribution noise matrix of each full-connection layer through Gaussian fuzzy test, and superposing the weight parameter matrix of each full-connection layer and the corresponding Gaussian distribution noise matrix to obtain a variation matrix of each full-connection layer; and each full-connection layer updates the weight parameters by adopting the parameters in the variation matrix to obtain a variation model.
Specifically, a Gaussian distribution noise matrix M of each full connection layer is established through a Gaussian fuzzy testGFThe specific method comprises the following steps: multiplying the mean value of each parameter in the weight parameter matrix by a multiplier factor r of the Gaussian blur testμAs the mean value mu, the maximum value of each parameter in the weight parameter matrix is multiplied by a multiplier factor r of the Gaussian blur testFor variance, a Gaussian distribution noise matrix MGF (μ r) is generatedμ,*r)。
Then, the weighting parameter matrix M of the ith fully-connected layer of the initial model is usedwAnd the matrix M of the Gaussian noise distribution of the generated i-th fully-connected layerGFAdding them to obtain the final variation matrix, Mm=(Mw+MGF). The parameters in the weight parameter matrix of all the fully-connected layers in the initial model are compared
Figure BDA0002692821260000071
Replacement with parameters in a new mutation matrix
Figure BDA0002692821260000072
And generating a new variation parameter matrix, and updating all parameters of the whole model to obtain the variation model.
S13: repeating the preset times S12 to obtain preset variation models. The preset number of variation models is stored in a file for subsequent hypothesis testing, and because the gaussian noise generated by each cycle is not completely the same, the requirement of generating different variation models is met, all the variation models approach to the same accuracy performance, and in the embodiment, the preset number of times is set to 100 times.
S2: and obtaining a sample to be detected, and inputting the sample to be detected into the initial model and the plurality of variation models respectively to obtain an initial prediction result and a plurality of variation prediction results.
Specifically, the initial prediction result Po is obtained by inputting the sample X to be measured into the initial model Mo. By inputting a sample X to be detected into 100 variation models Mm(m1,m2,……,mn,……m100) Then, the variation prediction results of the 100 variation models are obtained, and the variation prediction results constitute a final variation model prediction sequence Rp(p1,p2,……,pn,……p100)。
The prediction result of the malicious sample on the initial model and the prediction result of the variant model are almost unchanged, the prediction results are relatively stable, while the prediction results of the normal sample on the initial model are often different from the prediction results of the variant model, so that the prediction stability of the malicious sample input is higher than that of the normal sample in the prediction results of the variant models, and the detection of the malicious sample can be performed on the basis of the prediction stability.
S3: when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample. Specifically, referring to fig. 3, S3 includes the following steps
S31: and combining a plurality of variation prediction results to obtain a variation model prediction sequence.
Specifically, 100 variation prediction results are combined to obtain a variation model prediction sequence Rp(p1,p2,……,pn,……p100)。
S32: a Sequential Probability Ratio hypothesis Test (SPRT) is performed based on the initial prediction result and the variant model prediction sequence, and the initial sampling number is 1, and S303 is performed every time sampling is performed from the variant model prediction sequence.
In particular, the sequential probability ratio hypothesis test, among others, is a branch of mathematical statistics, the name of which derives from a same work published in 1947 by abraham wald, and the subject of which is the so-called "sequential sampling scheme", and how to use the samples obtained by this sampling scheme for statistical inference. The sequential sampling scheme is that a small amount of samples are firstly extracted without specifying the total sampling number (observation or experiment times) in sampling, and then the sampling is stopped or sampling is continued and the amount of the samples is extracted according to the result until the sampling is stopped. On the contrary, a sampling scheme in which the number of samples is determined in advance is referred to as a fixed sampling scheme.
For example, one sampling inspection protocol provides for drawing 20 samples in a batch, accepting the batch if the number of rejected samples therein does not exceed 3, and rejecting otherwise. Here, the number of samples 20 is a predetermined number, and is a fixed sample. If the scheme is specified that 3 rejected batches are drawn, if all rejected batches are rejected, if the number of rejected products is x1<3, 3-x1 rejected batches are drawn again for the second batch, if all rejected products are rejected, and if the number of rejected products is x2<3-x1, 3-x1-x2 are drawn again for the third batch, and the process is continued until 20 rejected products are drawn or 3 rejected products are drawn. This is a sequential sampling scheme which has the same effect as the fixed sampling scheme described above, but the number of samples is on average less. In this example, the number of samples is random, but has an upper limit 20 that cannot be exceeded. There are sequential sampling schemes where there is no upper limit on the number of possible samples, e.g., sequential probability is no upper limit than the number of samples that are hypothesis tested.
For sequential probability ratio hypothesis test, firstly initializing parameters; the test parameters for controlling the error boundary are initialized, the number n of the initialized samples is 0, and the number z of the unequal prediction results after the initialized samples is 0.
S33: calculating an SPRT probability value; when the SPRT probability value is smaller than a preset detection threshold value, the sample to be detected is a malicious sample, and the process is finished; otherwise, when the number of samples is the same as the number of the variation prediction results, the sample to be detected is a normal sample, and the process is finished; if the number of samples is different from the number of variation predictors, S32 is performed.
Specifically, referring to fig. 4, a specific method for calculating the SPRT probability value is as follows:
s331: presetting a test parameter and a hypothesis test threshold S of sequential probability ratio hypothesis testh(ii) a Obtaining a first parameter P according to equation (1)1Obtaining a second parameter P according to equation (2)2
P1=Sh- (1)
P2=Sh+ (2)
S332: dynamic prediction of sequence R from a variation modelp(p1,p2,……,pn,……p100) Selecting the ith variation prediction result PiThe number of samples n plus 1, if PiUnequal counts z plus 1, not equal initial prediction results Po. The SPRT probability value is obtained by equation (3):
Figure BDA0002692821260000101
comparing the SPRT probability value with a preset detection threshold value tau, and when the SPRT probability value is smaller than the preset detection threshold value, indicating that the sample to be detected is a malicious sample, and ending; otherwise, when the number of samples is different from the number of the variation prediction results, continuing to perform S32; when the number of samples is the same as the number of the variation prediction results, in this embodiment, n is 100, which indicates that the sample to be detected is a normal sample, and this is finished. And for the classification result of the normal sample system, which is the classification result of the initial model, no other change is made.
Wherein, based on the detection threshold selection of hypothesis test, the detection threshold can be adjusted, the selection between the missing report rate and the false report rate is balanced, in this embodiment, the preset detection threshold is selected as follows: 0.0124, the detection threshold value of the size can well balance the missing report rate and the false report rate.
S4: and storing the information of the malicious sample, and generating and sending a report based on the information of the malicious sample. Specifically, if the sample to be detected is judged to be a malicious sample, the information of the malicious sample is stored, a report is generated and sent to a user based on the information of the malicious sample, and the user makes a further decision by combining the information of the malicious sample and the input of a normal sample.
Referring to fig. 5, illustrating a graph of the difference between malicious samples and normal classes of input samples versus models, normal samples are more likely to cross decision boundaries of variant models, and normal samples are more sensitive to the variant of models than malicious backdoor input.
According to the method for detecting the malicious sample of the deep learning model, in the whole detection process of the sample to be detected, prior knowledge about an input trigger is not needed, and the method does not depend on the architectural characteristics of the model, so that the method has extremely strong generalization performance. The method is simple to realize, low in complexity and free from large calculation overhead, and the detection time of a single sample to be detected is within 0.5ms through actual verification based on the characteristics of the method by using a Gaussian noise variation initial model and combining an SPRT hypothesis testing method. The sensitivity of the normal sample and the malicious sample to the model variation is different, a new idea is provided for the detection of the malicious sample based on the characteristic that the normal sample and the malicious sample have different prediction differences to the variation model, the variation ratio can be manually adjusted in a specific actual industrial environment, and the variation boundary of the normal sample and the malicious sample can be better found out; the method can greatly improve the accuracy of detection, and has strong practicability in industrial production environment.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details of non-careless mistakes in the embodiment of the apparatus, please refer to the embodiment of the method of the present invention.
In another embodiment of the present invention, a deep learning model malicious sample detection system is provided, which can be used to implement the deep learning model malicious sample detection method described above, and specifically, the deep learning model malicious sample detection includes a variation model generation module, a prediction result generation module, and a sample detection module.
The variation model generation module is used for acquiring an initial model and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models; the prediction result generation module is used for acquiring a sample to be detected, and inputting the sample to be detected into the initial model and the plurality of variation models respectively to obtain an initial prediction result and a plurality of variation prediction results; the sample detection module is used for determining the sample to be detected as a malicious sample when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value; otherwise, determining the sample to be detected as a normal sample.
In yet another embodiment of the present invention, a terminal device is provided that includes a processor and a memory for storing a computer program comprising program instructions, the processor being configured to execute the program instructions stored by the computer storage medium. The Processor may be a Central Processing Unit (CPU), or may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and is specifically adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor provided by the embodiment of the invention can be used for the operation of the deep learning model malicious sample detection method, and comprises the following steps: s1: acquiring an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models; s2: obtaining a sample to be detected, and respectively inputting the sample to be detected into an initial model and a plurality of variation models to obtain an initial prediction result and a plurality of variation prediction results; s3: when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample.
In still another embodiment, the present invention also provides a computer-readable storage medium (Memory) which is a Memory device in a terminal device and stores programs and data. It is understood that the computer readable storage medium herein may include a built-in storage medium in the terminal device, and may also include an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory.
One or more instructions stored in the computer-readable storage medium can be loaded and executed by the processor to implement the corresponding steps of the method for detecting the malicious sample of the deep learning model in the above embodiment; one or more instructions in the computer-readable storage medium are loaded by the processor and perform the steps of: s1: acquiring an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models; s2: obtaining a sample to be detected, and respectively inputting the sample to be detected into an initial model and a plurality of variation models to obtain an initial prediction result and a plurality of variation prediction results; s3: when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. A deep learning model malicious sample detection method is characterized by comprising the following steps:
s1: acquiring an initial model, and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models;
s2: obtaining a sample to be detected, and respectively inputting the sample to be detected into an initial model and a plurality of variation models to obtain an initial prediction result and a plurality of variation prediction results;
s3: when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value, determining that the sample to be detected is a malicious sample; otherwise, determining the sample to be detected as a normal sample.
2. The method for detecting the malicious sample in the deep learning model according to claim 1, wherein the specific method of S1 is as follows:
s11: acquiring an initial model and traversing all fully-connected layers of the initial model to obtain a weight parameter matrix of each fully-connected layer;
s12: establishing a Gaussian distribution noise matrix of each full-connection layer through Gaussian fuzzy test, and superposing the weight parameter matrix of each full-connection layer and the corresponding Gaussian distribution noise matrix to obtain a variation matrix of each full-connection layer; each full-connection layer updates the weight parameters by adopting the parameters in the variation matrix to obtain a variation model;
s13: repeating the preset times S12 to obtain preset variation models.
3. The method for detecting the malicious sample of the deep learning model according to claim 2, wherein the specific method for establishing the gaussian distribution noise matrix of each fully-connected layer through the gaussian fuzzy test in S12 is as follows:
and generating a Gaussian distribution noise matrix by taking the mean value of each parameter in the weight parameter matrix multiplied by the multiplier factor of the Gaussian blur test as the mean value and the maximum value of each parameter in the weight parameter matrix multiplied by the multiplier factor of the Gaussian blur test as the variance.
4. The method for detecting the malicious sample in the deep learning model according to claim 1, wherein the specific method of S3 is as follows:
s31: combining a plurality of variation prediction results to obtain a variation model prediction sequence;
s32: performing sequential probability ratio hypothesis test according to the initial prediction result and the variation model prediction sequence; the initial sampling number is 0, and a variation prediction result is sampled from the variation model prediction sequence to carry out S33;
s33: calculating an SPRT probability value; when the SPRT probability value is smaller than a preset detection threshold value, the sample to be detected is a malicious sample, and the process is finished; otherwise, when the number of samples is the same as the number of the variation prediction results, the sample to be detected is a normal sample, and the process is finished; if the number of samples is different from the number of variation predictors, S32 is performed.
5. The method for detecting the malicious sample of the deep learning model according to claim 4, wherein the specific method for calculating the SPRT probability value in S33 is as follows:
s331: presetting a test parameter and a hypothesis test threshold S of sequential probability ratio hypothesis testh(ii) a Obtaining a first parameter P according to equation (1)1Obtaining a second parameter P according to equation (2)2
P1=Sh- (1)
P2=Sh+ (2)
S332: the SPRT probability value is obtained by equation (3):
Figure FDA0002692821250000021
wherein n is the number of samples, z is the unequal count, the unequal count is initialized to 0, and 1 is added to the unequal count every time the variation prediction result sampled from the variation model prediction sequence is different from the initial prediction result.
6. The method for detecting the malicious sample of the deep learning model according to claim 4, wherein the preset detection threshold is as follows: 0.0124.
7. the deep learning model malicious sample detection method according to claim 1, further comprising the following steps:
s4: and storing the information of the malicious sample, and generating and sending a report based on the information of the malicious sample.
8. A deep learning model malicious sample detection system is characterized by comprising:
the variation model generation module is used for obtaining an initial model and modifying parameters of a full connection layer of the initial model to obtain a plurality of variation models;
the prediction result generation module is used for acquiring a sample to be detected, and inputting the sample to be detected into the initial model and the plurality of variation models respectively to obtain an initial prediction result and a plurality of variation prediction results;
the sample detection module is used for determining the sample to be detected as a malicious sample when the difference rate between the initial prediction result and the plurality of variation prediction results is smaller than a preset detection threshold value; otherwise, determining the sample to be detected as a normal sample.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the deep learning model malice sample detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the deep learning model malicious sample detection method according to any one of claims 1 to 7.
CN202010996847.5A 2020-09-21 2020-09-21 Deep learning model malicious sample detection method, system, device and storage medium Pending CN112084505A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010996847.5A CN112084505A (en) 2020-09-21 2020-09-21 Deep learning model malicious sample detection method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010996847.5A CN112084505A (en) 2020-09-21 2020-09-21 Deep learning model malicious sample detection method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN112084505A true CN112084505A (en) 2020-12-15

Family

ID=73739373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010996847.5A Pending CN112084505A (en) 2020-09-21 2020-09-21 Deep learning model malicious sample detection method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN112084505A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733140A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Detection method and system for model tilt attack
CN113254930A (en) * 2021-05-28 2021-08-13 北京理工大学 Back door confrontation sample generation method of PE (provider edge) malicious software detection model
CN116739073A (en) * 2023-08-10 2023-09-12 武汉大学 Online back door sample detection method and system based on evolution deviation
US11977626B2 (en) 2021-03-09 2024-05-07 Nec Corporation Securing machine learning models against adversarial samples through backdoor misclassification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KAIDI JIN等: "A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models", 《HTTPS://ARXIV.ORG/ABS/2006.14871V1》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733140A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Detection method and system for model tilt attack
CN112733140B (en) * 2020-12-28 2023-12-22 上海观安信息技术股份有限公司 Detection method and system for model inclination attack
US11977626B2 (en) 2021-03-09 2024-05-07 Nec Corporation Securing machine learning models against adversarial samples through backdoor misclassification
CN113254930A (en) * 2021-05-28 2021-08-13 北京理工大学 Back door confrontation sample generation method of PE (provider edge) malicious software detection model
CN116739073A (en) * 2023-08-10 2023-09-12 武汉大学 Online back door sample detection method and system based on evolution deviation
CN116739073B (en) * 2023-08-10 2023-11-07 武汉大学 Online back door sample detection method and system based on evolution deviation

Similar Documents

Publication Publication Date Title
CN112084505A (en) Deep learning model malicious sample detection method, system, device and storage medium
CN110009171B (en) User behavior simulation method, device, equipment and computer readable storage medium
Anwar et al. A data-driven approach to distinguish cyber-attacks from physical faults in a smart grid
CN107231382A (en) A kind of Cyberthreat method for situation assessment and equipment
CN112613543A (en) Enhanced policy verification method and device, electronic equipment and storage medium
CN110619216B (en) Malicious software detection method and system for adversarial network
CN110111311B (en) Image quality evaluation method and device
CN111881159A (en) Fault detection method and device based on cost-sensitive extreme random forest
CN111522736A (en) Software defect prediction method and device, electronic equipment and computer storage medium
CN117454187B (en) Integrated model training method based on frequency domain limiting target attack
CN112016774A (en) Distribution network running state identification method and system based on data enhancement technology
CN114285587B (en) Domain name identification method and device and domain name classification model acquisition method and device
CN110598794A (en) Classified countermeasure network attack detection method and system
CN116545764B (en) Abnormal data detection method, system and equipment of industrial Internet
CN111026087B (en) Weight-containing nonlinear industrial system fault detection method and device based on data
CN112637104B (en) Abnormal flow detection method and system
CN110581857A (en) virtual execution malicious software detection method and system
CN113886765B (en) Method and device for detecting error data injection attack
CN109756494B (en) Negative sample transformation method and device
CN114880637A (en) Account risk verification method and device, computer equipment and storage medium
Wang et al. Model of network intrusion detection system based on BP algorithm
Han et al. Identification of CSTR using extreme learning machine based hammerstein-wiener model
CN116883417B (en) Workpiece quality inspection method and device based on machine vision
CN113762382B (en) Model training and scene recognition method, device, equipment and medium
CN117892301B (en) Classification method, device, equipment and medium for few-sample malicious software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201215

RJ01 Rejection of invention patent application after publication