CN112529179A - Genetic algorithm-based confrontation training method and device and computer storage medium - Google Patents

Genetic algorithm-based confrontation training method and device and computer storage medium Download PDF

Info

Publication number
CN112529179A
CN112529179A CN202011462377.0A CN202011462377A CN112529179A CN 112529179 A CN112529179 A CN 112529179A CN 202011462377 A CN202011462377 A CN 202011462377A CN 112529179 A CN112529179 A CN 112529179A
Authority
CN
China
Prior art keywords
sample
model
training
confrontation
genetic algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011462377.0A
Other languages
Chinese (zh)
Inventor
周颖
张宾
张伟哲
束建钢
杨孙傲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202011462377.0A priority Critical patent/CN112529179A/en
Publication of CN112529179A publication Critical patent/CN112529179A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physiology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)

Abstract

The invention discloses a countertraining method, a device and a computer storage medium based on a genetic algorithm, wherein the method comprises the following steps: determining a boundary sample corresponding to the model by using the output confidence of the training sample; carrying out expansion search on the boundary sample by adopting a genetic algorithm to generate a confrontation sample; screening the countermeasure samples through the smoothness of the model boundary where the countermeasure samples are located, and determining a set of the countermeasure samples; the model is retrained based on the set of challenge samples. The method solves the problems that the training model in the existing artificial intelligence model confrontation training method has defects and consumes training resources.

Description

Genetic algorithm-based confrontation training method and device and computer storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a countertraining method and device based on a genetic algorithm and a computer storage medium.
Background
The rapid development of artificial intelligence brings great convenience to life. However, the understandability of the artificial intelligence model is poor, the boundary is fuzzy, the attacked surface is wide, and the countersamples generated by different attack algorithms have good attack effects, so that the safety of the artificial intelligence system is low.
At present, defense algorithms for resisting attacks are multiple, and a training data set of a resisting sample enhancement algorithm is adopted for resisting training, so that the application is wide. However, the existing countermeasure training only adopts a single or multiple counterattack algorithm to enhance the training data set, and does not analyze and match the characteristics of the artificial intelligence model and the counterattack algorithm, so that the stability of the local boundary of the artificial intelligence model after the countermeasure training cannot be ensured, and the artificial intelligence model still suffers from other forms of counterattack.
The existing method mainly has the following defects:
firstly, the confrontation training adopts a confrontation sample generation algorithm to carry out data enhancement so as to retrain the artificial intelligence model, but the confrontation sample cannot well match the characteristics of the artificial intelligence model and fully represent the boundary of the model, so that the precision of the confrontation training is reduced.
Secondly, the confrontation training does not effectively distinguish the relative positions of the confrontation samples and the model boundary, and different confrontation samples cannot complement each other, so that the confrontation samples are overlapped, noise is introduced, and training resources are consumed.
Therefore, the training model in the prior artificial intelligence model confrontation training method has the defects and the problem of consuming training resources.
Disclosure of Invention
The invention mainly aims to provide a countervailing training method and device based on a genetic algorithm and a computer storage medium, and aims to solve the problems that the training model in the conventional artificial intelligence model countervailing training method has defects and consumes training resources.
In order to achieve the above object, the present invention provides a genetic algorithm-based confrontation training method, including the steps of:
determining a boundary sample corresponding to the model by using the output confidence of the training sample;
carrying out expansion search on the boundary sample by adopting a genetic algorithm to generate a confrontation sample;
screening the antagonistic sample through the smoothness of the model boundary where the antagonistic sample is located, and determining an antagonistic sample set;
retraining the model according to the set of confrontation samples.
In an embodiment, the determining, by using the output confidence of the training samples, the boundary sample corresponding to the model includes:
training the model using the training data;
calculating the confidence of the output result of the training sample after passing through the model;
and when the confidence coefficient is smaller than a preset confidence coefficient threshold value, taking the training sample corresponding to the confidence coefficient as a boundary sample.
In one embodiment, the performing an extended search on the boundary samples using a genetic algorithm to generate countermeasure samples includes:
using the boundary sample as a seed sample, and coding a sample space by adopting a preset coding rule;
randomly selecting a point initialization population near the seed sample;
calculating individual fitness in the population through a fitness function calculation formula;
selecting a sample according to the individual fitness to perform crossing and mutation operations to generate a new solution, and adding the new solution into the population to form a new population;
calculating individual fitness in the new population, performing iterative optimization according to the individual fitness, and setting the maximum iterative times;
and when the individual fitness is larger than a preset fitness function threshold or the maximum iteration number is reached, terminating the iteration and taking the new population as a confrontation sample.
In one embodiment, the fitness function calculation formula is:
fitness(Xj)=||Xj-X||2-τ*||f(Xj)-f(X)||2
wherein, the seed sample is X, the model mapping is f (X), and the jth sample searched by adopting the genetic algorithm according to the seed sample is Xj,XjThe corresponding fitness function is fitness (X)j) And τ is the modulusWeight of the difference of the pattern output.
In one embodiment, the screening the challenge samples by the smoothness of the model boundary where the challenge samples are located to determine a set of challenge samples includes:
calculating the smoothness of the model boundary where the confrontation sample is located through a smoothness calculation formula;
when the smoothness is larger than a preset smoothness threshold value, retaining the confrontation sample corresponding to the smoothness;
the retained challenge samples are merged into a set of challenge samples.
In one embodiment, the smoothness calculation formula is:
S(Xij)=(f(Xij)-f(Xi))/(Xij-Xi)
wherein, the seed sample XiAnd corresponding population sample Xij(ii) a The model maps to f (X)i),f(Xij);S(Xij) Corresponding model boundary smoothness.
In an embodiment, the retraining the model according to the set of confrontation samples comprises:
when the number of the reserved confrontation samples is larger than or equal to the preset number, combining the confrontation sample set, the original training samples and the training base model;
performing integrated countermeasure training on the model by adopting a preset algorithm to form a final defense model;
and generating a final decision of the final defense model by adopting a preset strategy.
In one embodiment, the method further comprises:
and when the number of the reserved confrontation samples is less than the preset number, training the model by adopting a common model training method.
To achieve the above object, the present invention further provides a genetic algorithm based confrontation training device, which includes a memory, a processor, and a genetic algorithm based confrontation training program stored in the memory and executable on the processor, wherein the genetic algorithm based confrontation training program, when executed by the processor, implements the steps of the genetic algorithm based confrontation training method as described above.
To achieve the above object, the present invention also provides a computer-readable storage medium storing a genetic algorithm-based confrontation training program, which when executed by a processor, implements the steps of the genetic algorithm-based confrontation training method as described above.
According to the confrontation training method and device based on the genetic algorithm and the computer storage medium, the boundary sample corresponding to the model is determined by utilizing the output confidence coefficient of the training sample; the genetic algorithm is adopted to carry out expansion search on the boundary sample to generate a confrontation sample, and the genetic algorithm is utilized to improve the generation mode of the confrontation sample; screening the countermeasure samples through the smoothness of the model boundary where the countermeasure samples are located to determine a set of the countermeasure samples, adding a screening process of the countermeasure samples, and selecting the countermeasure samples on the model boundary; retraining the model according to the determined confrontation sample set; therefore, the problems that the training model in the existing artificial intelligence model confrontation training method has defects and consumes training resources are solved.
Drawings
FIG. 1 is a schematic diagram of an apparatus according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of the genetic algorithm-based confrontation training method of the present invention;
FIG. 3 is a flowchart illustrating a detailed process of step S110 according to a first embodiment of the present invention;
FIG. 4 is a flowchart illustrating the step S120 in the first embodiment of the present invention;
FIG. 5 is a flowchart illustrating the step S130 according to the first embodiment of the present invention;
FIG. 6 is a flowchart illustrating the step S140 in the first embodiment of the present invention;
FIG. 7 is a schematic diagram of the algorithm of the present invention.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main solution of the embodiment of the invention is as follows: determining a boundary sample corresponding to the model by using the output confidence of the training sample; the genetic algorithm is adopted to carry out expansion search on the boundary sample to generate a confrontation sample, and the genetic algorithm is utilized to improve the generation mode of the confrontation sample; screening the countermeasure samples through the smoothness of the model boundary where the countermeasure samples are located to determine a set of the countermeasure samples, adding a screening process of the countermeasure samples, and selecting the countermeasure samples on the model boundary; retraining the model according to the determined confrontation sample set; therefore, the problems that the training model in the existing artificial intelligence model confrontation training method has defects and consumes training resources are solved.
As an implementation manner, fig. 1 may be shown, where fig. 1 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Processor 1100 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 1100. The processor 1100 described above may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1200, and the processor 1100 reads the information in the memory 1200 and performs the steps of the above method in combination with the hardware thereof.
It will be appreciated that memory 1200 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 1200 of the systems and methods described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
For a software implementation, the techniques described in this disclosure may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described in this disclosure. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Based on the above structure, an embodiment of the present invention is proposed.
Referring to fig. 2, fig. 2 is a first embodiment of the genetic algorithm based confrontation training method of the present invention, which includes the following steps:
and step S110, determining a boundary sample corresponding to the model by using the output confidence of the training sample.
In this embodiment, the challenge sample is presented by Christian szegdy et al, which refers to an input sample formed by intentionally adding a slight disturbance to the data set, resulting in the model giving an erroneous output generation with high confidence, and one of the main reasons for generating the challenge sample is excessive linearity. The confrontation training refers to mixing original training data and confrontation samples to form a new training data set, and then training the model by using the new training data set to strengthen the recognition capability of the model on the confrontation samples.
The training samples refer to the raw training data. In statistics, the Confidence interval (Confidence interval) of a probability sample is an interval estimate for some overall parameter of this sample. The confidence interval exhibits the extent to which the true value of this parameter has a certain probability of falling around the measurement. The confidence interval indicates the degree of plausibility of the measured value of the measured parameter, i.e. the "certain probability" required above. This probability is called the confidence level, i.e., confidence.
The model refers to an artificial intelligence model. With the development of neural network algorithms in recent years, artificial intelligence has been able to play an increasing role in our daily lives. Especially, in the field of deep learning which is developed rapidly recently, deep learning has achieved good application effects in natural language understanding, computer vision, intelligent robots, automatic programming and the like. Artificial Neural Networks (ans) generally refer to a model with mathematical functions that imitates the human nervous system in machine learning, and generally an Artificial Neural network is composed of a network structure, an activation function, and learning rules. The boundary samples are determined according to the fact that the output confidence of the training samples is smaller than a preset confidence threshold value. The lower the confidence, the closer the training sample is to the model boundary.
And step S120, performing expansion search on the boundary sample by adopting a genetic algorithm to generate a confrontation sample.
In this embodiment, in nature, organisms are constantly reproducing, and in the process of reproducing the organisms, genes of the organisms are recombined and mutated, so that the organisms continuously have new traits to adapt to different external environments, which is called as evolution. The evolutionary algorithm carries out mathematical abstraction on the evolutionary evolution process of organisms, the complex process of population evolution is represented in a mathematical coding mode, heuristic search on a complex space is realized through a genetic process, and finally an optimal solution is found in a global space with great possibility. The crossing, recombination and variation of individuals of different populations are independent, so that a plurality of populations can be used for parallel calculation, in addition to the fact that parallel processing can be carried out among the populations, parallel calculation can also be carried out among individuals, and therefore the genetic algorithm has good parallel processing capacity.
The genetic algorithm is initialized randomly at the beginning of the algorithm according to the designed individual coding mode to obtain a plurality of individuals, and the individuals form one or more populations. And then, evaluating the population fitness by an algorithm, screening out certain individuals according to the calculated population fitness, putting the individuals into a mating pool, and recombining and mutating the individuals in the mating pool according to certain probability to obtain new offspring. At this time, the population is composed of two parts, one part is the individuals of the previous generation, and the other part is the newborn individuals generated by mating. The algorithm needs to select some individuals in both parts as a new generation of population.
And (3) carrying out expansion search on the boundary sample in a sample space by adopting a genetic algorithm, wherein the sample space refers to a multidimensional space where the training data is located. The boundary samples are expanded by adopting a genetic algorithm, and the sample space is continuously searched along the boundary of the artificial intelligence model until the difference between the searched samples and the boundary samples is larger. In the application, the Euclidean distance is used for measuring the difference between samples, and the difference threshold value can be set in advance.
Step S130, screening the confrontation samples according to the smoothness of the model boundary where the confrontation samples are located, and determining a confrontation sample set.
In this embodiment, the smoothness of the model boundary, i.e., the gradient of the model boundary, is obtained by screening the antagonistic samples according to whether the smoothness of the model boundary where the antagonistic samples are located is greater than the preset smoothness threshold, i.e., the antagonistic samples with the smoothness greater than the preset smoothness threshold are retained, and the antagonistic samples with the smoothness less than or equal to the preset smoothness threshold are removed, so as to determine the set of the antagonistic samples.
Step S140, retraining the model according to the confrontation sample set.
In this embodiment, the model is retrained based on the determined set of challenge samples.
In the technical scheme provided by this embodiment, the output confidence of the training samples is used to determine the boundary samples corresponding to the model; the genetic algorithm is adopted to carry out expansion search on the boundary sample to generate a confrontation sample, and the genetic algorithm is utilized to improve the generation mode of the confrontation sample; screening the countermeasure samples through the smoothness of the model boundary where the countermeasure samples are located to determine a set of the countermeasure samples, adding a screening process of the countermeasure samples, and selecting the countermeasure samples on the model boundary; retraining the model according to the determined confrontation sample set; therefore, the problems that the training model in the existing artificial intelligence model confrontation training method has defects and consumes training resources are solved.
Referring to fig. 3, fig. 3 is a detailed step of step S110 in the first embodiment of the present invention, where the determining the boundary sample corresponding to the model by using the output confidence of the training sample includes:
step S111, training a model using the training data.
In this embodiment, the training data, i.e., training samples, are used to train the artificial intelligence model using the raw training data. For example, model M is trained using training data.
And step S112, calculating the confidence of the output result of the training sample after passing through the model.
In this embodiment, the confidence of the output result of the training sample after being trained by the model is calculated. For example, let the mapping function be f (X), calculate its output result confidence f (X)i)。
And step S113, when the confidence coefficient is smaller than a preset confidence coefficient threshold value, taking a training sample corresponding to the confidence coefficient as a boundary sample.
In this embodiment, the preset confidence threshold is preferably λ 1, and when the confidence is smaller than λ 1, the lower the confidence is, the closer the model boundary is indicated, and the training sample corresponding to the confidence is used as the boundary sample. For example, if f (X)i)<λ 1, then XiAre boundary samples.
In the technical scheme provided by this embodiment, a model is trained by using training data, then a confidence of an output result of the training sample after passing through the model is calculated, and when the confidence is smaller than a preset confidence threshold, the training sample corresponding to the confidence is used as a boundary sample. And selecting boundary samples according to the model output confidence of the training data, so that the boundary of the model can be represented.
Referring to fig. 4, fig. 4 is a detailed step of step S120 in the first embodiment of the present invention, where the performing an expansion search on the boundary sample by using a genetic algorithm to generate a countermeasure sample includes:
and step S121, using the boundary sample as a seed sample, and encoding a sample space by adopting a preset encoding rule.
In this embodiment, the parameter coding of the genetic algorithm, the setting of the initial population, and the genetic manipulation design are different according to the sample characteristics, and can be adjusted according to the actual situation. Since the problem to be optimized is represented mathematically, the mathematical problem needs to be encoded and the solution space of the problem is mapped to the encoding space. The coding design of the genetic algorithm can greatly influence the search result of the algorithm on the global optimal solution, and the global optimal solutions brought by different coding modes are different. The common encoding methods mainly include: 1. binary coding; 2. a Gray code; 3. floating point number encoding, etc., and will not be elaborated upon herein.
And (5) taking the boundary sample as a seed sample, and coding the sample space by adopting a preset coding rule. For example, if the coding rule is D (X) and the coding rule is D (X) for the sample space, the coding rule of the seed sample is D (X)i)。
And step S122, randomly selecting a point near the seed sample to initialize the population.
In this embodiment, population (population) and biological evolution are performed in the form of population, such a population is called population, and a set of multiple feasible solutions in the algorithm is a population. A point near the seed sample is randomly selected to initialize the population.
And S123, calculating the individual fitness in the population through a fitness function calculation formula.
In this embodiment, fitness is used to indicate the ability of an individual to adapt to the environment in which the individual is living in a population of organisms. The calculation of fitness is important for genetic algorithms, and is related to the direction of genetic algorithm search and the final search result. The value of the fitness is generally a real number type, and is generally mapped to a non-negative real number in order to obtain a good optimization effect. In the optimization problem using genetic algorithm, an objective function needing optimization is generally used as a population fitness calculation method. Generally, the smaller the value of the objective function, the better the population individuals are considered, and some objective functions and the fitness value are reversed.
The fitness function calculation formula is as follows:
fitness(Xj)=||Xj-X||2-τ*||f(Xj)-f(X)||2
wherein, the seed sample is X, the model mapping is f (X), and the jth sample searched by adopting the genetic algorithm according to the seed sample is Xj,XjThe corresponding fitness function is fitness (X)j) And τ is the weight of the difference in model output.
And S124, selecting a sample according to the individual fitness to perform crossing and mutation operations to generate a new solution, and adding the new solution into the population to form a new population.
In this embodiment, as the smaller the individual fitness, the closer the representation is to the model boundary, and then the samples are selected according to the individual fitness to perform the intersection and mutation operations to generate a new solution, the new solution is added to the population to form a new population.
And step S125, calculating the individual fitness in the new population, performing iterative optimization according to the individual fitness, and setting the maximum iterative times.
In this embodiment, the individual fitness in the new population is calculated through the fitness function calculation formula, and iterative optimization is performed according to the individual fitness, that is, a sample with relatively low individual fitness is selected to enter the next generation, and the maximum iteration number is set, where the maximum iteration number is specifically set according to a specific situation, and no excessive limitation is made here.
And step S126, when the individual fitness is larger than a preset fitness function threshold or reaches the maximum iteration frequency, terminating the iteration and taking the new population as a confrontation sample.
In this embodiment, when the individual fitness is greater than the preset fitness function value, the iteration is terminated when the sample and the seed sample have a large difference or the maximum number of iterations is reached, and the last generation, i.e., the individuals in the latest population, is used as the confrontation sample.
In the technical scheme provided by the embodiment, the boundary samples, namely the seed samples, are subjected to iterative optimization through a genetic algorithm to generate new population determination countermeasure samples, so that the generation mode of the countermeasure samples is improved. And searching along the model boundary according to the seed sample, searching the effective sample through a genetic algorithm, and further characterizing the model boundary. And converting the model boundary information into heuristic information in a genetic algorithm to generate an effective countermeasure sample and limit the distance between the effective countermeasure sample and the seed sample.
Referring to fig. 5, fig. 5 is a detailed step of step S130 in the first embodiment of the present invention, the screening the challenge samples according to the smoothness of the model boundary where the challenge samples are located to determine a challenge sample set, including:
step S131, calculating the smoothness of the model boundary where the confrontation sample is located through a smoothness calculation formula.
In this embodiment, the smoothness calculation formula is:
S(Xij)=(f(Xij)-f(Xi))/(Xij-Xi)
wherein, the seed sample XiAnd corresponding population sample Xij(ii) a The model maps to f (X)i),f(Xij);S(Xij) Corresponding model boundary smoothness.
And calculating the smoothness of the model boundary where the confrontation sample is located by a smoothness calculation formula.
Step S132, when the smoothness is greater than a preset smoothness threshold, retaining the confrontation sample corresponding to the smoothness.
In this embodiment, the smoothness threshold is preferably λ 3, and when the smoothness is greater than the preset smoothness threshold, the corresponding challenge sample of the smoothness is retained, for example, if S (X) isij)>λ 3, then X is retainedij
In step S133, the retained countermeasure samples are merged into a set of countermeasure samples.
In this embodiment, the remaining challenge samples are combined into a set of challenge samples.
In the technical scheme provided by the embodiment, the method comprises the steps of screening the antagonistic sample by calculating whether the smoothness of the model boundary where the antagonistic sample is located is greater than a preset smoothness threshold as a judgment condition, quantitatively describing the intensity of change of the model boundary through the section gradient of the seed sample and the antagonistic sample correspondingly generated, and improving the generalization capability of the model by storing the antagonistic sample on the boundary and retraining the model.
Referring to fig. 6, fig. 6 is a detailed step of step S140 in the first embodiment of the present invention, the retraining the model according to the confrontation sample set includes:
and step S141, merging the confrontation sample set, the original training sample and the training base model when the number of the reserved confrontation samples is greater than or equal to the preset number.
In this embodiment, when the number of the reserved countermeasure samples is greater than or equal to the preset number, the preset number may be specifically set according to the specific application, and without any limitation, the set of countermeasure samples, the original training samples, that is, the original training data, and the training base model are merged.
And S142, performing integrated countermeasure training on the model by adopting a preset algorithm to form a final defense model.
In the embodiment, a preset algorithm is adopted to carry out integrated confrontation training on the model to form a final defense model; the preset algorithm may preferably be a bagging algorithm.
And S143, generating a final decision of the final defense model by adopting a preset strategy.
In this embodiment, a preset policy, such as a voting policy, is used to generate a final decision of the final defense model in an average voting manner. The model is encouraged to be stable along any small change on the manifold of the same type of samples, the attacked surface is reduced, the success rate of resisting attack is reduced, and the generalization capability of the model is improved.
In the above embodiment, the method further includes:
and step S141b, when the number of the reserved confrontation samples is less than the preset number, training the model by adopting a common model training method.
In this embodiment, when the number of the reserved countermeasure samples is less than the preset number, the model is trained using a common model training method. For example, the model is trained using the original model training method.
Referring to FIG. 7, FIG. 7 is a schematic flow chart of the algorithm of the present invention; after the model boundary is confirmed according to the model characteristics, confrontation samples near the boundary sample are searched, samples corresponding to the steep boundary are selected, the samples are added into a training set, then the model is retrained, and a final defense model is obtained.
The present invention also provides a genetic algorithm-based confrontation training device, which comprises a memory, a processor and a genetic algorithm-based confrontation training program stored in the memory and executable on the processor, wherein the genetic algorithm-based confrontation training program realizes the steps of the genetic algorithm-based confrontation training method as described above when executed by the processor.
The present invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a genetic algorithm-based confrontation training program, and the genetic algorithm-based confrontation training program, when executed by a processor, implements the steps of the genetic algorithm-based confrontation training method as described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A confrontation training method based on genetic algorithm is characterized by comprising the following steps:
determining a boundary sample corresponding to the model by using the output confidence of the training sample;
carrying out expansion search on the boundary sample by adopting a genetic algorithm to generate a confrontation sample;
screening the antagonistic sample through the smoothness of the model boundary where the antagonistic sample is located, and determining an antagonistic sample set;
retraining the model according to the set of confrontation samples.
2. The genetic algorithm-based confrontational training method of claim 1, wherein the determining the boundary sample corresponding to the model by using the output confidence of the training sample comprises:
training the model using the training data;
calculating the confidence of the output result of the training sample after passing through the model;
and when the confidence coefficient is smaller than a preset confidence coefficient threshold value, taking the training sample corresponding to the confidence coefficient as a boundary sample.
3. The genetic algorithm-based confrontation training method of claim 2, wherein the performing an extended search on the boundary samples using the genetic algorithm to generate confrontation samples comprises:
using the boundary sample as a seed sample, and coding a sample space by adopting a preset coding rule;
randomly selecting a point initialization population near the seed sample;
calculating individual fitness in the population through a fitness function calculation formula;
selecting a sample according to the individual fitness to perform crossing and mutation operations to generate a new solution, and adding the new solution into the population to form a new population;
calculating individual fitness in the new population, performing iterative optimization according to the individual fitness, and setting the maximum iterative times;
and when the individual fitness is larger than a preset fitness function threshold or the maximum iteration number is reached, terminating the iteration and taking the new population as a confrontation sample.
4. The genetic algorithm-based confrontational training method of claim 3, wherein said fitness function calculation formula is:
fitness(Xj)=||Xj-X||2*||f(Xj)-f(X)||2
wherein, the seed sample is X, the model mapping is f (X), and the genetic algorithm is adopted to search out the seed sampleThe jth sample is Xj,XjThe corresponding fitness function is fitness (X)j) And τ is the weight of the difference of the model outputs.
5. The genetic algorithm-based confrontation training method of claim 4, wherein the screening the confrontation samples by the smoothness of the model boundary where the confrontation samples are located to determine a set of confrontation samples comprises:
calculating the smoothness of the model boundary where the confrontation sample is located through a smoothness calculation formula;
when the smoothness is larger than a preset smoothness threshold value, retaining the confrontation sample corresponding to the smoothness;
the retained challenge samples are merged into a set of challenge samples.
6. The genetic algorithm-based resistance training method of claim 5, wherein the smoothness calculation formula is:
S(Xij)=(f(Xij)-f(Xi))/(Xij-Xi)
wherein, the seed sample XiAnd corresponding population sample Xij(ii) a The model maps to f (X)i),f(Xij);S(Xij) Corresponding model boundary smoothness.
7. The genetic algorithm-based confrontational training method of claim 6, wherein said retraining said model from said confrontational sample set comprises:
when the number of the reserved confrontation samples is larger than or equal to the preset number, combining the confrontation sample set, the original training samples and the training base model;
performing integrated countermeasure training on the model by adopting a preset algorithm to form a final defense model;
and generating a final decision of the final defense model by adopting a preset strategy.
8. The genetic algorithm-based confrontational training method of claim 7, further comprising:
and when the number of the reserved confrontation samples is less than the preset number, training the model by adopting a common model training method.
9. A genetic algorithm-based confrontation training device, characterized in that the device comprises a memory, a processor and a genetic algorithm-based confrontation training program stored in the memory and executable on the processor, wherein the genetic algorithm-based confrontation training program, when executed by the processor, implements the steps of the genetic algorithm-based confrontation training method according to any one of claims 1 to 8.
10. A computer-readable storage medium storing a genetic algorithm-based confrontation training program, which when executed by a processor, performs the steps of the genetic algorithm-based confrontation training method according to any one of claims 1 to 8.
CN202011462377.0A 2020-12-10 2020-12-10 Genetic algorithm-based confrontation training method and device and computer storage medium Pending CN112529179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011462377.0A CN112529179A (en) 2020-12-10 2020-12-10 Genetic algorithm-based confrontation training method and device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011462377.0A CN112529179A (en) 2020-12-10 2020-12-10 Genetic algorithm-based confrontation training method and device and computer storage medium

Publications (1)

Publication Number Publication Date
CN112529179A true CN112529179A (en) 2021-03-19

Family

ID=74999274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011462377.0A Pending CN112529179A (en) 2020-12-10 2020-12-10 Genetic algorithm-based confrontation training method and device and computer storage medium

Country Status (1)

Country Link
CN (1) CN112529179A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734039A (en) * 2021-03-31 2021-04-30 杭州海康威视数字技术股份有限公司 Virtual confrontation training method, device and equipment for deep neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734039A (en) * 2021-03-31 2021-04-30 杭州海康威视数字技术股份有限公司 Virtual confrontation training method, device and equipment for deep neural network

Similar Documents

Publication Publication Date Title
Li et al. LGM-Net: Learning to generate matching networks for few-shot learning
US11610131B2 (en) Ensembling of neural network models
CN112966074A (en) Emotion analysis method and device, electronic equipment and storage medium
CN114548591B (en) Sequential data prediction method and system based on mixed deep learning model and Stacking
EP1586076A2 (en) System and method for optimization of a database for the training and testing of prediction algorithms
CN112215259B (en) Gene selection method and apparatus
CN104504442A (en) Neural network optimization method
CN113254927B (en) Model processing method and device based on network defense and storage medium
CN113139664A (en) Cross-modal transfer learning method
Zhang et al. Evolving neural network classifiers and feature subset using artificial fish swarm
CN117201122A (en) Unsupervised attribute network anomaly detection method and system based on view level graph comparison learning
CN118151020B (en) Method and system for detecting safety performance of battery
US20230110719A1 (en) Systems and methods for few-shot protein fitness prediction with generative models
CN112529179A (en) Genetic algorithm-based confrontation training method and device and computer storage medium
CN114757433A (en) Method for quickly identifying relative risk of drinking water source antibiotic resistance
Khorashadizade et al. An intelligent feature selection method using binary teaching-learning based optimization algorithm and ANN
CN117787331A (en) Agent-assisted co-evolution method based on improved grouping strategy
US20230253076A1 (en) Local steps in latent space and descriptors-based molecules filtering for conditional molecular generation
CN115049852B (en) Bearing fault diagnosis method and device, storage medium and electronic equipment
CN110033096B (en) State data generation method and system for reinforcement learning
CN116185843B (en) Two-stage neural network testing method and device based on neuron coverage rate guidance
CN116701948B (en) Pipeline fault diagnosis method and system, storage medium and pipeline fault diagnosis equipment
Kima et al. A weight-Adjusted voting algorithm for ensemble of classifiers
Cao et al. Detection and fine-grained classification of malicious code using convolutional neural networks and swarm intelligence algorithms
CN118278145B (en) Railway route selection method, equipment and medium based on imitation and reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination