CN114419379A - System and method for improving fairness of deep learning model based on antagonistic disturbance - Google Patents
System and method for improving fairness of deep learning model based on antagonistic disturbance Download PDFInfo
- Publication number
- CN114419379A CN114419379A CN202210320949.4A CN202210320949A CN114419379A CN 114419379 A CN114419379 A CN 114419379A CN 202210320949 A CN202210320949 A CN 202210320949A CN 114419379 A CN114419379 A CN 114419379A
- Authority
- CN
- China
- Prior art keywords
- disturbance
- image
- discriminator
- fairness
- generator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a system and a method for improving fairness of a deep learning model based on antagonistic disturbance, wherein the system comprises a deployment model, a disturbance generator and a discriminator, the deployment model comprises a feature extractor and a label predictor, and the disturbance generator is connected with the feature extractor. The invention processes the input data of the deployment model without changing the deep learning model. The model fairness is improved based on the adversarial disturbance, a corresponding disturbance generator and a discriminator are designed, the discriminator is used for capturing sensitive attribute information related to the fairness, training optimization of the disturbance generator is guided, the sensitive attribute information of adversarial disturbance hidden data is generated, target task related information is reserved, the sensitive information of input data is prevented from being extracted by the model in the feature extraction process, and therefore prediction fairness is improved.
Description
Technical Field
The invention relates to the field of trusted Artificial Intelligence (AI), in particular to a system and a method for improving fairness of a deep learning model based on adversarial disturbance.
Background
In recent years, deep neural networks have exhibited excellent performance in various fields such as image processing, natural language processing, voice recognition, and the like. Although the popularization of the application of the artificial intelligence technology promotes the change of various fields and brings convenience and improvement to human life, researches find that the existing partial artificial intelligence systems have ethical risks, and the systems contain bias and discrimination to specific groups and even place the vulnerable groups at a more unfavorable position. Therefore, the prejudice of the deep learning model is relieved, the fairness of model decision is improved, and the important premise for ensuring the reliable application of the artificial intelligence system is provided. The deep learning model usually learns from data, if the distribution of data of different groups is not balanced, a false statistical association exists between a target task label and a label of a sensitive attribute, which causes the model to learn the false association, and associates a predicted target task label with the label of the sensitive attribute, thereby generating a bias for a specific group. The existing technology for improving the fairness of the deep learning model essentially needs to modify the deployed model to prevent the model from learning false association so as to eliminate the bias of a specific group, thereby greatly limiting the practical application of the fairness mechanism of the deep learning model.
Disclosure of Invention
Aiming at the defect that the deployed deep learning model needs to be modified in the prior art, the invention provides a system and a method for improving the fairness of the deep learning model based on antagonistic disturbance, and the fairness is improved under the condition that the deep learning model is not changed.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the invention discloses a depth learning model fairness promotion system based on antagonistic disturbance, which comprises a deployment model, a disturbance generator and a discriminator, wherein the deployment model comprises a feature extractor and a label predictor, the disturbance generator is connected with the feature extractor, the feature extractor is respectively connected with the label predictor and the discriminator, an image is input by the feature extractor, the image is subjected to a hidden space representation by the feature extractor, the hidden space representation is output as a prediction result of a target label after being input into the label predictor, and the hidden space representation is output as a prediction result of image sensitive attribute after being input into the discriminator.
As a further improvement, the input of the disturbance generator is an image, the output is antagonistic disturbance, and the disturbance value is added with the input image and then input into the feature extractor.
The invention also discloses a method for improving the fairness of the deep learning model based on the antagonistic disturbance, which comprises the following steps:
1) adding antagonistic disturbance to the image by using a disturbance generator, inputting the disturbed image into a feature extractor of a deployment model, outputting a hidden space representation of the image by the feature extractor, and obtaining a prediction result of a target label after the hidden space representation is input into a label predictor;
2) measuring sensitive attribute information contained in the disturbed image, inputting the hidden space representation into a discriminator to obtain a prediction result of the image sensitive attribute, training the discriminator to predict the sensitive attribute from the hidden space representation, and updating the discriminator;
3) updating the disturbance generator to better generate the antagonistic disturbance, deceiving the discriminator to ensure that the image added with the antagonistic disturbance does not contain information of sensitive attributes in a hidden space representation as much as possible, and simultaneously ensure that a prediction result of the target label predictor is as accurate as possible;
4) and (3) repeating the step 2) and the step 3) until the generator can well cheat the discriminator, the target label predictor has high accuracy, the disturbance generator at the moment is integrated into a deployment model data preprocessing link as a fairness promotion module, and antagonism disturbance is added to the input image to promote fairness.
As a further improvement, it is possible to,the deployment model of the invention is expressed asWhereinIn order to provide a feature extractor for a computer,for the target label predictor, the input image isThe sensitive property isThe object label is。
As a further improvement, in step 1) of the invention, a disturbance generator is usedFor imagesAdding antagonistic disturbance, the disturbed image isDisturbance satisfiesNorm limitationThe disturbed imageInput deployment model, feature extractor for deployment modelImplicit spatial representation of an output imageAnd obtaining the prediction result of the target label after the label predictor is input in the hidden space representation。
As a further improvement, in the step 2) of the invention, the updating is carried outSo that the discriminator can accurately capture the sensitive attribute from the hidden space representationIs determined by the information of (a) a,the loss function of (d) is:
whereinRepresenting cross entropy, the hidden space of the perturbed data is represented asArbiter for sensitive attributeIs output as,Representing the true sensitive property.
As a further improvement, in the step 3) of the invention, the size is increasedEntropy of prediction of disturbed imageIn disturbing the sampleMaking a random guess above, the loss of entropy is expressed as:
wherein the content of the first and second substances,representing entropy, to this point, the generatorThe total loss for improving fairness is expressed as,Is a smaller value, controlling the weight of the entropy-constrained term.
As a further improvement, the step 3) of the invention is described, except that it is responsible for fairness perceptionBesides, information of the target label needs to be kept in the hidden space representation, the performance of the model on target label prediction needs to be kept, and a loss term responsible for the accuracy of the model needs to be:
whereinThe cross-entropy is represented by the cross-entropy,the output of the target label predictor representing the model,during the course of updating, by addingWhile at the same time reducingDeceiving the discriminator and keeping the accuracy of target label prediction;andis balanced by a parameterThe control is carried out by controlling the temperature of the air conditioner,the higher the primary task accuracy can be maintained,the lower the fairness can be improved,loss function ofExpressed as the total loss function design contains negativesLoss of fairness awarenessAnd loss of retention accuracyAnd the disturbance generator learns to generate the antagonistic disturbance meeting the requirement, and the fairness of the model is improved while the target label prediction accuracy is kept:
as a further improvement, in step 4) of the invention, the disturbance generatorAnd discriminatorConducting a mini-max game until the generator can fool the discriminator well and the target label predictor has a high accuracy, at which point the generator will be usedDeployed as a modelAdaptively generating perturbations for the input data.
As a further improvement, in the mini-max game process, the discriminatorMaximizing prediction of sensitive attributes from feature spaceAbility of disturbance generatorThen an attempt is made to fool as much as possibleAt the same time letThe target label of the sample after disturbance can be predicted, and the process target function can be formalized as follows:
wherein, the parameters to be updated in the objective function areAndupdateUpdating by maximizing (max) the above-mentioned objective functionMinimizing (min) the above objective function, a constraint term representation generator of the objective functionFor input imageApplying a perturbation to the imageDisturbance satisfiesNorm limitationThe implicit space obtained by the data after disturbance is expressed as。
The invention has the following beneficial technical effects:
step 1) in the technical scheme of the invention, firstly, antagonistic disturbance is added to an image to improve the fairness of a model; secondly, a disturbance generator is introduced to generate antagonistic disturbance, so that after training of the disturbance generator is completed, the generator can generate the antagonistic disturbance for any image, fairness of a model is improved, and sensitive attributes and target labels of the image do not need to be known.
In step 2) in the technical scheme of the invention, in the process of deceiving the discriminator, in addition to the cross entropy, the entropy is also used, so that the generated antagonism disturbance can be increasedFor the entropy of the prediction of the disturbance image, the model is prevented from extracting sensitive attribute information but not extracting information with opposite sensitive attributes, for example, the input image is male, and the model is expected not to extract sex information but rather extracting information with opposite sex after disturbance.
According to the method and the device, the sensitive characteristics of the data are prevented from being extracted by the deployment model by modifying the input image, so that the fairness can be improved under the condition of not changing the model. The invention processes the input data of the deployment model without changing the deep learning model. The method improves the model fairness based on the antagonistic disturbance, and designs a corresponding disturbance generator and a discriminator, wherein the disturbance generator is directly used for generating the antagonistic disturbance, and the discriminator assists the training of the disturbance generator. The method comprises the steps of capturing sensitive attribute information related to fairness by using a discriminator, guiding training optimization of a disturbance generator, generating antagonistic disturbance hiding data sensitive attribute information, and reserving target task related information, so that a model is prevented from extracting sensitive information of input data in a feature extraction process, and accordingly fairness prediction is improved.
Drawings
FIG. 1 is a block diagram of a deep learning model fairness boosting system based on antagonistic perturbations.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by the following embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
FIG. 1 is a frame diagram of a depth learning model fairness promotion system based on antagonistic disturbance, which includes a deployment model, a disturbance generator and a discriminator, wherein the deployment model includes a feature extractor and a label predictor, the disturbance generator is connected with the feature extractor, the feature extractor is respectively connected with the label predictor and the discriminator, an image is input by the feature extractor, the image is represented by a hidden space obtained by the feature extractor, the hidden space represents a prediction result of a target label output after the label predictor is input, and the hidden space represents a prediction result of an image sensitive attribute output after the discriminator is input.
Comprises the following steps:
1) adding antagonistic disturbance to an input image by using a disturbance generator, inputting the disturbed image into a deployment model, outputting a hidden space representation of the image by a feature extractor of the deployment model, and obtaining a prediction result of a target label after the hidden space representation is input into a label predictor;
2) training a sensitive attribute discriminator to predict a sensitive attribute from the hidden space representation, and updating the discriminator to guide the updating of the disturbance generator;
3) updating the disturbance generator to better generate the antagonistic disturbance, deceiving the discriminator to ensure that the image added with the antagonistic disturbance does not contain information of sensitive attributes in a hidden space representation as much as possible, and simultaneously ensure that a prediction result of the label predictor is as accurate as possible;
4) repeating the step 2) and the step 3) until the generator can well cheat the discriminator, the accuracy of the label predictor is high, the generator at the moment is used as a fairness promotion module to be integrated into a deployment model data preprocessing link, and antagonism disturbance is added to the input image to promote fairness;
the deployment model may be represented asWhereinA representative feature extractor for extracting a feature of the image,representing a label predictor, input image is notedThe sensitivity attribute of the image is recorded asObject tag is marked asThe implicit space output in the feature extraction process is represented asThe final output result of the model is the output of the label predictor. Disturbance generatorIs inputted asThe output is the antagonistic disturbance, the disturbance value and the input imageThe summed values are input to a feature extractor. Distinguishing deviceConnected to the output of the feature extractor, from a hidden spatial representationThe output of the medium prediction sensitive attribute is the predicted value of the sensitive attribute。
The method specifically comprises the following steps:
1) the implicit space output in the feature extraction process is represented asThe final output result of the model is the output of the label predictor. Training a reactive perturbation generation module using training data, modifying input data using the module, and using the perturbation generatorFor dataAdding antagonistic disturbance, the disturbed image isDisturbance satisfiesLimiting the norm; the disturbed image is processedInput deployment model, feature extractor for deployment modelImplicit spatial representation of an output imageAnd obtaining the prediction result of the target label after the label predictor is input in the hidden space representation。
2) Measuring sensitive attribute information contained in disturbed image, training discriminatorFrom implicit spatial representationPredicting sensitivity attribute and comparing discriminatorUpdating is carried out, and information of the sensitive attribute is better captured so as to guide the updating of the disturbance generator;
hidden spatial representation of post-perturbation data asDevice for discriminatingThe output of the predicted post-disturbance image sensitivity attribute isBy updatingSo that the discriminator can accurately capture the sensitive attribute from the hidden space representationIs determined by the information of (a) a,the loss function of (d) is:
whereinThe cross-entropy is represented by the cross-entropy,representing the true sensitive property. By minimizingLoss function ofContinuously updating sensitive attribute discriminator。
3) To disturbance generatorAnd updating is carried out, antagonistic disturbance is generated better, a discriminator is deceived, the data added with the antagonistic disturbance does not contain information of sensitive attributes in a hidden space representation as much as possible, and meanwhile, the prediction result of the label predictor is accurate as much as possible.
Disturbance generatorTo-be-deceived discriminatorAnd the method prevents the model from extracting sensitive attribute information, thereby eliminating the association between the sensitive attribute and the target label and improving the fairness under the condition of not changing the model. On the one hand, need to be maximizedHowever, this causes the image to be moved in feature space to the other side of the sensitive property hyperplane. Therefore, it is required to increaseFor the disturbed imagePredicted entropy of letIn thatMaking a random guess above, the loss of entropy for this term can be expressed as:
wherein the content of the first and second substances,representing entropy. Thus, the disturbance generatorThe total loss for improving fairness is expressed as,Is a smaller value, controlling the weight of the entropy-constrained term. In addition to being responsible for perceiving fairnessIn addition, it is necessary to keep the information of the target label in the hidden space representation and keep the model's performance on target label prediction, so a loss term responsible for the model accuracy is needed:
whereinThe cross-entropy is represented by the cross-entropy,the output of the label predictor representing the model.During the course of updating, by addingWhile at the same time reducingAnd deceiving the discriminator and keeping the accuracy of target label prediction.Loss function ofExpressed as:
wherein the parametersControl ofAndthe balance of (a) to (b) is,the higher the primary task accuracy can be maintained,the lower the fairness is improved.
4) And (3) repeating the step 2) and the step 3) to carry out iterative training until the generator can well cheat the discriminator, the accuracy of the label predictor is high, the generator at the moment is integrated into a deployment model data preprocessing link as a fairness promotion module, and antagonism disturbance is added to input data to promote fairness of the deployment model.
In the course of iterative training, the disturbance generatorAnd discriminatorConducting mini-max games until the creator can fool the discriminator well and the label predictor is accurate, at which point the creator will be playedDeployed as a modelAdaptively generating perturbations for the input data. Discriminator in infinitesimal-infinitesimal game processMaximizing slave spaceInter-prediction sensitivity attributeAbility of disturbance generatorThen an attempt is made to fool as much as possibleAt the same time letThe target label of the disturbed image can be predicted, and the target function can be formalized as follows:
wherein, the parameters to be updated in the objective function areAndupdateUpdating by maximizing (max) the above-mentioned objective functionThe above objective function is minimized (min). Constraint term representation generator of objective functionFor input imageApplying a perturbation to the imageDisturbance satisfiesNorm limitationThe implicit space obtained by the data after disturbance is expressed as. Distinguishing deviceMaximizing prediction of sensitive attributes from feature spaceAbility of disturbance generatorThen an attempt is made to fool as much as possibleAt the same time letThe target label of the disturbed image can be predicted. When the generator can cheat the discriminator well and the accuracy of the label predictor is high, stopping the iterative training and leading the generator to be usedDeployed as a modelA data preprocessing module, generatorAntagonistic perturbations can be generated adaptively for the input image.
According to the method for improving the fairness of the deep learning model based on the antagonistic disturbance, the antagonistic disturbance is added to the given deployment model through the training disturbance generator, the relevant characteristics of the sensitive attributes of the model are prevented from being extracted, images with different sensitive attribute values are treated fairly, and therefore the fairness of the deployment model is improved. The invention is tested on CelebA image data set, and in the testing process, the target labelThat is, the target task label value and the sensitive attribute which need to be predicted for the deployment modelFor gender, values of the target label and the sensitive attribute are { -1,1 }. In the test, in order to verify the improvement performance of the method on different fairness level models, models obtained by 4 different training modes are adopted as deployment models:
1) normal training model: training the model to minimize target label prediction loss on the dataset;
2) the confrontation training model comprises the following steps: adding a discriminator at the output end of the model, learning and predicting the sensitive attribute value by the discriminator, minimizing the target label prediction loss of the model on the data set in the process of training the model, and maximizing the discriminator loss so as to reduce the model bias;
3) and (3) turning over the label: randomly turning over a target label of the training set data, expanding bias in the data set, and minimizing target label prediction loss on the data set, so that the model learns the bias existing in the data;
4) gradient inversion model: and (3) inverting the gradient of reverse transmission of the discriminator in the antithetical training model, minimizing the target label prediction loss of the model on the data set in the process of training the model, and minimizing the loss of the discriminator so as to enlarge the bias of the model.
TABLE 1-1 method run results on Normal training model
TABLE 1-2 method run results on the challenge training model
Tables 1-3 running the results on the Label flipping model
Tables 1-4 results of the method run on a gradient inversion model
In the fairness promotion performance, the worse the original fairness of the deployment model is, the more the promotable space is, and the fairness promotion effect is more obvious by using the method and the device. As shown in tables 1-1, 1-2, 1-3, and 1-4 above, each table represents results on deployment models of different fairness levels. In each table, the first column represents different target label prediction tasks, including Smiling, Attractive and Blond _ Hair; the second column represents the input as an original image or a perturbed image; the third column ACC index is used to measure the accuracy of the target task,,the representation of the function of the indicative function,indicating the result of the prediction of the image object label,true target label representing an image, ACC being the number of predicted correct divided by the totalHigher ACC indicates better target task prediction performance; the fourth and fifth columns measure fairness,,representing the sensitive attribute value of the image, DP calculates the difference of the probabilities that the population of different sensitive attribute values is predicted as positive by the model, andthe DEO measures the difference between the false positive rate and the false negative rate of different sensitive attribute value groups, and the closer the two rates are to 0, the better the fairness is. The values of each row in the table represent the test results of either the original image or the perturbed image using the present invention on the deployment model. The experiment shows that the deployment model has certain prejudice, and the method can improve the fairness to a certain extent while maintaining the accuracy of the main task; when a deployment model has a large bias, the method can obviously improve fairness while effectively maintaining the accuracy of the main task; when the deployment model is fairly fair, the method can further improve fairness in a small range while maintaining accuracy of the main task.
TABLE 2-1 test results on the Ali API by the method
TABLE 2-2 method test results on Baidu API
Under the condition that the deployment model cannot be accessed, the method can also achieve a certain fairness promotion effect. As shown in tables 2-1 and 2-2, the smile detection interface (API) provided by the ali and Baidu vision open platform is used for experimental testing, the CelebA data set is used for training the disturbance generator, and when the deployment model architecture and parameters are unknown, the method can improve fairness while better maintaining the target label prediction accuracy.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. The system is characterized by comprising a deployment model, a disturbance generator and a discriminator, wherein the deployment model comprises a feature extractor and a label predictor, the disturbance generator is connected with the feature extractor, the feature extractor is respectively connected with the label predictor and the discriminator, an image is input by the feature extractor, the image is represented by a hidden space through the feature extractor, the hidden space represents a prediction result of a target label output after being input into the label predictor, and the hidden space represents a prediction result of an image sensitive attribute output after being input into the discriminator.
2. The system for improving fairness of a deep learning model based on antagonistic disturbance according to claim 1, wherein an input of the disturbance generator is an image, an output of the disturbance generator is the antagonistic disturbance, and a disturbance value is added to the input image and then input to the feature extractor.
3. A method for improving fairness of a deep learning model based on antagonistic disturbance is characterized by comprising the following steps:
1) adding antagonistic disturbance to the image by using a disturbance generator, inputting the disturbed image into a feature extractor of a deployment model, outputting a hidden space representation of the image by the feature extractor, and obtaining a prediction result of a target label after the hidden space representation is input into a label predictor;
2) measuring sensitive attribute information contained in the disturbed image, inputting the hidden space representation into a discriminator to obtain a prediction result of the image sensitive attribute, training the discriminator to predict the sensitive attribute from the hidden space representation, and updating the discriminator;
3) updating the disturbance generator to better generate the antagonistic disturbance, deceiving the discriminator to ensure that the image added with the antagonistic disturbance does not contain information of sensitive attributes in a hidden space representation as much as possible, and simultaneously ensure that a prediction result of the target label predictor is as accurate as possible;
4) and (3) repeating the step 2) and the step 3) until the generator can well cheat the discriminator, the target label predictor has high accuracy, the disturbance generator at the moment is integrated into a deployment model data preprocessing link as a fairness promotion module, and antagonism disturbance is added to the input image to promote fairness.
4. The method for improving fairness-based deep learning model fairness based on adversarial disturbance as claimed in claim 3, wherein the deployment model is expressed asWhereinIn order to provide a feature extractor for a computer,for the target label predictor, the input image isThe sensitive property isThe object label is。
5. The method for improving fairness of deep learning model based on adversarial disturbance according to claim 4, wherein in the step 1), a disturbance generator is usedFor imagesAdding antagonistic disturbance, the disturbed image isDisturbance satisfiesNorm limitationThe disturbed imageInput deployment model, feature extractor for deployment modelImplicit spatial representation of an output imageAnd obtaining the prediction result of the target label after the label predictor is input in the hidden space representation。
6. The method as claimed in claim 4, wherein the step 2) is performed by updatingSo that the discriminator can accurately capture the sensitive attribute from the hidden space representationIs determined by the information of (a) a,the loss function of (d) is:
7. The method for improving fairness of deep learning model based on adversarial disturbance as claimed in claim 4, 5 or 6, wherein in step 3), the increase is performedEntropy of prediction of disturbed imageIn disturbing the sampleMaking a random guess above, the loss of entropy is expressed as:
8. The method of claim 7 for improving fairness based on adversarial perturbation in deep learning modelLifting method, characterized in that in said step 3), except for being responsible for fairness perceptionBesides, information of the target label needs to be kept in the hidden space representation, the performance of the model on target label prediction needs to be kept, and a loss term responsible for the accuracy of the model needs to be:
whereinThe cross-entropy is represented by the cross-entropy,the output of the target label predictor representing the model,during the course of updating, by addingWhile at the same time reducingDeceiving the discriminator and keeping the accuracy of target label prediction;andis balanced by a parameterThe control is carried out by controlling the temperature of the air conditioner,the higher the primary task accuracy can be maintained,the lower the fairness can be improved,loss function ofExpressed as:
9. the method for improving fairness based on the adversarial disturbance deep learning model in claim 4 or 8, wherein in step 4), the disturbance generatorAnd discriminatorConducting a mini-max game until the generator can fool the discriminator well and the target label predictor has a high accuracy, at which point the generator will be usedDeployed as a modelAdaptively generating perturbations for the input data.
10. The adversarial-perturbation-based deep-learning model fairness of claim 9The character of the character promoting method is that in the minimum-maximum game process, the discriminatorMaximizing the ability to predict the sensitive property z from the feature space, a perturbation generatorThen an attempt is made to fool as much as possibleAt the same time letThe target label of the sample after disturbance can be predicted, and the process target function can be formalized as follows:
wherein, the parameters to be updated in the objective function areAndupdateTo maximize (max) the above mentioned objectiveStandard function, updateMinimizing (min) the above objective function, a constraint term representation generator of the objective functionApplying a disturbance to the input image x, the disturbed image beingDisturbance satisfiesNorm limitationThe implicit space obtained by the data after disturbance is expressed as。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210320949.4A CN114419379A (en) | 2022-03-30 | 2022-03-30 | System and method for improving fairness of deep learning model based on antagonistic disturbance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210320949.4A CN114419379A (en) | 2022-03-30 | 2022-03-30 | System and method for improving fairness of deep learning model based on antagonistic disturbance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114419379A true CN114419379A (en) | 2022-04-29 |
Family
ID=81262937
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210320949.4A Pending CN114419379A (en) | 2022-03-30 | 2022-03-30 | System and method for improving fairness of deep learning model based on antagonistic disturbance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114419379A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115017290A (en) * | 2022-07-15 | 2022-09-06 | 浙江星汉信息技术股份有限公司 | File question-answering system optimization method and device based on cooperative confrontation training |
CN116994309A (en) * | 2023-05-06 | 2023-11-03 | 浙江大学 | Face recognition model pruning method for fairness perception |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020029356A1 (en) * | 2018-08-08 | 2020-02-13 | 杰创智能科技股份有限公司 | Method employing generative adversarial network for predicting face change |
US20200285952A1 (en) * | 2019-03-08 | 2020-09-10 | International Business Machines Corporation | Quantifying Vulnerabilities of Deep Learning Computing Systems to Adversarial Perturbations |
CN111753918A (en) * | 2020-06-30 | 2020-10-09 | 浙江工业大学 | Image recognition model for eliminating sex bias based on counterstudy and application |
CN111881935A (en) * | 2020-06-19 | 2020-11-03 | 北京邮电大学 | Countermeasure sample generation method based on content-aware GAN |
CN112115963A (en) * | 2020-07-30 | 2020-12-22 | 浙江工业大学 | Method for generating unbiased deep learning model based on transfer learning |
-
2022
- 2022-03-30 CN CN202210320949.4A patent/CN114419379A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020029356A1 (en) * | 2018-08-08 | 2020-02-13 | 杰创智能科技股份有限公司 | Method employing generative adversarial network for predicting face change |
US20200285952A1 (en) * | 2019-03-08 | 2020-09-10 | International Business Machines Corporation | Quantifying Vulnerabilities of Deep Learning Computing Systems to Adversarial Perturbations |
CN111881935A (en) * | 2020-06-19 | 2020-11-03 | 北京邮电大学 | Countermeasure sample generation method based on content-aware GAN |
CN111753918A (en) * | 2020-06-30 | 2020-10-09 | 浙江工业大学 | Image recognition model for eliminating sex bias based on counterstudy and application |
CN112115963A (en) * | 2020-07-30 | 2020-12-22 | 浙江工业大学 | Method for generating unbiased deep learning model based on transfer learning |
Non-Patent Citations (1)
Title |
---|
ZHIBO WANG ET AL: "Fairness-aware Adversarial Perturbation Towards Bias Mitigation for Deployed Deep Models", 《ARXIV》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115017290A (en) * | 2022-07-15 | 2022-09-06 | 浙江星汉信息技术股份有限公司 | File question-answering system optimization method and device based on cooperative confrontation training |
CN115017290B (en) * | 2022-07-15 | 2022-11-08 | 浙江星汉信息技术股份有限公司 | File question-answering system optimization method and device based on cooperative confrontation training |
CN116994309A (en) * | 2023-05-06 | 2023-11-03 | 浙江大学 | Face recognition model pruning method for fairness perception |
CN116994309B (en) * | 2023-05-06 | 2024-04-09 | 浙江大学 | Face recognition model pruning method for fairness perception |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111767405B (en) | Training method, device, equipment and storage medium of text classification model | |
CN111754596B (en) | Editing model generation method, device, equipment and medium for editing face image | |
CN107704495B (en) | Training method, device and the computer readable storage medium of subject classification device | |
CN109583501B (en) | Method, device, equipment and medium for generating image classification and classification recognition model | |
CN107391760A (en) | User interest recognition methods, device and computer-readable recording medium | |
CN114419379A (en) | System and method for improving fairness of deep learning model based on antagonistic disturbance | |
CN110796199B (en) | Image processing method and device and electronic medical equipment | |
JP2022141931A (en) | Method and device for training living body detection model, method and apparatus for living body detection, electronic apparatus, storage medium, and computer program | |
CN108961358B (en) | Method and device for obtaining sample picture and electronic equipment | |
WO2023038574A1 (en) | Method and system for processing a target image | |
CN114842343A (en) | ViT-based aerial image identification method | |
CN111753918A (en) | Image recognition model for eliminating sex bias based on counterstudy and application | |
CN114155397A (en) | Small sample image classification method and system | |
CN110807291B (en) | On-site situation future guiding technology based on mimicry countermeasure learning mechanism | |
CN115761408A (en) | Knowledge distillation-based federal domain adaptation method and system | |
CN115063664A (en) | Model learning method, training method and system for industrial vision detection | |
Lauren et al. | A low-dimensional vector representation for words using an extreme learning machine | |
CN113240080A (en) | Prior class enhancement based confrontation training method | |
CN114495114B (en) | Text sequence recognition model calibration method based on CTC decoder | |
CN115795355A (en) | Classification model training method, device and equipment | |
CN113627498B (en) | Character ugly image recognition and model training method and device | |
CN111651626B (en) | Image classification method, device and readable storage medium | |
CN113689514A (en) | Theme-oriented image scene graph generation method | |
CN113449631A (en) | Image classification method and system | |
CN111598144A (en) | Training method and device of image recognition model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220429 |