CN112115963A

CN112115963A - A method for generating unbiased deep learning models based on transfer learning

Info

Publication number: CN112115963A
Application number: CN202010750897.5A
Authority: CN
Inventors: 陈晋音; 陈治清; 徐国宁; 徐思雨; 缪盛欢; 郑海斌
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2020-12-22
Anticipated expiration: 2040-07-30
Also published as: CN112115963B

Abstract

The invention discloses a method for generating an unbiased deep learning model based on migration learning, comprising the following steps: (1) constructing an original data set with task labels and prejudice labels of sample images; (2) using the original data set to The biased deep learning model is trained; (3) the adversarial attack network is constructed and trained, and the trained adversarial network is used to attack the original data set, the unbiased data set; (4) the unbiased data set is used to train and the biased deep learning model The initial unbiased deep learning model with the same structure; (5) prepare the third feature extractor, based on the transfer learning strategy, the parameters of the third feature extractor, the third feature extractor determined by the parameters and the trained initial unbiased deep learning model The included second classifier forms an unbiased deep learning model to ensure the fairness of the deep learning model when making automatic decisions based on the input image, so as to improve the accuracy of image recognition.

Description

A method for generating unbiased deep learning models based on transfer learning

技术领域technical field

本发明属于深度学习领域，具体涉及一种基于迁移学习生成无偏见深度学习模型的方法。The invention belongs to the field of deep learning, and in particular relates to a method for generating an unbiased deep learning model based on migration learning.

背景技术Background technique

深度学习凭借其强大的学习样本数据集内在规律和高度抽象化特征的能力，帮助人们自动做出决策并解决了很多复杂的模式识别问题，因而被应用于医疗诊断、语音识别、图像识别、自然语言理解、广告、信贷、就业、教育和刑事司法等领域，并且发挥了很好的效果。随着研究人员的不断探索和创新，深度学习的性能不断提高，应用也越来越广泛，对人们的日常生活产生了深远的影响。With its powerful ability to learn the inherent laws and highly abstract features of sample data sets, deep learning helps people make decisions automatically and solve many complex pattern recognition problems, so it is used in medical diagnosis, speech recognition, image recognition, natural Language comprehension, advertising, credit, employment, education, and criminal justice, and to great effect. With the continuous exploration and innovation of researchers, the performance of deep learning has been continuously improved, and its applications have become more and more widely, which has had a profound impact on people's daily life.

虽然深度学习可以帮助人们获得更准确的预测，然而最新的研究表明，深度学习模型在自动决策时也会带有偏见，这种偏见可能表现在：预测黑人被告再次犯罪的概率要远远高于白人，在预测照片中一个人的性别时，白人的准确率要远远高于黑人，在搜索软件工程师时，男性比例远远高于女性。在一些重要的场合中，例如企业，如果在使用深度学习模型来决策时，这种偏见可能会使得企业处于充满高风险的商业环境中，如果企业放弃深度学习模型，可能会在商业竞争中失去优势而被淘汰，因为深度学习的自动决策支持是时代发展的趋势。可见深度学习模型存在的偏见会对社会造成许多负面影响，并且这些偏见已经深入到各个领域，因此研究深度学习算法测量及其公平性显得尤为重要。Although deep learning can help people get more accurate predictions, the latest research shows that deep learning models are also biased in automatic decision-making. This bias may be manifested in: predicting that black defendants are far more likely to reoffend than Whites are far more accurate than blacks in predicting the gender of a person in a photo, and males are far more likely than females to search for software engineers. In some important occasions, such as enterprises, if the deep learning model is used to make decisions, this bias may make the enterprise in a high-risk business environment. If the enterprise abandons the deep learning model, it may lose in business competition. The advantage is eliminated, because the automatic decision support of deep learning is the trend of the times. It can be seen that the biases of deep learning models will have many negative impacts on society, and these biases have penetrated into various fields, so it is particularly important to study the measurement of deep learning algorithms and their fairness.

深度学习模型存在偏见的主要原因是样本数据集本身带有偏见、深度学习模型会放大这种偏见以及深度学习模型的评估带有偏见。因此，目前研究者对于消除深度学习模型存在偏见的工作主要包括对样本数据集进行预处理来消除偏见、对深度学习模型参数的小规模修改来消除模型中存在的偏见以及对深度学习模型进行具有公平性的评估。然而现有的对于消除深度学习模型偏见的方法中，往往只考虑模型产生偏见的其中一个因素。例如直接通过对样本数据集进行预处理来消除偏见，这种做法存在的问题是训练后的模型没有学习过含有偏见的数据集，那么在识别原始带有偏见的数据时可能会对一些偏见的或无关的特征很敏感，同时，由于没有考虑深度学习模型会放大这种偏见的影响，因此训练得到的模型仅能够消除一部分偏见。The main reasons for the bias of deep learning models are that the sample data set itself is biased, the deep learning model will amplify this bias, and the evaluation of the deep learning model is biased. Therefore, the current work of researchers to eliminate bias in deep learning models mainly includes preprocessing of sample data sets to eliminate bias, small-scale modification of deep learning model parameters to eliminate bias in the model, and deep learning models. fairness assessment. However, in the existing methods for eliminating the bias of deep learning models, only one factor of the model's bias is often considered. For example, preprocessing the sample data set directly to eliminate bias. The problem with this approach is that the trained model has not learned the biased data set, so some biased data may be recognized when identifying the original biased data. or irrelevant features are sensitive, and since deep learning models are not considered to amplify the impact of this bias, the trained model is only able to remove part of the bias.

鉴于深度学习模型存在上述偏见以及目前对于消除偏见方法的局限性，研究一种基于迁移学习生成无偏见深度学习模型的方法，生成无偏见深度学习模型来帮助人们自动决策具有极其重要的理论与实践意义。In view of the above-mentioned biases in deep learning models and the limitations of current methods for eliminating biases, it is extremely important in theory and practice to study a method to generate unbiased deep learning models based on transfer learning, and to generate unbiased deep learning models to help people make automatic decisions. significance.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种基于迁移学习生成无偏见深度学习模型的方法。通过知识迁移让深度学习模型在学习样本数据时，自动过滤带有偏见的特征，从而保证深度学习模型在根据输入图像自动决策时的公平性，以提升图像识别的准确性。The purpose of the present invention is to provide a method for generating an unbiased deep learning model based on transfer learning. Through knowledge transfer, the deep learning model can automatically filter biased features when learning sample data, so as to ensure the fairness of the deep learning model when it automatically makes decisions based on the input image, so as to improve the accuracy of image recognition.

为实现上述发明目的，本发明提供以下技术方案：In order to realize the above-mentioned purpose of the invention, the present invention provides the following technical solutions:

一种基于迁移学习生成无偏见深度学习模型的方法，包括以下步骤：A method for generating unbiased deep learning models based on transfer learning, including the following steps:

(1)获取样本图像，并标记样本图像的任务标签和偏见标签，构建原始数据集；(1) Obtain a sample image, and mark the task label and bias label of the sample image to construct the original data set;

(2)利用原始数据集中图像数据和任务标签对由第一特征提取器和第一分类器组成的有偏见深度学习模型进行训练，得到训练好的有偏见深度学习模型；(2) Use the image data and task labels in the original data set to train the biased deep learning model composed of the first feature extractor and the first classifier, and obtain a trained biased deep learning model;

(3)构建并训练对抗攻击网络，利用训练好的对抗网络对原始数据集进行攻击，得到与原始数据集对应的无偏见数据集，以使得无偏见数据集中的偏见标签不可以预测；(3) Construct and train an adversarial attack network, use the trained adversarial network to attack the original data set, and obtain an unbiased data set corresponding to the original data set, so that the biased labels in the unbiased data set are unpredictable;

(4)利用无偏见数据集训练与有偏见深度学习模型结构相同的初始无偏见深度学习模型；(4) Use the unbiased dataset to train the initial unbiased deep learning model with the same structure as the biased deep learning model;

(5)准备第三特征提取器，利用第三特征提取器对原始样本图像提取的特征分布和利用无偏见深度学习模型的第二特征提取器对原始样本图像对应的无偏见图像提取得到的特征分布构建损失函数，利用损失函数优化第三特征提取器参数，参数确定的第三特征提取器与训练好的初始无偏见深度学习模型包含的第二分类器组成无偏见深度学习模型。(5) Prepare a third feature extractor, and use the feature distribution extracted from the original sample image by the third feature extractor and the features extracted from the unbiased image corresponding to the original sample image by the second feature extractor using the unbiased deep learning model A loss function is constructed by distribution, and the parameters of the third feature extractor are optimized by the loss function. The third feature extractor determined by the parameters and the second classifier included in the trained initial unbiased deep learning model form an unbiased deep learning model.

优选地，所述构建并训练对抗攻击网络包括：Preferably, the constructing and training the adversarial attack network comprises:

构建对抗攻击网络，包括由卷积层和全连接层，激活函数采用ReLU 函数，对抗攻击网络的输入为原始样本图像经过训练好的第一特征提取器提取的特征分布，输出是logits层，经过softmax函数得到原始样本图像的偏见标签的预测概率分布；Construct the adversarial attack network, including the convolution layer and the fully connected layer, the activation function adopts the ReLU function, the input of the adversarial attack network is the feature distribution extracted by the first feature extractor trained on the original sample image, and the output is the logits layer. The softmax function obtains the predicted probability distribution of the bias label of the original sample image;

构建对抗攻击网络的损失函数Loss_NAdv，该损失函数旨在使对抗攻击网络根据原始样本图像对应的特征分布预测偏见标签的概率分布，计算公式为：Construct the loss function Loss_NAdv of the adversarial attack network. The loss function is designed to make the adversarial attack network predict the probability distribution of bias labels according to the feature distribution corresponding to the original sample image. The calculation formula is:

其中，z_i是原始样本图像x_i经过训练好的有偏见深度学习模型的第一特征提取器的输出；B_i是原始样本图像x_i的真实偏见标签；nadv(·)表示对抗攻击网络的输出；L(·)表示交叉熵函数，i为原始样本图像的索引，N为原始样本图像的总个数；Among them, _zi is the output of the first feature extractor of the pre-trained biased deep learning model for the original sample image _{xi; B i} _is the true bias label of the original sample image _xi ; nadv( ) represents the value of the adversarial attack network Output; L( ) represents the cross-entropy function, i is the index of the original sample image, and N is the total number of original sample images;

利用损失函数Loss_NAdv对对抗攻击网络进行训练，以优化对抗攻击网络的模型参数。The adversarial attack network is trained with the loss function Loss_NAdv to optimize the model parameters of the adversarial attack network.

优选地，利用训练好的对抗网络对原始数据集进行攻击，得到与原始数据集对应的无偏见数据集包括：Preferably, the trained adversarial network is used to attack the original data set, and the unbiased data set corresponding to the original data set is obtained including:

(a)设计扰动变量r；(a) Design disturbance variable r;

(b)该扰动变量r添加到原始样本图像x_i，得到扰动样本图像，利用训练好的有偏见深度学习模型的第一特征提取器提取扰动样本图像的扰动特征分布；(b) The disturbance variable r is added to the original sample image _xi to obtain the disturbance sample image, and the first feature extractor of the trained biased deep learning model is used to extract the disturbance feature distribution of the disturbance sample image;

(c)利用训练好的对抗攻击网络对扰动特征分布进行计算，得到预测概率分布，根据预测概率分布计算损失Loss_Adv，在迭代次数没有达到最大迭代次数时，根据损失Loss_Adv更新扰动变量r，跳转执行步骤(b)，直到达到最大迭代次数时为止，输出利用最新扰动变量r得到的扰动样本图像作为无偏见图像，组成无偏见数据集；(c) Use the trained adversarial attack network to calculate the disturbance feature distribution to obtain the predicted probability distribution, and calculate the loss Loss_Adv according to the predicted probability distribution. When the number of iterations does not reach the maximum number of iterations, update the disturbance variable r according to the loss Loss_Adv, jump Execute step (b) until the maximum number of iterations is reached, and output the perturbed sample image obtained by using the latest perturbation variable r as an unbiased image to form an unbiased data set;

损失Loss_Adv的计算公式为：The formula for calculating the loss Loss_Adv is:

Loss_Adv＝-αLoss_NAdv+Loss_YLoss_Adv=-αLoss_NAdv+Loss_Y

其中，α为超参数，取值范围为0～1，Loss_Y是除偏见标签外任务标签的损失函数值，计算公式为：Among them, α is a hyperparameter with a value range of 0 to 1, Loss_Y is the loss function value of the task label except the bias label, and the calculation formula is:

其中，c₁(·)表示有偏见深度学习模型的第一分类器的预测输出，y_i表示原始样本图像x_i的任务标签。where c ₁ ( ) represents the predicted output of the first classifier of the biased deep learning model, and _yi represents the task label of the original sample image _xi .

其中，优化第三特征提取器参数的损失函数Loss_tl为：Among them, the loss function Loss_tl for optimizing the parameters of the third feature extractor is:

Loss_tl＝∑L(h,h')Loss_tl=∑L(h,h')

其中，h表示原始样本图像x_i经过第三特征提取器输出的特征分布，h' 表示无偏见深度学习模型的第二特征提取器对原始样本图像x_i对应的无偏见图像提取得到的特征分布。Among them, h represents the feature distribution of the original sample image _xi output by the third feature extractor, h' represents the feature distribution obtained by the unbiased image corresponding to the original sample image _xi extracted by the second feature extractor of the unbiased deep learning model .

其中，在获取样本图像后，对样本图像进行旋转、翻转、颜色增强、添加高斯噪声、随机缩放，以扩充样本图像，偏见标签包括种族标签、地域标签、性别标签。Among them, after the sample image is obtained, the sample image is rotated, flipped, color enhanced, added with Gaussian noise, and randomly scaled to expand the sample image. The bias labels include ethnic labels, regional labels, and gender labels.

优选地，第一特征提取器和第二特征提取器采用ResNet-50模型；Preferably, the first feature extractor and the second feature extractor use the ResNet-50 model;

第一分类器和初始无偏见深度学习模型的第二分类器采用由全连接层组成的网络。The first classifier and the second classifier of the initial unbiased deep learning model employ a network consisting of fully connected layers.

所述第一分类器和初始无偏见深度学习模型的第二分类器采用由4个全连接层组成的网络；The first classifier and the second classifier of the initial unbiased deep learning model use a network consisting of 4 fully connected layers;

对抗攻击网络采用由3个卷积层和4全连接层组成的网络，激活函数采用ReLU函数。The adversarial attack network adopts a network composed of 3 convolutional layers and 4 fully connected layers, and the activation function adopts the ReLU function.

优选地，对抗攻击网络、初始无偏见深度学习模型以及第三特征提取器的训练参数设置为：Batch大小设为32，训练最大迭代次数设为60，优化器采用Adam，学习率设置为0.001，第一次和第二次估计的指数衰减率分别设为0.9和0.999。Preferably, the training parameters of the adversarial attack network, the initial unbiased deep learning model and the third feature extractor are set as follows: the batch size is set to 32, the maximum number of training iterations is set to 60, the optimizer is Adam, and the learning rate is set to 0.001, The exponential decay rates for the first and second estimates were set to 0.9 and 0.999, respectively.

与现有技术相比，本发明具有的有益效果至少包括：Compared with the prior art, the beneficial effects of the present invention at least include:

本发明实施例提供的基于迁移学习生成无偏见深度学习模型的方法，基于迁移学习的策略使深度学习模型获得自动过滤样本数据的偏见特征的能力，保证模型决策的公平性，提升了图像识别的准确性。The method for generating an unbiased deep learning model based on migration learning provided by the embodiment of the present invention enables the deep learning model to obtain the ability to automatically filter biased features of sample data based on the strategy of migration learning, ensures the fairness of model decision-making, and improves the performance of image recognition. accuracy.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动前提下，还可以根据这些附图获得其他附图。In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts.

图1为本发明实施例提供的基于迁移学习生成无偏见深度学习模型方法的流程示意图；1 is a schematic flowchart of a method for generating an unbiased deep learning model based on transfer learning according to an embodiment of the present invention;

图2为本发明实施例提供的基于迁移学习的无偏见深度学习模型的系统框架图；2 is a system framework diagram of an unbiased deep learning model based on transfer learning provided by an embodiment of the present invention;

图3为本发明实施例提供的生成无偏见数据集的构建流程图；Fig. 3 is a construction flow chart of generating an unbiased data set provided by an embodiment of the present invention;

图4为本发明实施例提供的基于迁移学习的无偏见深度学习模型的训练流程图。FIG. 4 is a training flowchart of an unbiased deep learning model based on transfer learning provided by an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例对本发明进行进一步的详细说明。应当理解，此处所描述的具体实施方式仅仅用以解释本发明，并不限定本发明的保护范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the protection scope of the present invention.

为了解决由于深度学习模型存在偏见问题导致的图像识别不准确的问题。本实施例提供了一种基于迁移学习生成无偏见深度学习模型的方法，如图1所示，该基于迁移学习生成无偏见深度学习模型的方法包括以下步骤：In order to solve the problem of inaccurate image recognition due to the bias problem of deep learning models. This embodiment provides a method for generating an unbiased deep learning model based on migration learning. As shown in FIG. 1 , the method for generating an unbiased deep learning model based on migration learning includes the following steps:

(1)深度学习模型偏见的定义。(1) Definition of deep learning model bias.

本发明将深度学习模型在自动决策时依赖于虚假的相关特征导致的错误决策定义为深度学习模型的决策偏见。其中，虚假的相关特征即偏见特征，可能是种族、地域、性别等，例如，当偏见特征是性别时，在利用深度学习模型对软件工程师的简历进行筛选时，女性的淘汰率往往比较高，从而造成该职业对女性的歧视以及模型决策的不公平。因此，本发明通过过滤无关特征后进行分类来消除模型的偏见。In the present invention, the erroneous decision-making caused by the deep learning model relying on false relevant features during automatic decision-making is defined as the decision bias of the deep learning model. Among them, false related features are bias features, which may be race, region, gender, etc. For example, when the bias feature is gender, when using deep learning models to screen software engineers' resumes, the elimination rate of women is often higher. This results in discrimination against women in the profession and unfair model decision-making. Therefore, the present invention removes the bias of the model by filtering irrelevant features for classification.

(2)数据集准备及预处理。(2) Data set preparation and preprocessing.

本实施例选择一个带有多标签分类的图像数据集，例如COCO数据集，将其中一个偏见标签B作为偏见特征，例如性别特征。其他标签中选择一个或多个作为任务标签，该任务标签可以是职业标签等，对数据集进行预处理，为了提高深度学习模型的识别精度，可以添加数据增强操作，来扩充数据集，包括旋转、翻转、颜色增强、添加高斯噪声、随机缩放。将数据增强操作后的图像组成原始数据集，将该原始数据集进按7:3的比例划分为训练集、测试集，分别用于无偏模型的迁移学习和测试。This embodiment selects an image dataset with multi-label classification, such as the COCO dataset, and uses one of the biased labels B as a biased feature, such as a gender feature. Select one or more of the other tags as task tags. The task tags can be occupation tags, etc., and preprocess the data set. In order to improve the recognition accuracy of the deep learning model, data augmentation operations can be added to expand the data set, including rotation. , flip, color enhancement, add Gaussian noise, random scaling. The images after the data enhancement operation are composed of the original data set, and the original data set is divided into a training set and a test set according to the ratio of 7:3, which are used for the transfer learning and testing of the unbiased model respectively.

(3)搭建并训练有偏见深度学习模型。(3) Build and train a biased deep learning model.

本实施例中，构建的有偏见深度学习模型包括第一特征提取器和第一分类器两部分，其中第一特征提取器采用ResNet-50模型结构，第一分类器采用4个全连接层构成的网络。利用原始数据集的训练集训练有偏见深度学习模型，并用测试集对有偏见深度学习模型进行测试优化，使有偏见深度学习模型达到预设的识别准确率。由于训练使用的原始数据集中是含有偏见特征的，因此训练得到的模型称为有偏见深度学习模型。In this embodiment, the constructed biased deep learning model includes a first feature extractor and a first classifier, wherein the first feature extractor adopts the ResNet-50 model structure, and the first classifier adopts four fully connected layers. network of. Use the training set of the original data set to train the biased deep learning model, and use the test set to test and optimize the biased deep learning model, so that the biased deep learning model can reach the preset recognition accuracy. Since the original dataset used for training contains biased features, the trained model is called a biased deep learning model.

(4)搭建并训练对抗攻击网络。(4) Build and train the adversarial attack network.

该对抗攻击网络旨在通过有偏见深度学习模型的第一特征提取器的输出特征来预测偏见标签B，主要用于生成无偏见的数据集。具体过程如下：This adversarial attack network aims to predict the biased label B through the output features of the first feature extractor of a biased deep learning model, and is mainly used to generate an unbiased dataset. The specific process is as follows:

(4.1)构建对抗攻击网络的结构，对抗攻击网络采用3个卷积层和4 个全连接层组成的网络，激活函数采用ReLU函数。对抗攻击网络的输入是原始数据集经过步骤(3)中有偏见深度学习模型的第一特征提取器的输出z，z表示原始数据集的特征分布。对抗攻击网络的输出是logits层，经过softmax函数得到原始数据集的偏见标签的预测概率分布。(4.1) Construct the structure of the adversarial attack network. The adversarial attack network adopts a network composed of 3 convolutional layers and 4 fully connected layers, and the activation function adopts the ReLU function. The input of the adversarial attack network is the output z of the first feature extractor of the biased deep learning model on the original dataset in step (3), where z represents the feature distribution of the original dataset. The output of the adversarial attack network is the logits layer, and the predicted probability distribution of the bias labels of the original data set is obtained through the softmax function.

(4.2)设计对抗攻击网络的损失函数，该损失函数旨在使对抗攻击网络根据原始样本图像对应的特征分布预测偏见标签的概率分布，其计算公式如下：(4.2) Design the loss function of the adversarial attack network. The loss function is designed to make the adversarial attack network predict the probability distribution of bias labels according to the feature distribution corresponding to the original sample image. The calculation formula is as follows:

其中，z_i是原始样本图像x_i经过步骤(3)中有偏见深度学习模型的第一特征提取器的输出；B_i是原始样本图像x_i的真实偏见标签；nadv(·)表示对抗攻击网络的输出；L(·)表示交叉熵函数。Among them, _zi is the output of the original sample image _xi through the first feature extractor of the biased deep learning model in step (3); B _i is the true bias label of the original sample image _xi ; nadv( ) represents the adversarial attack The output of the network; L( ) represents the cross-entropy function.

(4.3)训练对抗攻击网络，设置Batch大小设为32，训练最大迭代次数设为60，优化器采用Adam，学习率设置为0.001，第一次和第二次估计的指数衰减率分别设为0.9和0.999。使用原始数据集的训练集及其偏见标签B训练对抗攻击网络，并用测试集对对抗攻击网络进行测试优化，使对抗攻击网络达到预设的识别准确率。(4.3) Train the adversarial attack network, set the batch size to 32, the maximum number of training iterations to 60, the optimizer to use Adam, the learning rate to 0.001, and the exponential decay rates of the first and second estimates to be 0.9, respectively and 0.999. The adversarial attack network is trained using the training set of the original data set and its bias label B, and the adversarial attack network is tested and optimized with the test set, so that the adversarial attack network can reach the preset recognition accuracy.

(5)生成无偏见数据集，利用对抗攻击网络对原始数据集进行攻击，该攻击是通过在原始数据集中添加扰动使得生成的无偏见数据集的偏见标签B不可预测，但其它标签不受影响。其中原始数据集与无偏见数据集一一对应。(5) Generate an unbiased dataset, and use the adversarial attack network to attack the original dataset. The attack is to add disturbance to the original dataset to make the biased label B of the generated unbiased dataset unpredictable, but other labels are not affected. . The original dataset corresponds to the unbiased dataset one-to-one.

设计对抗攻击的损失函数，计算公式如下：Design the loss function against the attack, and the calculation formula is as follows:

Loss_Adv＝-αLoss_NAdv+Loss_Y (2)Loss_Adv=-αLoss_NAdv+Loss_Y (2)

其中，α是超参数，Loss_Y是除偏见标签B外其它分类标签的损失函数值，其计算公式如下：Among them, α is a hyperparameter, Loss_Y is the loss function value of other classification labels except the bias label B, and its calculation formula is as follows:

其中，c₁(·)表示步骤(3)中有偏见深度学习模型的第一分类器的输出；y_i表示原始样本图像x_i的真实标签。Among them, c ₁ (·) represents the output of the first classifier of the biased deep learning model in step (3); _yi represents the true label of the original sample image _xi .

设计扰动变量r，r为weight×hight×3的矩阵，其中，weight、hight分别是样本数据图像的宽和高，3表示样本数据图像的RGB三个通道，扰动变量r初始化为零矩阵。采用Adam优化器对扰动变量r进行优化，优化器参数与步骤(4.3)相同。The disturbance variable r is designed, where r is a matrix of weight×hight×3, where weight and hight are the width and height of the sample data image respectively, 3 represents the RGB three channels of the sample data image, and the disturbance variable r is initialized to a zero matrix. The disturbance variable r is optimized by the Adam optimizer, and the optimizer parameters are the same as in step (4.3).

设置单个样本的最大迭代次数为1000，如图3，每个原始数据集样本生成无偏见数据的具体过程如下：The maximum number of iterations for a single sample is set to 1000, as shown in Figure 3. The specific process of generating unbiased data for each original dataset sample is as follows:

(5.1)输入原始样本图像x_i，转步骤(5.2)；(5.1) Input the original sample image _xi and go to step (5.2);

(5.2)原始样本图像x_i与扰动变量r叠加得到扰动样本图像x_i'，转步骤(5.3)；(5.2) The original sample image _xi and the disturbance variable r are superimposed to obtain the disturbance sample image _xi ', and then go to step (5.3);

(5.3)将扰动样本图像x_i'输入到步骤(3)中有偏见深度学习模型的第一特征提取器并输出z_i，转步骤(5.4)；(5.3) Input the perturbed sample image _xi ' into the first feature extractor of the biased deep learning model in step (3) and output _zi , and go to step (5.4);

(5.4)将z_i输入到步骤(4)的对抗攻击网络中，并输出nadv(z_i)，转步骤(5.5)；(5.4) Input _zi into the adversarial attack network of step (4), and output nadv( _zi ), go to step (5.5);

(5.5)根据公式(1)～(3)计算损失函数值，转步骤(5.6)；(5.5) Calculate the loss function value according to formulas (1) to (3), and go to step (5.6);

(5.6)判断是否达到最大迭代次数，如果是，输出扰动样本图像x_i'作为无偏见图像，组成无偏见数据集，结束迭代，如果不是，转步骤(5.7)；(5.6) Judging whether the maximum number of iterations is reached, if so, output the disturbed sample image _xi ' as an unbiased image to form an unbiased data set, and end the iteration, if not, go to step (5.7);

(5.7)根据损失函数值使用Adam更新扰动变量r，转步骤(5.2)。(5.7) Use Adam to update the disturbance variable r according to the loss function value, and go to step (5.2).

(6)搭建并训练初始无偏见深度学习模型。(6) Build and train an initial unbiased deep learning model.

初始无偏见深度学习模型结构与步骤(3)中有偏见的深度学习模型相同。即初始无偏见深度学习模型包括与第一特征提取器结构相同的第二特征提取器，还包括与第一分类器结果相同的第二分类器。The initial unbiased deep learning model structure is the same as the biased deep learning model in step (3). That is, the initial unbiased deep learning model includes a second feature extractor with the same structure as the first feature extractor, and also includes a second classifier with the same result as the first classifier.

设置Batch大小设为32，训练最大迭代次数设为60，优化器采用Adam，学习率设置为0.001，第一次和第二次估计的指数衰减率分别设为0.9和0.999。使用利用步骤(5)生成的无偏见数据集训练初始无偏见深度学习模型，并用测试集对模型进行测试优化，使初始无偏见深度学习模型达到预设的识别准确率。记初始无偏见深度学习模型的第二特征提取器的输出为h'，记初始无偏见深度学习模型的第二分类器为c₂(·)。Set the batch size to 32, the maximum number of training iterations to 60, the optimizer to use Adam, the learning rate to 0.001, and the exponential decay rates for the first and second estimates to be 0.9 and 0.999, respectively. Use the unbiased data set generated in step (5) to train the initial unbiased deep learning model, and use the test set to test and optimize the model, so that the initial unbiased deep learning model reaches a preset recognition accuracy. Let the output of the second feature extractor of the initial unbiased deep learning model be h', and let the second classifier of the initial unbiased deep learning model be c ₂ (·).

(7)设计基于迁移学习的无偏见深度学习模型训练框架。(7) Design an unbiased deep learning model training framework based on transfer learning.

如图2所示，步骤(7)的具体过程为：As shown in Figure 2, the concrete process of step (7) is:

(7.1)设计基于迁移学习的无偏见深度学习模型结构，可以采用第三特征提取器，其输入是原始样本图像x_i，输出为中间特征h；第三分类器采用初始无偏见深度学习模型训练好的第二分类器c₂(·)；(7.1) To design the structure of an unbiased deep learning model based on transfer learning, a third feature extractor can be used, whose input is the original sample image _xi , and the output is an intermediate feature h; the third classifier is trained by the initial unbiased deep learning model good second classifier c ₂ ( );

(7.2)设计基于迁移学习的无偏见深度学习模型的损失函数，该损失函数旨在使无偏见深度学习模型学习初始无偏见深度学习模型的知识，使得无偏见深度学习模型在面对原始数据集时能够自动过滤带有偏见的特征，该损失函数计算公式如下：(7.2) Design the loss function of the unbiased deep learning model based on transfer learning. The loss function is designed to make the unbiased deep learning model learn the knowledge of the initial unbiased deep learning model, so that the unbiased deep learning model can face the original data set. It can automatically filter the biased features when the loss function is calculated as follows:

Loss_tl＝∑L(h,h') (4)Loss_tl=∑L(h,h') (4)

其中，h表示无偏见深度学习模型的第三特征提取器的输出，h'表示初始无偏见深度学习模型的第二特征提取器的输出。where h represents the output of the third feature extractor of the unbiased deep learning model, and h' represents the output of the second feature extractor of the initial unbiased deep learning model.

(7.3)设计初始化的训练参数，设置Batch大小设为32，训练最大迭代次数设为60，优化器采用Adam，学习率设置为0.001，第一次和第二次估计的指数衰减率分别设为0.9和0.999。(7.3) Design the initialized training parameters, set the batch size to 32, the maximum number of training iterations to 60, the optimizer to use Adam, the learning rate to 0.001, and the exponential decay rates of the first and second estimates to be set as 0.9 and 0.999.

(7.4)设计无偏见深度学习模型的训练流程，结合图4，具体过程如下：(7.4) The training process of designing an unbiased deep learning model, combined with Figure 4, the specific process is as follows:

(7.4.1)将原始样本图像x_i输入到第三特征提取器中并输出h，转步骤(7.4.2)；(7.4.1) Input the original sample image _xi into the third feature extractor and output h, go to step (7.4.2);

(7.4.2)将原始样本图像x_i对应的无偏图像x_i'输入到初始无偏见深度学习模型的第二特征提取器中并输出h'，转步骤(7.4.3)；(7.4.2) Input the unbiased image _xi ' corresponding to the original sample image _xi into the second feature extractor of the initial unbiased deep learning model and output h', go to step (7.4.3);

(7.4.3)根据公式(4)计算损失函数值，转步骤(7.4.4)；(7.4.3) Calculate the loss function value according to formula (4), go to step (7.4.4);

(7.4.4)根据损失函数值更新无偏见深度学习模型参数，转步骤 (7.4.5)；(7.4.4) Update the parameters of the unbiased deep learning model according to the loss function value, go to step (7.4.5);

(7.4.5)判断是否达到最大迭代次数，如果是，保存模型，结束训练；如果不是，转步骤(7.4.1)。(7.4.5) Determine whether the maximum number of iterations is reached, if so, save the model and end the training; if not, go to step (7.4.1).

(8)训练并测试无偏见深度学习模型。(8) Train and test an unbiased deep learning model.

根据步骤(7.4)的训练流程，利用原始数据集和无偏数据集训练无偏见深度学习模型的第三特征提取器，训练结束后将第三特征提取器与步骤 (6)初始无偏见深度学习模型训练好的第二分类器连接作为无偏深度学习模型，即本发明基于迁移学习生成的无偏见深度学习模型。利用原始数据集的测试集测试基于迁移学习生成的无偏见深度学习模型的偏见程度λ，其中模型的偏见程度λ值越小，模型在决策时越公平，其计算公式如下：According to the training process of step (7.4), use the original data set and the unbiased data set to train the third feature extractor of the unbiased deep learning model. After the training, the third feature extractor and the initial unbiased deep learning of step (6) The second classifier trained by the model is connected as an unbiased deep learning model, that is, an unbiased deep learning model generated based on transfer learning in the present invention. The test set of the original data set is used to test the prejudice degree λ of the unbiased deep learning model generated based on transfer learning. The smaller the prejudice degree λ value of the model, the fairer the model is in decision-making. The calculation formula is as follows:

其中，n表示原始数据集中测试集样本数据总数；nadv(h_i)表示样本数据x_i经过迁移学习无偏模型的特征提取器和对抗攻击网络的输出；函数l[·] 表示指示函数，当括号中等式成立时值为1，反之为0。Among them, n represents the total number of test set sample data in the original data set; nadv(h _i ) represents the output of the feature extractor of the unbiased model and the adversarial attack network of the sample data _xi through the transfer learning; the function l[ ] represents the indicator function, when The value in parentheses is 1 when the formula is established, and 0 otherwise.

上述提供的基于迁移学习生成无偏见深度学习模型的方法，提出了一种新的无偏模型训练框架，通过无偏模型的知识迁移和对抗攻击网络来生成无偏见深度学习模型，提出的无偏模型训练方法同时解决了训练数据和模型结构参数存在偏见的问题，进一步保证了模型决策的公平性。提出的迁移学习无偏模型可以选择简单的结构，大大降低深度学习模型在实际应用中的计算时间。提出的基于迁移学习的策略，可以在保证原始目标分类任务精度的情况下，使深度学习模型能够自动过滤偏见特征，保证深度学习模型在决策时的公平性，为研究消除深度学习模型偏见提供指导。The method for generating an unbiased deep learning model based on transfer learning provided above proposes a new unbiased model training framework, which generates an unbiased deep learning model through the knowledge transfer of the unbiased model and the adversarial attack network. The model training method simultaneously solves the problem of bias in training data and model structure parameters, and further ensures the fairness of model decision-making. The proposed unbiased model for transfer learning can choose a simple structure, which greatly reduces the computational time of deep learning models in practical applications. The proposed strategy based on transfer learning can enable the deep learning model to automatically filter biased features while ensuring the accuracy of the original target classification task, ensure the fairness of the deep learning model in decision-making, and provide guidance for research on eliminating the bias of deep learning models. .

以上所述的具体实施方式对本发明的技术方案和有益效果进行了详细说明，应理解的是以上所述仅为本发明的最优选实施例，并不用于限制本发明，凡在本发明的原则范围内所做的任何修改、补充和等同替换等，均应包含在本发明的保护范围之内。The above-mentioned specific embodiments describe in detail the technical solutions and beneficial effects of the present invention. It should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, additions and equivalent substitutions made within the scope shall be included within the protection scope of the present invention.

Claims

1. a method for generating an unbiased deep learning model based on migration learning, is characterized in that, comprises the following steps:

(1) Obtain a sample image, and mark the task label and bias label of the sample image to construct the original data set;

(2) Use the image data and task labels in the original data set to train the biased deep learning model composed of the first feature extractor and the first classifier, and obtain a trained biased deep learning model;

(3) Construct and train an adversarial attack network, use the trained adversarial network to attack the original data set, and obtain an unbiased data set corresponding to the original data set, so that the biased labels in the unbiased data set are unpredictable;

(4) Use the unbiased dataset to train the initial unbiased deep learning model with the same structure as the biased deep learning model;

(5) Prepare a third feature extractor, and use the feature distribution extracted from the original sample image by the third feature extractor and the features extracted from the unbiased image corresponding to the original sample image by the second feature extractor using the unbiased deep learning model A loss function is constructed by distribution, and the parameters of the third feature extractor are optimized by the loss function. The third feature extractor determined by the parameters and the second classifier included in the trained initial unbiased deep learning model form an unbiased deep learning model.

2. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, wherein the constructing and training the adversarial attack network comprises:

Construct an adversarial attack network, including a convolutional layer and a fully connected layer, the activation function adopts the ReLU function, the input of the adversarial attack network is the feature distribution extracted by the first feature extractor trained on the original sample image, and the output is the logits layer. The softmax function obtains the predicted probability distribution of the bias label of the original sample image;

Construct the loss function Loss_NAdv of the adversarial attack network. The loss function is designed to make the adversarial attack network predict the probability distribution of bias labels according to the feature distribution corresponding to the original sample image. The calculation formula is:

Among them, _zi is the output of the first feature extractor of the pre-trained biased deep learning model for the original sample image _{xi; B i} _is the true bias label of the original sample image _xi ; nadv( ) represents the value of the adversarial attack network Output; L( ) represents the cross-entropy function, i is the index of the original sample image, and N is the total number of original sample images;

The adversarial attack network is trained with the loss function Loss_NAdv to optimize the model parameters of the adversarial attack network.

3. The method for generating an unbiased deep learning model based on migration learning as claimed in claim 2, wherein the trained adversarial network is used to attack the original data set, and the unbiased data set corresponding to the original data set includes: :

(a) Design disturbance variable r;

(b) The disturbance variable r is added to the original sample image _xi to obtain the disturbance sample image, and the first feature extractor of the trained biased deep learning model is used to extract the disturbance feature distribution of the disturbance sample image;

(c) Use the trained adversarial attack network to calculate the disturbance feature distribution to obtain the predicted probability distribution, and calculate the loss Loss_Adv according to the predicted probability distribution. When the number of iterations does not reach the maximum number of iterations, update the disturbance variable r according to the loss Loss_Adv, jump Execute step (b) until the maximum number of iterations is reached, and output the perturbed sample image obtained by using the latest perturbation variable r as an unbiased image to form an unbiased data set;

The formula for calculating the loss Loss_Adv is:

Loss_Adv=-αLoss_NAdv+Loss_Y

Among them, α is a hyperparameter with a value range of 0 to 1, Loss_Y is the loss function value of the task label except the bias label, and the calculation formula is:

where c ₁ ( ) represents the predicted output of the first classifier of the biased deep learning model, and _yi represents the task label of the original sample image _xi .

4. The method for generating an unbiased deep learning model based on migration learning as claimed in claim 1, wherein the loss function Loss_t1 of optimizing the third feature extractor parameter is:

Loss_tl=∑L(h,h')

Among them, h represents the feature distribution of the original sample image _xi output by the third feature extractor, h' represents the feature distribution obtained by the unbiased image corresponding to the original sample image _xi extracted by the second feature extractor of the unbiased deep learning model .

5. The method for generating an unbiased deep learning model based on migration learning as claimed in claim 1, wherein after acquiring the sample image, the sample image is rotated, flipped, color enhanced, added Gaussian noise, and randomly scaled to The sample images are augmented, and bias labels include ethnic labels, geographical labels, and gender labels.

6. The method for generating an unbiased deep learning model based on migration learning as claimed in claim 1, wherein the first feature extractor and the second feature extractor use the ResNet-50 model;

The first classifier and the second classifier of the initial unbiased deep learning model employ a network consisting of fully connected layers.

7. The method for generating an unbiased deep learning model based on transfer learning according to claim 1, wherein the first classifier and the second classifier of the initial unbiased deep learning model are composed of four fully connected layers. formed network;

The adversarial attack network adopts a network composed of 3 convolutional layers and 4 fully connected layers, and the activation function adopts the ReLU function.

8. The method for generating an unbiased deep learning model based on migration learning as claimed in claim 1, wherein the training parameters of the adversarial attack network, the initial unbiased deep learning model and the third feature extractor are set as: Batch size is set as follows. is 32, the maximum number of training iterations is set to 60, the optimizer adopts Adam, the learning rate is set to 0.001, and the exponential decay rates of the first and second estimates are set to 0.9 and 0.999, respectively.