CN110188593B - 提高人脸识别深度网络训练效率和效果的验证集反馈方法 - Google Patents

提高人脸识别深度网络训练效率和效果的验证集反馈方法 Download PDF

Info

Publication number
CN110188593B
CN110188593B CN201910287187.0A CN201910287187A CN110188593B CN 110188593 B CN110188593 B CN 110188593B CN 201910287187 A CN201910287187 A CN 201910287187A CN 110188593 B CN110188593 B CN 110188593B
Authority
CN
China
Prior art keywords
training
model
effect
candidate
face recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910287187.0A
Other languages
English (en)
Other versions
CN110188593A (zh
Inventor
高华
陈胜勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910287187.0A priority Critical patent/CN110188593B/zh
Publication of CN110188593A publication Critical patent/CN110188593A/zh
Application granted granted Critical
Publication of CN110188593B publication Critical patent/CN110188593B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法,包括如下步骤:步骤1:确定人脸识别深度神经网络训练任务、训练数据集和评估标准,将训练数据集按比例分成互不重叠的训练集和验证集;步骤2:初始化模型M0,确定候选模型参数的个数N;步骤3:在训练集上随机采样生成N个训练集序列;步骤4:分别以M0为起点,在每个训练数据序列上用反向传播方法更新训练模型;步骤5:在验证集上评估步骤4生成的N个候选模型的效果;步骤6:按照模型的效果赋予每个候选模型概率Pn,结合随机性选择1个候选模型替代模型M0;步骤7:重复步骤3至步骤6直至训练结束。本发明有效提升深度神经网络训练的效果和效率。

Description

提高人脸识别深度网络训练效率和效果的验证集反馈方法
技术领域
本发明涉及机器学习和人工智能算法领域,尤其是一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法。
背景技术
深度神经网络大幅度提升了机器学习的性能,在目标检测、模式识别、语义分割和自然语言处理等领域取得了极大的成功,成为目前机器学习理论研究和工业应用的一个主流分支。如何提高深度神经网络的训练效率,提升训练的效果,是目前影响深度神经网络发展和应用的关键问题之一。
已有的人脸识别深度神经网络训练通过误差反向传播迭代更新网络模型的参数,是一个开环过程。在线难例挖掘(Online Hard Example Mining,OHFM)方法根据各训练数据实例的损失值,从批量训练数据实例中查找识别困难实例,一方面完全采用困难实例训练模型存在过拟合风险,另一方面训练困难实例与提升训练效果没有必然的因果关系。
发明内容
为了解决现有人脸识别深度神经网络训练过程缺乏闭环控制的问题,本发明提供一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法,根据不同训练数据组合下的模型在验证集上的效果反馈控制训练过程,是一种闭环控制方式。
为实现上述目的,本发明解决其技术问题所采用的技术方案:
一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法,所述方法包括如下步骤:
步骤1:确定人脸识别深度神经网络训练数据集和评估标准,将训练数据集按比例分成互不重叠的两个数据子集:训练集T和验证集V,训练集T用来执行反向传播过程,验证集V用来评估模型的效果;
步骤2:初始化模型M0,确定候选模型的个数N,在验证集V上评估模型M0的效果,记为R0
步骤3:将训练集T随机打乱N次,保存打乱的训练集{Tn}n=1,...,N,将每一份打乱的训练集Tn(n=1,2,...,N)等分成K批次(batch),记作训练数据序列{Tn,k}k=1,...,K
步骤4:将模型M0复制N份,分别以模型M0为起点,在每一份训练数据序列{Tn,k}k=1,...,K上按照反向传播方法更新训练模型,得到N个候选模型{Mn}n=1,...,N
步骤5:在验证集V上评估N个模型{Mn}n=1,...,N的效果,记为{Rn}n=1,...,N,计算N个模型{Mn}n=1,...,N的增益,第n个模型的增益记如下:
Figure GDA0002853027100000021
∈是一个很小的正数;
步骤6:赋予每个候选模型概率Pn
Figure GDA0002853027100000022
为每个候选模型设置区间An:
Figure GDA0002853027100000023
P0=0
随机产生[0,1]区间的均匀随机数p,如果p∈An则候选模型Mn被选中,用Mn代替初始模型M0,Pn代替初始模型的评估P0
步骤7:重复步骤3至步骤6直至训练结束。
与现有技术相比,本发明的有益效果是:本发明在训练数据集的不同采样序列上更新模型,根据模型在验证集的效果控制训练过程,是一种闭环控制方式,可以有效提升深度神经网络训练的效果,提高训练的效率。
附图说明
图1是本发明一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法的流程图。
具体实施方式
下面将结合本发明实施例中的附图1,对本发明实施例中的技术方案进行清楚、完整地描述。
参照图1,一种用于提高人脸识别深度神经网络训练效率和效果的验证集反馈方法,包括如下步骤:
步骤1:收集50000张有ID标注的人脸图像作为训练数据集,按9:1比例分成互不重叠的训练集T(45000张)和验证集V(5000张),选择ResNet网络结构,设置初始学习率0.01、SGD学习方式、Step=10000、batch_size=100等超参数;
步骤2:初始化ResNet模型M0,在验证集V上评估模型M0的分类损失R0,确定候选训练方案的个数N=10;
步骤3:将训练集T随机打乱10次,保存打乱后的训练集T1、T2、...和T10,将每一份打乱的训练集Tn(n=1,2,...,10)等分成450批(每批100张),记作训练数据序列{Tn,k}k=1,...,450
步骤4:将模型M0复制10份,分别以模型M0为起点,在10份训练数据序列{Tn,k}k=1,...,450(n=1,2,...,10)上分别按照反向传播方法更新训练模型,得到10个候选模型{Mn}n=1,...,10
步骤5:在验证集V上评估10个模型{Mn}n=1,...,N的分类损失R1、R2、...和R10,计算每个候选模型的增益:
Figure GDA0002853027100000041
赋予每个候选模型概率Pn(n=1,2,...,10):
Figure GDA0002853027100000042
步骤6:为每个候选模型设置区间An(n=1,2,...,10):
Figure GDA0002853027100000043
P0=0
随机产生[0,1]区间的均匀随机数p,如果p∈An则候选模型Mn被选中,用Mn代替初始模型M0,Pn代替初始模型的评估P0
步骤7:重复步骤3~步骤6直至训练结束。
显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。

Claims (1)

1.一种提高人脸识别深度网络训练效率和效果的验证集反馈方法,其特征在于,所述方法包括如下步骤:
步骤1:确定人脸识别深度神经网络训练数据集和评估标准,将训练数据集按比例分成互不重叠的两个数据子集:训练集T和验证集V,训练集T用来执行反向传播过程,验证集V用来评估模型的效果;
步骤2:初始化模型M0,确定候选模型的个数N,在验证集V上评估模型M0的效果,记为R0
步骤3:将训练集T随机打乱N次,保存打乱的训练集{Tn}n=1,...,N,将每一份打乱的训练集Tn等分成K批次,n=1,2,...,N,记作训练数据序列{Tn,k}k=1,...,K
步骤4:将模型M0复制N份,分别以模型M0为起点,在每一份训练数据序列{Tn,k}k=1,...,K上按照反向传播方法更新训练模型,得到N个候选模型{Mn}n=1,...,N
步骤5:在验证集V上评估N个模型{Mn}n=1,...,N的效果,记为{Rn}n=1,...,N,计算N个模型{Mn}n=1,...,N的增益,第n个模型的增益记如下:
Figure FDA0002929094330000011
∈是一个很小的正数;
步骤6:赋予每个候选模型概率Pn
Figure FDA0002929094330000012
为每个候选模型设置区间An
Figure FDA0002929094330000013
P0=0
随机产生[0,1]区间的均匀随机数p,如果p∈An则候选模型Mn被选中,用Mn代替初始模型M0,Pn代替初始模型的评估P0
步骤7:重复步骤3至步骤6直至训练结束。
CN201910287187.0A 2019-04-11 2019-04-11 提高人脸识别深度网络训练效率和效果的验证集反馈方法 Active CN110188593B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910287187.0A CN110188593B (zh) 2019-04-11 2019-04-11 提高人脸识别深度网络训练效率和效果的验证集反馈方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910287187.0A CN110188593B (zh) 2019-04-11 2019-04-11 提高人脸识别深度网络训练效率和效果的验证集反馈方法

Publications (2)

Publication Number Publication Date
CN110188593A CN110188593A (zh) 2019-08-30
CN110188593B true CN110188593B (zh) 2021-05-18

Family

ID=67714094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910287187.0A Active CN110188593B (zh) 2019-04-11 2019-04-11 提高人脸识别深度网络训练效率和效果的验证集反馈方法

Country Status (1)

Country Link
CN (1) CN110188593B (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103728028A (zh) * 2013-12-31 2014-04-16 天津大学 红外热释电小波包能量人体热源特征提取与判别方法
US8787627B1 (en) * 2010-04-16 2014-07-22 Steven Jay Freedman System for non-repudiable registration of an online identity
US8908919B2 (en) * 2012-05-29 2014-12-09 The Johns Hopkins University Tactical object finder
CN109272003A (zh) * 2017-07-17 2019-01-25 华东师范大学 一种消除深度学习模型中未知错误的方法与装置
CN109389146A (zh) * 2018-08-22 2019-02-26 中翔科技(杭州)有限公司 基于神经网络图像分类算法的设备状态反馈方法及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180068221A1 (en) * 2016-09-07 2018-03-08 International Business Machines Corporation System and Method of Advising Human Verification of Machine-Annotated Ground Truth - High Entropy Focus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787627B1 (en) * 2010-04-16 2014-07-22 Steven Jay Freedman System for non-repudiable registration of an online identity
US8908919B2 (en) * 2012-05-29 2014-12-09 The Johns Hopkins University Tactical object finder
CN103728028A (zh) * 2013-12-31 2014-04-16 天津大学 红外热释电小波包能量人体热源特征提取与判别方法
CN109272003A (zh) * 2017-07-17 2019-01-25 华东师范大学 一种消除深度学习模型中未知错误的方法与装置
CN109389146A (zh) * 2018-08-22 2019-02-26 中翔科技(杭州)有限公司 基于神经网络图像分类算法的设备状态反馈方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks;Chunshui Cao 等;《2015 IEEE International Conference on Computer Vision (ICCV)》;20151213;第2956-2964页 *

Also Published As

Publication number Publication date
CN110188593A (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
Sun et al. Surrogate-assisted cooperative swarm optimization of high-dimensional expensive problems
US11494637B2 (en) Layer-wise distillation for protecting pre-trained neural network models
Martín et al. Evodeep: a new evolutionary approach for automatic deep neural networks parametrisation
US11914969B2 (en) Contrastive pre-training for language tasks
Zhong et al. Adam revisited: A weighted past gradients perspective
CN110647765A (zh) 协同学习框架下基于知识迁移的隐私保护方法及系统
CN112699247A (zh) 一种基于多类交叉熵对比补全编码的知识表示学习框架
CN110322003B (zh) 一种用于文档分类的基于梯度的通过添加虚假节点的图对抗样本生成方法
CN113140018A (zh) 训练对抗网络模型的方法、建立字库的方法、装置和设备
Yan et al. Noise is also useful: Negative correlation-steered latent contrastive learning
CN110909125A (zh) 推文级社会媒体谣言检测方法
CN115398450A (zh) 使用基于样本的正则化技术的迁移学习设备和方法
CN116152554A (zh) 基于知识引导的小样本图像识别系统
Song et al. Toward robustness in multi-label classification: A data augmentation strategy against imbalance and noise
Wang et al. M2SPL: Generative multiview features with adaptive meta-self-paced sampling for class-imbalance learning
CN110188593B (zh) 提高人脸识别深度网络训练效率和效果的验证集反馈方法
CN111737688B (zh) 基于用户画像的攻击防御系统
Wang et al. Farewell to aimless large-scale pretraining: Influential subset selection for language model
CN110071845B (zh) 一种对未知应用进行分类的方法及装置
CN116561664A (zh) 基于tcn网络的雷达辐射源脉间调制模式识别方法
CN115495578A (zh) 基于最大熵损失的文本预训练模型后门消除方法、系统及介质
Yang et al. Individual property inference over collaborative learning in deep feature space
CN111967973A (zh) 银行客户数据处理方法及装置
Ge et al. A modified artificial immune network for feature extracting
Sudha et al. Recurrrent neural network based model for autism spectrum disorder prediction using codon encoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant