CN111788585A - 一种深度学习模型的训练方法、系统 - Google Patents

一种深度学习模型的训练方法、系统 Download PDF

Info

Publication number
CN111788585A
CN111788585A CN201980000128.9A CN201980000128A CN111788585A CN 111788585 A CN111788585 A CN 111788585A CN 201980000128 A CN201980000128 A CN 201980000128A CN 111788585 A CN111788585 A CN 111788585A
Authority
CN
China
Prior art keywords
gradient
deep learning
learning model
iteration
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201980000128.9A
Other languages
English (en)
Other versions
CN111788585B (zh
Inventor
白小龙
李鹏飞
张震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN111788585A publication Critical patent/CN111788585A/zh
Application granted granted Critical
Publication of CN111788585B publication Critical patent/CN111788585B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)
  • Debugging And Monitoring (AREA)
  • Machine Translation (AREA)

Abstract

一种深度学习模型的训练方法,该方法包括:在N个深度学习模型的第j次迭代的BP计算中生成N个第一梯度集合,调整第一梯度集合包括的梯度的通信顺序,不按照第一梯度集合中包括的梯度的生成顺序来将第一梯度集合包括的梯度发送至参数存储空间。并按照调整之后的梯度的通信顺序,将N个第一梯度集合包括的梯度分别发送至参数存储空间。该方法通过调整本次迭代过程中得到的梯度传输到参数存储空间的传输顺序,提高了深度学习模型的训练效率。

Description

PCT国内申请,说明书已公开。

Claims (11)

  1. PCT国内申请,权利要求书已公开。
CN201980000128.9A 2019-01-16 2019-01-24 一种深度学习模型的训练方法、系统 Active CN111788585B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910041235 2019-01-16
CN2019100412358 2019-01-16
PCT/CN2019/072895 WO2020147142A1 (zh) 2019-01-16 2019-01-24 一种深度学习模型的训练方法、系统

Publications (2)

Publication Number Publication Date
CN111788585A true CN111788585A (zh) 2020-10-16
CN111788585B CN111788585B (zh) 2024-04-12

Family

ID=71613070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980000128.9A Active CN111788585B (zh) 2019-01-16 2019-01-24 一种深度学习模型的训练方法、系统

Country Status (4)

Country Link
US (1) US20210342696A1 (zh)
EP (1) EP3889846A4 (zh)
CN (1) CN111788585B (zh)
WO (1) WO2020147142A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329941A (zh) * 2020-11-04 2021-02-05 支付宝(杭州)信息技术有限公司 深度学习模型的更新方法及装置
CN112949853A (zh) * 2021-02-23 2021-06-11 北京金山云网络技术有限公司 深度学习模型的训练方法、系统、装置及设备
CN113419931A (zh) * 2021-05-24 2021-09-21 北京达佳互联信息技术有限公司 分布式机器学习系统的性能指标确定方法及装置
CN113642740A (zh) * 2021-08-12 2021-11-12 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN115965074A (zh) * 2022-11-28 2023-04-14 北京百度网讯科技有限公司 深度学习模型的训练方法、数据处理方法、装置和设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111935179B (zh) * 2020-09-23 2021-01-12 支付宝(杭州)信息技术有限公司 一种基于可信执行环境的模型训练方法和装置
CN115080249B (zh) * 2022-08-22 2022-12-16 南京可信区块链与算法经济研究院有限公司 一种基于联邦学习的车联网多维资源分配方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150324690A1 (en) * 2014-05-08 2015-11-12 Microsoft Corporation Deep Learning Training System
CN107273975A (zh) * 2017-06-15 2017-10-20 北京大学 一种神经网络模型的稀疏化后向传播训练方法
CN107292385A (zh) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 一种类Alexnet网络的模型训练方法和装置
US20180121806A1 (en) * 2016-10-27 2018-05-03 International Business Machines Corporation Efficient parallel training of a network model on multiple graphics processing units
CN108491928A (zh) * 2018-03-29 2018-09-04 腾讯科技(深圳)有限公司 模型参数训练方法、装置、服务器及存储介质
CN108829441A (zh) * 2018-05-14 2018-11-16 中山大学 一种分布式深度学习的参数更新优化系统
CN108960410A (zh) * 2018-06-13 2018-12-07 华为技术有限公司 基于神经网络的参数更新方法、相关平台及计算机存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10748064B2 (en) * 2015-08-27 2020-08-18 International Business Machines Corporation Deep neural network training with native devices
CN107516127B (zh) * 2017-08-21 2020-06-30 山东大学 服务机器人自主获取人穿携物品归属语义的方法及系统
CN108053029B (zh) * 2017-12-27 2021-08-27 上海闪易半导体有限公司 一种基于存储阵列的神经网络的训练方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150324690A1 (en) * 2014-05-08 2015-11-12 Microsoft Corporation Deep Learning Training System
CN107292385A (zh) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 一种类Alexnet网络的模型训练方法和装置
US20180121806A1 (en) * 2016-10-27 2018-05-03 International Business Machines Corporation Efficient parallel training of a network model on multiple graphics processing units
CN107273975A (zh) * 2017-06-15 2017-10-20 北京大学 一种神经网络模型的稀疏化后向传播训练方法
CN108491928A (zh) * 2018-03-29 2018-09-04 腾讯科技(深圳)有限公司 模型参数训练方法、装置、服务器及存储介质
CN108829441A (zh) * 2018-05-14 2018-11-16 中山大学 一种分布式深度学习的参数更新优化系统
CN108960410A (zh) * 2018-06-13 2018-12-07 华为技术有限公司 基于神经网络的参数更新方法、相关平台及计算机存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
毛勇华;桂小林;李前;贺兴时;: "深度学习应用技术研究", 计算机应用研究, vol. 33, no. 11, pages 3201 - 3204 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329941A (zh) * 2020-11-04 2021-02-05 支付宝(杭州)信息技术有限公司 深度学习模型的更新方法及装置
CN112949853A (zh) * 2021-02-23 2021-06-11 北京金山云网络技术有限公司 深度学习模型的训练方法、系统、装置及设备
CN112949853B (zh) * 2021-02-23 2024-04-05 北京金山云网络技术有限公司 深度学习模型的训练方法、系统、装置及设备
CN113419931A (zh) * 2021-05-24 2021-09-21 北京达佳互联信息技术有限公司 分布式机器学习系统的性能指标确定方法及装置
CN113419931B (zh) * 2021-05-24 2024-05-17 北京达佳互联信息技术有限公司 分布式机器学习系统的性能指标确定方法及装置
CN113642740A (zh) * 2021-08-12 2021-11-12 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN113642740B (zh) * 2021-08-12 2023-08-01 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN115965074A (zh) * 2022-11-28 2023-04-14 北京百度网讯科技有限公司 深度学习模型的训练方法、数据处理方法、装置和设备
CN115965074B (zh) * 2022-11-28 2023-11-10 北京百度网讯科技有限公司 深度学习模型的训练方法、数据处理方法、装置和设备

Also Published As

Publication number Publication date
EP3889846A1 (en) 2021-10-06
US20210342696A1 (en) 2021-11-04
EP3889846A4 (en) 2022-06-01
CN111788585B (zh) 2024-04-12
WO2020147142A1 (zh) 2020-07-23

Similar Documents

Publication Publication Date Title
CN111788585A (zh) 一种深度学习模型的训练方法、系统
US11568258B2 (en) Operation method
US11138494B2 (en) Storage controller acceleration for neural network training and inference
CN109951438B (zh) 一种分布式深度学习的通信优化方法及系统
US9990558B2 (en) Generating image features based on robust feature-learning
WO2018099085A1 (zh) 一种神经网络模型的训练方法、装置及芯片
CN107292352B (zh) 基于卷积神经网络的图像分类方法和装置
CN113168559A (zh) 机器学习模型的自动化生成
EP4150535A1 (en) Improved knowledge distillation by utilizing backward pass knowledge in neural networks
US20200342265A1 (en) Adaptive sampling for imbalance mitigation and dataset size reduction in machine learning
KR20190098671A (ko) 뉴럴 네트워크의 고속 처리 방법 및 그 방법을 이용한 장치
US11429865B1 (en) Optimizing neural networks
WO2022156475A1 (zh) 神经网络模型的训练方法、数据处理方法及装置
WO2021243473A1 (en) Improved knowledge distillation by utilizing backward pass knowledge in neural networks
CN109032630B (zh) 一种参数服务器中全局参数的更新方法
CN114358250A (zh) 数据处理方法、装置、计算机设备、介质及程序产品
US20220114479A1 (en) Systems and methods for automatic mixed-precision quantization search
WO2019180314A1 (en) Artificial neural networks
WO2024060839A9 (zh) 对象操作方法、装置、计算机设备以及计算机存储介质
WO2020042770A1 (zh) 图像识别处理方法和装置
WO2022127603A1 (zh) 一种模型处理方法及相关装置
WO2022105348A1 (zh) 神经网络的训练方法和装置
CN113705801A (zh) 一种神经网络模型的训练装置、方法及相关设备
CN113721655A (zh) 一种控制周期自适应的强化学习无人机稳定飞行控制方法
KR20210157826A (ko) 심층 신경망 구조 학습 및 경량화 방법

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220209

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Applicant after: Huawei Cloud Computing Technologies Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant before: HUAWEI TECHNOLOGIES Co.,Ltd.

GR01 Patent grant
GR01 Patent grant