WO2020249085A1 - Procédé et dispositif de traitement de données basés sur un calcul de réseau neuronal - Google Patents

Procédé et dispositif de traitement de données basés sur un calcul de réseau neuronal Download PDF

Info

Publication number
WO2020249085A1
WO2020249085A1 PCT/CN2020/095823 CN2020095823W WO2020249085A1 WO 2020249085 A1 WO2020249085 A1 WO 2020249085A1 CN 2020095823 W CN2020095823 W CN 2020095823W WO 2020249085 A1 WO2020249085 A1 WO 2020249085A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
quantization
layer
calculation
calculation result
Prior art date
Application number
PCT/CN2020/095823
Other languages
English (en)
Chinese (zh)
Inventor
陈超
徐斌
谢展鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020249085A1 publication Critical patent/WO2020249085A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Definitions

  • the present application provides a data processing method and device based on neural network calculations.
  • a quantization operation an inverse quantization operation and a three-level data processing structure are combined, it is beneficial to reduce the time taken for loading data and weights.
  • the second linear calculation layer includes a second quantization sublayer, a second calculation sublayer, and a second inverse quantization sublayer; the second quantization sublayer is used to quantize the intermediate output data according to a second data quantization coefficient , Obtain the second quantized data; the second calculation sublayer is used to calculate the second quantized data to obtain the second calculation result, and the second dequantization sublayer is used to calculate the second dequantization coefficient according to the second dequantization coefficient. Performing inverse quantization on the second calculation result to obtain second output data;
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain a smaller size image.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value.
  • quant_result represents a fixed-point result.
  • the full-precision neural network model can be understood as a neural network that has not undergone quantization and dequantization operations. That is to say, the weight of each layer of the input data in the full-precision neural network can be floating-point data, for example, fp32.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Pure & Applied Mathematics (AREA)
  • Biophysics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Neurology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un dispositif de traitement de données basés sur un réseau neuronal quantifié. Le procédé consiste : à effectuer un traitement de requantification sur un premier résultat de calcul sur la base d'un coefficient de requantification, le coefficient de requantification étant égal à la multiplication d'un premier coefficient de quantification de données par un premier coefficient de quantification de poids, puis à diviser le résultat par un second coefficient de quantification de données, c'est-à-dire, une première opération de quantification inverse classique et une seconde opération de quantification sont combinées au moyen d'un traitement de requantification, de telle sorte que de multiples processus de chargement de données et de poids dans la première opération de quantification inverse et la seconde opération de quantification sont combinés à un processus de chargement de données correspondant à une opération de requantification, et un processus de chargement de coefficient de requantification, ce qui facilite la réduction du temps occupé par le chargement des données et des poids.
PCT/CN2020/095823 2019-06-14 2020-06-12 Procédé et dispositif de traitement de données basés sur un calcul de réseau neuronal WO2020249085A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910517485.4A CN112085175B (zh) 2019-06-14 2019-06-14 基于神经网络计算的数据处理方法和装置
CN201910517485.4 2019-06-14

Publications (1)

Publication Number Publication Date
WO2020249085A1 true WO2020249085A1 (fr) 2020-12-17

Family

ID=73734189

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/095823 WO2020249085A1 (fr) 2019-06-14 2020-06-12 Procédé et dispositif de traitement de données basés sur un calcul de réseau neuronal

Country Status (2)

Country Link
CN (1) CN112085175B (fr)
WO (1) WO2020249085A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022116266A (ja) * 2021-06-18 2022-08-09 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド ニューラルネットワークプロセッシングユニット、ニューラルネットワークの処理方法及びその装置
EP4336409A1 (fr) * 2022-09-12 2024-03-13 STMicroelectronics S.r.l. Circuit accélérateur matériel de réseau neuronal avec circuits de requantification
US11977971B2 (en) 2018-02-27 2024-05-07 Stmicroelectronics International N.V. Data volume sculptor for deep learning acceleration

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951962A (zh) * 2017-03-22 2017-07-14 北京地平线信息技术有限公司 用于神经网络的复合运算单元、方法和电子设备
CN108108811A (zh) * 2017-12-18 2018-06-01 北京地平线信息技术有限公司 神经网络中的卷积计算方法和电子设备
US20180232621A1 (en) * 2017-02-10 2018-08-16 Kneron, Inc. Operation device and method for convolutional neural network
CN108734272A (zh) * 2017-04-17 2018-11-02 英特尔公司 卷积神经网络优化机构
CN109615068A (zh) * 2018-11-08 2019-04-12 阿里巴巴集团控股有限公司 一种对模型中的特征向量进行量化的方法和装置
CN109754063A (zh) * 2017-11-07 2019-05-14 三星电子株式会社 用于学习低精度神经网络的方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180232621A1 (en) * 2017-02-10 2018-08-16 Kneron, Inc. Operation device and method for convolutional neural network
CN106951962A (zh) * 2017-03-22 2017-07-14 北京地平线信息技术有限公司 用于神经网络的复合运算单元、方法和电子设备
CN108734272A (zh) * 2017-04-17 2018-11-02 英特尔公司 卷积神经网络优化机构
CN109754063A (zh) * 2017-11-07 2019-05-14 三星电子株式会社 用于学习低精度神经网络的方法及装置
CN108108811A (zh) * 2017-12-18 2018-06-01 北京地平线信息技术有限公司 神经网络中的卷积计算方法和电子设备
CN109615068A (zh) * 2018-11-08 2019-04-12 阿里巴巴集团控股有限公司 一种对模型中的特征向量进行量化的方法和装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11977971B2 (en) 2018-02-27 2024-05-07 Stmicroelectronics International N.V. Data volume sculptor for deep learning acceleration
JP2022116266A (ja) * 2021-06-18 2022-08-09 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド ニューラルネットワークプロセッシングユニット、ニューラルネットワークの処理方法及びその装置
EP4044070A3 (fr) * 2021-06-18 2022-12-21 Beijing Baidu Netcom Science Technology Co., Ltd. Unité de traitement de réseau neuronal, procédé et dispositif de traitement de réseau neuronal
JP7408723B2 (ja) 2021-06-18 2024-01-05 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド ニューラルネットワークプロセッシングユニット、ニューラルネットワークの処理方法及びその装置
EP4336409A1 (fr) * 2022-09-12 2024-03-13 STMicroelectronics S.r.l. Circuit accélérateur matériel de réseau neuronal avec circuits de requantification

Also Published As

Publication number Publication date
CN112085175A (zh) 2020-12-15
CN112085175B (zh) 2024-05-03

Similar Documents

Publication Publication Date Title
WO2020249085A1 (fr) Procédé et dispositif de traitement de données basés sur un calcul de réseau neuronal
CN111652368B (zh) 一种数据处理方法及相关产品
US9916531B1 (en) Accumulator constrained quantization of convolutional neural networks
US10282659B2 (en) Device for implementing artificial neural network with multiple instruction units
CN114868108A (zh) 组合多个整数和浮点数据类型的脉动阵列部件
CN109726806A (zh) 信息处理方法及终端设备
WO2021135715A1 (fr) Procédé et appareil de compression d'image
WO2022111002A1 (fr) Procédé et appareil permettant d'entraîner un réseau neuronal et support de stockage lisible par ordinateur
CN110874627B (zh) 数据处理方法、数据处理装置及计算机可读介质
US20220092399A1 (en) Area-Efficient Convolutional Block
CN113238989A (zh) 将数据进行量化的设备、方法及计算机可读存储介质
JP2021108230A (ja) ニューラルネットワーク処理装置およびニューラルネットワーク処理方法
CN115437778A (zh) 内核调度方法及装置、电子设备、计算机可读存储介质
CN113238987B (zh) 量化数据的统计量化器、存储装置、处理装置及板卡
US11423313B1 (en) Configurable function approximation based on switching mapping table content
US20210342694A1 (en) Machine Learning Network Model Compression
Jiang et al. A high-throughput full-dataflow mobilenetv2 accelerator on edge FPGA
WO2021073638A1 (fr) Procédé et appareil d'exécution de modèle de réseau neuronal et dispositif informatique
US20230214638A1 (en) Apparatus for enabling the conversion and utilization of various formats of neural network models and method thereof
WO2023109748A1 (fr) Procédé de réglage de réseau neuronal et appareil correspondant
CN113238976B (zh) 缓存控制器、集成电路装置及板卡
US11281470B2 (en) Argmax use for machine learning
CN114065913A (zh) 模型量化方法、装置及终端设备
CN113238988A (zh) 优化深度神经网络的参数的处理系统、集成电路及板卡
CN113238975A (zh) 优化深度神经网络的参数的内存、集成电路及板卡

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20822354

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20822354

Country of ref document: EP

Kind code of ref document: A1