WO2018107383A1 - Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur - Google Patents

Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur Download PDF

Info

Publication number
WO2018107383A1
WO2018107383A1 PCT/CN2016/109862 CN2016109862W WO2018107383A1 WO 2018107383 A1 WO2018107383 A1 WO 2018107383A1 CN 2016109862 W CN2016109862 W CN 2016109862W WO 2018107383 A1 WO2018107383 A1 WO 2018107383A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
transformed
neuron
neural network
weight
Prior art date
Application number
PCT/CN2016/109862
Other languages
English (en)
Chinese (zh)
Inventor
陈云霁
庄毅敏
刘少礼
郭崎
陈天石
Original Assignee
上海寒武纪信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海寒武纪信息科技有限公司 filed Critical 上海寒武纪信息科技有限公司
Priority to PCT/CN2016/109862 priority Critical patent/WO2018107383A1/fr
Publication of WO2018107383A1 publication Critical patent/WO2018107383A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

Definitions

  • the present invention relates to the field of artificial neural network technologies, and in particular, to a convolution operation method and apparatus for a neural network, and a computer readable storage medium.
  • Multi-layer artificial neural networks are widely used in the fields of pattern recognition, image processing, function approximation and optimization calculation.
  • Multi-layer artificial networks have been accepted by Kirin, image processing, function approximation and optimization calculation.
  • Multi-layer artificial networks have been accepted by Kirin, image processing, function approximation and optimization calculation.
  • Multi-layer artificial networks have been accepted by Kirin, image processing, function approximation and optimization calculation.
  • Multi-layer artificial networks have been accepted by Kir in recent years due to their high recognition accuracy and good parallelism. The industry is getting more and more attention.
  • the object of the present invention is to provide a convolution operation method and device for a neural network and a computer readable storage medium, which can realize a convolution operation of a weight matrix and a neuron in a neural network by matrix multiplication, thereby reducing a volume
  • the amount of computation required for the product increases the computational speed of the neural network and greatly improves the efficiency of data processing.
  • An aspect of the present invention provides a convolution operation method for a neural network, which is used to implement a convolution operation of a weight matrix and a neuron in a neural network by matrix multiplication, and the method includes:
  • Another aspect of the present invention provides a convolution operation device for a neural network, which is used to implement a convolution operation of a weight matrix and a neuron in a neural network by matrix multiplication, and the device includes:
  • a memory for storing instructions
  • a processor configured to execute the decoded instruction to perform:
  • the inverse matrix is inverse-transformed by the multiplication matrix to obtain the operation result.
  • Another aspect of the invention provides a computer readable storage medium storing instructions executable by a processor to cause the processor to perform the methods of the present invention.
  • the invention can turn a complex convolution operation into a sparse matrix multiplication operation, and the transform and inverse transform processes can be realized by bit operations.
  • the amount of calculation required for convolution can be greatly reduced, and the operation speed of the neural network can be improved.
  • Improve the efficiency of data processing can reduce the storage space required to store network parameters and reduce the bandwidth of memory access.
  • FIG. 1 is a flow chart schematically showing a convolution operation method of a neural network according to an embodiment of the present invention.
  • FIG. 2 is a schematic block diagram showing the structure of a convolution operation device of a neural network according to an embodiment of the present invention.
  • FIG. 3 is a schematic block diagram showing the structure of a processor according to an embodiment of the present invention.
  • Fig. 4 schematically shows a schematic diagram of a convolution operation.
  • FIG. 5 is a schematic diagram showing the process of performing the convolution operation of FIG. 4 according to an embodiment of the present invention in conjunction with the apparatus described in the embodiment of the present invention.
  • the techniques of this disclosure may be implemented in the form of hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of the present disclosure may take the form of a computer program product on a computer readable medium storing instructions for use by an instruction execution system.
  • a computer readable medium can be any medium that can contain, store, communicate, propagate or transport the instructions.
  • a computer readable medium can include but not Limited to electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, devices, or propagation media.
  • the computer readable medium include: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disk (CD-ROM); a memory such as a random access memory (RAM) or a flash memory; and/or a wired /Wireless communication link.
  • a magnetic storage device such as a magnetic tape or a hard disk (HDD)
  • an optical storage device such as a compact disk (CD-ROM)
  • a memory such as a random access memory (RAM) or a flash memory
  • RAM random access memory
  • FIG. 1 is a flow chart schematically showing a convolution operation method of a neural network according to an embodiment of the present invention. As shown in FIG. 1, the method includes:
  • Step 1 Perform a winograd transformation on the neuron matrix and the weight matrix to obtain a transformed neuron matrix and a transformed weight matrix.
  • the neuron matrix d 0 and the weight matrix w 0 are subjected to winograd transformation using the following formula to obtain the transformed neuron matrix d and the transformed weight matrix w:
  • C is the transformation matrix of the neuron matrix d 0
  • C T is the transposed matrix of C
  • G is the transformation matrix of the weight matrix w 0
  • G T is the transposed matrix of G.
  • the values in the neuron matrix and the weight matrix are binary, and the values of the transformation matrix C, G are 2 n , such as 1, -0.5, 0, 0.5, 1, and the like.
  • the embodiment of the present invention implements a winograd transform using a bit operation, and implements operations of multiplying 2 and dividing 2 by left shift and right shift. For example, when a value in the neuron matrix d 0 is multiplied by 0.5, the value is shifted to the right by one bit. When multiplied by -0.5, the value is shifted to the left by one bit and the highest bit is inverted. Therefore, in the embodiment of the present invention, the winograd transformation is realized by the bit operation, the calculation amount is reduced, and the operation speed is improved.
  • the transformation matrices C and G of the neuron matrix d 0 and the weight matrix w 0 are obtained using the winograd algorithm.
  • the winograd algorithm uses the block multiplication of the matrix to reduce the number of multiplications of the matrix multiplication. There are many different matrix blocking methods. A winograd algorithm is shown below.
  • M 5 S 1 S 5
  • M 6 S 4 B 22
  • M 7 A 22 S 8
  • T 1 M 1 + M 2
  • T 2 T 1 + M 4
  • the transformation matrix required for convolution is obtained by the above winograd algorithm, for example, for a one-dimensional convolution [d 1 , d 2 , d 3 ]*[w 1 , w 2 ], assuming that each convolution slip is 1,
  • the convolution can be extended into a matrix multiplied form
  • M 1 (-a 1 + a 2 + a 3 ) b 1
  • M 2 a 1 b 1
  • M 3 a 2 b 2
  • M 4 0
  • m 1 (-a 1 + a 2 + a 3 ) b 1
  • m 2 a 1 b 1
  • m 3 a 2 b 2
  • m 4 a 3 (b 1 - b 2 )
  • the convolutional transformation matrix can be obtained by multiple matrix partitioning.
  • the winograd algorithm has different matrix blocking methods.
  • the specific values and dimensions of the transformation matrix are determined by the dimensions of the input neurons and the weight matrix and the convolution sliding step size.
  • the specific value and dimension of the transformation matrix are determined by the dimensions of the input neuron and the weight matrix.
  • the specific influencing factors include the dimension of the input neuron, the dimension of the weight matrix, and the sliding step size of each convolution operation.
  • the values and dimensions of the transformation matrices are also determined, because in the neural network structure, the three influencing factors are things. It is set first, so this embodiment operates offline to complete the setting for each transformation matrix.
  • Step 2 performing a matrix multiplication operation on the transformed neuron matrix and the transformed weight matrix to obtain a multiplication matrix t:
  • the two matrices participating in the operation may have different scales, so multiple matrix multiplication operations need to be performed by a sliding operation, and in the embodiment of the present invention, the converted neurons are The matrix d and the weight matrix w conform to the matrix multiplication rule, that is, only one matrix multiplication operation is performed, which greatly saves the calculation amount.
  • the transformed weight matrix is mapped into a sparse sequence consisting of “0” and “1”, where “0” corresponds to an element whose value is “0” in the transformed weight matrix, and “1” corresponds to An element whose value is not 0 in the transformed weight matrix.
  • the matrix multiplication operation is performed, the elements of the corresponding positions in the transformed neuron matrix are extracted according to the "1" recorded by the sparse sequence to be multiplied by the corresponding elements in the transformed weight matrix.
  • the sparse sequence corresponding to w is 1110111011101100 (read line by line).
  • the use of sparse sequences can further reduce the amount of computation of matrix multiplication operations.
  • step 3 the multiplication matrix is inverse-transformed by winograd to obtain an operation result.
  • the multiplication matrix t is inverse-transformed by winograd using the following formula to obtain an operation result:
  • A is the inverse transformation matrix and A T is the transposed matrix of A.
  • the inverse transformation matrix A is the same as C and G, and is obtained by using the winograd algorithm. The specific process is not repeated here.
  • the value of the inverse transformation matrix A is also 2 n , which is also realized by bit operations. The operation between values.
  • FIG. 2 is a schematic structural diagram of a convolution operation device of a neural network according to an embodiment of the present invention. As shown in FIG. 2, the device includes:
  • the data access unit 1 is configured to acquire a neuron matrix and a weight matrix from an external address space, and provide the same to the processor 5, and can also obtain an instruction from the outside and provide it to the memory 2.
  • the memory 2 is configured to read an instruction through the data access unit 1 and cache the read instruction.
  • the controller 3 is configured to read an instruction in the memory 2, decode the read instruction, obtain a micro instruction that controls the corresponding module, and send the micro instruction to the corresponding module.
  • the data buffer unit 4 is configured to store data required for data processing, and cache data during the operation.
  • the processor 5 is configured to perform a corresponding operation operation under the control of the controller unit, and the processor 5 acquires data from the data buffer unit 4 or through the data access unit 1, and the result of the operation is output to the data buffer unit 4 or Output through the data access unit 1.
  • FIG. 3 is a schematic structural diagram of a processor according to an embodiment of the present invention.
  • the processor 5 includes an operation control unit 51, a matrix multiplication unit 52, and a thinning processing unit 53, wherein the thinning processing unit 53 specifically includes a mapping unit 531, wherein the matrix multiplication unit 52 also performs the operations shown in the method of FIG. 1, and details are not described herein again.
  • the thinning processing unit 53 includes a mapping unit 531, and the mapping unit 531 implements mapping between the matrix and the sparse sequence to map the transformed weight matrix into a sparse sequence consisting of “0” and “1”, where “0” corresponds to the transformed An element whose value is “0” in the weight matrix, "1" corresponds to an element whose value is not 0 in the transformed weight matrix; when the matrix operation unit performs the matrix multiplication operation, the thinning unit records "1" according to the sparse sequence And extracting the elements of the corresponding positions in the transformed neuron matrix to be multiplied by the corresponding elements in the transformed weight matrix.
  • the matrix multiplication is specifically implemented as follows.
  • the portion with the sparse sequence value of 0 does not contribute to the multiplication, does not participate in the operation, and the portion with the sparse sequence value of 1, reads the corresponding weight data through the mapping unit 531, and is read by the processor 531.
  • the multiplication operation is completed, and when the sparse sequence completes the weight matrix, one row is added, and the value obtained by the multiplication operation is added. Due to moment Most of the multiply-and-accumulate operations in the matrix multiplication do not affect each other. Therefore, in the present embodiment, a plurality of multiply-and-accumulate operations are performed in parallel.
  • the sparse sequence is completed offline, and the storage space occupied by the sparse sequence relative to the sparse storage space is very small, so the process does not affect the operation speed and storage space of the neural network.
  • the convolution kernel is a 3 ⁇ 3 matrix, and the convolution kernel slides on the input image, wherein the convolution kernel in the figure is the present invention.
  • the layer weight matrix, the input image is the neuron matrix of the present invention.
  • FIG. 5 is a schematic diagram showing a process for performing the convolution operation of FIG. 4 according to an embodiment of the present invention, as shown in FIG. 5:
  • step S1 the controller 3 reads an instruction from the memory 2.
  • step S2 the controller 3 decodes the microinstruction, and the data access unit 1 reads the data required to perform the convolution operation from the external address space according to the microinstruction, including the neuron matrix d 0 and the weight matrix w 0 . Then, the transformation matrices C, G, and the inverse transform matrix A are obtained, in the example of FIG. 4:
  • step S3 the processor 5 reads the neuron matrix d 0 and the weight matrix w 0 from the data buffer unit 4, and performs a winograd transformation on the neuron matrix d 0 and the weight matrix w 0 , namely:
  • step S4 the thinning processing unit 53 obtains the sparse sequence of the transformed weight matrix w by the mapping unit 531, that is, [1110111011101100].
  • the creation of the sparse sequence is performed by the mapping unit.
  • the mapping unit traverses the weight matrix, and the non-zero value of the weight matrix is marked with a bit 1 and the zero value is marked with a bit 0, and finally a bit sequence is obtained as a sparse sequence, the bit sequence.
  • the length is the same as the number of values of the weight matrix.
  • Step S5 the processor 5 selects the corresponding neuron and the weight according to the sparse sequence to perform a multiplication operation, and completes the matrix multiplication of the input neuron and the weight, wherein the transformed neuron matrix d is based on the index sequence [d 03 , d 13 , d 23 , d 32 , d 33 ] do not participate in the operation, and finally get the operation result, namely:
  • Step S6 the processor 5 performs a winograd inverse transform operation on the result of multiplying the matrix, and obtains the output as follows
  • the computer readable storage medium provided by the present invention may be, for example, any medium capable of containing, storing, transmitting, transmitting or transmitting instructions.
  • a readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium.
  • Specific examples of the readable storage medium include: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disk (CD-ROM); a memory such as a random access memory (RAM) or a flash memory; and/or a wired /Wireless communication link.
  • the readable storage medium includes computer executable instructions that, when executed by a processor, cause the processor to perform, for example, the method flow described above in connection with FIGS. 1 and 5, and any variations thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Complex Calculations (AREA)

Abstract

L'invention concerne un procédé et un dispositif de calcul de convolution d'un réseau de neurones artificiels, permettant d'accomplir un calcul de convolution d'une matrice de pondération et de neurones d'un réseau de neurones artificiels par multiplication de matrices. Le procédé comprend les étapes consistant à : tout d'abord effectuer une transformée de winograd sur une matrice neuronale et une matrice de pondération pour obtenir une matrice neuronale transformée et une matrice de pondération transformée (étape1); puis effectuer une opération de multiplication matricielle sur la matrice neuronale transformée et la matrice de pondération transformée pour obtenir une matrice de multiplication (étape 2); et enfin effectuer une transformée inverse de winograd sur la matrice de multiplication pour obtenir le résultat de calcul (étape 3).
PCT/CN2016/109862 2016-12-14 2016-12-14 Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur WO2018107383A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/109862 WO2018107383A1 (fr) 2016-12-14 2016-12-14 Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/109862 WO2018107383A1 (fr) 2016-12-14 2016-12-14 Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur

Publications (1)

Publication Number Publication Date
WO2018107383A1 true WO2018107383A1 (fr) 2018-06-21

Family

ID=62557732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/109862 WO2018107383A1 (fr) 2016-12-14 2016-12-14 Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur

Country Status (1)

Country Link
WO (1) WO2018107383A1 (fr)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919307A (zh) * 2019-01-28 2019-06-21 广东浪潮大数据研究有限公司 Fpga及深度残差网络实现方法、系统、计算机介质
CN111047017A (zh) * 2019-12-18 2020-04-21 北京安兔兔科技有限公司 一种神经网络算法的评估方法、装置及电子设备
CN111126081A (zh) * 2018-10-31 2020-05-08 永德利硅橡胶科技(深圳)有限公司 全球通用语言终端及方法
CN111199275A (zh) * 2018-11-20 2020-05-26 上海登临科技有限公司 用于神经网络的片上系统
CN111210010A (zh) * 2020-01-15 2020-05-29 上海眼控科技股份有限公司 数据处理方法、装置、计算机设备及可读存储介质
CN111260020A (zh) * 2018-11-30 2020-06-09 深圳市海思半导体有限公司 卷积神经网络计算的方法和装置
CN111291317A (zh) * 2020-02-26 2020-06-16 上海海事大学 一种近似矩阵的卷积神经网络二值化贪心递归方法
CN111831254A (zh) * 2019-04-15 2020-10-27 阿里巴巴集团控股有限公司 图像处理加速方法、图像处理模型存储方法及对应装置
CN112686365A (zh) * 2019-10-18 2021-04-20 华为技术有限公司 运行神经网络模型的方法、装置和计算机设备
CN112765539A (zh) * 2019-11-01 2021-05-07 中科寒武纪科技股份有限公司 运算装置、方法及相关产品
CN112784207A (zh) * 2019-11-01 2021-05-11 中科寒武纪科技股份有限公司 运算方法及相关产品
WO2022227024A1 (fr) * 2021-04-30 2022-11-03 华为技术有限公司 Procédé et appareil opérationnels pour un modèle de réseau neuronal et procédé et appareil d'apprentissage pour un modèle de réseau neuronal
CN117851744A (zh) * 2024-03-07 2024-04-09 北京象帝先计算技术有限公司 矩阵运算电路、处理器、集成电路系统、电子组件及设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101893686A (zh) * 2010-06-11 2010-11-24 河南电力试验研究院 基于摄影数字化的断路器动作特性在线检测装置和方法
CN104915322A (zh) * 2015-06-09 2015-09-16 中国人民解放军国防科学技术大学 一种卷积神经网络硬件加速方法及其axi总线ip核
CN106127297A (zh) * 2016-06-02 2016-11-16 中国科学院自动化研究所 基于张量分解的深度卷积神经网络的加速与压缩方法
CN106203617A (zh) * 2016-06-27 2016-12-07 哈尔滨工业大学深圳研究生院 一种基于卷积神经网络的加速处理单元及阵列结构

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101893686A (zh) * 2010-06-11 2010-11-24 河南电力试验研究院 基于摄影数字化的断路器动作特性在线检测装置和方法
CN104915322A (zh) * 2015-06-09 2015-09-16 中国人民解放军国防科学技术大学 一种卷积神经网络硬件加速方法及其axi总线ip核
CN106127297A (zh) * 2016-06-02 2016-11-16 中国科学院自动化研究所 基于张量分解的深度卷积神经网络的加速与压缩方法
CN106203617A (zh) * 2016-06-27 2016-12-07 哈尔滨工业大学深圳研究生院 一种基于卷积神经网络的加速处理单元及阵列结构

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TAN, FUPING ET AL.: "A New Scheme to Divide Odd-sized Matrices for the Winograd's Algorithm", COMMUNICATION ON APPLIED MATHEMATICS AND COMPUTATION, vol. 18, no. 1, 30 June 2004 (2004-06-30) *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126081A (zh) * 2018-10-31 2020-05-08 永德利硅橡胶科技(深圳)有限公司 全球通用语言终端及方法
CN111126081B (zh) * 2018-10-31 2023-07-21 深圳永德利科技股份有限公司 全球通用语言终端及方法
CN111199275A (zh) * 2018-11-20 2020-05-26 上海登临科技有限公司 用于神经网络的片上系统
CN111199275B (zh) * 2018-11-20 2023-04-28 上海登临科技有限公司 用于神经网络的片上系统
CN111260020B (zh) * 2018-11-30 2024-04-16 深圳市海思半导体有限公司 卷积神经网络计算的方法和装置
CN111260020A (zh) * 2018-11-30 2020-06-09 深圳市海思半导体有限公司 卷积神经网络计算的方法和装置
CN109919307B (zh) * 2019-01-28 2023-04-07 广东浪潮大数据研究有限公司 Fpga及深度残差网络实现方法、系统、计算机介质
CN109919307A (zh) * 2019-01-28 2019-06-21 广东浪潮大数据研究有限公司 Fpga及深度残差网络实现方法、系统、计算机介质
CN111831254A (zh) * 2019-04-15 2020-10-27 阿里巴巴集团控股有限公司 图像处理加速方法、图像处理模型存储方法及对应装置
CN112686365B (zh) * 2019-10-18 2024-03-29 华为技术有限公司 运行神经网络模型的方法、装置和计算机设备
CN112686365A (zh) * 2019-10-18 2021-04-20 华为技术有限公司 运行神经网络模型的方法、装置和计算机设备
CN112784207A (zh) * 2019-11-01 2021-05-11 中科寒武纪科技股份有限公司 运算方法及相关产品
CN112765539A (zh) * 2019-11-01 2021-05-07 中科寒武纪科技股份有限公司 运算装置、方法及相关产品
CN112765539B (zh) * 2019-11-01 2024-02-02 中科寒武纪科技股份有限公司 运算装置、方法及相关产品
CN112784207B (zh) * 2019-11-01 2024-02-02 中科寒武纪科技股份有限公司 运算方法及相关产品
CN111047017B (zh) * 2019-12-18 2023-06-23 北京安兔兔科技有限公司 一种神经网络算法的评估方法、装置及电子设备
CN111047017A (zh) * 2019-12-18 2020-04-21 北京安兔兔科技有限公司 一种神经网络算法的评估方法、装置及电子设备
CN111210010A (zh) * 2020-01-15 2020-05-29 上海眼控科技股份有限公司 数据处理方法、装置、计算机设备及可读存储介质
CN111291317B (zh) * 2020-02-26 2023-03-24 上海海事大学 一种近似矩阵的卷积神经网络二值化贪心递归方法
CN111291317A (zh) * 2020-02-26 2020-06-16 上海海事大学 一种近似矩阵的卷积神经网络二值化贪心递归方法
WO2022227024A1 (fr) * 2021-04-30 2022-11-03 华为技术有限公司 Procédé et appareil opérationnels pour un modèle de réseau neuronal et procédé et appareil d'apprentissage pour un modèle de réseau neuronal
CN117851744A (zh) * 2024-03-07 2024-04-09 北京象帝先计算技术有限公司 矩阵运算电路、处理器、集成电路系统、电子组件及设备

Similar Documents

Publication Publication Date Title
WO2018107383A1 (fr) Procédé et dispositif de calcul de convolution d'un réseau de neurones artificiels, et support d'enregistrement lisible par ordinateur
CN107622302B (zh) 用于卷积神经网络的超像素方法
EP3612947B1 (fr) Traitement de mémoire non contiguë en tant que mémoire contiguë pour améliorer les performances d'un environnement de réseau neuronal
US20240152729A1 (en) Convolutional neural network (cnn) processing method and apparatus performing high-speed and precision convolution operations
WO2018108126A1 (fr) Dispositif et procédé pour opération de convolution de réseau neuronal
JP7325158B2 (ja) ニューラル・ネットワーク・コアにおける動的精度のためのデータ表現
US20190340510A1 (en) Sparsifying neural network models
US10650230B2 (en) Image data extraction using neural networks
US20170097884A1 (en) Pipelined convolutional operations for processing clusters
JP7287397B2 (ja) 情報処理方法、情報処理装置及び情報処理プログラム
CN107392842A (zh) 图像风格化处理方法、装置、计算设备及计算机存储介质
CN107610146A (zh) 图像场景分割方法、装置、计算设备及计算机存储介质
CN111126559A (zh) 神经网络处理器及其卷积操作方法
WO2022111002A1 (fr) Procédé et appareil permettant d'entraîner un réseau neuronal et support de stockage lisible par ordinateur
WO2019215907A1 (fr) Dispositif de traitement arithmétique
WO2021036362A1 (fr) Procédé et appareil de traitement de données et produit associé
WO2022151779A1 (fr) Procédé et dispositif de mise en œuvre d'opération de convolution, et procédé et dispositif de traitement de données
US20200118002A1 (en) Down-sampling for convolutional neural networks
US20200073911A1 (en) System, method and apparatus for computationally efficient data manipulation
TWI758223B (zh) 具有動態最小批次尺寸之運算方法,以及用於執行該方法之運算系統及電腦可讀儲存媒體
CN115953651B (zh) 一种基于跨域设备的模型训练方法、装置、设备及介质
CN111178513B (zh) 神经网络的卷积实现方法、卷积实现装置及终端设备
WO2021081854A1 (fr) Circuit d'opération de convolution et procédé d'opération de convolution
CN112446472A (zh) 用于处理数据的方法、装置以及相关产品
CN114207694B (zh) 秘密梯度下降法计算方法及系统、秘密深度学习方法及系统、秘密计算装置、记录介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16924104

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16924104

Country of ref document: EP

Kind code of ref document: A1