CN111428188A - Convolution operation method and device - Google Patents

Convolution operation method and device Download PDF

Info

Publication number
CN111428188A
CN111428188A CN202010239668.7A CN202010239668A CN111428188A CN 111428188 A CN111428188 A CN 111428188A CN 202010239668 A CN202010239668 A CN 202010239668A CN 111428188 A CN111428188 A CN 111428188A
Authority
CN
China
Prior art keywords
convolution kernel
data
gradient
convolution
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010239668.7A
Other languages
Chinese (zh)
Inventor
岳涛
赵思杰
胡雪梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202010239668.7A priority Critical patent/CN111428188A/en
Publication of CN111428188A publication Critical patent/CN111428188A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/156Correlation function computation including computation of convolution operations using a domain transform, e.g. Fourier transform, polynomial transform, number theoretic transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a convolution operation method and a convolution operation device. The method comprises the following steps: expanding the cyclic displacement of the convolution kernel element by element or channel by channel, performing convolution operation on the cyclic displacement and the input characteristic graph respectively, accelerating the operation process by using fast Fourier transform, and adjusting the parameters of the convolution kernel by adopting a learning mode. The convolution operation device comprises a cyclic displacement unit, a forward reasoning unit, a data unit, a back propagation unit and an updating unit. The method and the device not only can greatly reduce the parameter quantity of the convolutional neural network, but also can reduce the calculated quantity of the convolutional neural network, thereby accelerating the training process and the reasoning process, and enabling the network to be conveniently deployed in embedded equipment, mobile equipment or other terminals.

Description

Convolution operation method and device
Technical Field
The invention relates to the field of convolutional neural networks, in particular to a method and a device for convolutional operation.
Background
In recent years, Convolutional Neural Networks (CNNs) have been rapidly developed and widely used in the fields of object recognition, image restoration, semantic segmentation, and the like. A convolutional neural network usually consists of an indefinite number of convolutional layers, pooling layers, normalization layers, activation layers, and other modules, whereas a conventional standard convolution may contain tens of thousands of parameters, which generate millions of floating-point operations in the process of calculating with input data, and thus require a large amount of storage space and calculation resources, and usually the convolutional operation occupies more than 90% of the operation amount of the whole convolutional neural network. Due to the limitations of computational power, memory space, and power consumption, these convolutional neural networks are difficult to directly apply to embedded devices or mobile devices.
MobileNet proposes a deep separable convolution, i.e., a point-by-point convolution that decomposes a standard convolution of a set of k × k into a set of k × k grouped convolution kernels 1 × 1. ShuffleNet proposes the use of grouped convolutions to replace the standard convolution and uses the channelshuffle technique to facilitate flow between multi-channel information. L egoNet proposes a robust convolution that uses a small number of convolution kernels to generate more feature layers in a building block-like manner.GhostNet proposes a simple linear mapping of a small number of feature layers to expand the number of feature layers.
Disclosure of Invention
The invention mainly aims at the problems of large parameter quantity, high calculated amount and high power consumption in the operation process of the conventional convolution neural network, provides a convolution operation device and method, and can effectively reduce the calculated amount and parameter quantity of the neural network.
The method adopts the technical scheme that:
a method of convolution operation comprising the steps of:
step 1, performing convolution calculation on an original convolution kernel and an input characteristic graph to generate first output data;
step 2, performing cyclic shift operation on the original convolution kernel to obtain an intermediate convolution kernel: sequentially unfolding an original convolution kernel from a low dimension to a high dimension into a form of a one-dimensional vector, then translating all data of the vector by t bits, supplementing the t bits of data translated out of the vector to an end point of the vector so as to obtain a new one-dimensional vector, and then recombining the vector from the low dimension to the high dimension to obtain an intermediate convolution kernel, wherein the dimension of the intermediate convolution kernel is the same as that of the original convolution kernel;
step 3, performing convolution calculation on the intermediate convolution kernel and the input characteristic graph again to generate second output data;
step 4, repeating the operations o-1 times from the step 2 to the step 3, wherein the translation amount t is different each time; splicing all the obtained o data including the first output data and the second output data along a channel dimension to obtain an output characteristic diagram;
step 5, calculating the gradient of the input feature map according to the gradient of the output feature map;
step 6, calculating the gradient of an original convolution kernel according to the gradient of the output characteristic diagram and the gradient of the input characteristic diagram;
and 7, updating the original convolution kernel and the learning rate by using the original convolution kernel gradient calculated in the step 6.
Further, in step 1, the convolution calculation specifically includes: and carrying out counterpoint multiplication and summation on the original convolution kernel data and data corresponding to a certain position of the input characteristic diagram to obtain output data with one element, calculating to obtain unit element output data of all positions of the input characteristic diagram, and finally forming single-channel first output data.
Further, in the steps 1 to 4, a fast fourier transform method is used to accelerate convolution calculation, and the specific steps are as follows: sequentially unfolding original convolution kernel data from a low dimension to a high dimension into a form of one-dimensional vectors, and mapping the one-dimensional vectors into one-dimensional vectors in a frequency domain by using Fourier transform; meanwhile, local data corresponding to a certain position of the input characteristic diagram are expanded into a one-dimensional vector from a low dimension to a high dimension, and the one-dimensional vector is mapped into a one-dimensional vector in a frequency domain by using fast Fourier transform; and then performing dot product operation on the one-dimensional vectors in the two frequency domains, mapping the result vector into a one-dimensional result vector in the time domain by using inverse fast Fourier transform, selecting o data from the one-dimensional result vector as data of each channel at a position corresponding to the output feature map data, wherein the selected basis is the interval represented by each translation amount in the step 4.
Further, in the step 5, firstly, the output feature map gradient is expanded by using "0" around to obtain an expanded output feature map gradient, then the original convolution kernel in the step 1 is subjected to a rotation operation to generate an original rotation convolution kernel, wherein the rotation operation means that data of a convolution kernel two-dimensional sub-matrix in each channel of the original convolution kernel are sequentially subjected to symmetrical transformation along two matrix diagonals. (ii) a Performing cyclic displacement operation o-1 times on the original rotary convolution kernel in the step 2, wherein each displacement corresponds to the displacement in the step 4, performing channel arrangement operation on an intermediate convolution kernel group formed by the generated o-1 convolution kernels and the original rotary convolution kernel to generate a final convolution kernel group comprising c convolution kernels, wherein the channel arrangement operation is to extract ith channel data of each convolution kernel in the intermediate convolution kernel group to form a new convolution kernel; and performing convolution operation on each convolution kernel in the final convolution kernel group and the gradient of the expansion output characteristic diagram to obtain c parts of partial input characteristic diagram gradients in total, and performing channel dimension splicing on the partial input characteristic diagram gradients to finally obtain the gradient of the input characteristic diagram.
Further, in the step 6, an intermediate feature map group is formed by o-1 intermediate feature maps obtained by performing cyclic shift operation o-1 times on the input feature map and the input feature map, the convolution operation in the step 1 is performed on each feature map in the intermediate feature map group and the gradient of the output feature map to obtain c partial convolution kernel gradients in total, and the c partial convolution kernel gradients are subjected to channel dimension splicing to obtain the original convolution kernel gradient finally.
Further, in step 7, the gradient of the original convolution kernel is multiplied by the negative number of the learning rate and accumulated to the original convolution kernel to obtain an updated original convolution kernel, and a constant is subtracted from the learning rate to obtain an updated learning rate.
The invention relates to a convolution operation device, which comprises a cyclic displacement unit, a forward reasoning unit, a data unit, a reverse propagation unit and an updating unit; the cyclic displacement unit is used for translating the convolution kernel and the one-dimensional vector data of the input characteristic diagram; the forward reasoning unit is used for carrying out convolution calculation on the convolution kernel and the input characteristic diagram; the data unit is used for storing an input feature map, a convolution kernel, an output feature map and gradient data; the back propagation unit is used for solving the input characteristic diagram gradient and the original convolution kernel gradient in a reverse mode according to the output characteristic diagram gradient; and the updating unit is used for updating the original convolution kernel and the learning rate.
The invention has the advantages that: compared with the prior convolution method in the convolution neural network, the method has the advantages that the single convolution kernel is subjected to the cyclic shift operation, and when the feature layers with the same channel number are generated, the required weight parameters are only the number
Figure BDA0002432129540000031
o denotes the number of output channels. Meanwhile, forward reasoning is mapped into a frequency domain for operation by means of fast Fourier transform, and under the condition of common parameter setting, the calculated amount is greatly reduced, so that the forward reasoning process of the neural network is accelerated, and the neural network can be conveniently deployed in embedded equipment, mobile equipment or other terminals.
Drawings
FIG. 1 is a training flowchart in embodiment 1 of the present invention;
FIG. 2 is a schematic flow diagram of the process of the present invention;
fig. 3 is a schematic diagram of the framework of the device of the invention.
Detailed Description
The invention provides a convolution operation method and a convolution operation device, which are applied to a neural network. The present invention will be described in detail with reference to the accompanying drawings.
Example 1
Fig. 1 is a training flowchart of the present embodiment, which includes the following steps:
(1) building a neural network and initializing all weight parameters;
(2) inputting a training sample pair;
(3) forward reasoning is carried out layer by layer, for a convolution network part, the method provided by the invention is used for carrying out forward reasoning, and for other general neural network modules, a given mode is used for carrying out forward reasoning;
(4) calculating the final output of the neural network and the sample label according to a loss function to obtain a loss value;
(5) the loss values are reversely propagated layer by layer through the loss function, the weight parameter gradient of each layer of network is solved, for the convolution network part, the method provided by the invention is used for solving the convolution kernel gradient, and for other general neural network modules, the given mode is used for reversely propagating;
(6) updating all weight parameters of the neural network according to the weight parameter gradient, including convolution kernels of the convolution network part and weight parameters of other general neural network modules, and updating the learning rate;
(7) and (5) repeating the steps (2) to (6) before the specified number of iterations is reached.
The operation of the above steps involving the convolutional network portion will be described in detail below with reference to the flow chart of the method of the present invention shown in fig. 2.
In this embodiment, the data related to a single layer convolution network includes an input feature map, an original convolution kernel, an output feature map, an input feature map gradient, an original convolution kernel gradient, and an output feature map gradient. In this embodiment, the data organization mode of the input feature map is
Figure BDA0002432129540000041
The data organization mode of the original convolution kernel is
Figure BDA0002432129540000042
The data organization mode of the output characteristic diagram is
Figure BDA0002432129540000043
Each having four dimensions, wherein,
Figure BDA0002432129540000046
representing the real number field, c representing the number of channels of the input data, hxIndicating the height, w, of the input dataxRepresenting the width of the input data, o representing the number of channels output, k representing the size of the convolution kernel, hyIndicating the height of the output data, wyRepresenting the width of the output data, the way the data of their respective gradients are organized and is the same as itself.
The convolutional network part involved in step (3) is described in detail in this embodiment, and as shown in fig. 2, the process includes the following steps:
step 1, performing convolution calculation on an original convolution kernel F and an input feature map X to generate first output data. The convolution calculation refers to: and carrying out counterpoint multiplication and summation on data corresponding to a certain position of the first input data and the second input data to obtain output data with one element, calculating to obtain unit element output data of all positions of the second input data, and finally forming single-channel output data. Here, the first input data is the original convolution kernel F, and the second input data is the input feature map X.
And 2, performing cyclic shift operation on the original convolution kernel to obtain an intermediate convolution kernel. And (2) sequentially unfolding the original convolution kernel from a low dimension to a high dimension into a form of a one-dimensional vector, then translating all data of the vector by t bits, supplementing the t bits of data translated out of the vector to the end point of the vector to obtain a new one-dimensional vector, and recombining the vector from the low dimension to the high dimension to obtain an intermediate convolution kernel, wherein the dimension of the intermediate convolution kernel is the same as that of the original convolution kernel in the step (1). In this embodiment, the intermediate convolution kernel obtained by the above cyclic shift operation is denoted as
Figure BDA0002432129540000044
Indicating that the cyclic shift operation direction is left and the shift amount is t1
And 3, performing convolution calculation on the intermediate convolution kernel and the input characteristic diagram again to generate second output data.
And 4, repeating the operation o-1 times of the steps 2 to 3, wherein the translation amount t is different every time, and splicing all o data including the first output data and the second output data along the channel dimension to obtain an output characteristic diagram. The above steps can be expressed as:
Figure BDA0002432129540000045
in the formula, a represents a convolution operation. Translation amount t in the present embodimentiIs determined in such a way that there is an arithmetic progression of the number o (0, k)2,2k2,…(o-1)k2) The ith number in (a) may be regarded as the ith number in the sequence of arithmetic differences (0,1,2, … (o-1)) with o numbers, where the cyclic left shift is performed along the third dimension while keeping the lowest two-dimension data of the original convolution kernel F unchanged, and accordingly, the inter-channel translation amount may be expressed as:
Figure BDA0002432129540000051
in the present embodiment, the step size of convolution is set to 1, and equation (1) can be expressed as the following mathematical equation:
Figure BDA0002432129540000052
in the formula, d, i, j is the index of the output Y from high dimension to low dimension, b, m, n is the weight parameter Fg←dFrom the high dimension to the low dimension, the formula describes the specific calculation flow of the above steps 1 to 4, and finally the output feature diagram Y is obtained.
Optionally, in this embodiment, a fast fourier transform is used to accelerate the operation process, and the specific operation process is as follows: equation (3) can first be expressed as a dot product between two vectors:
Figure BDA0002432129540000053
in the formula, the first step is that,
Figure BDA0002432129540000054
represents the calculation of Yd,i,jThe data involved in the input data X,
Figure BDA0002432129540000055
represents a weight parameter F ofExpanding from high dimension to low dimension to obtain a one-dimensional vector and circularly and leftwards translating the vector by k2The bit of the d is a bit of the D,
Figure BDA0002432129540000056
representing the dot product of two vectors. Will Yd,i,jThe vector formed by o data of different channels is represented as yi,jThe vector is a part of the vector t obtained by circularly convolving x and f, and can be expressed as:
Figure BDA0002432129540000057
in the formula, the first step is that,
Figure BDA0002432129540000058
which represents a convolution of the circumference of a circle,
Figure BDA0002432129540000059
k-th representing the vector t2c-1 numerical value. The fourier transform of the two vector circular convolutions is equal to the dot product of their respective fourier transforms, thus:
Figure BDA00024321295400000510
in the formula, the first step is that,
Figure BDA00024321295400000511
which represents the inverse fast fourier transform of the signal,
Figure BDA00024321295400000512
representing a fast fourier transform. As shown in formula (5), o points are taken from the one-dimensional vector t as partial data of o channels of the output feature map, and the above operation is repeated according to step length traversal input feature map to obtain complete o channel data, which is the output feature map Y. Accelerating the steps 1 to 4 by using a formula (6), and calculating the number of floating point number operations required by the output data Y from hy×wy×k2c × o to hy×wy×k2c×3log2k2c. In thatUnder a common parameter setting, for example, when k is 3 and c is o is 64, the calculation amount is reduced to 43% compared to that in steps 1 to 4 without using an acceleration method, and the larger the number o of channels of the output data Y, the more significant the acceleration effect can be obtained.
The convolutional network part involved in step (5) is described in detail in this embodiment, and as shown in fig. 1, the process includes the following steps:
and 5, calculating the gradient of the input feature map according to the gradient of the output feature map. The specific operation flow is as follows: firstly, expanding the periphery of the output characteristic graph gradient by using 0 to obtain an expanded output characteristic graph gradient, and then carrying out rotation operation on the original convolution kernel in the step 1 to generate an original rotation convolution kernel, wherein the rotation operation refers to that data of a convolution kernel two-dimensional sub-matrix in each channel of the original convolution kernel are sequentially and symmetrically transformed along two matrix diagonals. And (3) performing cyclic displacement operation o-1 times on the original convolution kernel in the step (2), wherein each displacement corresponds to the step (4), performing channel arrangement operation on an intermediate convolution kernel group formed by the generated o-1 convolution kernels and the original convolution kernel to generate a final convolution kernel group comprising c convolution kernels, wherein the channel arrangement operation is to extract ith channel data of each convolution kernel in the intermediate convolution kernel group to form a new convolution kernel, performing convolution operation on each convolution kernel in the final convolution kernel group and the expansion output characteristic map gradient to obtain c partial input characteristic map gradients in total, performing channel dimension splicing on the partial input characteristic map gradients, and finally obtaining the input characteristic map gradient. With reference to the present embodiment, the equivalent calculation formula of the above operation flow can be expressed as:
Figure BDA0002432129540000061
in the formula, d ', i ', j ' represents the index of the input feature diagram X from the high dimension to the low dimension, and L represents the final loss value of the neural network.
And 6, calculating the gradient of the original convolution kernel according to the gradient of the output characteristic diagram. The specific operation flow is as follows: and (3) performing cyclic displacement operation o-1 times on the input feature map to obtain o-1 intermediate feature maps, forming an intermediate feature map group with the input feature map, performing convolution operation in the step (1) on each feature map in the intermediate feature map group and the gradient of the output feature map to obtain c parts of partial convolution kernel gradient data in total, and performing channel dimension splicing on the c parts of partial convolution kernel gradient data to finally obtain the original convolution kernel gradient. In this embodiment, the equivalent calculation formula of the above operation flow can be expressed as:
Figure BDA0002432129540000062
in the formula, Xg→dIndicating that the data of the two dimensions with the lowest input data X is kept unchanged, and circularly performing right shift by d bits along the third dimension.
The convolutional network part involved in step (6) is described in detail in this embodiment, and as shown in fig. 1, the process includes the following steps:
and 7, updating the original convolution kernel and the learning rate by using the original convolution kernel gradient calculated in the step 6. The method comprises the following specific steps: and multiplying the original convolution kernel gradient and the negative number of the learning rate and accumulating the result to the original convolution kernel to obtain an updated original convolution kernel, and simultaneously subtracting a constant from the learning rate to obtain an updated learning rate. In this embodiment, the equivalent calculation formula of the above operation flow can be expressed as:
Figure BDA0002432129540000063
in the formula, F ', η' represents the updated weight parameter and learning rate, and α is a constant.
Example 2
The embodiment provides a convolution operation device, named convolution operation device 100, as shown in fig. 3, which includes a cyclic shift unit, a forward inference unit, a data unit, a back propagation unit, an update unit, and an acceleration unit.
The cyclic displacement unit is used for translating the original convolution kernel, the original rotation convolution kernel and the one-dimensional vector data of the input characteristic diagram; the forward reasoning unit is used for carrying out convolution calculation on the original convolution kernel, the intermediate convolution kernel and the input feature map;
the data unit is used for storing an input feature map, an original convolution kernel, an output feature map, an input feature map gradient, an original convolution kernel gradient and an output feature map gradient;
the back propagation unit is used for solving the input characteristic diagram gradient and the original convolution kernel gradient in a reverse mode according to the output characteristic diagram gradient;
and the updating unit is used for updating the original convolution kernel and the learning rate.
And the accelerating unit accelerates the calculation process in the cyclic displacement unit and the forward reasoning unit by using a fast Fourier transform method.
For the data unit, in this embodiment, the data organization mode of the input feature map is
Figure BDA0002432129540000071
The data organization mode of the original convolution kernel is
Figure BDA0002432129540000072
The data organization mode of the output characteristic diagram is
Figure BDA0002432129540000073
Each having four dimensions, wherein,
Figure BDA0002432129540000074
representing the real number field, c representing the number of channels of the input data, hxIndicating the height, w, of the input dataxRepresenting the width of the input data, o representing the number of channels output, k representing the size of the convolution kernel, hyIndicating the height of the output data, wyRepresenting the width of the output data, the way the data of their respective gradients are organized and is the same as itself. Compared with the conventional standard convolutional network layer in deep learning, the device has the advantages that the weight parameters of the convolutional kernels in the data unit of the device are only the weight parameters in the conventional standard convolutional network layer under the condition that the data quantity of the input feature map and the output feature map is the same
Figure BDA0002432129540000075
o denotes the number of lanes of output data.
And for the forward reasoning unit, the method is characterized in that the original convolution kernel F is expanded by element or channel by channel cyclic displacement operation, and is subjected to convolution operation with the input feature diagram X.
For the accelerating unit, the method is characterized in that the forward reasoning process is mapped to a frequency domain for operation by using fast Fourier transform, and then the result is mapped back to a time domain by using inverse fast Fourier transform.
The counter-propagating unit comprises two parts: a first sub-processing unit and a second sub-processing unit. The first sub-processing unit outputs characteristic diagram gradient
Figure BDA0002432129540000076
Solving for input feature map gradients
Figure BDA0002432129540000077
Providing a basis for gradient solution of a previous layer of network by using a chain rule; the second sub-processing unit outputs characteristic diagram gradient
Figure BDA0002432129540000078
Solving original convolution kernel gradients
Figure BDA0002432129540000079
Providing basis for updating the original convolution kernel F.
For the update unit, the effect is to gradient the original convolution kernel according to the set learning rate η
Figure BDA00024321295400000710
Adds up to the original convolution kernel F to update it and updates the learning rate.
The convolution operation device 100 of the embodiment is constructed into a single-layer neural network, the convolution operation device 100 is used for replacing a conventional standard convolution layer in the convolution neural network, and the neural network is constructed with other general neural network structures, wherein the general neural network structures mainly comprise an activation layer, a pooling layer, a normalization layer and a loss function. And training the built neural network by using the training data until reaching the specified iteration times, and then deploying the trained neural network model to embedded equipment, mobile equipment or other terminals.

Claims (8)

1. A method of convolution operation, comprising the steps of:
step 1, performing convolution calculation on an original convolution kernel and an input characteristic graph to generate first output data;
step 2, performing cyclic shift operation on the original convolution kernel to obtain an intermediate convolution kernel: sequentially unfolding an original convolution kernel from a low dimension to a high dimension into a form of a one-dimensional vector, then translating all data of the vector by t bits, supplementing the t bits of data translated out of the vector to an end point of the vector so as to obtain a new one-dimensional vector, and then recombining the vector from the low dimension to the high dimension to obtain an intermediate convolution kernel, wherein the dimension of the intermediate convolution kernel is the same as that of the original convolution kernel;
step 3, performing convolution calculation on the intermediate convolution kernel and the input characteristic graph again to generate second output data;
step 4, repeating the operations o-1 times from the step 2 to the step 3, wherein the translation amount t is different each time; splicing all the obtained o data including the first output data and the second output data along a channel dimension to obtain an output characteristic diagram;
step 5, calculating the gradient of the input feature map according to the gradient of the output feature map;
step 6, calculating the gradient of an original convolution kernel according to the gradient of the output characteristic diagram and the gradient of the input characteristic diagram;
and 7, updating the original convolution kernel and the learning rate by using the original convolution kernel gradient calculated in the step 6.
2. The method of claim 1, wherein in step 1, the convolution calculation specifically comprises the following steps: and carrying out counterpoint multiplication and summation on the original convolution kernel data and data corresponding to a certain position of the input characteristic diagram to obtain output data with one element, calculating to obtain unit element output data of all positions of the input characteristic diagram, and finally forming single-channel first output data.
3. The method of claim 1, wherein in the steps 1 to 4, a fast fourier transform method is used to accelerate convolution calculation, and the specific steps are as follows: sequentially unfolding original convolution kernel data from a low dimension to a high dimension into a form of one-dimensional vectors, and mapping the one-dimensional vectors into one-dimensional vectors in a frequency domain by using Fourier transform; meanwhile, local data corresponding to a certain position of the input characteristic diagram are expanded into a one-dimensional vector from a low dimension to a high dimension, and the one-dimensional vector is mapped into a one-dimensional vector in a frequency domain by using fast Fourier transform; and then performing dot product operation on the one-dimensional vectors in the two frequency domains, mapping the result vector into a one-dimensional result vector in the time domain by using inverse fast Fourier transform, selecting o data from the one-dimensional result vector as data of each channel at a position corresponding to the output feature map data, wherein the selected basis is the interval represented by each translation amount in the step 4.
4. The method of claim 1, wherein in step 5, the gradient of the output feature map is first expanded by "0" around to obtain an expanded gradient of the output feature map, and then the original convolution kernel in step 1 is rotated to generate an original rotated convolution kernel; performing cyclic displacement operation on the original rotary convolution kernel for o-1 times in the step 2, wherein each displacement corresponds to the displacement in the step 4, and performing channel arrangement operation on an intermediate convolution kernel group formed by the generated o-1 convolution kernels and the original rotary convolution kernel to generate a final convolution kernel group comprising c convolution kernels; and performing convolution operation on each convolution kernel in the final convolution kernel group and the gradient of the expansion output characteristic diagram to obtain c parts of partial input characteristic diagram gradients in total, and performing channel dimension splicing on the partial input characteristic diagram gradients to finally obtain the gradient of the input characteristic diagram.
5. The method of convolution operation according to claim 1, wherein in step 6, an intermediate feature map group is formed by o-1 intermediate feature maps obtained by performing cyclic shift operation o-1 times on the input feature map and the input feature map, a total of c partial convolution kernel gradients are obtained by performing convolution operation in step 1 on each feature map in the intermediate feature map group and the output feature map gradient, and the c partial convolution kernel gradients are subjected to channel dimension stitching to obtain an original convolution kernel gradient.
6. The method of claim 1, wherein in step 7, the original convolution kernel gradient is multiplied by the negative of the learning rate and added to the original convolution kernel to obtain an updated original convolution kernel, and the learning rate is subtracted by a constant to obtain an updated learning rate.
7. A convolution operation device is characterized by comprising a cyclic shift unit, a forward reasoning unit, a data unit, a back propagation unit and an updating unit;
the cyclic displacement unit is used for translating the convolution kernel and the one-dimensional vector data of the input characteristic diagram;
the forward reasoning unit is used for carrying out convolution calculation on the convolution kernel and the input characteristic diagram;
the data unit is used for storing an input feature map, a convolution kernel, an output feature map and gradient data;
the back propagation unit is used for solving the input characteristic diagram gradient and the original convolution kernel gradient in a reverse mode according to the output characteristic diagram gradient;
and the updating unit is used for updating the original convolution kernel and the learning rate.
8. The apparatus of claim 7, further comprising an acceleration unit for accelerating the processes in the cyclic shift unit and the forward inference unit by using fast fourier transform.
CN202010239668.7A 2020-03-30 2020-03-30 Convolution operation method and device Pending CN111428188A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010239668.7A CN111428188A (en) 2020-03-30 2020-03-30 Convolution operation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010239668.7A CN111428188A (en) 2020-03-30 2020-03-30 Convolution operation method and device

Publications (1)

Publication Number Publication Date
CN111428188A true CN111428188A (en) 2020-07-17

Family

ID=71550304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010239668.7A Pending CN111428188A (en) 2020-03-30 2020-03-30 Convolution operation method and device

Country Status (1)

Country Link
CN (1) CN111428188A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181358A (en) * 2020-10-26 2021-01-05 南京大学 Reconfigurable neural network training calculation method and device
CN112784969A (en) * 2021-02-01 2021-05-11 东北大学 Convolutional neural network accelerated learning method based on sampling
CN112830359A (en) * 2021-01-08 2021-05-25 燕山大学 System for detecting abnormal behavior of passengers in elevator car based on deep learning
CN112836823A (en) * 2021-03-02 2021-05-25 东南大学 Convolutional neural network back propagation mapping method based on cyclic recombination and blocking
CN113254996A (en) * 2021-05-31 2021-08-13 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181358A (en) * 2020-10-26 2021-01-05 南京大学 Reconfigurable neural network training calculation method and device
CN112830359A (en) * 2021-01-08 2021-05-25 燕山大学 System for detecting abnormal behavior of passengers in elevator car based on deep learning
CN112830359B (en) * 2021-01-08 2022-04-15 燕山大学 System for detecting abnormal behavior of passengers in elevator car based on deep learning
CN112784969A (en) * 2021-02-01 2021-05-11 东北大学 Convolutional neural network accelerated learning method based on sampling
CN112836823A (en) * 2021-03-02 2021-05-25 东南大学 Convolutional neural network back propagation mapping method based on cyclic recombination and blocking
CN112836823B (en) * 2021-03-02 2024-03-05 东南大学 Convolutional neural network back propagation mapping method based on cyclic recombination and blocking
CN113254996A (en) * 2021-05-31 2021-08-13 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113254996B (en) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111428188A (en) Convolution operation method and device
Oliva et al. Transformation autoregressive networks
CN107977704B (en) Weight data storage method and neural network processor based on same
CN108345939B (en) Neural network based on fixed-point operation
US11645529B2 (en) Sparsifying neural network models
EP3373210A1 (en) Transposing neural network matrices in hardware
US20170193361A1 (en) Neural network training performance optimization framework
JP7325158B2 (en) Data Representation for Dynamic Accuracy in Neural Network Cores
TWI670651B (en) Apparatus, method and system for increasing a speed at which a rocessing unit performs machinelearning computations
CN107610146A (en) Image scene segmentation method, apparatus, computing device and computer-readable storage medium
CN109410114B (en) Compressed Sensing Image Reconstruction Algorithm Based on Deep Learning
CN107204008B (en) Quantum image matching method
CN110008952B (en) Target identification method and device
Kirby Fast simplicial finite element algorithms using Bernstein polynomials
CN107730514A (en) Scene cut network training method, device, computing device and storage medium
CN109447897B (en) Real scene image synthesis method and system
CN111598227B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN111882028B (en) Convolution operation device for convolution neural network
CN114418105B (en) Method and device for processing quantum application problem based on quantum circuit
EP4323924A2 (en) Classical and quantum algorithms for orthogonal neural networks
CN114418104B (en) Quantum application problem processing method and device
Turay et al. SSP Framework: A New Approach to Designing Lightweight Convolutional Neural Networks
CN114595641A (en) Method and system for solving combined optimization problem
Woolfe Matrix product operator simulations of quantum algorithms
CN112184592A (en) Image restoration method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200717

RJ01 Rejection of invention patent application after publication