CN113407904A

CN113407904A - Winograd processing method, system and medium compatible with multi-dimensional convolutional neural network

Info

Publication number: CN113407904A
Application number: CN202110641820.9A
Authority: CN
Inventors: 王鉴; 虞志益; 邓慧鹏; 叶华锋; 肖山林
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2021-06-09
Filing date: 2021-06-09
Publication date: 2021-09-17
Anticipated expiration: 2041-06-09
Also published as: CN113407904B

Abstract

The invention discloses a Winograd processing method, a system and a medium compatible with a multi-dimensional convolutional neural network, wherein the method comprises the following steps: judging according to network dimensionality, determining a Winograd dimensionality calculation mode, and entering a Winograd conversion step; inputting the activation of the neural network into a Winograd activation conversion module for conversion calculation; inputting the weight of the neural network into a weight conversion module to perform Winograd conversion calculation; performing matrix multiplication on activation and weight after Winograd calculation and conversion is completed to obtain a partial sum before Winograd conversion; judging a calculation mode corresponding to the partial sum and conversion according to different Winograd convolution sizes, and performing Winograd conversion calculation on the partial sum; and determining a final effective output link according to different Winograd dimension modes. The method realizes the function of flexibly switching the Winograd calculation size in the calculation process of the same layer of neural network by combining Winograd formulas with different sizes, and can be widely applied to the field of convolutional neural network algorithms.

Description

Winograd processing method, system and medium compatible with multi-dimensional convolutional neural network

Technical Field

The invention relates to the field of convolutional neural network algorithms, in particular to a Winograd processing method, system and medium compatible with a multi-dimensional convolutional neural network.

Background

With the development of deep learning technology and convolutional neural network research, networks of different dimensions have excellent performance and diversified application scenes. Because of the difference of different network dimensions, the realization and deployment of convolutional neural networks on hardware platforms are increasingly differentiated, and the hardware design lacks flexibility for supporting various network dimensions. Meanwhile, due to the huge amount of parameters of the neural network, the Winograd algorithm is proposed to reduce the computational complexity and the number of multiplications. However, Winograd formulas of different-dimension convolutional neural networks have differences, and the requirements of the multi-dimension convolutional neural networks cannot be met at the same time.

Disclosure of Invention

To solve at least one of the technical problems in the prior art to a certain extent, the present invention aims to provide a Winograd processing method, system and medium compatible with a multidimensional convolutional neural network.

The technical scheme adopted by the invention is as follows:

a Winograd processing method compatible with a multi-dimensional convolutional neural network comprises the following steps:

judging according to network dimensionality, determining a Winograd dimensionality calculation mode, and entering a Winograd conversion step;

in the Winograd conversion step, the activation of the neural network is input into a Winograd activation conversion module for conversion calculation;

inputting the weight of the neural network into a weight conversion module to perform Winograd conversion calculation;

performing matrix multiplication on activation and weight after Winograd calculation and conversion is completed to obtain a partial sum before Winograd conversion;

judging a calculation mode corresponding to the partial sum and conversion according to different Winograd convolution sizes, and performing Winograd conversion calculation on the partial sum;

and determining a final effective output link according to different Winograd dimension modes.

Further, the determining a Winograd dimension calculation mode according to the judgment of the network dimension includes:

the size of the 3D convolutional neural network is divided into three dimension directions of a row-column frame, the direction of the 2D neural network is divided into two dimension directions of a row-column, and the 1D neural network only has one dimension direction;

and for network data with different dimensions, different Winograd formulas are adopted to carry out conversion processing and calculation on activation, weight and partial sum.

Further, a Winograd calculation formula corresponding to the 1D neural network is as follows:

y＝G[(Bx)⊙(Aw)]

the Winograd calculation formula corresponding to the 2D neural network is as follows:

y＝G[(BxB^T)⊙(AwA^T)]G^T

the Winograd calculation formula corresponding to the 3D neural network is as follows:

y＝(G[((BxB^T)^RB^T)⊙((AwA^T)^RA^T)]G^T)^RG^T

wherein, G represents a Winograd algorithm part and a conversion matrix, B represents a Winograd algorithm activation conversion matrix, A represents a Winograd algorithm weight conversion matrix, x represents activation, w represents weight, T represents transposition, and R represents rotation.

Further, the weight conversion module is composed of an adder, an inverter, a shifter and a selector.

Further, the partial sum conversion module is composed of an adder, an inverter and a selector.

Further, the determining a final effective output link according to different Winograd dimensional modes includes:

selecting a corresponding effective output link as a final result according to different network dimensions;

for the case of a 3D convolutional neural network, selecting the link for the 3D output;

for the case of a 2D convolutional neural network, selecting the link for the 2D output;

for the case of a 1D convolutional neural network, the link for the 1D output is selected.

The other technical scheme adopted by the invention is as follows:

a Winograd processing system compatible with a multi-dimensional convolutional neural network comprises:

the dimensionality judgment module is used for judging according to network dimensionality, determining a Winograd dimensionality calculation mode and entering a Winograd conversion step;

the activation calculation module is used for inputting the activation of the neural network into the Winograd activation conversion module for conversion calculation in the Winograd conversion step;

the weight calculation module is used for inputting the weight of the neural network into the weight conversion module to carry out Winograd conversion calculation;

the part sum calculation module is used for carrying out matrix multiplication on activation and weight after Winograd calculation conversion is completed to obtain a part sum of Winograd conversion;

the part and conversion module is used for judging a calculation mode corresponding to the part and conversion according to different Winograd convolution sizes and carrying out Winograd conversion calculation on the part and the conversion;

and the output module is used for determining a final effective output link according to different Winograd dimension modes.

The other technical scheme adopted by the invention is as follows:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method described above.

The other technical scheme adopted by the invention is as follows:

a storage medium having stored therein a processor-executable program for performing the method as described above when executed by a processor.

The invention has the beneficial effects that: the method and the device realize the function of flexibly switching the Winograd calculation size in the calculation process of the same layer of neural network by combining Winograd formulas with different sizes, reduce the times of zero padding operation of a single layer of network, improve the calculation efficiency and reduce the area overhead of hardware realization.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart illustrating steps of a Winograd processing method compatible with a multidimensional convolutional neural network according to an embodiment of the present invention;

FIG. 2 is a flow chart of Winograd calculation oriented to different dimensions in an embodiment of the present invention;

FIG. 3 is an overall architecture diagram including activation, weighting and segmentation and translation modules in an embodiment of the present invention;

FIG. 4 is a diagram of the overall architecture of the weights and portions of an embodiment of the present invention;

fig. 5 is an overall architecture diagram of portions and a conversion module in an embodiment of the invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.

As shown in fig. 1 to fig. 3, the present embodiment provides a Winograd processing method compatible with a multidimensional convolutional neural network, including the following steps:

s1, judging according to the network dimension, determining a Winograd dimension calculation mode, and entering a Winograd conversion step.

The size of the 3D convolutional neural network is divided into three dimension directions (W, H and F) of a row-column frame, the direction of the 2D neural network is divided into two dimension directions (W and H) of a row-column, and the 1D neural network only has one dimension direction (W). And (4) performing first-step judgment according to network dimensionality, and entering a Winograd conversion step after determining a Winograd dimensionality calculation mode.

The Winograd conversion formula comprises conversion of activation, weight and part and three parts, and for network data with different dimensions, conversion processing and calculation of the activation, weight and part and the part need to be carried out by using different Winograd formulas. After a dimension calculation mode is determined according to data dimensions, 1D Winograd conversion formula and calculation are adopted in the subsequent steps for 1D data, 2D Winograd conversion formula and calculation are adopted in the subsequent steps for 2D data, and 3D Winograd conversion formula and calculation are adopted in the subsequent steps for 3D data.

The Winograd calculation formula corresponding to the 1D neural network is as follows:

y＝G[(Bx)⊙(Aw)]

y＝G[(BxB^T)⊙(AwA^T)]G^T

y＝(G[((BxB^T)^RB^T)⊙((AwA^T)^RA^T)]G^T)^RG^T

s2, in the Winograd conversion step, the activation of the neural network is input into a Winograd activation conversion module for conversion calculation.

And S3, inputting the weight of the neural network into the weight conversion module to perform Winograd conversion calculation.

Winograd weight conversion needs to determine the Winograd calculation size first, and different Winograd calculation formulas need to be used for different weight sizes. The embodiment simultaneously supports formats with the Winograd weight size of 2 and the weight size of 3, can freely switch in the calculation of a specific layer of convolutional neural network, and realizes the completion of the calculation of the convolutional neural network under less zero padding operation. The Winograd conversion parameters with a weight size of 2 and a weight size of 3 are as follows.

Winograd parameters with activation size 4 and weight size 3:

winograd parameter with activation size 4 and weight size 2:

referring to fig. 4, the circuit of the simultaneous weight conversion module is composed of an adder, a negation device, a shifter and a selector, and compared with a Winograd formula in which multiplication is directly performed, resources of the multiplier on hardware are reduced, and the hardware implementation area of the circuit is smaller.

And S4, matrix multiplication is carried out on the activation and the weight after Winograd calculation conversion is completed, and the partial sum before Winograd conversion is obtained.

Winograd parts and conversion require that Winograd calculation sizes are determined first, and different Winograd calculation formulas are required for different parts and sizes. The embodiment simultaneously supports formats of a Winograd part with the size of 2 and a Winograd part with the size of 3, can freely switch in the calculation of a specific layer of convolutional neural network, and realizes that the calculation of the convolutional neural network is completed under the operation of less zero padding.

Referring to fig. 5, the circuit of the simultaneous part and the conversion module is composed of an adder, an inverter and a selector, and compared with the method of directly performing multiplication operation in a Winograd formula, the method reduces the resource of the multiplier on hardware and has smaller hardware implementation area of the circuit.

And S5, judging the calculation mode corresponding to the partial sum conversion according to different Winograd convolution sizes, and performing Winograd conversion calculation on the partial sum.

And S6, determining a final effective output link according to different Winograd dimension modes.

And selecting the corresponding effective output link according to different network dimensions to obtain a final result. For the case of a 3D convolutional neural network, the link for the 3D output is selected. For the case of a 2D convolutional neural network, the link for the 2D output is selected. For the case of a 1D convolutional neural network, the link for the 1D output is selected.

Referring to fig. 2, it can be seen from the above summary that the method of the present embodiment determines a Winograd dimension calculation mode according to the network dimension, and enters a Winograd conversion step; inputting the activation of the neural network into a Winograd activation conversion module for conversion calculation, if the activation is 1D activation, activating 1D conversion, if the activation is 2D activation, activating 2D conversion, and if the activation is 3D activation, activating 3D conversion after rotation, and after the conversion is completed, obtaining the activation after conversion; meanwhile, the weight of the neural network is input into a weight conversion module to be subjected to Winograd conversion calculation to obtain the converted weight; matrix multiplication is carried out on the converted activation and converted weight which are calculated and converted by Winograd, and a partial sum before conversion of Winograd is obtained; judging a calculation mode corresponding to the partial sum and conversion according to different Winograd convolution sizes, and performing Winograd conversion calculation on the partial sum; and determining a final effective output link according to different Winograd dimension modes, selecting a 1D link for output according to the 1D result, selecting a 2D link for output according to the 2D result, and selecting a 3D link for output according to the 3D result.

In summary, compared with the prior art, the method of the embodiment has the following beneficial effects:

(1) most of the neural network accelerator designs in the prior art only aim at a certain specific dimension network, and the compatibility and the flexibility of the multi-dimension convolution neural network are lacked. In the embodiment, the support of the convolutional neural network with different dimensions is realized by combining different-dimension Winograd calculation formulas, and a novel different-dimension calculation flow chart is benefited.

(2) In the prior art, the Winograd computing module is designed more unilaterally, only a certain Winograd size can be supported, or the Winograd computing module cannot be flexibly switched in a certain layer of neural network computing, so that computing resources are wasted. In contrast, the function of flexibly switching the calculation size of Winograd in the calculation process of the same layer of neural network is realized by combining Winograd formulas with different sizes. The zero padding operation frequency of the single-layer network is reduced, and the calculation efficiency is improved.

(3) The conversion module circuit consists of a shift, an adder and an inverter, and does not use a multiplier, thereby reducing the area overhead of hardware implementation.

The embodiment further provides a Winograd processing system compatible with the multidimensional convolutional neural network, which includes:

The Winograd processing system compatible with the multidimensional convolutional neural network can execute the Winograd processing method compatible with the multidimensional convolutional neural network provided by the method embodiment of the invention, can execute any combination implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.

The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.

The embodiment also provides a storage medium, which stores an instruction or a program capable of executing the Winograd processing method compatible with the multidimensional convolutional neural network provided by the embodiment of the method of the present invention, and when the instruction or the program is executed, the arbitrary combination implementation steps of the embodiment of the method can be executed, and the method has corresponding functions and advantages of the method.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A Winograd processing method compatible with a multi-dimensional convolutional neural network is characterized by comprising the following steps:

performing matrix multiplication on activation and weight after Winograd calculation and conversion is completed to obtain a partial sum before Winograd conversion; judging a calculation mode corresponding to the partial sum and conversion according to different Winograd convolution sizes, and performing Winograd conversion calculation on the partial sum;

2. The Winograd processing method compatible with the multi-dimensional convolutional neural network according to claim 1, wherein the determining a Winograd dimension calculation mode according to the judgment made by the network dimensions comprises:

3. The Winograd processing method compatible with the multi-dimensional convolutional neural network as claimed in claim 2, wherein the Winograd calculation formula corresponding to the 1D neural network is as follows:

y＝G[(Bx)⊙(Aw)]

y＝G[(BxB^T)⊙(AwA^T)]G^T

y＝(G[((BxB^T)RB^T)⊙((AwA^T)^RA^T)]G^T)^RG^T

4. The Winograd processing method compatible with the multi-dimensional convolutional neural network of claim 3, wherein the weight conversion module is composed of an adder, an inverter, a shifter and a selector.

5. The Winograd processing method compatible with the multi-dimensional convolutional neural network of claim 3, wherein the partial sum transformation module is composed of an adder, an inverter and a selector.

6. The Winograd processing method compatible with the multi-dimensional convolutional neural network of claim 1, wherein the determining the final effective output link according to different Winograd dimensional modes comprises:

7. A Winograd processing system compatible with a multi-dimensional convolutional neural network is characterized by comprising:

8. A Winograd processing system compatible with a multi-dimensional convolutional neural network is characterized by comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-6.

9. A storage medium having stored therein a program executable by a processor, wherein the program executable by the processor is adapted to perform the method of any one of claims 1-6 when executed by the processor.