CN110633785B

CN110633785B - Method and system for calculating convolutional neural network

Info

Publication number: CN110633785B
Application number: CN201810646058.1A
Authority: CN
Inventors: 张广艳; 李夏青; 郑纬民
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-06-21
Filing date: 2018-06-21
Publication date: 2021-01-05
Anticipated expiration: 2038-06-21
Also published as: CN110633785A

Abstract

The invention provides a method and a system for calculating a convolutional neural network, wherein the method comprises the following steps: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method and the system can improve the overall calculation efficiency and the overall performance of the convolutional neural network.

Description

Method and system for calculating convolutional neural network

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a system for calculating a convolutional neural network.

Background

Compared with the traditional machine learning method, the convolutional neural network can more effectively process complex identification problems, such as: computer vision, semantic recognition, natural language processing, and other recognition tasks. The convolutional neural network (especially, the deep convolutional neural network) has a complex structure and many parameters, so that the internal computation amount of the convolutional neural network is very large. Especially in the training phase of the convolutional neural network, a high-precision convolutional neural network often needs to be trained on a large-scale data set, and each training needs millions or even millions of times of computational iterations, so that a large amount of computational resources and running time are needed.

General purpose GPUs are efficient convolutional neural network accelerators, and therefore, a number of convolutional neural network frameworks and acceleration libraries are currently designed and developed, such as: caffe, cuDNN, cuda-convnet2, Torch, Theano, fbfft, and the like. However, these GPU-based convolutional neural network implementations do not meet the computational requirements of convolutional neural networks. First, there are large differences in performance between these implementations, and none of them can operate fastest in all scenarios. These performance differences are mainly due to different convolution strategies and optimization techniques on the GPU. For example, cuda-convnet2 employs a direct convolution strategy, which can achieve good memory usage because it does not require additional storage space to store intermediate computation results. However, due to the optimized memory layout of the cuda-convnet2, the performance of the cuda-convnet2 can only be calculated efficiently in a specific scene, and is low in calculation efficiency in other scenes. This performance difference is also present in convolution strategies based on matrix multiplication, such as cuDNN. Although fbfft of the convolution strategy using fourier transforms can remain efficient in many computational scenarios by reducing computational complexity and optimization of GPU memory, it is computationally inefficient in scenarios where the convolution kernel is small.

Furthermore, the usability of these implementations also shows great differences in different computing scenarios. The cuda-convnet2 based on the direct convolution strategy can only operate under a partial parameter space. For example, the cuda-convnet2 can only operate in a parameter space where the input pictures are square. Fbfft based on Fourier transform does not support step convolution, namely only supports parameter space with convolution step of 1, and if the space is more than 1, operation fails.

In summary, these typical implementations cannot effectively meet the computation requirements of the convolutional neural network, and none of the implementations can efficiently perform the computation of the convolutional neural network in all computation scenarios, and each implementation shows different performance differences in different scenarios, which results in low overall computation efficiency of the convolutional neural network and affects the overall performance of the convolutional neural network to a certain extent.

Disclosure of Invention

The invention provides a method and a system for calculating a convolutional neural network, aiming at solving the problem that the overall calculation efficiency of the convolutional neural network is low due to different performance differences of different convolutional products under different scenes in the prior art.

In one aspect, the present invention provides a method for calculating a convolutional neural network, including:

acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer;

if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;

and combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective optimal convolution strategy combination.

Preferably, the searching for the optimal convolution strategy combination corresponding to the convolutional layer in the knowledge base according to the configuration parameters corresponding to the convolutional layer further includes:

and if the optimal convolution strategy combination exists in the knowledge base, acquiring the optimal convolution strategy combination from the knowledge base so that the convolution layer carries out convolution calculation according to the optimal convolution strategy combination.

Preferably, the obtaining of the multiple candidate convolution strategies corresponding to the convolution stage specifically includes:

and selecting the existing convolution strategy for instantiation to obtain a plurality of candidate convolution strategies corresponding to the convolution stage.

Preferably, the selecting the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies specifically includes:

testing the availability of all the candidate convolution strategies and screening out effective convolution strategies from all the candidate convolution strategies;

and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.

Preferably, the obtaining of the optimal convolution strategy combination corresponding to the convolution layer further includes:

and associating and storing the configuration parameters corresponding to the convolutional layer and the optimal convolutional strategy combination corresponding to the convolutional layer to the knowledge base.

Preferably, the selecting the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies further includes:

and calling the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies by using the interface function corresponding to the convolution stage.

Preferably, the configuration parameters include a batch size, an input channel, the number of convolution kernels, a length and a width of a convolution window, a length and a width of an input picture, and a convolution step size.

In one aspect, the present invention provides a convolutional neural network computing system, comprising:

the convolution strategy searching module is used for acquiring configuration parameters corresponding to all convolution layers in the target convolution neural network, and searching the optimal convolution strategy combination corresponding to the convolution layer in the knowledge base according to the configuration parameters corresponding to the convolution layer for any convolution layer;

a convolution strategy screening module, configured to decompose the convolution layer into multiple convolution stages if the optimal convolution strategy combination does not exist in the knowledge base, acquire multiple candidate convolution strategies corresponding to the convolution stage for any one convolution stage, and screen out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;

and the convolution calculation module is used for combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination.

In one aspect, the present invention provides an electronic device comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor being capable of performing any of the methods described above when invoked by the processor.

In one aspect, the invention provides a non-transitory computer readable storage medium storing computer instructions that cause a computer to perform any of the methods described above.

The invention provides a method and a system for calculating a convolutional neural network, which are characterized in that configuration parameters corresponding to all convolutional layers in a target convolutional neural network are obtained, and for any convolutional layer, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method and the system screen the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional layer implementations in the prior art are solved.

Drawings

Fig. 1 is a schematic overall flow chart of a calculation method of a convolutional neural network according to an embodiment of the present invention;

FIG. 2 is a schematic overall flow chart of a convolutional neural network computing system according to an embodiment of the present invention;

fig. 3 is a schematic structural framework diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

The convolutional neural network is a network model having a multilayer structure including an input layer, a convolutional layer, a pooling layer, an output layer, and the like, wherein the convolutional layer is a hot spot calculation layer of the convolutional neural network, that is, most of the calculation amount of the convolutional neural network is reflected in the convolution calculation of the convolutional layer. In view of this, in order to improve the overall calculation efficiency of the convolutional neural network to improve the overall performance of the convolutional neural network, the present invention mainly optimizes the convolutional calculation of the convolutional neural network, and provides a calculation method of the convolutional neural network, which can effectively improve the convolutional calculation efficiency of the convolutional neural network, and is further beneficial to improving the overall calculation efficiency of the convolutional neural network. The concrete implementation is as follows:

fig. 1 is a schematic overall flow chart of a method for calculating a convolutional neural network according to an embodiment of the present invention, and as shown in fig. 1, the present invention provides a method for calculating a convolutional neural network, including:

s1, acquiring configuration parameters corresponding to all convolutional layers in the target convolutional neural network, and searching the optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layer for any convolutional layer;

specifically, the target convolutional neural network is subjected to structure splitting, the target neural network is split into a plurality of base layers, convolutional layers are positioned from the base layers, and all convolutional layers of the target convolutional neural network can be obtained. Because any convolutional neural network sets corresponding configuration parameters for each convolutional layer during construction, the configuration parameters are often stored in a configuration file form, that is, each convolutional neural network corresponds to one configuration file. In view of this, in this embodiment, after obtaining all convolutional layers in the target convolutional neural network, and then obtaining the configuration file corresponding to the target convolutional neural network, the configuration parameters corresponding to all convolutional layers may be obtained from the configuration file corresponding to the target convolutional neural network. The configuration parameters include batch size, the number of convolution kernels, the size of a convolution window, the input size, the convolution step size and the like.

When a convolutional layer carries out convolution calculation, a plurality of corresponding convolution stages need to be provided with corresponding convolution strategies, and the convolution strategies corresponding to all the convolution stages are combined to form the convolution strategy combination corresponding to the convolutional layer. Because the configuration parameters corresponding to different convolutional layers are different, and the configuration parameters of the convolutional layers represent the attributes of the convolutional layers, the optimal convolutional strategy combinations applicable to the convolutional layers with different configuration parameters during convolutional calculation are different, the calculation time required by the convolutional layers for performing convolutional calculation according to the optimal convolutional strategy combinations is shortest, and the corresponding calculation efficiency is highest.

Based on the above technical solution, in this embodiment, after all convolutional layers in the target convolutional neural network and the configuration parameters corresponding to all convolutional layers are obtained, for any convolutional layer, first, an optimal convolution strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer, and a calculation time required by the convolutional layer to perform convolution calculation according to the optimal convolution strategy combination is shortest. It should be noted that, through history learning and accumulation, the knowledge base stores the optimum convolution strategy combination applied to the convolution layers with different configuration parameters.

S2, if the optimum convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimum convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;

specifically, on the basis of the above technical solution, if the optimal convolution strategy combination corresponding to the convolution layer is not found in the knowledge base, it can be determined that the optimal convolution strategy combination corresponding to the convolution layer does not exist in the knowledge base. In view of this, in the embodiment, the convolutional layer is decomposed into a plurality of convolution stages, which mainly includes three convolution stages, i.e., a forward output calculation stage, a backward input gradient calculation stage, and a backward weight gradient calculation stage. On the basis, for any convolution stage, the existing convolution strategy is instantiated, and a plurality of candidate convolution strategies corresponding to the convolution stage are obtained. And then testing all candidate convolution strategies, specifically, the convolution stage can perform convolution calculation according to each candidate convolution strategy to test the corresponding running time of each candidate convolution strategy, and determine the convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution layer. Therefore, the optimal convolution strategy corresponding to the convolution stage can be screened from all candidate convolution strategies.

And S3, combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolutional neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.

Specifically, on the basis of the above technical solution, after the optimal convolution strategy corresponding to each convolution stage is obtained, the optimal convolution strategies corresponding to all convolution stages are combined, so that the optimal convolution strategy combination corresponding to the convolution layer can be obtained. Taking an example that one convolutional layer comprises three convolutional stages, the finally obtained optimal convolution strategy combination S^* _iCan be represented as S^* _i＝{(P₀,ws0),(P₁,ws1),(P₂Ws 2). Wherein, P₀、P₁、P₂Respectively representing three convolution stages; ws0, ws1, ws2 represent the optimal convolution strategies for the three convolution stages, respectively.

After the optimal convolution strategy combination corresponding to the convolution layer is obtained, the convolution layer can perform convolution calculation according to the optimal convolution strategy combination. For example, if the optimum convolution strategy combination S corresponds to a certain convolution layer^* _i＝{(P₀,ws0),(P₁,ws1),(P₂Ws2), P in the convolutional layer₀、P₁、P₂The three convolution stages will perform convolution calculation according to three optimal convolution strategies of ws0, ws1 and ws2 respectively. Because the time required by each convolution stage for convolution calculation according to the optimal convolution strategy is shortest, the time for convolution calculation of the convolution layer according to the optimal convolution strategy combination is also shortest, and the convolution calculation efficiency of the convolution layer can be effectively improved.

On the basis of the technical scheme, corresponding optimal convolution strategy combinations can be obtained for all convolution layers in the target convolution neural network, and the convolution layers carry out convolution calculation according to the corresponding optimal convolution strategy combinations, so that the overall calculation efficiency of the target convolution neural network can be effectively improved, and the overall performance of the target convolution neural network can be improved.

The invention provides a calculation method of a convolutional neural network, which comprises the steps of obtaining configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method screens the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional realizations in the prior art are solved.

Based on any of the above embodiments, there is provided a method for calculating a convolutional neural network, which searches for an optimal convolutional policy combination corresponding to the convolutional layer in a knowledge base according to a configuration parameter corresponding to the convolutional layer, and then further includes: and if the optimal convolution strategy combination exists in the knowledge base, acquiring the optimal convolution strategy combination from the knowledge base so that the convolution layer carries out convolution calculation according to the optimal convolution strategy combination.

Specifically, after all convolutional layers in the target convolutional neural network and configuration parameters corresponding to all convolutional layers are obtained, for any convolutional layer, firstly, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer. It should be noted that, through history learning and accumulation, the knowledge base stores the optimum convolution strategy combination applied to the convolution layers with different configuration parameters. On the basis, after the search is carried out, if the optimal convolution strategy combination exists in the knowledge base, the optimal convolution strategy combination is obtained from the knowledge base. After the optimal convolution strategy combination corresponding to the convolution layer is obtained, the convolution layer can carry out convolution calculation according to the optimal convolution strategy combination, and because the optimal convolution strategy combination comprises the optimal convolution strategy corresponding to each convolution stage, and the time required for carrying out convolution calculation according to the optimal convolution strategy in each convolution stage is shortest, the time for carrying out convolution calculation according to the optimal convolution strategy combination by the convolution layer is also shortest, and the convolution calculation efficiency of the convolution layer can be effectively improved.

According to the calculation method of the convolutional neural network, after the optimal convolution strategy combination corresponding to the convolutional layer is searched in the knowledge base according to the configuration parameters corresponding to the convolutional layer, if the optimal convolution strategy combination exists in the knowledge base, the optimal convolution strategy combination is obtained from the knowledge base, so that the convolutional layer carries out convolution calculation according to the optimal convolution strategy combination. According to the method, the optimal convolution strategy combination corresponding to each convolution layer can be rapidly obtained through the knowledge base, and meanwhile, each convolution layer can carry out convolution calculation according to the optimal convolution strategy combination, so that the calculation efficiency of a convolution stage is effectively improved, and further, the overall calculation efficiency of the convolutional neural network is effectively improved.

Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, and a plurality of candidate convolution strategies corresponding to the convolution stage are obtained, specifically: and selecting the existing convolution strategy for instantiation to obtain a plurality of candidate convolution strategies corresponding to the convolution stage.

Specifically, in this embodiment, for any convolution stage, an existing convolution policy is selected to be instantiated, and a plurality of candidate convolution policies corresponding to the convolution stage are obtained. The existing convolution strategy is an existing typical convolution strategy, and the existing typical convolution strategy comprises cuda-convnet2 (direct convolution), torch-cunn (convolution based on matrix multiplication), cuDNN (convolution based on matrix multiplication) and fbfft (convolution based on fourier transform). On this basis, in the present embodiment, four candidate convolution strategies are correspondingly set for each convolution stage. In other embodiments, the number and the type of the candidate convolution policies may be set according to actual requirements, and are not specifically limited herein.

According to the calculation method of the convolutional neural network, provided by the invention, for any convolution stage, the existing convolution strategy is selected for instantiation, a plurality of candidate convolution strategies corresponding to the convolution stage are obtained, and the optimum convolution strategy corresponding to the convolution stage is favorably screened from all the candidate convolution strategies.

Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, which screens out an optimal convolution strategy corresponding to the convolution stage from all candidate convolution strategies, and specifically includes: testing the availability of all candidate convolution strategies, and screening out effective convolution strategies from all candidate convolution strategies; and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.

Specifically, for any convolution stage, after obtaining a plurality of candidate convolution strategies corresponding to the convolution stage, all the candidate convolution strategies need to be tested. Specifically, in this embodiment, firstly, the availability of all candidate convolution policies is tested, the convolution stage may be operated according to each candidate convolution policy, and if there is an abnormality in the operation process, it may be determined that the corresponding candidate convolution policy is invalid, so that invalid policies in all candidate convolution policies may be excluded, and further, all valid convolution policies may be screened out from all candidate convolution policies.

On the basis, all the effective convolution strategies are further tested, specifically, the convolution stage may be run according to each effective convolution strategy, and a running time corresponding to each effective convolution strategy is tested, where the running time is a time required for performing convolution calculation on the convolution stage according to the effective convolution strategy. And after the running time corresponding to each effective convolution strategy is obtained, selecting the shortest running time from the running times, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.

The invention provides a calculation method of a convolutional neural network, which is used for testing the availability of all candidate convolution strategies and screening out effective convolution strategies from all candidate convolution strategies; and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage. The method not only can screen out the optimal convolution strategy corresponding to each convolution stage, but also can exclude invalid convolution strategies, and can effectively ensure the availability and the high efficiency of the optimal convolution strategy.

Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, where an optimal convolution strategy combination corresponding to the convolutional layer is obtained, and then the method further includes: and associating and storing the configuration parameters corresponding to the convolutional layer and the optimal convolutional strategy combination corresponding to the convolutional layer to a knowledge base.

Specifically, after the optimal convolution strategy combination corresponding to each convolution layer is obtained, the configuration parameters corresponding to the convolution layer and the optimal convolution strategy combination corresponding to the convolution layer are stored in the knowledge base in a correlation mode, so that the optimal convolution strategy combination corresponding to the configuration parameters can be directly searched from the knowledge base according to the configuration parameters of the convolution layer with the same configuration in the follow-up process, and the follow-up process of retesting the convolution layers with the same configuration can be effectively avoided.

According to the calculation method of the convolutional neural network, after the optimal convolution strategy combination corresponding to the convolutional layer is obtained, the configuration parameters corresponding to the convolutional layer and the optimal convolution strategy combination corresponding to the convolutional layer are stored in the knowledge base in a correlation mode, so that the optimal convolution strategy combination corresponding to the configuration parameters can be searched for the convolutional layer with the same configuration from the knowledge base directly according to the configuration parameters of the convolutional layer, and repeated testing on the convolutional layer with the same configuration can be effectively avoided.

Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, where an optimal convolution policy corresponding to the convolution stage is screened from all candidate convolution policies, and then the method further includes: and calling the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies by using the interface function corresponding to the convolution stage.

Specifically, in this embodiment, a uniform interface is designed for calculation of a target convolutional neural network convolutional layer, an interface function corresponding to each convolutional stage is designed in the interface, and a plurality of candidate convolutional strategies are pre-packaged in each interface function. In view of this, for any convolution stage, after obtaining the optimal convolution strategy corresponding to the convolution stage, the interface function corresponding to the convolution stage is used to invoke the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies. That is, candidate convolution strategies corresponding to the same convolution stage in the target convolutional neural network are all encapsulated in the same interface function, so that when a certain convolution stage performs convolution calculation, a corresponding optimal convolution strategy can be selected from the corresponding interface function.

For example, if the target convolutional neural network includes 6 convolutional layers, and each convolutional layer includes 3 convolutional stages, including a forward output calculation stage, a backward input gradient calculation stage, and a backward weight gradient calculation stage. On the basis, candidate convolution strategies corresponding to the forward output calculation stage in the 6 convolution layers can be pre-packaged in the same interface function; pre-encapsulating candidate convolution strategies corresponding to a backward input gradient calculation stage in the 6 convolution layers in the same interface function; and pre-encapsulating candidate convolution strategies corresponding to the backward weight gradient calculation stage in the 6 convolution layers in the same interface function. Therefore, after the optimal convolution strategy corresponding to each convolution stage is obtained, the optimal convolution strategy corresponding to the convolution stage is called from the candidate convolution strategies packaged in advance by using the interface function corresponding to the convolution stage.

According to the calculation method of the convolutional neural network, after the optimal convolution strategy corresponding to the convolution stage is screened out from all candidate convolution strategies, the interface function corresponding to the convolution stage is utilized, the optimal convolution strategy corresponding to the convolution stage is called from the pre-packaged candidate convolution strategies, the method is beneficial to different convolution stages to select the corresponding optimal convolution strategy from the corresponding interface functions when convolution calculation is carried out, and switching between different convolution strategies can be realized.

Based on any one of the embodiments, a method for calculating a convolutional neural network is provided, where the configuration parameters include batch size, input channel, number of convolution kernels, length and width of convolution window, length and width of input picture, and convolution step size.

Specifically, in the present embodiment, each convolutional layer in the target convolutional neural networkThe corresponding configuration parameters comprise 8 configuration parameters including batch size, input channel, number of convolution kernels, length and width of convolution window, length and width of input picture and convolution step, and are B, N respectively_i、N_o、H、W、K_w、K_hAnd S are represented by 8 parameters in total. Wherein the value range of B is 1-512; n is a radical of_iThe value range of (1) to (1000); n is a radical of_oThe value range of (1) to (1000); the value range of H is 1-512; the value range of W is 1-512; k_wThe value range of (1) to (50); k_hThe value range of (1) to (50); the value range of S is 1-50.

According to the calculation method of the convolutional neural network, the configuration parameters corresponding to each convolutional layer comprise batch size, input channel, the number of convolutional kernels, the length and width of a convolutional window, the length and width of an input picture and a convolutional step size, and the optimum convolutional strategy corresponding to the convolutional layer is determined according to different configuration parameters.

Fig. 2 is a schematic overall flow chart of a convolutional neural network computing system according to an embodiment of the present invention, and as shown in fig. 2, based on any one of the above method embodiments, a convolutional neural network computing system is provided, including:

the convolution strategy searching module 1 is used for acquiring configuration parameters corresponding to all convolution layers in a target convolution neural network, and searching an optimal convolution strategy combination corresponding to the convolution layer in a knowledge base according to the configuration parameters corresponding to the convolution layer for any convolution layer;

the convolution strategy screening module 2 is used for decomposing the convolution layer into a plurality of convolution stages if the optimal convolution strategy combination does not exist in the knowledge base, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;

and the convolution calculation module 3 is configured to combine the optimal convolution strategies corresponding to all convolution stages to obtain an optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolutional neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combinations.

The invention provides a calculation system of a convolutional neural network, which comprises a convolutional strategy searching module 1, a convolutional strategy screening module 2 and a convolutional calculation module 3, wherein the method in any embodiment is realized through the cooperation of the modules, and specific implementation steps can be referred to the method embodiment, which is not described herein again.

The invention provides a calculation system of a convolutional neural network, which is characterized in that configuration parameters corresponding to all convolutional layers in a target convolutional neural network are obtained, and for any convolutional layer, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The system screens the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional realizations in the prior art are solved.

Fig. 3 shows a block diagram of an electronic device according to an embodiment of the present invention. Referring to fig. 3, the electronic device includes: a processor (processor)31, a memory (memory)32, and a bus 33; wherein, the processor 31 and the memory 32 complete the communication with each other through the bus 33; the processor 31 is configured to call program instructions in the memory 32 to perform the methods provided by the above-mentioned method embodiments, for example, including: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.

The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, the method of the present application is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of computing a convolutional neural network, comprising:

combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective optimal convolution strategy combination;

the configuration parameters comprise batch size, input channel, number of convolution kernels, length and width of a convolution window, length and width of an input picture and convolution step size.

2. The method of claim 1, wherein the searching for the optimal convolution strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layer further comprises:

3. The method according to claim 1, wherein the obtaining of the candidate convolution policies corresponding to the convolution stage specifically includes:

4. The method according to claim 1, wherein the selecting of the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies is specifically:

5. The method of claim 1, wherein obtaining the optimal convolution strategy combination for the convolution layer further comprises:

6. The method of claim 1, wherein the step of screening out the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies further comprises:

7. A computing system of a convolutional neural network, comprising:

the convolution calculation module is used for combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective corresponding optimal convolution strategy combination;

8. An electronic device, comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 6.

9. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 6.