CN110633785B - Method and system for calculating convolutional neural network - Google Patents

Method and system for calculating convolutional neural network Download PDF

Info

Publication number
CN110633785B
CN110633785B CN201810646058.1A CN201810646058A CN110633785B CN 110633785 B CN110633785 B CN 110633785B CN 201810646058 A CN201810646058 A CN 201810646058A CN 110633785 B CN110633785 B CN 110633785B
Authority
CN
China
Prior art keywords
convolution
optimal
strategy
convolutional
strategies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810646058.1A
Other languages
Chinese (zh)
Other versions
CN110633785A (en
Inventor
张广艳
李夏青
郑纬民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810646058.1A priority Critical patent/CN110633785B/en
Publication of CN110633785A publication Critical patent/CN110633785A/en
Application granted granted Critical
Publication of CN110633785B publication Critical patent/CN110633785B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a method and a system for calculating a convolutional neural network, wherein the method comprises the following steps: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method and the system can improve the overall calculation efficiency and the overall performance of the convolutional neural network.

Description

Method and system for calculating convolutional neural network
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a system for calculating a convolutional neural network.
Background
Compared with the traditional machine learning method, the convolutional neural network can more effectively process complex identification problems, such as: computer vision, semantic recognition, natural language processing, and other recognition tasks. The convolutional neural network (especially, the deep convolutional neural network) has a complex structure and many parameters, so that the internal computation amount of the convolutional neural network is very large. Especially in the training phase of the convolutional neural network, a high-precision convolutional neural network often needs to be trained on a large-scale data set, and each training needs millions or even millions of times of computational iterations, so that a large amount of computational resources and running time are needed.
General purpose GPUs are efficient convolutional neural network accelerators, and therefore, a number of convolutional neural network frameworks and acceleration libraries are currently designed and developed, such as: caffe, cuDNN, cuda-convnet2, Torch, Theano, fbfft, and the like. However, these GPU-based convolutional neural network implementations do not meet the computational requirements of convolutional neural networks. First, there are large differences in performance between these implementations, and none of them can operate fastest in all scenarios. These performance differences are mainly due to different convolution strategies and optimization techniques on the GPU. For example, cuda-convnet2 employs a direct convolution strategy, which can achieve good memory usage because it does not require additional storage space to store intermediate computation results. However, due to the optimized memory layout of the cuda-convnet2, the performance of the cuda-convnet2 can only be calculated efficiently in a specific scene, and is low in calculation efficiency in other scenes. This performance difference is also present in convolution strategies based on matrix multiplication, such as cuDNN. Although fbfft of the convolution strategy using fourier transforms can remain efficient in many computational scenarios by reducing computational complexity and optimization of GPU memory, it is computationally inefficient in scenarios where the convolution kernel is small.
Furthermore, the usability of these implementations also shows great differences in different computing scenarios. The cuda-convnet2 based on the direct convolution strategy can only operate under a partial parameter space. For example, the cuda-convnet2 can only operate in a parameter space where the input pictures are square. Fbfft based on Fourier transform does not support step convolution, namely only supports parameter space with convolution step of 1, and if the space is more than 1, operation fails.
In summary, these typical implementations cannot effectively meet the computation requirements of the convolutional neural network, and none of the implementations can efficiently perform the computation of the convolutional neural network in all computation scenarios, and each implementation shows different performance differences in different scenarios, which results in low overall computation efficiency of the convolutional neural network and affects the overall performance of the convolutional neural network to a certain extent.
Disclosure of Invention
The invention provides a method and a system for calculating a convolutional neural network, aiming at solving the problem that the overall calculation efficiency of the convolutional neural network is low due to different performance differences of different convolutional products under different scenes in the prior art.
In one aspect, the present invention provides a method for calculating a convolutional neural network, including:
acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer;
if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
and combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective optimal convolution strategy combination.
Preferably, the searching for the optimal convolution strategy combination corresponding to the convolutional layer in the knowledge base according to the configuration parameters corresponding to the convolutional layer further includes:
and if the optimal convolution strategy combination exists in the knowledge base, acquiring the optimal convolution strategy combination from the knowledge base so that the convolution layer carries out convolution calculation according to the optimal convolution strategy combination.
Preferably, the obtaining of the multiple candidate convolution strategies corresponding to the convolution stage specifically includes:
and selecting the existing convolution strategy for instantiation to obtain a plurality of candidate convolution strategies corresponding to the convolution stage.
Preferably, the selecting the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies specifically includes:
testing the availability of all the candidate convolution strategies and screening out effective convolution strategies from all the candidate convolution strategies;
and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.
Preferably, the obtaining of the optimal convolution strategy combination corresponding to the convolution layer further includes:
and associating and storing the configuration parameters corresponding to the convolutional layer and the optimal convolutional strategy combination corresponding to the convolutional layer to the knowledge base.
Preferably, the selecting the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies further includes:
and calling the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies by using the interface function corresponding to the convolution stage.
Preferably, the configuration parameters include a batch size, an input channel, the number of convolution kernels, a length and a width of a convolution window, a length and a width of an input picture, and a convolution step size.
In one aspect, the present invention provides a convolutional neural network computing system, comprising:
the convolution strategy searching module is used for acquiring configuration parameters corresponding to all convolution layers in the target convolution neural network, and searching the optimal convolution strategy combination corresponding to the convolution layer in the knowledge base according to the configuration parameters corresponding to the convolution layer for any convolution layer;
a convolution strategy screening module, configured to decompose the convolution layer into multiple convolution stages if the optimal convolution strategy combination does not exist in the knowledge base, acquire multiple candidate convolution strategies corresponding to the convolution stage for any one convolution stage, and screen out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
and the convolution calculation module is used for combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination.
In one aspect, the present invention provides an electronic device comprising:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor being capable of performing any of the methods described above when invoked by the processor.
In one aspect, the invention provides a non-transitory computer readable storage medium storing computer instructions that cause a computer to perform any of the methods described above.
The invention provides a method and a system for calculating a convolutional neural network, which are characterized in that configuration parameters corresponding to all convolutional layers in a target convolutional neural network are obtained, and for any convolutional layer, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method and the system screen the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional layer implementations in the prior art are solved.
Drawings
Fig. 1 is a schematic overall flow chart of a calculation method of a convolutional neural network according to an embodiment of the present invention;
FIG. 2 is a schematic overall flow chart of a convolutional neural network computing system according to an embodiment of the present invention;
fig. 3 is a schematic structural framework diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The convolutional neural network is a network model having a multilayer structure including an input layer, a convolutional layer, a pooling layer, an output layer, and the like, wherein the convolutional layer is a hot spot calculation layer of the convolutional neural network, that is, most of the calculation amount of the convolutional neural network is reflected in the convolution calculation of the convolutional layer. In view of this, in order to improve the overall calculation efficiency of the convolutional neural network to improve the overall performance of the convolutional neural network, the present invention mainly optimizes the convolutional calculation of the convolutional neural network, and provides a calculation method of the convolutional neural network, which can effectively improve the convolutional calculation efficiency of the convolutional neural network, and is further beneficial to improving the overall calculation efficiency of the convolutional neural network. The concrete implementation is as follows:
fig. 1 is a schematic overall flow chart of a method for calculating a convolutional neural network according to an embodiment of the present invention, and as shown in fig. 1, the present invention provides a method for calculating a convolutional neural network, including:
s1, acquiring configuration parameters corresponding to all convolutional layers in the target convolutional neural network, and searching the optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layer for any convolutional layer;
specifically, the target convolutional neural network is subjected to structure splitting, the target neural network is split into a plurality of base layers, convolutional layers are positioned from the base layers, and all convolutional layers of the target convolutional neural network can be obtained. Because any convolutional neural network sets corresponding configuration parameters for each convolutional layer during construction, the configuration parameters are often stored in a configuration file form, that is, each convolutional neural network corresponds to one configuration file. In view of this, in this embodiment, after obtaining all convolutional layers in the target convolutional neural network, and then obtaining the configuration file corresponding to the target convolutional neural network, the configuration parameters corresponding to all convolutional layers may be obtained from the configuration file corresponding to the target convolutional neural network. The configuration parameters include batch size, the number of convolution kernels, the size of a convolution window, the input size, the convolution step size and the like.
When a convolutional layer carries out convolution calculation, a plurality of corresponding convolution stages need to be provided with corresponding convolution strategies, and the convolution strategies corresponding to all the convolution stages are combined to form the convolution strategy combination corresponding to the convolutional layer. Because the configuration parameters corresponding to different convolutional layers are different, and the configuration parameters of the convolutional layers represent the attributes of the convolutional layers, the optimal convolutional strategy combinations applicable to the convolutional layers with different configuration parameters during convolutional calculation are different, the calculation time required by the convolutional layers for performing convolutional calculation according to the optimal convolutional strategy combinations is shortest, and the corresponding calculation efficiency is highest.
Based on the above technical solution, in this embodiment, after all convolutional layers in the target convolutional neural network and the configuration parameters corresponding to all convolutional layers are obtained, for any convolutional layer, first, an optimal convolution strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer, and a calculation time required by the convolutional layer to perform convolution calculation according to the optimal convolution strategy combination is shortest. It should be noted that, through history learning and accumulation, the knowledge base stores the optimum convolution strategy combination applied to the convolution layers with different configuration parameters.
S2, if the optimum convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimum convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
specifically, on the basis of the above technical solution, if the optimal convolution strategy combination corresponding to the convolution layer is not found in the knowledge base, it can be determined that the optimal convolution strategy combination corresponding to the convolution layer does not exist in the knowledge base. In view of this, in the embodiment, the convolutional layer is decomposed into a plurality of convolution stages, which mainly includes three convolution stages, i.e., a forward output calculation stage, a backward input gradient calculation stage, and a backward weight gradient calculation stage. On the basis, for any convolution stage, the existing convolution strategy is instantiated, and a plurality of candidate convolution strategies corresponding to the convolution stage are obtained. And then testing all candidate convolution strategies, specifically, the convolution stage can perform convolution calculation according to each candidate convolution strategy to test the corresponding running time of each candidate convolution strategy, and determine the convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution layer. Therefore, the optimal convolution strategy corresponding to the convolution stage can be screened from all candidate convolution strategies.
And S3, combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolutional neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.
Specifically, on the basis of the above technical solution, after the optimal convolution strategy corresponding to each convolution stage is obtained, the optimal convolution strategies corresponding to all convolution stages are combined, so that the optimal convolution strategy combination corresponding to the convolution layer can be obtained. Taking an example that one convolutional layer comprises three convolutional stages, the finally obtained optimal convolution strategy combination S* iCan be represented as S* i={(P0,ws0),(P1,ws1),(P2Ws 2). Wherein, P0、P1、P2Respectively representing three convolution stages; ws0, ws1, ws2 represent the optimal convolution strategies for the three convolution stages, respectively.
After the optimal convolution strategy combination corresponding to the convolution layer is obtained, the convolution layer can perform convolution calculation according to the optimal convolution strategy combination. For example, if the optimum convolution strategy combination S corresponds to a certain convolution layer* i={(P0,ws0),(P1,ws1),(P2Ws2), P in the convolutional layer0、P1、P2The three convolution stages will perform convolution calculation according to three optimal convolution strategies of ws0, ws1 and ws2 respectively. Because the time required by each convolution stage for convolution calculation according to the optimal convolution strategy is shortest, the time for convolution calculation of the convolution layer according to the optimal convolution strategy combination is also shortest, and the convolution calculation efficiency of the convolution layer can be effectively improved.
On the basis of the technical scheme, corresponding optimal convolution strategy combinations can be obtained for all convolution layers in the target convolution neural network, and the convolution layers carry out convolution calculation according to the corresponding optimal convolution strategy combinations, so that the overall calculation efficiency of the target convolution neural network can be effectively improved, and the overall performance of the target convolution neural network can be improved.
The invention provides a calculation method of a convolutional neural network, which comprises the steps of obtaining configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The method screens the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional realizations in the prior art are solved.
Based on any of the above embodiments, there is provided a method for calculating a convolutional neural network, which searches for an optimal convolutional policy combination corresponding to the convolutional layer in a knowledge base according to a configuration parameter corresponding to the convolutional layer, and then further includes: and if the optimal convolution strategy combination exists in the knowledge base, acquiring the optimal convolution strategy combination from the knowledge base so that the convolution layer carries out convolution calculation according to the optimal convolution strategy combination.
Specifically, after all convolutional layers in the target convolutional neural network and configuration parameters corresponding to all convolutional layers are obtained, for any convolutional layer, firstly, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer. It should be noted that, through history learning and accumulation, the knowledge base stores the optimum convolution strategy combination applied to the convolution layers with different configuration parameters. On the basis, after the search is carried out, if the optimal convolution strategy combination exists in the knowledge base, the optimal convolution strategy combination is obtained from the knowledge base. After the optimal convolution strategy combination corresponding to the convolution layer is obtained, the convolution layer can carry out convolution calculation according to the optimal convolution strategy combination, and because the optimal convolution strategy combination comprises the optimal convolution strategy corresponding to each convolution stage, and the time required for carrying out convolution calculation according to the optimal convolution strategy in each convolution stage is shortest, the time for carrying out convolution calculation according to the optimal convolution strategy combination by the convolution layer is also shortest, and the convolution calculation efficiency of the convolution layer can be effectively improved.
According to the calculation method of the convolutional neural network, after the optimal convolution strategy combination corresponding to the convolutional layer is searched in the knowledge base according to the configuration parameters corresponding to the convolutional layer, if the optimal convolution strategy combination exists in the knowledge base, the optimal convolution strategy combination is obtained from the knowledge base, so that the convolutional layer carries out convolution calculation according to the optimal convolution strategy combination. According to the method, the optimal convolution strategy combination corresponding to each convolution layer can be rapidly obtained through the knowledge base, and meanwhile, each convolution layer can carry out convolution calculation according to the optimal convolution strategy combination, so that the calculation efficiency of a convolution stage is effectively improved, and further, the overall calculation efficiency of the convolutional neural network is effectively improved.
Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, and a plurality of candidate convolution strategies corresponding to the convolution stage are obtained, specifically: and selecting the existing convolution strategy for instantiation to obtain a plurality of candidate convolution strategies corresponding to the convolution stage.
Specifically, in this embodiment, for any convolution stage, an existing convolution policy is selected to be instantiated, and a plurality of candidate convolution policies corresponding to the convolution stage are obtained. The existing convolution strategy is an existing typical convolution strategy, and the existing typical convolution strategy comprises cuda-convnet2 (direct convolution), torch-cunn (convolution based on matrix multiplication), cuDNN (convolution based on matrix multiplication) and fbfft (convolution based on fourier transform). On this basis, in the present embodiment, four candidate convolution strategies are correspondingly set for each convolution stage. In other embodiments, the number and the type of the candidate convolution policies may be set according to actual requirements, and are not specifically limited herein.
According to the calculation method of the convolutional neural network, provided by the invention, for any convolution stage, the existing convolution strategy is selected for instantiation, a plurality of candidate convolution strategies corresponding to the convolution stage are obtained, and the optimum convolution strategy corresponding to the convolution stage is favorably screened from all the candidate convolution strategies.
Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, which screens out an optimal convolution strategy corresponding to the convolution stage from all candidate convolution strategies, and specifically includes: testing the availability of all candidate convolution strategies, and screening out effective convolution strategies from all candidate convolution strategies; and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.
Specifically, for any convolution stage, after obtaining a plurality of candidate convolution strategies corresponding to the convolution stage, all the candidate convolution strategies need to be tested. Specifically, in this embodiment, firstly, the availability of all candidate convolution policies is tested, the convolution stage may be operated according to each candidate convolution policy, and if there is an abnormality in the operation process, it may be determined that the corresponding candidate convolution policy is invalid, so that invalid policies in all candidate convolution policies may be excluded, and further, all valid convolution policies may be screened out from all candidate convolution policies.
On the basis, all the effective convolution strategies are further tested, specifically, the convolution stage may be run according to each effective convolution strategy, and a running time corresponding to each effective convolution strategy is tested, where the running time is a time required for performing convolution calculation on the convolution stage according to the effective convolution strategy. And after the running time corresponding to each effective convolution strategy is obtained, selecting the shortest running time from the running times, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.
The invention provides a calculation method of a convolutional neural network, which is used for testing the availability of all candidate convolution strategies and screening out effective convolution strategies from all candidate convolution strategies; and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage. The method not only can screen out the optimal convolution strategy corresponding to each convolution stage, but also can exclude invalid convolution strategies, and can effectively ensure the availability and the high efficiency of the optimal convolution strategy.
Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, where an optimal convolution strategy combination corresponding to the convolutional layer is obtained, and then the method further includes: and associating and storing the configuration parameters corresponding to the convolutional layer and the optimal convolutional strategy combination corresponding to the convolutional layer to a knowledge base.
Specifically, after the optimal convolution strategy combination corresponding to each convolution layer is obtained, the configuration parameters corresponding to the convolution layer and the optimal convolution strategy combination corresponding to the convolution layer are stored in the knowledge base in a correlation mode, so that the optimal convolution strategy combination corresponding to the configuration parameters can be directly searched from the knowledge base according to the configuration parameters of the convolution layer with the same configuration in the follow-up process, and the follow-up process of retesting the convolution layers with the same configuration can be effectively avoided.
According to the calculation method of the convolutional neural network, after the optimal convolution strategy combination corresponding to the convolutional layer is obtained, the configuration parameters corresponding to the convolutional layer and the optimal convolution strategy combination corresponding to the convolutional layer are stored in the knowledge base in a correlation mode, so that the optimal convolution strategy combination corresponding to the configuration parameters can be searched for the convolutional layer with the same configuration from the knowledge base directly according to the configuration parameters of the convolutional layer, and repeated testing on the convolutional layer with the same configuration can be effectively avoided.
Based on any of the above embodiments, a method for calculating a convolutional neural network is provided, where an optimal convolution policy corresponding to the convolution stage is screened from all candidate convolution policies, and then the method further includes: and calling the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies by using the interface function corresponding to the convolution stage.
Specifically, in this embodiment, a uniform interface is designed for calculation of a target convolutional neural network convolutional layer, an interface function corresponding to each convolutional stage is designed in the interface, and a plurality of candidate convolutional strategies are pre-packaged in each interface function. In view of this, for any convolution stage, after obtaining the optimal convolution strategy corresponding to the convolution stage, the interface function corresponding to the convolution stage is used to invoke the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies. That is, candidate convolution strategies corresponding to the same convolution stage in the target convolutional neural network are all encapsulated in the same interface function, so that when a certain convolution stage performs convolution calculation, a corresponding optimal convolution strategy can be selected from the corresponding interface function.
For example, if the target convolutional neural network includes 6 convolutional layers, and each convolutional layer includes 3 convolutional stages, including a forward output calculation stage, a backward input gradient calculation stage, and a backward weight gradient calculation stage. On the basis, candidate convolution strategies corresponding to the forward output calculation stage in the 6 convolution layers can be pre-packaged in the same interface function; pre-encapsulating candidate convolution strategies corresponding to a backward input gradient calculation stage in the 6 convolution layers in the same interface function; and pre-encapsulating candidate convolution strategies corresponding to the backward weight gradient calculation stage in the 6 convolution layers in the same interface function. Therefore, after the optimal convolution strategy corresponding to each convolution stage is obtained, the optimal convolution strategy corresponding to the convolution stage is called from the candidate convolution strategies packaged in advance by using the interface function corresponding to the convolution stage.
According to the calculation method of the convolutional neural network, after the optimal convolution strategy corresponding to the convolution stage is screened out from all candidate convolution strategies, the interface function corresponding to the convolution stage is utilized, the optimal convolution strategy corresponding to the convolution stage is called from the pre-packaged candidate convolution strategies, the method is beneficial to different convolution stages to select the corresponding optimal convolution strategy from the corresponding interface functions when convolution calculation is carried out, and switching between different convolution strategies can be realized.
Based on any one of the embodiments, a method for calculating a convolutional neural network is provided, where the configuration parameters include batch size, input channel, number of convolution kernels, length and width of convolution window, length and width of input picture, and convolution step size.
Specifically, in the present embodiment, each convolutional layer in the target convolutional neural networkThe corresponding configuration parameters comprise 8 configuration parameters including batch size, input channel, number of convolution kernels, length and width of convolution window, length and width of input picture and convolution step, and are B, N respectivelyi、No、H、W、Kw、KhAnd S are represented by 8 parameters in total. Wherein the value range of B is 1-512; n is a radical ofiThe value range of (1) to (1000); n is a radical ofoThe value range of (1) to (1000); the value range of H is 1-512; the value range of W is 1-512; kwThe value range of (1) to (50); khThe value range of (1) to (50); the value range of S is 1-50.
According to the calculation method of the convolutional neural network, the configuration parameters corresponding to each convolutional layer comprise batch size, input channel, the number of convolutional kernels, the length and width of a convolutional window, the length and width of an input picture and a convolutional step size, and the optimum convolutional strategy corresponding to the convolutional layer is determined according to different configuration parameters.
Fig. 2 is a schematic overall flow chart of a convolutional neural network computing system according to an embodiment of the present invention, and as shown in fig. 2, based on any one of the above method embodiments, a convolutional neural network computing system is provided, including:
the convolution strategy searching module 1 is used for acquiring configuration parameters corresponding to all convolution layers in a target convolution neural network, and searching an optimal convolution strategy combination corresponding to the convolution layer in a knowledge base according to the configuration parameters corresponding to the convolution layer for any convolution layer;
the convolution strategy screening module 2 is used for decomposing the convolution layer into a plurality of convolution stages if the optimal convolution strategy combination does not exist in the knowledge base, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
and the convolution calculation module 3 is configured to combine the optimal convolution strategies corresponding to all convolution stages to obtain an optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolutional neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combinations.
The invention provides a calculation system of a convolutional neural network, which comprises a convolutional strategy searching module 1, a convolutional strategy screening module 2 and a convolutional calculation module 3, wherein the method in any embodiment is realized through the cooperation of the modules, and specific implementation steps can be referred to the method embodiment, which is not described herein again.
The invention provides a calculation system of a convolutional neural network, which is characterized in that configuration parameters corresponding to all convolutional layers in a target convolutional neural network are obtained, and for any convolutional layer, an optimal convolutional strategy combination corresponding to the convolutional layer is searched in a knowledge base according to the configuration parameters corresponding to the convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination. The system screens the corresponding optimal convolution strategy for each convolution stage in each convolution layer of the convolutional neural network, so that each convolution stage can carry out convolution calculation according to the optimal convolution strategy, the calculation efficiency of each convolution stage is effectively improved, the overall calculation efficiency of the convolutional neural network is further effectively improved, the overall performance of the convolutional neural network can be improved in a fine-grained manner, the convolutional neural network can carry out efficient calculation in any scene, and different performance differences existing in different scenes of different convolutional realizations in the prior art are solved.
Fig. 3 shows a block diagram of an electronic device according to an embodiment of the present invention. Referring to fig. 3, the electronic device includes: a processor (processor)31, a memory (memory)32, and a bus 33; wherein, the processor 31 and the memory 32 complete the communication with each other through the bus 33; the processor 31 is configured to call program instructions in the memory 32 to perform the methods provided by the above-mentioned method embodiments, for example, including: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all convolution layers in the target convolution neural network perform convolution calculation according to the respective corresponding optimal convolution strategy combination.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer; if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies; and combining the optimal convolution strategies corresponding to all convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network are subjected to convolution calculation according to the respective corresponding optimal convolution strategy combination.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, the method of the present application is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method of computing a convolutional neural network, comprising:
acquiring configuration parameters corresponding to all convolutional layers in a target convolutional neural network, and searching an optimal convolutional strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layers for any convolutional layer;
if the optimal convolution strategy combination does not exist in the knowledge base, decomposing the convolution layer into a plurality of convolution stages, acquiring a plurality of candidate convolution strategies corresponding to the convolution stages for any one convolution stage, and screening out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective optimal convolution strategy combination;
the configuration parameters comprise batch size, input channel, number of convolution kernels, length and width of a convolution window, length and width of an input picture and convolution step size.
2. The method of claim 1, wherein the searching for the optimal convolution strategy combination corresponding to the convolutional layer in a knowledge base according to the configuration parameters corresponding to the convolutional layer further comprises:
and if the optimal convolution strategy combination exists in the knowledge base, acquiring the optimal convolution strategy combination from the knowledge base so that the convolution layer carries out convolution calculation according to the optimal convolution strategy combination.
3. The method according to claim 1, wherein the obtaining of the candidate convolution policies corresponding to the convolution stage specifically includes:
and selecting the existing convolution strategy for instantiation to obtain a plurality of candidate convolution strategies corresponding to the convolution stage.
4. The method according to claim 1, wherein the selecting of the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies is specifically:
testing the availability of all the candidate convolution strategies and screening out effective convolution strategies from all the candidate convolution strategies;
and testing the running time corresponding to all the effective convolution strategies, and determining the effective convolution strategy corresponding to the shortest running time as the optimal convolution strategy corresponding to the convolution stage.
5. The method of claim 1, wherein obtaining the optimal convolution strategy combination for the convolution layer further comprises:
and associating and storing the configuration parameters corresponding to the convolutional layer and the optimal convolutional strategy combination corresponding to the convolutional layer to the knowledge base.
6. The method of claim 1, wherein the step of screening out the best convolution strategy corresponding to the convolution stage from all the candidate convolution strategies further comprises:
and calling the optimal convolution strategy corresponding to the convolution stage from the pre-packaged candidate convolution strategies by using the interface function corresponding to the convolution stage.
7. A computing system of a convolutional neural network, comprising:
the convolution strategy searching module is used for acquiring configuration parameters corresponding to all convolution layers in the target convolution neural network, and searching the optimal convolution strategy combination corresponding to the convolution layer in the knowledge base according to the configuration parameters corresponding to the convolution layer for any convolution layer;
a convolution strategy screening module, configured to decompose the convolution layer into multiple convolution stages if the optimal convolution strategy combination does not exist in the knowledge base, acquire multiple candidate convolution strategies corresponding to the convolution stage for any one convolution stage, and screen out the optimal convolution strategy corresponding to the convolution stage from all the candidate convolution strategies;
the convolution calculation module is used for combining the optimal convolution strategies corresponding to all the convolution stages to obtain the optimal convolution strategy combination corresponding to the convolution layer, so that all the convolution layers in the target convolution neural network carry out convolution calculation according to the respective corresponding optimal convolution strategy combination;
the configuration parameters comprise batch size, input channel, number of convolution kernels, length and width of a convolution window, length and width of an input picture and convolution step size.
8. An electronic device, comprising:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 6.
9. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 6.
CN201810646058.1A 2018-06-21 2018-06-21 Method and system for calculating convolutional neural network Active CN110633785B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810646058.1A CN110633785B (en) 2018-06-21 2018-06-21 Method and system for calculating convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810646058.1A CN110633785B (en) 2018-06-21 2018-06-21 Method and system for calculating convolutional neural network

Publications (2)

Publication Number Publication Date
CN110633785A CN110633785A (en) 2019-12-31
CN110633785B true CN110633785B (en) 2021-01-05

Family

ID=68967553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810646058.1A Active CN110633785B (en) 2018-06-21 2018-06-21 Method and system for calculating convolutional neural network

Country Status (1)

Country Link
CN (1) CN110633785B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311599B (en) * 2020-01-17 2024-03-26 北京达佳互联信息技术有限公司 Image processing method, device, electronic equipment and storage medium
CN111260036B (en) 2020-01-19 2023-01-10 苏州浪潮智能科技有限公司 Neural network acceleration method and device
CN113033422A (en) * 2021-03-29 2021-06-25 中科万勋智能科技(苏州)有限公司 Face detection method, system, equipment and storage medium based on edge calculation
CN113419931B (en) * 2021-05-24 2024-05-17 北京达佳互联信息技术有限公司 Performance index determining method and device for distributed machine learning system
CN114398040A (en) * 2021-12-24 2022-04-26 上海商汤科技开发有限公司 Neural network reasoning method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869117A (en) * 2016-03-28 2016-08-17 上海交通大学 Method for accelerating GPU directed at deep learning super-resolution technology
CN106326939A (en) * 2016-08-31 2017-01-11 深圳市诺比邻科技有限公司 Parameter optimization method and system of convolutional neural network
CN107341761A (en) * 2017-07-12 2017-11-10 成都品果科技有限公司 A kind of calculating of deep neural network performs method and system
CN107844828A (en) * 2017-12-18 2018-03-27 北京地平线信息技术有限公司 Convolutional calculation method and electronic equipment in neutral net
CN108009634A (en) * 2017-12-21 2018-05-08 美的集团股份有限公司 A kind of optimization method of convolutional neural networks, device and computer-readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10223333B2 (en) * 2014-08-29 2019-03-05 Nvidia Corporation Performing multi-convolution operations in a parallel processing system
US10621486B2 (en) * 2016-08-12 2020-04-14 Beijing Deephi Intelligent Technology Co., Ltd. Method for optimizing an artificial neural network (ANN)

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869117A (en) * 2016-03-28 2016-08-17 上海交通大学 Method for accelerating GPU directed at deep learning super-resolution technology
CN106326939A (en) * 2016-08-31 2017-01-11 深圳市诺比邻科技有限公司 Parameter optimization method and system of convolutional neural network
CN107341761A (en) * 2017-07-12 2017-11-10 成都品果科技有限公司 A kind of calculating of deep neural network performs method and system
CN107844828A (en) * 2017-12-18 2018-03-27 北京地平线信息技术有限公司 Convolutional calculation method and electronic equipment in neutral net
CN108009634A (en) * 2017-12-21 2018-05-08 美的集团股份有限公司 A kind of optimization method of convolutional neural networks, device and computer-readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Research on Optimization Algorithm of Convolution Neural Network in Speech Recognition;Liu Chang-zheng et al;《Journal of Harbin University of Science and Technology》;20160331;第1-5页 *
基于FPGA的图像卷积IP核的设计与实现;朱学亮等;《微电子与计算机》;20110630(第6期);第188-192页 *

Also Published As

Publication number Publication date
CN110633785A (en) 2019-12-31

Similar Documents

Publication Publication Date Title
CN110633785B (en) Method and system for calculating convolutional neural network
CN110058922B (en) Method and device for extracting metadata of machine learning task
US20230008597A1 (en) Neural network model processing method and related device
EP3678068A1 (en) Distributed system for executing machine learning and method therefor
CN111047563B (en) Neural network construction method applied to medical ultrasonic image
CN113994350A (en) Generating parallel computing schemes for neural networks
CN113703775A (en) Compiling method, device, equipment and storage medium
CN112101525A (en) Method, device and system for designing neural network through NAS
WO2021011914A1 (en) Scheduling operations on a computation graph
EP3926546A2 (en) Neural network model splitting method, apparatus, computer device and storage medium
US20230394330A1 (en) A method and system for designing ai modeling processes based on graph algorithms
Ali Next-generation ETL Framework to Address the Challenges Posed by Big Data.
US11003960B2 (en) Efficient incident management in large scale computer systems
CN112306452A (en) Method, device and system for processing service data by merging and sorting algorithm
CN112990461A (en) Method and device for constructing neural network model, computer equipment and storage medium
CN112633516B (en) Performance prediction and machine learning compiling optimization method and device
KR102372869B1 (en) Matrix operator and matrix operation method for artificial neural network
CN113190352B (en) General CPU-oriented deep learning calculation acceleration method and system
CN111767204A (en) Overflow risk detection method, device and equipment
CN113762469B (en) Neural network structure searching method and system
CN117114087B (en) Fault prediction method, computer device, and readable storage medium
CN117196015A (en) Operator execution method, device, electronic equipment and storage medium
CN117786416B (en) Model training method, device, equipment, storage medium and product
US20230130747A1 (en) Computer-readable recording medium storing learning program, learning method, and information processing device
US20240061661A1 (en) Method, apparatus and device for optimizing compiler based on tensor data calculation inference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant