CN109858610A

CN109858610A - A kind of accelerated method of convolutional neural networks, device, equipment and storage medium

Info

Publication number: CN109858610A
Application number: CN201910016345.9A
Authority: CN
Inventors: 王丽; 曹芳; 郭振华
Original assignee: Guangdong Inspur Big Data Research Co Ltd
Current assignee: Guangdong Inspur Smart Computing Technology Co Ltd
Priority date: 2019-01-08
Filing date: 2019-01-08
Publication date: 2019-06-07
Also published as: WO2020143236A1

Abstract

The invention discloses accelerated method device, equipment and the storage mediums of a kind of convolutional neural networks, the calculating operation model including receiving multiple preset kinds in preset convolutional neural networks CNN in advance；From multiple calculating operation models, the calculating operation model that can be realized each calculating operation of CNN to be accelerated is obtained, as stand-by calculating operation model；The on-site programmable gate array FPGA of accelerator card is controlled according to stand-by calculating operation model, compiles out the kernel program for executing CNN to be accelerated；Obtain the action sequence parameter of the action sequence of each calculating operation comprising CNN to be accelerated；It controls FPGA and executes kernel program according to the action sequence in action sequence parameter, and operation is carried out to preset data, accelerate to realize.The present invention can execute the operation of the acceleration to any one CNN to be accelerated using any one piece of accelerator card, and without developing a variety of accelerator cards, flexibility is stronger, and saves research and development cost.

Description

A kind of accelerated method of convolutional neural networks, device, equipment and storage medium

Technical field

The present invention relates to algorithms to accelerate field, and more particularly to a kind of accelerated method of convolutional neural networks, the present invention is also It is related to accelerator, equipment and the storage medium of a kind of convolutional neural networks.

Background technique

CNN (Convolutional Neutral Network, convolutional neural networks) is one kind of artificial neural network, In order to meet the requirement such as arithmetic speed, it will usually accelerated using calculating process of the accelerator card to CNN, but there are many CNN Different type, in the prior art when the calculating process to CNN accelerates, it is necessary to use this type to be accelerated The dedicated accelerator card of CNN, i.e., each type of CNN, which requires dedicated accelerator card, can realize acceleration, and flexibility is poor, and It researches and develops a plurality of types of accelerator cards and produces higher research and development cost.

Therefore, how to provide a kind of scheme of solution above-mentioned technical problem is that those skilled in the art need to solve at present Problem.

Summary of the invention

The object of the present invention is to provide a kind of accelerated methods of convolutional neural networks, and flexibility is stronger, and save research and development Cost；It is a further object of the present invention to provide a kind of accelerator of convolutional neural networks, equipment and storage medium, flexibility compared with By force, and research and development cost is saved.

In order to solve the above technical problems, the present invention provides a kind of accelerated methods of convolutional neural networks, comprising:

The calculating operation model of multiple preset kinds in preset convolutional neural networks CNN is received in advance；

From multiple calculating operation models, the meter that can be realized each calculating operation of CNN to be accelerated is obtained Operation model is calculated, as stand-by calculating operation model；

The on-site programmable gate array FPGA of accelerator card is controlled according to the stand-by calculating operation model, is compiled out for holding The kernel program of the row CNN to be accelerated；

Obtain the action sequence parameter of the action sequence of each calculating operation comprising the CNN to be accelerated；

It controls the FPGA and executes the kernel program according to the action sequence in the action sequence parameter, and is right Preset data carries out operation, accelerates to realize.

Preferably, described to obtain comprising described when accelerating the movement of the action sequence of each calculating operation of CNN Order parameter specifically:

The CNN to be accelerated is converted into CNN to be accelerated described in predetermined deep learning framework；

Obtain the action sequence of each calculating operation of the CNN to be accelerated comprising predetermined deep learning framework Action sequence parameter.

Preferably, the predetermined deep learning framework is caffe or TensorFlow.

Preferably, the on-site programmable gate array FPGA of the control accelerator card is according to the stand-by calculating operation model, Compile out the kernel program for executing the CNN to be accelerated specifically:

The on-site programmable gate array FPGA of accelerator card is controlled according to the stand-by calculating operation model, passes through the hard of itself Part compiles platform, compiles out the kernel program for executing the CNN to be accelerated.

Preferably, the calculating operation mould for receiving multiple preset kinds in preset convolutional neural networks CNN in advance Type specifically:

It receives in advance and utilizes multiple preset kinds in the open preset convolutional neural networks CNN of operation language OpenCL Calculating operation model.

Preferably, the preset kind includes convolution operation, pondization operation, line rectification function Relu and Norm letter Number.

In order to solve the above technical problems, the present invention also provides a kind of accelerators of convolutional neural networks, comprising:

Receiving module, for receiving the calculating operation of multiple preset kinds in preset convolutional neural networks CNN in advance Model；

First obtains module, for from multiple calculating operation models, acquisition to can be realized each of CNN to be accelerated The calculating operation model of calculating operation, as stand-by calculating operation model；

First control module, for controlling the on-site programmable gate array FPGA of accelerator card according to the stand-by calculating operation Model compiles out the kernel program for executing the CNN to be accelerated；

Second obtains module, the action sequence of each calculating operation for obtaining the CNN to be accelerated comprising described in Action sequence parameter；

Second control module is executed for controlling the FPGA according to the action sequence in the action sequence parameter The kernel program, and operation is carried out to preset data, accelerate to realize.

Preferably, the second acquisition module includes:

Conversion module, for the CNN to be accelerated to be converted to CNN to be accelerated described in predetermined deep learning framework；

Acquisition submodule, for obtaining each calculating comprising CNN to be accelerated described in predetermined deep learning framework The action sequence parameter of the action sequence of operation.

In order to solve the above technical problems, the present invention also provides a kind of acceleration equipments of convolutional neural networks, comprising:

Memory, for storing computer program；

Processor realizes the acceleration side of the as above any one convolutional neural networks when for executing the computer program The step of method.

In order to solve the above technical problems, the computer can the present invention also provides a kind of computer readable storage medium It reads to be stored with computer program on storage medium, the as above any one volume is realized when the computer program is executed by processor The step of accelerated method of product neural network.

The present invention provides a kind of accelerated methods of convolutional neural networks, including receive preset convolutional neural networks in advance The calculating operation model of multiple preset kinds in CNN；From multiple calculating operation models, acquisition can be realized CNN to be accelerated Each calculating operation calculating operation model, as stand-by calculating operation model；Control the field-programmable gate array of accelerator card FPGA is arranged according to stand-by calculating operation model, compiles out the kernel program for executing CNN to be accelerated；It obtains comprising wait accelerate The action sequence parameter of the action sequence of each calculating operation of CNN；When controlling FPGA according to movement in action sequence parameter Sequence executes kernel program, and carries out operation to preset data, accelerates to realize.

As it can be seen that executed to any one when accelerating the acceleration of CNN to operate in the present invention, it can be from preset CNN In multiple preset kinds calculating operation model in, obtain can be realized CNN to be accelerated each calculating operation calculating behaviour Make model, as stand-by calculating operation model, then can control the FPGA in accelerator card according to stand-by calculating operation model, compile The kernel program for executing CNN to be accelerated is translated, then the movement of available each calculating operation comprising CNN to be accelerated The action sequence parameter of timing, and control FPGA and execute kernel program according to the action sequence in action sequence parameter, and to pre- If data carry out operation, accelerate to realize, the present invention can be executed to be added to any one using any one piece of accelerator card The acceleration of fast CNN operates, and without developing a variety of accelerator cards, flexibility is stronger, and saves research and development cost.

The present invention also provides a kind of accelerator of convolutional neural networks, equipment and storage medium, there is as above volume The identical beneficial effect of accelerated method of product neural network.

Detailed description of the invention

It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to institute in the prior art and embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.

Fig. 1 is a kind of flow diagram of the accelerated method of convolutional neural networks provided by the invention；

Fig. 2 is a kind of structural schematic diagram of the accelerator of convolutional neural networks provided by the invention；

Fig. 3 is a kind of structural schematic diagram of the acceleration equipment of convolutional neural networks provided by the invention.

Specific embodiment

Core of the invention is to provide a kind of accelerated method of convolutional neural networks, and flexibility is stronger, and saves research and development Cost；Another core of the invention is to provide accelerator, equipment and the storage medium of a kind of convolutional neural networks, flexibility compared with By force, and research and development cost is saved.

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

Referring to FIG. 1, Fig. 1 is a kind of flow diagram of the accelerated method of convolutional neural networks provided by the invention, packet It includes:

Step S1: the calculating operation model of multiple preset kinds in preset CNN is received in advance；

Specifically, the calculating operation model of multiple preset kinds can be that can be realized in various convolutional neural networks to commonly use Calculating operation each calculating operation model, such as A calculating operation model may be implemented A calculating operation, and B calculating operation mould B calculating operation etc. may be implemented in type, and quantity can independently be set according to demand, and the embodiment of the present invention does not limit herein It is fixed.

Specifically, the executing subject in the embodiment of the present invention can be CPU, this step is specifically as follows the storage in CPU Module receives the calculating operation model of multiple preset kinds in preset CNN in advance, or CPU receive it is preset It is stored it in memory module after the calculating operation model of multiple preset kinds in CNN, in such cases, works as memory module In have each calculating operation model after, subsequent step can be executed, to realize the acceleration operation to various algorithms.

Step S2: from multiple calculating operation models, the calculating that can be realized each calculating operation of CNN to be accelerated is obtained Operation model, as stand-by calculating operation model；

Specifically, CNN to be accelerated can be any one in various CNN, the embodiment of the present invention is it is not limited here.

Wherein it is possible to be obtained first wait accelerate each calculating operation in CNN, that is, know wait accelerate each meter in CNN Calculating operation is what respectively, then from multiple calculating operation models, can be realized the meter of each calculating operation of CNN to be accelerated Operation model is calculated, as stand-by calculating operation model, to execute subsequent step.

Step S3: FPGA (Field-Programmable Gate Array, the field-programmable gate array of accelerator card are controlled Column) according to stand-by calculating operation model, compile out the kernel program for executing CNN to be accelerated；

Accelerate CNN to be accelerated specifically, FPGA wants smoothly to treat, can compile out and use according to stand-by computation model In the kernel program for executing CNN to be accelerated, in such cases, FPGA can execute kernel program and cooperate subsequent step with right CNN to be accelerated is accelerated.

Wherein, the acceleration that the embodiment of the present invention is realized by FPGA can accelerate for isomery, can be adapted for various types CNN, the embodiment of the present invention is it is not limited here.

Wherein, after kernel program is compiled, it can control kernel program and be loaded into FPGA, so as to subsequent execution.

Step S4: the action sequence parameter of the action sequence of each calculating operation comprising CNN to be accelerated is obtained；

Specifically, the action sequence parameter of CNN to be accelerated can be obtained by multiple approach, for example, can directly for CNN is accelerated to carry out parsing acquisition, or acquisition etc. from pre-stored data bank, the embodiment of the present invention is it is not limited here.

Wherein, it may include the action sequence of CNN to be accelerated in action sequence parameter, such as after A movement is finished B movement is executed, executes D movement etc. after B movement has executed, the concrete form of action sequence is opposite with the type of CNN to be accelerated It answers, the embodiment of the present invention is it is not limited here.

Step S5: control FPGA executes kernel program according to the action sequence in action sequence parameter, and to preset data Operation is carried out, is accelerated to realize.

Specifically, above-mentioned steps after the completion of, FPGA can be controlled according to the action sequence in action sequence parameter Kernel program is executed, and operation is carried out to preset data in the process, the acceleration for CNN to be accelerated is realized, improves Arithmetic speed.

Wherein, preset data can be a plurality of types of data, such as the people got in carrying out face recognition process Face data etc., preset data can be input in FPGA under the control of cpu by global memory and carry out operation, the embodiment of the present invention It is not limited here.

Wherein it is possible to which action sequence parameter is saved in data group, the read-write operation of data in array is then controlled, it will Data are passed in the global memory of FPGA, and start FPGA kernel program, and reading it from global memory includes when acting The input data of order parameter and preset data, accelerates algorithm.

In addition, CPU can also obtain the operation result of FPGA after operation terminates, the process for obtaining operation result can Think that control FPGA stores operation result, then CPU obtains operation result from storage, and by operation result with a variety of Form output, such as diagrammatic form or the form of voice prompting etc., the embodiment of the present invention is it is not limited here.

It should be noted that a branch of the deep learning as machine learning, is the neck quickly grown in artificial intelligence One of domain can help the data of computer understanding great amount of images, sound and textual form.Recently as caffe (Convolutional Architecture for Fast Feature Embedding, the convolution for swift nature insertion Structure) even depth study Open-Source Tools tend to be mature, deep learning technology is quickly grown, currently, deep learning recognition of face, Speech recognition, precisely medical treatment and the fields such as unmanned are just widely used.CNN is one kind of artificial neural network, is First is really successfully trained the deep learning algorithm of multitiered network structure.Developer is created using computation-intensive algorithm CNN, and it is implemented on a variety of platforms.It, can mimic biology vision because it is connected with multilayer neuron handles data The behavior of nerve obtains very high recognition accuracy, it has also become the research hotspot of current speech analysis and field of image recognition.Check In reading system, OCR (Optical Character Recognition, optical character identification) and hand-written discrimination system, streetscape Recognition of face and Car license recognition and France Telecom's video conferencing system in recognition of face all used CNN.

Existing major part CNN, which is realized, is mainly based upon general processor CPU realization, in CNN network structure, in layer Calculating is independent incoherent, and interlayer structure can be understood as a flowing structure.Due to the specific calculations mode of CNN, General processor CPU excavates the concurrency inside CNN due to its own feature with being unable to fully, and realizes that CNN is not efficient, so It is difficult to meet performance requirement.It is based on FPGA recently, GPU (Graphics Processing Unit, graphics processor) is even The different accelerators of ASIC (Application Specific Integrated Circuit, specific integrated circuit) are successive It proposes to promote CNN design performance.In these schemes, based on the accelerator of FPGA due to its better performance, high energy efficiency, fastly The fast development cycle and it is reconfigurable can the gravitational attraction attention of more and more researchers.FPGA adds as a kind of computation-intensive Fast component, by accelerating the Parallel Hardware on Algorithm mapping to FPGA, the upper designed each hardware module of FPGA can To execute parallel, flowing structure provided by the interconnection of each hardware module input and output and FPGA can be very good and CNN algorithm matches, and makes full use of the concurrency inside algorithm network structure, reduces energy while improving arithmetic speed Consumption.There is scholar to realize the CNN of different structure on FPGA before to do simple realtime graphic identification or classification, but this A little researchs realize it is most of just in calculating more complicated convolutional layer or being based on certain specific neural network, such as Aydonat et al. proposes a completely new CNN and realizes frame, completes FPGA and accelerates to the isomery of Alexnet network.Work as research and development When personnel need to carry out FPGA isomery to new convolutional neural networks to accelerate, then the specific network knot according to new network is needed Structure realizes there is poor versatility and flexibility to the design for realizing that framework carries out again of FPGA.

On the basis of the above embodiments:

Embodiment as one preferred obtains the movement of the action sequence of each calculating operation comprising CNN to be accelerated Time sequence parameter specifically:

CNN to be accelerated is converted to the CNN to be accelerated of predetermined deep learning framework；

Obtain the action sequence of the action sequence of each calculating operation of the CNN to be accelerated comprising predetermined deep learning framework Parameter.

Specifically, in view of CNN to be accelerated can be a plurality of types of deep learning frames, it is desirable to from different types of depth The action sequence parameter of CNN to be accelerated is obtained in degree learning framework, it is necessary to build various types of deep learnings in advance in CPU Frame in the embodiment of the present invention, in order to save resource, can only build a kind of predetermined deep learning framework, such feelings in CPU Under condition, it is only necessary to CNN to be accelerated be converted to the CNN to be accelerated of default learning framework, then CPU, which can be treated, accelerates CNN In action sequence parameter obtained, save resource.

Certainly, obtaining action sequence parameter may be other modes, such as build a variety of deep learnings in CPU in advance Then frame is treated and the action sequence parameter of CNN is accelerated to directly acquire etc., the embodiment of the present invention is it is not limited here.

Embodiment as one preferred, predetermined deep learning framework are caffe or TensorFlow.

Specifically, caffe and TensorFlow are common deep learning frame, in such cases, if wait accelerate CNN is just caffe and TensorFlow, then just being converted without carrying out deep learning frame, is further saved Computing resource.

Certainly, other than caffe and TensorFlow, predetermined deep learning framework can also be other types, this hair Bright embodiment is it is not limited here.

Embodiment as one preferred controls the on-site programmable gate array FPGA of accelerator card according to stand-by calculating operation Model compiles out the kernel program for executing CNN to be accelerated specifically:

The on-site programmable gate array FPGA of accelerator card is controlled according to stand-by calculating operation model, is compiled by the hardware of itself Platform is translated, the kernel program for executing CNN to be accelerated is compiled out.

It, can be with save the cost, without will count specifically, the hardware compilation platform using FPGA itself compiles kernel program According to export, improve work efficiency.

Certainly, other than using the hardware compilation platform of FPGA itself compiling kernel program, other modes can also be used, The embodiment of the present invention is it is not limited here.

Embodiment as one preferred receives multiple preset kinds in preset convolutional neural networks CNN in advance Calculating operation model specifically:

It receives in advance and utilizes OpenCL (Open Computing Language, open operation language) preset convolution mind Calculating operation model through multiple preset kinds in network C NN.

Specifically, OpenCL has many advantages, such as that structure is simple and easy to use.

Wherein, can use computing module in CNN network layer in the embodiment of the present invention is independent incoherent feature, will Common each network layer computing module is realized respectively with the high-level programming language OpenCl of FPGA in CNN, and completes OpenCL's Parallel optimization design, constructs the calculating operation model of multiple preset kinds, and can be by all calculating operation model structures It builds and calculates library for a network layer.

Certainly, other than OpenCL, calculating operation model can also be realized using other programming languages, the present invention Embodiment is it is not limited here.

Embodiment as one preferred, preset kind include convolution operation, pondization operation, Relu (Rectified Linear Unit, line rectification function) and Norm function.

Specifically, the operation of convolution operation, pondization, Relu and norm Norm function are commonly to calculate in various CNN Operation, can realize various types of CNN well.

Certainly, preset kind can also include other multiple types, and the embodiment of the present invention is it is not limited here.

Referring to FIG. 2, Fig. 2 is a kind of accelerator of convolutional neural networks provided by the invention, comprising:

Receiving module 1, for receiving the calculating operation of multiple preset kinds in preset convolutional neural networks CNN in advance Model；

First obtains module 2, for obtaining each meter that can be realized CNN to be accelerated from multiple calculating operation models The calculating operation model for calculating operation, as stand-by calculating operation model；

First control module 3, for controlling the on-site programmable gate array FPGA of accelerator card according to stand-by calculating operation mould Type compiles out the kernel program for executing CNN to be accelerated；

Second obtains module 4, the action sequence of the action sequence for obtaining each calculating operation comprising CNN to be accelerated Parameter；

Second control module 5 executes kernel program according to the action sequence in action sequence parameter for controlling FPGA, and Operation is carried out to preset data, is accelerated to realize.

Embodiment as one preferred, the second acquisition module 4 include:

Conversion module, for CNN to be accelerated to be converted to the CNN to be accelerated of predetermined deep learning framework；

Acquisition submodule, for obtain include predetermined deep learning framework CNN to be accelerated each calculating operation it is dynamic Make the action sequence parameter of timing.

Acceleration side above-mentioned is please referred to for the introduction of the medium of the accelerator of convolutional neural networks provided by the invention The embodiment of method, details are not described herein for the embodiment of the present invention.

Referring to FIG. 3, Fig. 3 is a kind of acceleration equipment of convolutional neural networks provided by the invention, comprising:

Memory 6, for storing computer program；

Processor 7 realizes the acceleration side such as the convolutional neural networks in previous embodiment when for executing computer program The step of method.

Acceleration side above-mentioned is please referred to for the introduction of the medium of the acceleration equipment of convolutional neural networks provided by the invention The embodiment of method, details are not described herein for the embodiment of the present invention.

The present invention also provides a kind of computer readable storage medium, computer is stored on computer readable storage medium Program realizes the step of the accelerated method such as the convolutional neural networks in previous embodiment when computer program is executed by processor 7 Suddenly.

The embodiment of accelerated method above-mentioned is please referred to for the introduction of computer readable storage medium provided by the invention, Details are not described herein for the embodiment of the present invention.

Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.

It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or equipment for including the element.

The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims

1. a kind of accelerated method of convolutional neural networks characterized by comprising

From multiple calculating operation models, the calculating behaviour for each calculating operation that can be realized CNN to be accelerated is obtained Make model, as stand-by calculating operation model；

The on-site programmable gate array FPGA of accelerator card is controlled according to the stand-by calculating operation model, is compiled out for executing institute State the kernel program of CNN to be accelerated；

It controls the FPGA and executes the kernel program according to the action sequence in the action sequence parameter, and to default Data carry out operation, accelerate to realize.

2. accelerated method according to claim 1, which is characterized in that described to obtain comprising each of the CNN to be accelerated The action sequence parameter of the action sequence of the calculating operation specifically:

Obtain the movement of the action sequence of each calculating operation of the CNN to be accelerated comprising predetermined deep learning framework Time sequence parameter.

3. accelerated method according to claim 2, which is characterized in that the predetermined deep learning framework be caffe or TensorFlow。

4. accelerated method according to claim 2, which is characterized in that the field programmable gate array of the control accelerator card FPGA compiles out the kernel program for executing the CNN to be accelerated according to the stand-by calculating operation model specifically:

The on-site programmable gate array FPGA of accelerator card is controlled according to the stand-by calculating operation model, is compiled by the hardware of itself Platform is translated, the kernel program for executing the CNN to be accelerated is compiled out.

5. accelerated method according to claim 4, which is characterized in that described to receive preset convolutional neural networks in advance The calculating operation model of multiple preset kinds in CNN specifically:

The calculating using multiple preset kinds in the open preset convolutional neural networks CNN of operation language OpenCL is received in advance Operation model.

6. accelerated method according to any one of claims 1 to 5, which is characterized in that the preset kind includes convolution behaviour Make, pondization operation, line rectification function Relu and Norm function.

7. a kind of accelerator of convolutional neural networks characterized by comprising

Receiving module, for receiving the calculating operation model of multiple preset kinds in preset convolutional neural networks CNN in advance；

First obtains module, for obtaining each calculating that can be realized CNN to be accelerated from multiple calculating operation models The calculating operation model of operation, as stand-by calculating operation model；

First control module, for controlling the on-site programmable gate array FPGA of accelerator card according to the stand-by calculating operation mould Type compiles out the kernel program for executing the CNN to be accelerated；

Second obtains module, the movement of the action sequence for obtaining each calculating operation comprising the CNN to be accelerated Time sequence parameter；

Second control module, for controlling the FPGA according to described in the action sequence execution in the action sequence parameter Kernel program, and operation is carried out to preset data, accelerate to realize.

8. accelerator according to claim 7, which is characterized in that described second, which obtains module, includes:

Acquisition submodule, for obtaining each calculating operation comprising CNN to be accelerated described in predetermined deep learning framework Action sequence action sequence parameter.

9. a kind of acceleration equipment of convolutional neural networks characterized by comprising

Memory, for storing computer program；

Processor, realizing the convolutional neural networks as described in any one of claim 1 to 6 when for executing the computer program The step of accelerated method.

10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program, when the computer program is executed by processor realize as described in any one of claim 1 to 6 convolutional neural networks add The step of fast method.