CN109740619A - Neural network terminal operating method and device for target identification - Google Patents
Neural network terminal operating method and device for target identification Download PDFInfo
- Publication number
- CN109740619A CN109740619A CN201811609115.5A CN201811609115A CN109740619A CN 109740619 A CN109740619 A CN 109740619A CN 201811609115 A CN201811609115 A CN 201811609115A CN 109740619 A CN109740619 A CN 109740619A
- Authority
- CN
- China
- Prior art keywords
- layer
- convolutional layer
- parameter
- model
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of neural network terminal operating methods and device for target identification, wherein this method comprises: obtaining trained depth convolutional neural networks model;The model parameter of depth convolutional neural networks is stored into external memory DDR, wherein the model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;The model framework of depth convolutional neural networks is stored into system level chip FPGA, wherein the convolutional layer in model framework is dispensed into programmed logic module PL, and the pond layer, full articulamentum and active coating in model framework are dispensed into control unit module PS;Target image is handled using depth convolutional neural networks model, identifies target.The present invention solves the technical issues of processor in the related technology is unable to satisfy depth convolutional neural networks service requirement in target identification.
Description
Technical field
The invention belongs to intelligent algorithm fields, are related to a kind of neural network terminal operating method and dress for target identification
It sets.
Background technique
Infrared imaging is widely used application in military target seeker, wherein is moving close to strike process for bomb
The identification of middle target is critically important.Target by during close the construction profile of target imaging vary widely, therefore adopt
With traditional image procossing, it is difficult to extract features intrinsic out to carry out identifying processing.Using depth convolutional neural networks method into
Row automatic target detection can obtain very high discrimination, but since parameter amount is very big, model parameter storage is very big asks
Topic, and there is also huge challenges in terminal transplanting.
It in the related technology, is by the way that depth convolutional neural networks model by beta pruning, is carried out compression of parameters with suitable mostly
Answer the condition of current hardware.But the optimization method has the limit, and as depth convolutional neural networks develop, network model
Depth and width will become larger, but the resource of programmed logic module PL (Programmable Logic), such as block arbitrary access
The increased speed such as memory BRAM (block random access memory), trigger FF (Flip Flop) are but much
It does not catch up with.Therefore during target identification, it is unpractical that depth convolutional neural networks are fully achieved in PL.
For above-mentioned problem, currently no effective solution has been proposed.
Summary of the invention
The present invention provides a kind of neural network terminal operating methods and device for target identification, at least to solve phase
Hardware processor is unable to satisfy the technical issues of depth convolutional neural networks service requirement in target identification in the technology of pass.
The technical solution of the invention is as follows: a kind of neural network terminal operating method for target identification, comprising: obtain
Take trained depth convolutional neural networks model;The model parameter of the depth convolutional neural networks is stored into outside
Memory DDR, wherein the model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;By the depth
The model framework of degree convolutional neural networks is stored into system level chip FPGA, wherein the convolution Layer assignment in the model framework
Enter programmed logic module PL, the pond layer, full articulamentum and active coating in the model framework are dispensed into control unit module PS;
Target image is handled using the depth convolutional neural networks model, identifies the target.
Optionally, the target image is handled using the depth convolutional neural networks model, identifies the mesh
Mark, comprising: extract the convolution layer parameter to the PL, according to the convolutional layer and the target image, in the PL into
Row convolutional layer calculates, and obtains the characteristic of the target image;Full connection layer parameter is extracted to the PS, according to the pond
Layer, the full articulamentum, the active coating and the characteristic, are calculated in the PS, obtain and export the mesh
Target recognition result.
Optionally, the convolution layer parameter is extracted to the PL, according to the convolutional layer and the target image, described
Convolutional layer calculating is carried out in PL, obtains the characteristic of the target image, comprising: design convolutional layer computing module, the volume
Lamination computing module is for carrying out convolution kernel calculating, wherein the convolutional layer has multiple, each convolutional layer of the target image
In have multiple convolution kernels, the convolutional layer computing module has multiple;According to the PL, distribute different convolutional layer computing modules into
Row parallel computation;According to the target image and multiple convolutional layer computing modules, the characteristic of the target image is obtained.
Optionally, convolutional layer computing module is designed, comprising: using two-dimensional fast fourier transform 2DFFT to the target
Image is calculated, and the frequency domain image data of the target image is obtained;The frequency domain image data and the convolutional layer are joined
Number carries out complex multiplication calculating;At data after being calculated using two-dimentional inverse fast Fourier transform 2DIFFT complex multiplication
Reason, and to treated, result carries out real part extraction, and exports.
Optionally, the target image is calculated using 2DFFT, comprising: be directed to original image image IMG (x, y), successively
It chooses data in each row and does 1DFFT processing:
For previous step calculated result, successively chooses data in each column and does 1DFFT processing again:
Data after being calculated using 2DIFFT complex multiplication are handled, comprising: the data after calculating complex multiplication
IMG_K (x, y) successively chooses data in each row and does 1DIFFT processing:
For previous step calculated result, successively chooses data in each column and does 1DIFFT processing again:
Wherein, K1、P1、K2、P2It is parameter after frequency-domain transform.
Optionally, the pretreatment are as follows: 2DFFT calculating is carried out to reel lamination parameter.
Optionally, every calculating for having carried out a convolutional layer computing module in the PL, is notified by interruption system
The scheduling that the PS is calculated next time.
Optionally, the first convolutional layer parameters precision is INT32, and the second convolutional layer parameters precision is INT16, third convolutional layer
Parameters precision is INT8, and Volume Four lamination and other convolutional layer parameters precisions are INT4.
According to another aspect of the present invention, it is also proposed that another technical solution: a kind of mind for target identification
Through network terminal running gear, comprising: module is obtained, for obtaining trained depth convolutional neural networks model;The
One access module, for storing the model parameter of the depth convolutional neural networks into external memory DDR, wherein described
Model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;Second access module is used for the depth
The model framework of convolutional neural networks is stored into system level chip FPGA, wherein the convolutional layer in the model framework is dispensed into
Programmed logic module PL, pond layer, full articulamentum and active coating in the model framework are dispensed into control unit module PS;Know
Other module identifies the target for handling using the depth convolutional neural networks model target image.
According to another aspect of the present invention, it is also proposed that a kind of processor, the processor is for running program, wherein
Described program executes the neural network terminal operating method for target identification of above-mentioned any one when running.
Neural network terminal operating method for target identification of the invention, based on to the calculating of depth convolutional neural networks
The characteristics of analyze, computationally intensive convolutional layer in network model is calculated and is built in PL, by the smaller pond layer of calculation amount, entirely
Articulamentum and active coating calculating are built in PS, meanwhile, network query function scheduling also is carried out using PS, and before transplantation to convolutional layer
Parameter is pre-processed, with the convolutional layer calculation amount in PL after reduction transplanting.It is different from and is joined in the related technology by beta pruning
The mode for fully achieving entire depth convolutional network in PL after number compression, the present invention solve because of depth convolutional Neural net
BRAM resource is few greatly, in PL for network parameter amount, and caused hardware processor is unable to satisfy depth convolutional neural networks in target identification
The technical issues of service requirement, the precision for realizing target identification during to target motion change improves and speed is accelerated
Technical effect.
Detailed description of the invention
Fig. 1 is the flow chart of the neural network terminal operating method according to an embodiment of the present invention for target identification;
Fig. 2 is the acceleration design diagram of convolutional layer computing module according to an embodiment of the present invention;
Fig. 3 is the neural network terminal operating system schematic according to an embodiment of the present invention for target identification;
Fig. 4 is the neural network terminal operating schematic device according to an embodiment of the present invention for target identification.
Specific embodiment
Scheme in order to enable those skilled in the art to better understand the present invention describes the present invention below in conjunction with attached drawing and implements
Example.
According to embodiments of the present invention, a kind of method for providing neural network terminal operating for target identification is implemented
Example, it should be noted that, in some cases, can be to be different from this although logical order is shown in flow charts
The sequence at place executes shown or described step.
Fig. 1 is the flow chart of the neural network terminal operating method according to an embodiment of the present invention for target identification, such as
Shown in Fig. 1, this method comprises the following steps:
Step S101 obtains trained depth convolutional neural networks model;
Step S102 stores the model parameter of depth convolutional neural networks into external memory DDR (Double Data
Rate SDRAM), wherein model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;
Step S103 stores the model framework of depth convolutional neural networks into system level chip FPGA (Field
Programmable Gate Array), wherein the convolutional layer in model framework is dispensed into programmed logic module PL, model framework
In pond layer, full articulamentum and active coating be dispensed into control unit module PS (Processing System);
Step S104 is handled target image using depth convolutional neural networks model, identifies target.
It should be noted that the realization for depth convolutional neural networks, in alternative processor terminal (such as
CPU, GPU, FPGA, ASIC), calculated by emulation, accounts for system to depth convolutional neural networks each layer calculating time before can obtaining
The percentage of total predicted time, wherein convolutional layer occupies system-computed total time ratio highest, full articulamentum, pond layer and sharp
The layer accounting time living is less, therefore it can be concluded that be directed to the optimization of depth convolutional neural networks, is mainly for convolutional layer expansion
It can.
Based on being analyzed the characteristics of calculating depth convolutional neural networks, through the above steps, may be implemented of the invention real
It applies in example, convolutional layer computationally intensive in network model is calculated and is built in PL, by the smaller pond layer of calculation amount, full connection
Layer and active coating calculating are built in PS, meanwhile, the embodiment of the present invention also carries out network query function scheduling using PS, and before transplantation
Convolution layer parameter is pre-processed, with the convolutional layer calculation amount in PL after reduction transplanting.
It is different from the related technology by after beta pruning progress compression of parameters that entire depth convolutional network is complete real in PL
Existing mode, the embodiment of the present invention solves because depth convolutional neural networks parameter amount is big, BRAM resource is few in PL, caused
Hardware processor is unable to satisfy the technical issues of depth convolutional neural networks service requirement in target identification, realizes to target
The technical effect that the precision of target identification improves during motion change and speed is accelerated.
Optionally, target image is handled using depth convolutional neural networks model, identifies target, may include:
Convolution layer parameter is extracted to PL, according to convolutional layer and target image, convolutional layer calculating is carried out in PL, obtains the spy of target image
Levy data;Full connection layer parameter is extracted to be counted in PS to PS according to pond layer, full articulamentum, active coating and characteristic
It calculates, obtains and export the recognition result of target.I.e. by carrying out convolutional layer calculating in PL, pond layer, Quan Lian are carried out in PS
The mode for connecing layer and active coating calculating solves depth convolutional neural networks extension bring Transplanting Problem, and further realizes
Identification accelerates.
It is further preferred that extracting convolution layer parameter to PL, according to convolutional layer and target image, convolutional layer is carried out in PL
It calculates, obtains the characteristic of target image, comprising: design convolutional layer computing module, convolutional layer computing module is for being rolled up
Product assesses calculation, wherein convolutional layer has multiple, there is multiple convolution kernels, convolutional layer computing module in each convolutional layer of target image
Have multiple;According to PL, distributes different convolutional layer computing modules and carry out parallel computation;According to target image and multiple convolutional layer meters
Module is calculated, the characteristic of target image is obtained.
Infrared imaging is widely used in military target seeker, primarily directed to the more aobvious protrusion of demand of the identification of target,
For the identification of infrared target, since the resolution ratio of target seeker infrared image itself is just very low, for the knowledge of infrared target
Other demand is very big.During the neural network terminal operating for target identification for small size infrared target, due to small
The features such as size infrared image, size is small, single channel, Cheng Qian is crossed in input processor predictably terminal imagination, it first can be to input
Target image is standardized, during leaning on close-target from the distant to the near again due to bomb, size of the target in visual field
(how much is shared pixel) is different, so being directed to realistic objective size feature, determines modular size, by input candidate regions target contracting
It is put into formulation size, then inputs convolutional neural networks and carries out target identification.
Then in conjunction with the design size for having trained convolution kernel in model parameter in model, carries out corresponding convolutional layer and calculate mould
Block design.For small size infrared target, convolutional layer can extract object at 3-5 layers well in depth convolutional neural networks
The feature of body, it is contemplated that the reusability of computing module in same layer convolutional layer, therefore convolutional layer calculates mould in each convolutional layer
Block needs at least to design one, then carries out respective classes convolutional layer computing module for the size of different convolutional layers input picture
Design, while according to processor resource handling characteristics, same category of convolutional layer computing module can carry out multiple designs, with
Just in calculating same layer convolutional layer, multistage flowing water parallel computation processing is may be implemented in multiple convolution kernels in algorithm level, is kept away
The idle waste of logical resource is exempted from.
In order to further carry out calculating acceleration in algorithm level, it is preferred that design convolutional layer computing module, may include:
Target image is counted using two-dimensional fast fourier transform 2DFFT (2D-Fast Fourier Transformation)
It calculates, obtains the frequency domain image data of target image;Frequency domain image data and convolution layer parameter are subjected to complex multiplication calculating;Using
After two-dimentional inverse fast Fourier transform 2DIFFT (2D-Inverse Fast Fourier Transform) calculates complex multiplication
Data handled, and real part extraction is carried out to treated result, and export.
Fig. 2 is the acceleration design diagram of convolutional layer computing module according to an embodiment of the present invention, as shown in Fig. 2, this hair
Bright embodiment pre-processes convolution layer parameter first before transplanting to network model, after transplanting, in input target figure
When picture, convolutional layer computing module does 2DFFT calculating to target image.Complex multiplication is done in conjunction with pretreated convolution layer parameter
It calculates, the result after calculating is finally done into 2DIFFT calculating, real part extraction is carried out for final calculation result, as each convolution
The output of layer computing module.
Wherein, 2DFFT calculating the pretreatment behavior of above-mentioned convolution layer parameter: is carried out to reel lamination parameter.By moving
Convolution layer parameter is pre-processed before planting, it is possible to reduce the convolutional layer calculation amount after transplanting in PL, and then to whole network mould
Type computing capability is further speeded up.
Meanwhile accelerating convolutional layer calculating that can not only reduce calculation amount by FFT, the logical resource in less PL is occupied,
And it can accelerate to calculate, reduce and calculate the time.
Wherein, original image image is IMG (x, y), (x=0,1 ..., M-1;Y=0,1 ..., N-1), pretreated convolution
Layer parameter complex matrix be Kernel (x, y), (x=0,1 ..., 2M-1;Y=0,1 ..., N-1), x, y respectively indicate X-Y scheme
The position of transverse and longitudinal coordinate as in.Wherein, calculating process are as follows:
The first step successively chooses data in each row and does 1DFFT processing for original image image IMG (x, y):
Second step successively chooses data in each column and does 1DFFT processing again for the calculated result of the first step:
The calculated result of second step and Kernel (x, y) are multiplied by corresponding position and do complex multiplication by third step:
IMG_K (x, y)=IMG2(x,y)*Kernel(x,y);
4th step, the result after third step is calculated successively choose data in each row and do 1DIFFT processing:
5th step successively chooses data in each column for four-step calculation result and does 1DIFFT processing again:
6th step takes output result IMG in the 5th step4The real part of (x, y) exports.
Wherein, K1、P1、K2、P2It is parameter after frequency-domain transform.
Preferably, every calculating for having carried out a convolutional layer computing module in PL, by interruption system notify PS into
The scheduling that row calculates next time.In the calculating process of each convolutional layer computing module, the embodiment of the present invention can be according to processing
Device resources of chip capacity feature carries out convolutional layer computing module to parallel computation and is allocated.Fig. 3 is according to embodiments of the present invention
The neural network terminal operating system schematic for target identification, as shown in figure 3, each convolutional layer computing module is based on
AXI_Lite interface by PS control and scheduling, after the completion of each calculating of each convolutional layer computing module, i.e., to each volume
After the completion of convolution kernel in lamination computing module calculates, PL can notify PS, PS to complete new calculating scheduling by interrupting system,
And the multistage pipeline computing of an IP progress new round for corresponding waiting is controlled by GP mouthfuls, wherein each convolutional layer computing module
Comprising CNN accelerating part and the control section CNN, ANN is full articulamentum computing module, can be according to chip logic resource service condition
The calculating of full articulamentum is transplanted to calculate in PL and is completed, the HP mouthfuls of high speed bus interfaces communicated between PS and PL.
In order to promote generalization ability of the depth convolutional neural networks when extracting characteristic feature, it is preferred that in network structure,
First convolutional layer parameters precision can be INT32, and the second convolutional layer parameters precision can be INT16, third convolution layer parameter essence
Degree can be INT8, and Volume Four lamination and other convolutional layer parameters precisions are INT4.Depth is rolled up in the embodiment of the present invention
Product neural network parameter precision is calculated by the way of (INT32 → INT16 → INT8 → INT4) accordingly using successively successively decreasing,
Continue to expand convolutional layer if necessary, then with the Extended Precision of minimum INT4.After parameter floating data turns fixed point, can not only it subtract
Few parameter storage, occupies the time delay that less storage resource and output transmission are come, and point parameter is calculated relative to floating point parameters
Calculating can occupy less logical resource and calculation delay is reduced.
Wherein, due to small size single channel infrared image results of intermediate calculations carry out FFT accelerate calculate during caching compared with
It is small, it is possible to select according to the actual situation, convolutional layer accelerates result after calculating to be stored in BRAM, or by HP1 and
HP2 interface is buffered in external memory.When result data amount is larger, such as high definition image application, due to centre
Store results are larger in calculating process, cannot be stored in BRAM resource, need to dump in external memory, therefore the storage side
The flexible design of formula can satisfy practical application request.
Further, the processor chips in the embodiment of the present invention can select the XC7Z030-2FBG484I of XLINX.
According to embodiments of the present invention, the device for additionally providing a kind of neural network terminal operating for target identification is implemented
Example, Fig. 4 is the neural network terminal operating schematic device according to an embodiment of the present invention for target identification, as shown in figure 4,
The device includes: to obtain module 41, the first access module 42, the second access module 43, identification module 44, wherein
Module 41 is obtained, for obtaining trained depth convolutional neural networks model;
First access module 42 is connected to and obtains module 41, for storing the model parameter of depth convolutional neural networks
Into external memory DDR, wherein model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;
Second access module 43 is connected to the first access module 42, for by the model framework of depth convolutional neural networks
It stores into system level chip FPGA, wherein the convolutional layer in model framework is dispensed into programmed logic module PL, in model framework
Pond layer, full articulamentum and active coating are dispensed into control unit module PS;
Identification module 44 is connected to the second access module 43, for using depth convolutional neural networks model to target figure
As being handled, target is identified.
According to embodiments of the present invention, a kind of processor is additionally provided, the processor is for running program, wherein program fortune
The neural network terminal operating method for target identification of above-mentioned any one is executed when row.
The content that description in the present invention is not described in detail belongs to the well-known technique of those skilled in the art.
Claims (10)
1. a kind of neural network terminal operating method for target identification characterized by comprising
Obtain trained depth convolutional neural networks model;
The model parameter of the depth convolutional neural networks is stored into external memory DDR, wherein the model parameter includes
It is complete to connect layer parameter and pass through pretreated convolution layer parameter;
The model framework of the depth convolutional neural networks is stored into system level chip FPGA, wherein in the model framework
Convolutional layer be dispensed into programmed logic module PL, the pond layer, full articulamentum and active coating in the model framework are dispensed into control
Unit module PS processed;
Target image is handled using the depth convolutional neural networks model, identifies the target.
2. the method according to claim 1, wherein using the depth convolutional neural networks model to the mesh
Logo image is handled, and identifies the target, comprising:
It extracts the convolution layer parameter and is rolled up in the PL to the PL according to the convolutional layer and the target image
Lamination calculates, and obtains the characteristic of the target image;
Full connection layer parameter is extracted to the PS, according to the pond layer, the full articulamentum, the active coating and the feature
Data are calculated in the PS, obtain and export the recognition result of the target.
3. according to the method described in claim 2, it is characterized in that, extracting the convolution layer parameter to the PL, according to described
Convolutional layer and the target image carry out convolutional layer calculating in the PL, obtain the characteristic of the target image, wrap
It includes:
Convolutional layer computing module is designed, the convolutional layer computing module is for carrying out convolution kernel calculating, wherein the convolutional layer has
It is multiple, there are multiple convolution kernels in each convolutional layer of the target image, the convolutional layer computing module has multiple;
According to the PL, distributes different convolutional layer computing modules and carry out parallel computation;
According to the target image and multiple convolutional layer computing modules, the characteristic of the target image is obtained.
4. according to the method described in claim 3, it is characterized in that, design convolutional layer computing module, comprising:
The target image is calculated using two-dimensional fast fourier transform 2DFFT, obtains the frequency domain of the target image
Image data;
The frequency domain image data and the convolution layer parameter are subjected to complex multiplication calculating;
Data after being calculated using two-dimentional inverse fast Fourier transform 2DIFFT complex multiplication are handled, and to treated
As a result real part extraction is carried out, and is exported.
5. according to the method described in claim 4, it is characterized in that,
The target image is calculated using 2DFFT, comprising:
For original image image IMG (x, y), successively chooses data in each row and does 1DFFT processing:
For previous step calculated result, successively chooses data in each column and does 1DFFT processing again:
Data after being calculated using 2DIFFT complex multiplication are handled, comprising:
Data IMG_K (x, y) after calculating complex multiplication successively chooses data in each row and does 1DIFFT processing:
For previous step calculated result, successively chooses data in each column and does 1DIFFT processing again:
Wherein, K1、P1、K2、P2It is parameter after frequency-domain transform.
6. the method according to claim 1, wherein the pretreatment are as follows: carry out 2DFFT to reel lamination parameter
It calculates.
7. according to the method described in claim 3, it is characterized in that, every in the PL carried out a convolutional layer computing module
Calculating, the scheduling for notifying the PS to be calculated next time by interruption system.
8. according to the method described in claim 3, it is characterized in that, the first convolutional layer parameters precision is INT32, the second convolutional layer
Parameters precision is INT16, and third convolutional layer parameters precision is INT8, and Volume Four lamination and other convolutional layer parameters precisions are
INT4。
9. a kind of neural network terminal operating device for target identification characterized by comprising
Module is obtained, for obtaining trained depth convolutional neural networks model;
First access module, for storing the model parameter of the depth convolutional neural networks into external memory DDR,
In, the model parameter includes connecting layer parameter entirely and by pretreated convolution layer parameter;
Second access module, for storing the model framework of the depth convolutional neural networks into system level chip FPGA,
In, the convolutional layer in the model framework is dispensed into programmed logic module PL, pond layer, full articulamentum in the model framework
Control unit module PS is dispensed into active coating;
Identification module identifies the target for handling using the depth convolutional neural networks model target image.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit require 1 to 8 described in any one the neural network terminal operating method for target identification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811609115.5A CN109740619B (en) | 2018-12-27 | 2018-12-27 | Neural network terminal operation method and device for target recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811609115.5A CN109740619B (en) | 2018-12-27 | 2018-12-27 | Neural network terminal operation method and device for target recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109740619A true CN109740619A (en) | 2019-05-10 |
CN109740619B CN109740619B (en) | 2021-07-13 |
Family
ID=66360068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811609115.5A Active CN109740619B (en) | 2018-12-27 | 2018-12-27 | Neural network terminal operation method and device for target recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109740619B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110806640A (en) * | 2019-10-28 | 2020-02-18 | 西北工业大学 | Photonic integrated visual feature imaging chip |
CN111126309A (en) * | 2019-12-26 | 2020-05-08 | 长沙海格北斗信息技术有限公司 | Convolutional neural network architecture method based on FPGA and face recognition method thereof |
CN111582446A (en) * | 2020-04-28 | 2020-08-25 | 北京达佳互联信息技术有限公司 | System for neural network pruning and neural network pruning processing method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106875012A (en) * | 2017-02-09 | 2017-06-20 | 武汉魅瞳科技有限公司 | A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA |
CN107239829A (en) * | 2016-08-12 | 2017-10-10 | 北京深鉴科技有限公司 | A kind of method of optimized artificial neural network |
CN107341761A (en) * | 2017-07-12 | 2017-11-10 | 成都品果科技有限公司 | A kind of calculating of deep neural network performs method and system |
CN107657316A (en) * | 2016-08-12 | 2018-02-02 | 北京深鉴科技有限公司 | The cooperative system of general processor and neural network processor designs |
CN207458128U (en) * | 2017-09-07 | 2018-06-05 | 哈尔滨理工大学 | A kind of convolutional neural networks accelerator based on FPGA in vision application |
CN108229670A (en) * | 2018-01-05 | 2018-06-29 | 中国科学技术大学苏州研究院 | Deep neural network based on FPGA accelerates platform |
CN108764466A (en) * | 2018-03-07 | 2018-11-06 | 东南大学 | Convolutional neural networks hardware based on field programmable gate array and its accelerated method |
CN108959895A (en) * | 2018-08-16 | 2018-12-07 | 广东工业大学 | A kind of EEG signals EEG personal identification method based on convolutional neural networks |
-
2018
- 2018-12-27 CN CN201811609115.5A patent/CN109740619B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107239829A (en) * | 2016-08-12 | 2017-10-10 | 北京深鉴科技有限公司 | A kind of method of optimized artificial neural network |
CN107657316A (en) * | 2016-08-12 | 2018-02-02 | 北京深鉴科技有限公司 | The cooperative system of general processor and neural network processor designs |
CN106875012A (en) * | 2017-02-09 | 2017-06-20 | 武汉魅瞳科技有限公司 | A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA |
CN107341761A (en) * | 2017-07-12 | 2017-11-10 | 成都品果科技有限公司 | A kind of calculating of deep neural network performs method and system |
CN207458128U (en) * | 2017-09-07 | 2018-06-05 | 哈尔滨理工大学 | A kind of convolutional neural networks accelerator based on FPGA in vision application |
CN108229670A (en) * | 2018-01-05 | 2018-06-29 | 中国科学技术大学苏州研究院 | Deep neural network based on FPGA accelerates platform |
CN108764466A (en) * | 2018-03-07 | 2018-11-06 | 东南大学 | Convolutional neural networks hardware based on field programmable gate array and its accelerated method |
CN108959895A (en) * | 2018-08-16 | 2018-12-07 | 广东工业大学 | A kind of EEG signals EEG personal identification method based on convolutional neural networks |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110806640A (en) * | 2019-10-28 | 2020-02-18 | 西北工业大学 | Photonic integrated visual feature imaging chip |
CN111126309A (en) * | 2019-12-26 | 2020-05-08 | 长沙海格北斗信息技术有限公司 | Convolutional neural network architecture method based on FPGA and face recognition method thereof |
CN111582446A (en) * | 2020-04-28 | 2020-08-25 | 北京达佳互联信息技术有限公司 | System for neural network pruning and neural network pruning processing method |
CN111582446B (en) * | 2020-04-28 | 2022-12-06 | 北京达佳互联信息技术有限公司 | System for neural network pruning and neural network pruning processing method |
Also Published As
Publication number | Publication date |
---|---|
CN109740619B (en) | 2021-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20180157969A1 (en) | Apparatus and Method for Achieving Accelerator of Sparse Convolutional Neural Network | |
CN106875011B (en) | Hardware architecture of binary weight convolution neural network accelerator and calculation flow thereof | |
CN106844294B (en) | Convolution algorithm chip and communication equipment | |
CN106022468B (en) | the design method of artificial neural network processor integrated circuit and the integrated circuit | |
CN111242289B (en) | Convolutional neural network acceleration system and method with expandable scale | |
US10394929B2 (en) | Adaptive execution engine for convolution computing systems | |
CN109740619A (en) | Neural network terminal operating method and device for target identification | |
CN108665059A (en) | Convolutional neural networks acceleration system based on field programmable gate array | |
CN111144561B (en) | Neural network model determining method and device | |
CN107341544A (en) | A kind of reconfigurable accelerator and its implementation based on divisible array | |
CN111667051A (en) | Neural network accelerator suitable for edge equipment and neural network acceleration calculation method | |
CN108764466A (en) | Convolutional neural networks hardware based on field programmable gate array and its accelerated method | |
CN106951395A (en) | Towards the parallel convolution operations method and device of compression convolutional neural networks | |
CN106203617A (en) | A kind of acceleration processing unit based on convolutional neural networks and array structure | |
WO2019136764A1 (en) | Convolutor and artificial intelligent processing device applied thereto | |
WO2021051987A1 (en) | Method and apparatus for training neural network model | |
CN112163601B (en) | Image classification method, system, computer device and storage medium | |
CN109063824B (en) | Deep three-dimensional convolutional neural network creation method and device, storage medium and processor | |
CN112633490B (en) | Data processing device, method and related product for executing neural network model | |
CN109117187A (en) | Convolutional neural networks accelerated method and relevant device | |
CN107766292A (en) | A kind of Processing with Neural Network method and processing system | |
CN110147252A (en) | A kind of parallel calculating method and device of convolutional neural networks | |
CN110210278A (en) | A kind of video object detection method, device and storage medium | |
CN109583586A (en) | A kind of convolution kernel processing method and processing device | |
CN111210019A (en) | Neural network inference method based on software and hardware cooperative acceleration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |