CN109993279A - A kind of double-deck same or binary neural network compression method calculated based on look-up table - Google Patents

A kind of double-deck same or binary neural network compression method calculated based on look-up table Download PDF

Info

Publication number
CN109993279A
CN109993279A CN201910178528.0A CN201910178528A CN109993279A CN 109993279 A CN109993279 A CN 109993279A CN 201910178528 A CN201910178528 A CN 201910178528A CN 109993279 A CN109993279 A CN 109993279A
Authority
CN
China
Prior art keywords
convolution
double
deck
look
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910178528.0A
Other languages
Chinese (zh)
Other versions
CN109993279B (en
Inventor
张萌
李建军
李国庆
沈旭照
曹晗翔
刘雪梅
陈子洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201910178528.0A priority Critical patent/CN109993279B/en
Publication of CN109993279A publication Critical patent/CN109993279A/en
Application granted granted Critical
Publication of CN109993279B publication Critical patent/CN109993279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a kind of double-deck same or binary neural network compression methods calculated based on look-up table, the compression method is completed by the double-deck convolutional coding structure, its algorithm is the following steps are included: first, by input feature vector figure after nonlinear activation, batch normalization and two-value activation, the first layer convolution operation that grouping carries out different convolution kernel sizes obtains first layer output result;Then, output characteristic pattern is obtained using the second layer convolution operation of 1 × 1 size to first layer output result.In hardware realization, same or operation is inputted using the three of the double-deck parallel computation to the improved double-deck convolution and all uses look-up table mode to complete to calculate instead of conventional double sequence calculation, and by all double-deck convolution operations, improves hardware resource utilization.It is a kind of hardware algorithm collaborative compression scheme for merging full precision efficient neural network skill and look-up table calculation that the present invention, which provides compression method, there is preferable compression effectiveness in structure, and logical resource consumption is decreased on hardware.

Description

A kind of double-deck same or binary neural network compression method calculated based on look-up table
Technical field
The present invention relates to a kind of FPGA design optimizings of binary neural network, belong to digital image processing techniques neck Domain.
Background technique
Flourishing based on depth learning technology, application of the convolutional neural networks (CNN) in digital image processing field It is more and more extensive.Since most classic AlexNet, the ResNet residual error neural network proposed to Facebook research institute is deep Degree convolutional neural networks start to step into the high-speed developing period, and the performance of neural network is also gradually riseing.In terms of practical application, Google using convolutional neural networks automatic Pilot, in terms of all achieve significant achievement.Convolution mind at the same time Some challenges are also encountered during development through network, as the high calculation amount of convolutional neural networks and high complexity characteristics make CNN is more difficult to be applied in embedded device.
And as mobile intelligent terminal equipment is universal, it is desirable to be also able to achieve in some only equipment of low performance processor The algorithm of neural network.Therefore, the variant BCNN (two-value convolutional neural networks) of CNN with its can without carry out multiplication operate into Row extracts the advantage of feature and receives much attention in terms of lightweight and low-power consumption.Montreal, CAN university in 2016 Courbariaux etc. is in " Binarized neural networks:Training deep neural networks with weights and activations constrained to+1or-1》(arXiv preprint arXiv: 1602.02830,2016.) the novel Binarization methods based on convolutional neural networks are proposed in, he by the weight of neural network and Every layer of activation value binaryzation, saves the time of a large amount of memory space, computing resource and propagated forward, by convolution weight (weights) it is compensated with output characteristic pattern (FeatureMap) multiplied by coefficient, can be realized and mould be not greatly lowered Computation complexity can be theoretically set to reduce by 60% in the case where type precision.This shows that hardware can be effectively reduced in binarization method Resource consumption reduces and calculates cost, improves the processing speed of neural network, and help to realize neural network algorithm on chip. The same year allows traditional convolution multiplication convolution operation to become by the XNORNet that the Mohammad Rastegari of University of Washington is proposed At same or operation, so that the hardware realization of binary neural network becomes easier.But it compares
For the classification capacity of full precision convolutional neural networks, the ability in feature extraction of binary neural network is also deficient It lacks, binaryzation neural network is equivalent to the regularization process of full precision neural network, the complexity of further sparse network.Two-value There is biggish loss after carrying out two-value activation in the feature that change process extracts network, how to extract in the case where binaryzation More validity features become the critical issue of binary neural network.Over the past two years, different proprietary two value-based algorithms was suggested, such as Parallel network PC-BNN ABC-Net etc. has reached preferable effect, but while the raising of two value-based algorithm recognition performances, and Hard-wired cost problem is not reduced greatly, the problems such as simple algorithm but unsuitable hardware occurs.In conclusion two It is worth the algorithm have begun to take shape of convolutional neural networks, it is not that further development, which is conducive to the algorithm of hard-wired binary neural network, Come a direction of binary neural network development.
Since neural network algorithm calculation amount is huge, so directly realizing that these algorithms become in the form of software in terminal Abnormal difficult, the dedicated accelerating hardware of researching neural network is a current development trend.Therefore, different neural networks is dedicated Accelerating structure is proposed, main during designing accelerating hardware consideration is that how to run faster and more save hardware Resource.It is directed to the problem of running faster, what brainstrust was mainly studied is the parallelization operation mode of neural network algorithm, with The characteristics of hardware can execute parallel matches, and carrys out the execution of accelerating algorithm.And aiming at the problem that saving hardware resource, it is main Research direction is the data-reusing contained in neural network algorithm and multiplexing functions part, can reduce opening for hardware resource Pin.
In terms of the hardware realization of binaryzation neural network, accelerate chip fortune according to existing general full precision artificial intelligence Operation is carried out in BCNN, and there are low efficiency, problems at high cost.And for embedded system and other low-power consumption operations Also these high-performance processors can not be used in occasion.Due to the fast development of the algorithm of binary neural network, algorithm structure is more Deformation characteristics become increasingly prevalent the FPGA design realization of network.Tsinghua University is in " FP-BNN:Binarized Neural network on FPGA " general two are realized in (Neurocomputing, 2018,275:1072-1086.) article It is worth neural network accelerator, accelerates in AlexNet structure than realizing 11.6 times of CPU calculating speed and 2.75 times of GPU Computing capability, entire model reach 384GOP/s/w calculating speed on FPGA,
But the structure power consumption and logical resource consumption are relatively large.Therefore, in order in low-power-consumption embedded equipment Using the algorithm of high discrimination, the software and hardware for being conducive to hard-wired algorithm optimization design and the proprietary deployment of FPGA is carried out Collaborative design method.
Summary of the invention
Goal of the invention: for overcome the deficiencies in the prior art, it is same that the present invention provides a kind of bilayer calculated based on look-up table Or binary neural network compression method, network parameter is reduced, computational efficiency is improved and reduces resource consumption.
Technical solution: a kind of double-deck same or binary neural network compression method calculated based on look-up table,
The compression method is completed by the double-deck convolutional coding structure, algorithm the following steps are included:
Firstly, grouping carries out different convolution by input feature vector figure after nonlinear activation, batch normalization and two-value activation The first layer convolution operation of core size obtains first layer output result;
Then, output characteristic pattern is obtained using the second layer convolution operation of 1 × 1 size to first layer output result.
The bilayer convolutional coding structure, hardware realization step include:
(1) after hardware realization nonlinear activation, batch normalization and two-value activation, first layer convolution module is carried out With or treatment process in carry out simultaneously the convolution of the second layer with or processing, realize that the double-deck convolution calculates simultaneously;
(2) output valve that convolution double-deck in step (1) is calculated simultaneously is added using five into three adders progress flowing water Method processing.
The double-deck convolution calculate simultaneously using three input with or processing, wherein three values of input be respectively input feature vector map values, First layer convolution weighted value, second layer convolution weighted value.
The second layer convolution for 1 × 1 size of grouping convolution sum that the double-deck convolution is made of different convolution kernel sizes connects It connects.
The bilayer convolution is calculated simultaneously using look-up table mode, and multiple input single output is basic in foundation look-up table Characteristic, the double-deck convolution calculates three inputs of composition simultaneously together or processing basic unit is realized in a look-up table.
The utility model has the advantages that using ability in feature extraction the invention proposes the double-deck same or two-value network calculated based on look-up table Stronger composite double layer convolution kernel replaces traditional convolution kernel, and uses three inputs same or calculate and eliminate convolution behaviour in bilayer convolution Make non-two-value situation, further reduces binary neural network parameter amount and computation complexity.This structure uses CIFAR-10 number The validity of algorithm is demonstrated according to collection.
Detailed description of the invention
Fig. 1 is the double-deck schematic diagram of single module;
Fig. 2 is sign function and gradient updating functional arrangement;
Fig. 3 is binary neural network algorithm and improvement structure chart;
Fig. 4 is that three inputs are same or the conversion of hardware realization process is schemed;
Fig. 5 is that the general hardware of the double-deck convolution realizes frame diagram.
Specific embodiment
Technical solution of the present invention is described further with reference to the accompanying drawing.
A kind of double-deck same or binary neural network compression method calculated based on look-up table, the compression method is by twin-laminate roll Product structure is completed, and algorithm is the following steps are included: firstly, by input feature vector figure by nonlinear activation, batch normalization and two-value After activation, the first layer convolution operation that grouping carries out different convolution kernel sizes obtains first layer output result.Then, to first layer Output result obtains output characteristic pattern using the second layer convolution operation of 1 × 1 size.
Its hardware realization step includes:
(1) after hardware realization nonlinear activation, batch normalization and two-value activation, first layer convolution module is carried out With or treatment process in carry out simultaneously the convolution of the second layer with or processing, realize that the double-deck convolution calculates simultaneously;
(2) output valve that convolution double-deck in step (1) is calculated simultaneously is added using five into three adders progress flowing water Method processing.
The double-deck convolution calculate simultaneously using three input with or processing, wherein three values of input be respectively input feature vector map values, First layer convolution weighted value, second layer convolution weighted value.
The second layer convolution for 1 × 1 size of grouping convolution sum that the double-deck convolution is made of different convolution kernel sizes connects It connects.
The bilayer convolution is calculated simultaneously using look-up table mode, and multiple input single output is basic in foundation look-up table Characteristic, the double-deck convolution calculates three inputs of composition simultaneously together or processing basic unit is realized in a look-up table.
Below with reference to example, the present invention will be further explained, is carried out using the convolution sizes such as 3 × 3,1 × 3,3 × 1 It illustrates.3 × 3 common convolution operations of left side as shown in Figure 1, is replaced with the double-deck convolution on right side by convolution operation.Assuming that Input activation port number is n, and output activation port number is m, then the traditional convolution kernel size in left side is n*m*9, right side Pconv3 × 3 represent 3 × 3 convolution, and pconv1 × 3 represent 1 × 3 convolution, pconv3 × 1 represents 3 × 1 convolution, and parameter is respectively N/8*m/8*9, n/8*m/8*3, n/8*m/8*3,1 × 1 convolution operation size are n*m, and the double-deck deconvolution parameter summation on right side is N*m*1.75, parameter amount are the 1/5 of common convolution, greatly reduce the number and convolutional calculation amount of parameter.Wherein parameter subtracts It is few that the loss of further precision is not brought to be to be put forward for the first time without two-value activation between the double-deck convolution, using hardware Realize binarization;Two-value activation can all extract original network convolution process each time in binary neural network Feature is changed into a kind of new feature with part validity feature, meanwhile, the reversed gradient communication process of network is caused sternly The influence of weight, gradient can not be propagated forward when gradient being caused to propagate at this, and as shown in Fig. 2 (a), sign function is greater than null part It is 1, being less than null part is 0, and for its gradient to be infinite at zero, other place gradients are all zero.So needing new gradient letter Number (shown in such as Fig. 2 (b)) is for carrying out solving the problems, such as that gradient can not backpropagation.New ladder of this structure similar to Gaussian Profile Function is spent, the gradient distribution of sign function is both met to a certain extent, also further reduces the normal loss of binaryzation Process plays certain correcting action, and the training speed of network and accuracy rate is made all to be improved.But gradient is repaired Gradient propagation problem is just being solved only, the loss during forward-propagating can not solved effectively, therefore is being needed in feature extraction It reduces the loss of two-value activation bring as far as possible in the process, reduces the training difficulty and the loss of precision of network.According to It should be extracted as far as possible with the less number of plies during designing binary neural network algorithm known to above description more effective Feature just meets the algorithm characteristic of two-value network using the double-layer network of such as Fig. 1 (b).Test through neural network proves, special The feature extraction for levying figure direction is more even more important than the feature extraction between channel, by the number and reduction that increase first layer channel Second layer number of channels can extract more features under less parameter.
To verify algorithm part of the invention, experiment using Tensorflow build based on two-value bilayer with or convolutional Neural Network algorithm uses 4 parallel 3 × 3 convolution kernels, 21 × 3 convolution kernels, 23 × 1 convolution kernels and 11 × 1 convolution kernel generation For 3 × 3 convolution kernels.Test is compared using convolutional neural networks structure shown in Fig. 3, Fig. 3 (a) is common residual error Neural network includes seven convolution modules, in first module, two-value weight convolution is used to grasp after batch normalization operation Make, port number 128;In second to the 7th convolution module, each module has 13 × 3 convolution operation, port number Respectively 128,256,256 and 512;After each convolution operation all can and then one PBA layers, by nonlinear function active coating, Batch normalized layer and two-value active coating composition, can all have after the second, the 4th and layer 7 convolution module one most Great Chiization layer;And then a full articulamentum after 7th convolution module, is 32 × 32 threeway due to using size Road color image CIFAR-10 data set is tested and is trained, and wherein CIFAR-10 is 10 classification, therefore most for CIFAR-10 The output channel number of the full articulamentum of the latter is 10, finally accesses normalization exponential function (Softmax) layer and completes classification Operation, Fig. 3 (b) are using present invention network improved on the basis of Fig. 3 (a), as shown in dotted line frame module, from second convolution Module starts, and replaces former 3 × 3 convolution kernels using the double-deck same or convolution kernel, the middle layer feature port number of the double-deck convolutional coding structure can Self-defining, increase appropriate can enhance the overall performance of network, and other parts are then kept as former network.Fig. 3 (a), (b) Network model is built using Tensorflow to be trained and tests, and table 1 gives the model pair of 250 wheel of training under the same number of plies Compare situation.
As shown in table 1, the same number of plies, after training 250 is taken turns, at CIFAR-10, ResNet (- 7 layers of residual error neural network) Network test accuracy rate is 87%, is using improved two-value residual error network (PM-ResNet-7) test accuracy rate of the present invention 86.1%;Table one gives the network parameter comparative situation under CIFAR-10 data set, and primitive network number of parameters is 2.83M, And improved network is only 1.08M, slip reaches 63%.The reduction of parameter necessarily reduces the convolution operation number of network, Therefore in the case where guaranteeing test accuracy rate and full two-value, the computation complexity of network is greatly reduced, when having saved calculating Between.
The parameter and accuracy rate of 1 heterogeneous networks model of table compare
Data set Model Number of parameters Accuracy rate
CIFAR-10 ResNet-7 2.83M 87%
CIFAR-10 PM-ResNet-7 1.08M 86.1%
In terms of hardware realization, as shown in Fig. 4 (a), the convolution operation step of normal double-layer network is first to calculate first Output valve o11, o12, o13 of layer convolution, the convolution operation that calculating resulting value carries out 1x1 again obtain o1, o2, o3.The first meter Calculation mode such as formula (1), (2), (3), (4) are shown, and first layer convolution operation resulting value needs to carry out corresponding sum operation, therefore It is obtaining the result is that bit value more than one.When carrying out the calculating of next step O1 result, though weight
O11=I11W111+I12W112+ ...+I21W121+I22W122+ ...+I31W131+I32W132+ ...+I39W139 (1)
O12=I11W211+I12W212+ ...+I21W221+I22W222+ ...+I31W231+I32W232+ ...+I39W239 (2)
O13=I11W311+I12W312+ ...+I21W321+I22W322+ ...+I31W331+I32W332+ ...+I39W339 (3)
O1=O11* × 11+O12* × 12+O13*x13 (4)
Part x11, x12, x13 are single-bit, but output result O11, O12, O13 of first layer are more bit values, therefore only It can be carried out addition and subtraction operation, the advantage that convolution operation is able to use with or operates when losing binary neural network hardware realization, The hardware resource cost of this method is also larger.And improved calculating mode such as Fig. 4 (b) shown in, using fusion with or method only Three input values need to be used same or judgement can export first layer result removal, not only solve the non-single-bit of middle layer.It can not The difficult point for carrying out with or calculating again, and eliminates two-value activation bring loss of significance, allows binary neural network can be More features are extracted in the case of less parameter.By formula (5) it is found that the final result O1 calculated need to only carry out three single-bits Required result can be obtained by carrying out the cumulative summation of 1 bit after same or operation again.Since the adder of multidigit is relative to an addition Device resource consumption is bigger, and bring clock delay is also more, increases complexity to the realization of circuit.
At the same time, three single-bits are same or network calculates on look-up table relative to common calculation has more The advantages of.It is found that the programmable logic resource of FPGA mainly consists of two parts by taking FPGA as an example, a part is by look-up table (LUT) combinational circuit is realized, a part is to realize sequence circuit by register.It is consumed most during realizing neural network hardware It is exactly convolution algorithm more, and convolution algorithm is mainly multiplication and addition and subtraction operation, is mainly addition and subtraction in binary neural network With with or operation, but a large amount of combinational logic circuit is all consumed, so being a key point to the optimization of combinational logic circuit.Root According to the characteristic of LUT, each LUT is to look for table, it can be achieved that different logic function, but it is fixed for outputting and inputting mode , for the FPGA device of the neural network hardware realization of current main-stream, the LUT of FPGA mostly uses 4,6 input modes.For 6 Input pattern, the LUT there are three types of mode, respectively can -1 output of one 6 input of reality, two -1 outputs of 3 input or one it is 5 defeated Enter -2 output modes.According to same or principle it is found that two single-bits inputs carry out with or obtain single-bit output, three single-bits Input carries out with or is still single-bit output.Meanwhile being limited according to the output bit of LUT, three two inputs are same or operate not Can be realized on 1 LUT, thus under normal circumstances two two input it is same or operation required for LUT and two three input LUT required for same or operation is identical, and also needs to carry out after the normal same or convolution under the first mode secondary Convolution operation, therefore consumed hardware resource need to consume more logic moneys relative to second of one-time calculation bilayer convolution Source, therefore second of mode disclosed by the invention can save considerable hardware resource.
Whole hardware realization for double-layer network will be as shown in figure 5, this example will be counted parallel using 8 parallel modules It calculates, by input channel p, characteristic pattern size is that the convolution module of m*n is divided into 8 modules, and input feature vector figure size is p/8*m*n, For 3 × 3 convolution modules;First by the upper left hand corner section of input feature vector figure, size is the value of p/8*3*3 and 4 difference volume 3 × 3 Product core, size are that the weight matrix of p/8*3*3 carries out convolutional calculation respectively, and as seen from the figure, first part's output valve number is equal to The number of weight, same or operation do not reduce output valve number, and gained matrix size such as Fig. 5 intermediary matrix show p/8*4*9. Obtained intermediate result continues and next 1 × 1 convolution kernel carries out convolution operation, is carried out by 128 4*1 matrixes to intermediary matrix Sliding is same or operation obtains the matrix in 128 channels of output, matrix size 128*p/8*4*9, and obtained matrix carries out non-channel It sums on direction, each Matrix Calculating and number are p/8*4*9, i.e. channel direction does not need to add up.The adder of the matrix is adopted It is compressed with five into three additions, firstly, the characteristics of being exported according to the 5 of LUT inputs 2,5 value summations only need the consumption of 2 LUT. Secondly, according to 5 into inputting the higher feature of the more resource utilizations of number parallel in 3 addition water operations, by it is double-deck with or institute Progress parallel addition calculating, which must be worth, will greatly improve resource utilization.Identical place is used for 1 × 3 or 3 × 1 convolution modules Reason mode is calculated, and final parallel computation obtains summation on 81 directions of 128*1 matrix values progress and obtains final 128 output Feature map values.
It should be pointed out that for those skilled in the art, without departing from the principle of the present invention, Several improvements and modifications can also be made, these modifications and embellishments should also be considered as the scope of protection of the present invention.In the present embodiment not The available prior art of specific each component part is realized.

Claims (5)

1. a kind of double-deck same or binary neural network compression method calculated based on look-up table, it is characterised in that:
The compression method is completed by the double-deck convolutional coding structure, algorithm the following steps are included:
Firstly, grouping carries out different convolution kernel rulers by input feature vector figure after nonlinear activation, batch normalization and two-value activation Very little first layer convolution operation obtains first layer output result;
Then, output characteristic pattern is obtained using the second layer convolution operation of 1 × 1 size to first layer output result.
2. a kind of double-deck same or binary neural network compression method calculated based on look-up table according to claim 1, Be characterized in that: the bilayer convolutional coding structure, hardware realization step include:
(1) hardware realization nonlinear activation, batch normalization and two-value activation after, to first layer convolution module carry out with or Carried out simultaneously in treatment process the convolution of the second layer with or processing, realize that the double-deck convolution calculates simultaneously;
(2) output valve that convolution double-deck in step (1) is calculated simultaneously carries out at flowing water addition using five into three adders Reason.
3. a kind of double-deck same or binary neural network compression method calculated based on look-up table according to claim 1, Be characterized in that: the double-deck convolution and meanwhile calculate using three input with or processing, wherein three values of input are input feature vector figure respectively Value, first layer convolution weighted value, second layer convolution weighted value.
4. a kind of double-deck same or binary neural network compression method calculated based on look-up table according to claim 1, Be characterized in that: the second layer convolution for 1 × 1 size of grouping convolution sum that the double-deck convolution is made of different convolution kernel sizes connects It connects.
5. a kind of double-deck same or binary neural network compression method calculated based on look-up table according to claim 1, Be characterized in that: the bilayer convolution is calculated simultaneously using look-up table mode, the base according to multiple input single output in look-up table This characteristic, the double-deck convolution calculates three inputs of composition simultaneously together or processing basic unit is realized in a look-up table.
CN201910178528.0A 2019-03-11 2019-03-11 Double-layer same-or binary neural network compression method based on lookup table calculation Active CN109993279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910178528.0A CN109993279B (en) 2019-03-11 2019-03-11 Double-layer same-or binary neural network compression method based on lookup table calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910178528.0A CN109993279B (en) 2019-03-11 2019-03-11 Double-layer same-or binary neural network compression method based on lookup table calculation

Publications (2)

Publication Number Publication Date
CN109993279A true CN109993279A (en) 2019-07-09
CN109993279B CN109993279B (en) 2023-08-04

Family

ID=67130485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910178528.0A Active CN109993279B (en) 2019-03-11 2019-03-11 Double-layer same-or binary neural network compression method based on lookup table calculation

Country Status (1)

Country Link
CN (1) CN109993279B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445012A (en) * 2020-04-28 2020-07-24 南京大学 FPGA-based packet convolution hardware accelerator and method thereof
CN111832718A (en) * 2020-06-24 2020-10-27 上海西井信息科技有限公司 Chip architecture
US20210150313A1 (en) * 2019-11-15 2021-05-20 Samsung Electronics Co., Ltd. Electronic device and method for inference binary and ternary neural networks
CN112906886A (en) * 2021-02-08 2021-06-04 合肥工业大学 Result-multiplexing reconfigurable BNN hardware accelerator and image processing method
CN113408713A (en) * 2021-08-18 2021-09-17 成都时识科技有限公司 Method for eliminating data copy, neural network processor and electronic product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148078A1 (en) * 2014-11-20 2016-05-26 Adobe Systems Incorporated Convolutional Neural Network Using a Binarized Convolution Layer
CN106355244A (en) * 2016-08-30 2017-01-25 深圳市诺比邻科技有限公司 CNN (convolutional neural network) construction method and system
US20180247180A1 (en) * 2015-08-21 2018-08-30 Institute Of Automation, Chinese Academy Of Sciences Deep convolutional neural network acceleration and compression method based on parameter quantification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148078A1 (en) * 2014-11-20 2016-05-26 Adobe Systems Incorporated Convolutional Neural Network Using a Binarized Convolution Layer
US20180247180A1 (en) * 2015-08-21 2018-08-30 Institute Of Automation, Chinese Academy Of Sciences Deep convolutional neural network acceleration and compression method based on parameter quantification
CN106355244A (en) * 2016-08-30 2017-01-25 深圳市诺比邻科技有限公司 CNN (convolutional neural network) construction method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210150313A1 (en) * 2019-11-15 2021-05-20 Samsung Electronics Co., Ltd. Electronic device and method for inference binary and ternary neural networks
CN111445012A (en) * 2020-04-28 2020-07-24 南京大学 FPGA-based packet convolution hardware accelerator and method thereof
CN111832718A (en) * 2020-06-24 2020-10-27 上海西井信息科技有限公司 Chip architecture
CN112906886A (en) * 2021-02-08 2021-06-04 合肥工业大学 Result-multiplexing reconfigurable BNN hardware accelerator and image processing method
CN113408713A (en) * 2021-08-18 2021-09-17 成都时识科技有限公司 Method for eliminating data copy, neural network processor and electronic product
CN113408713B (en) * 2021-08-18 2021-11-16 成都时识科技有限公司 Method for eliminating data copy, neural network processor and electronic product

Also Published As

Publication number Publication date
CN109993279B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
CN109993279A (en) A kind of double-deck same or binary neural network compression method calculated based on look-up table
Guo et al. FBNA: A fully binarized neural network accelerator
CN111459877B (en) Winograd YOLOv2 target detection model method based on FPGA acceleration
US20190087713A1 (en) Compression of sparse deep convolutional network weights
CN103176767B (en) The implementation method of the floating number multiply-accumulate unit that a kind of low-power consumption height is handled up
CN110991631A (en) Neural network acceleration system based on FPGA
CN108108809A (en) A kind of hardware structure and its method of work that acceleration is made inferences for convolutional Neural metanetwork
CN107092960A (en) A kind of improved parallel channel convolutional neural networks training method
CN109948784A (en) A kind of convolutional neural networks accelerator circuit based on fast filtering algorithm
CN110383300A (en) A kind of computing device and method
CN110163359A (en) A kind of computing device and method
CN109284824A (en) A kind of device for being used to accelerate the operation of convolution sum pond based on Reconfiguration Technologies
Li et al. AlphaGo policy network: A DCNN accelerator on FPGA
CN113283587A (en) Winograd convolution operation acceleration method and acceleration module
Duan et al. Energy-efficient architecture for FPGA-based deep convolutional neural networks with binary weights
Li et al. An efficient CNN accelerator using inter-frame data reuse of videos on FPGAs
Zhuang et al. Vlsi architecture design for adder convolution neural network accelerator
Jiang et al. Hardware implementation of depthwise separable convolution neural network
Zhan et al. Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems
Tsai et al. A CNN accelerator on FPGA using binary weight networks
Liu et al. Tcp-net: Minimizing operation counts of binarized neural network inference
Yang et al. Data-aware adaptive pruning model compression algorithm based on a group attention mechanism and reinforcement learning
Paul et al. Hardware-software co-design approach for deep learning inference
Kang et al. Design of convolution operation accelerator based on FPGA
CN110163793B (en) Convolution calculation acceleration method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant