CN107862381A - A kind of FIR filter suitable for a variety of convolution patterns is realized - Google Patents

A kind of FIR filter suitable for a variety of convolution patterns is realized Download PDF

Info

Publication number
CN107862381A
CN107862381A CN201711101343.7A CN201711101343A CN107862381A CN 107862381 A CN107862381 A CN 107862381A CN 201711101343 A CN201711101343 A CN 201711101343A CN 107862381 A CN107862381 A CN 107862381A
Authority
CN
China
Prior art keywords
convolution
hardware
fir filter
parallel
patterns
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711101343.7A
Other languages
Chinese (zh)
Inventor
王中风
袁炅
林军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201711101343.7A priority Critical patent/CN107862381A/en
Publication of CN107862381A publication Critical patent/CN107862381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H17/02Frequency selective networks
    • H03H17/06Non-recursive filters
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H2017/0072Theoretical filter design
    • H03H2017/0081Theoretical filter design of FIR filters

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of FIR filter for being applicable to a variety of convolution patterns and its hardware to realize, the structure can support the convolution algorithm of main flow in current convolutional neural networks, the 3*3 convolution algorithms that the 3*3 and 5*5 convolutional calculation and step-length that such as step-length is 1 are 2, and reduce hardware consumption with 6 parallel quick FIR algorithms, convolutional calculation complexity is reduced, improves data throughput.The hardware configuration that the present invention completes three parallel convolution operations that paces are 2 derives, and it is combined with 6 parallel quick FIR filter hardware configurations not increasing adder on the basis of multiplier so that the structure is all very big under each pattern of adaptation must to make use of hardware resource.The present invention can realize that the convolutional neural networks of current most main flows calculate by the different configurations of the single hardware configuration, improve hardware utilization, possess high universalizable, and the hardware for simplifying convolutional neural networks is realized.

Description

A kind of FIR filter suitable for a variety of convolution patterns is realized
Technical field
The present invention relates to computer and the hardware in electronics science field, more particularly to deep learning field convolutional neural networks Realize, the generic structure and hardware that a kind of compatible paces are 1 convolutional calculation and paces are 2 convolution algorithms are realized.
Background technology
Nowadays convolutional neural networks (CNN) have turned into presently most because it is in the remarkable performance in the fields such as image, audio Prevalence, and one of most widely used deep learning algorithm.With the rapid development of convolutional neural networks in recent years, big convolution kernel Application in a model is fewer and fewer, with the convolution algorithm for being most widely 3*3 and 5*5 in current each model, and And the convolution algorithm that paces are 2 is also arrived by increasing model use.And for the convolution algorithm that paces are 2, do not have always but There is a good hardware to realize prioritization scheme.The convolution algorithm that traditional paces are 1 can be improved by quick FIR algorithm Degree of parallelism simultaneously reduces multiplier resources.
The FIR filter of one N tap is shown as in the polynomial table of time domain:
Or it can be expressed as in z domains
If using the FIR filter coefficient sequence { h (n) } that length is N as the coefficient of N-dimensional discrete convolution, FIR filtering Device can realize the convolution algorithm of a N-dimensional.Pass through the combination of N number of wave filter, it is possible to achieve N*N in convolutional neural networks Convolution algorithm.And quick FIR algorithm can realize high degree of parallelism, and by increase adder reduce the method for multiplier come Realize low complex degree.But this method is 2 convolution algorithm for step-length and improper, is calculated and selected by this method Property output can bring the serious waste of hardware resource to realize the convolution that step-length is 2, have in each cycle about 50% it is hard Part resource is no influence on result of calculation.So a kind of can either realize that the convolution algorithm and can that traditional step-length is 1 realizes step A length of 2 convolution algorithm, and will turn into low complex degree, high degree of parallelism, the common hardware framework of high hardware resource utilization A kind of demand.
The content of the invention
In view of the above-mentioned problems, the present invention proposes one kind, not only compatible step-length was 1 but also can be simultaneous on quick FIR algorithm framework Hold the convolutional calculation framework and its hardware realization that step-length is 2.The present invention realizes three kinds of computation schemas on a kind of hardware structure, Respectively the parallel-convolution of 6 tap 6 calculates, three independent parallel convolution operations of 3 tap 3, and 3 that 2 independent step-lengths are 2 The parallel-convolution of tap 3 calculates.The present invention possesses high universalizable, by being configured to the difference of the hardware structure, it is possible to achieve big portion Divide the convolution algorithm of current main-stream.The specific content of the invention is as follows:
A kind of FIR filter for being applicable to a variety of convolution patterns, its hardware structure include:
1) data input selecting unit, for different convolution patterns, input data is carried out to reselect arrangement input To corresponding convolutional calculation module.
2) convolutional calculation unit, basic component units are the 3 parallel quick FIR filters of 3 taps, and insert data choosing Device control data stream is selected to be directed to different convolution algorithms.
3) computing unit after convolution, the output to convolutional calculation unit carry out processing and calculated so as to realize to convolutional calculation The cascade of multiple independent component units in unit, form the quick FIR filter of a multiple parallel multi-tap.
4) data output selecting unit, for different convolution patterns, corresponding result of calculation is selected as module Output.
Second of computation schema of the present invention is three 3 independent taps 3 quick FIR algorithm hardware configuration parallel, wherein 3 Quick FIR hardware configurations are most basic comprising modules parallel for tap 3, and each output Y and input X can be obtained by the derivation of equation Between relation:
Y0=H0X0-z-3H2X2+z-3[(H1+H2)(X1+X2)-H1X1]
Y1=[(H0+H1)(X0+X1)-H1X1]-[H0X0-z-3H2X2]
Y2=[H0+H1+H2)(X0+X1+X2)]-[(H0+H1)(X0+X1)-H1X1]-[(H1+H2)(X1+X2)-H1X1]
And for the parallel convolution operations of 3 tap 3 that step-length is 2, it can derive between each output Y and input X Relation is:
Y0=H0X0+z-6(H2X4+H1X5)
Y1=H2X0+H1X1+H0X2
Y2=H2X2+H1X3+H0X4
By the multiplexing of multiplier and adder, and the mode of multiplexer is added by convolution algorithm that step-length is 2 It is combined with quick FIR algorithm, so as to realize with high hardware resource utilization, high degree of parallelism, the convolution algorithm of high universalizable Hardware structure.
Brief description of the drawings
Fig. 1 is the 3 parallel-convolution computing hardware structure charts that step-length is 2.
Fig. 2 is that quick FIR filters computing hardware structure chart parallel for 3 taps 3.
Fig. 3 is integrated stand composition of the present invention.
Fig. 4 is a kind of FIR filter hardware structure diagram suitable for a variety of convolution patterns of the invention.
Embodiment
Below in conjunction with accompanying drawing to the present invention specific implementation be further described, step-length be 2 convolution algorithm with The convolution algorithm that step-length is 1 has very big difference in the specific implementation, because the data inputted each time and last data At intervals of 2, when this to carry out parallel computation with quick FIR algorithm, due to the noncontinuity of data so that the meter of half It is all hash to calculate result.If without using quick FIR algorithm, by inputting the relation with output:
Y0=H0X0+z-6(H2X4+H1X5)
Y1=H2X0+H1X1+H0X2
Y2=H2X2+H1X3+H0X4
If can obtain the convolution algorithm that step-length is 2 making 3 parallel, its hardware configuration is as shown in Figure 1.
The hardware configuration altogether used 9 multipliers and 6 adders, often input 6 data be calculated 3 it is defeated Go out, relative to quick FIR algorithm accomplish it is identical it is parallel under conditions of, realize the high usage of hardware resource.
A kind of hardware configuration of quick FIR filter is as shown in Figure 2.The hardware configuration reduces multiplication by increasing adder The method of device reduces hardware implementation complexity and has accomplished high degree of parallelism.For the convolution algorithm that traditional step-length is 1, the hardware Structure can realize efficient computing.
The present invention is 1 convolution algorithm for being 2 with paces by the way that the fusion of above two hardware configuration is realized to paces Efficient support, have complexity low on the premise of high degree of parallelism, hardware utilization is high, it is versatile the features such as.Fig. 3 is The hardware structure of the present invention, such as scheming the framework includes four modules:
1) data input selecting unit, for different convolution patterns, input data is carried out to reselect arrangement input To corresponding convolutional calculation module.
2) convolutional calculation unit, basic component units are the 3 parallel quick FIR filters of 3 taps, and insert data choosing Device control data stream is selected to be directed to different convolution algorithms.
3) computing unit after convolution, the output to convolutional calculation unit carry out processing and calculated so as to realize to convolutional calculation The cascade of multiple independent component units in unit, form the quick FIR filter of a multiple parallel multi-tap.
4) data output selecting unit, for different convolution patterns, corresponding result of calculation is selected as module Output.
Specific hardware circuit is as shown in Figure 4.The present invention is answered by being multiplexed multiplier and adder and adding multichannel With the method for device it is achieved thereby that melting 3 parallel-convolution computing hardware structures that step-length is 2 and three parallel quickly FIR filters Close, it is 2 to be realized under conditions of no one multiplier of increase and adder to paces and paces are 1 two kinds of convolution patterns It is compatible.6 parallel quick FIR filters are realized into 33 parallel quick FIR filter cascades.The present invention shares three kinds of mode of operations It can select, the first is the parallel quick FIR filter of 3 tap in 3 independent 3, can be real by setting tap coefficient H Existing 3*3 convolution algorithm.Second be 2 independent step-lengths be 23 tap convolution algorithms.The third is single 6 parallel quick FIR filter.5*5 convolution algorithm can be realized by being set to 0 by last tap coefficient to the third pattern.Only need to be by number Inputted according to input X, tap coefficient H, and mode of operation M to the hardware circuit, the present invention can be according to the pattern of input to X and H Automatically distributed, input to each submodule and carry out computing, and the result drawn is passed through into multiplexer selection and pattern The data to match export as a result.
Certain present invention is not only applicable in the calculating of convolutional neural networks, at Digital Image Processing, data signal Also there are much applicable scenes, and be multiplexed by multiplier and adder and insert multiplexer in the fields such as reason, radio communication Improvement of the mode to some other wave filters can realize matching to more convolutional calculation patterns.For the art For those of ordinary skill, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvement Protection scope of the present invention is also should be regarded as with retouching.It is that the available prior art of clear and definite each part is subject in the present embodiment Realize.

Claims (1)

1. a kind of FIR filter for being applicable to a variety of convolution patterns, its hardware structure include:
1) data input selecting unit, for different convolution patterns, input data reselect arrangement input to phase The convolutional calculation module answered.
2) convolutional calculation unit, basic component units are the 3 parallel quick FIR filters of 3 taps, and insert data selector Control data stream is directed to different convolution algorithms.
3) computing unit after convolution, the output to convolutional calculation unit carry out processing and calculated so as to realize to convolutional calculation unit The cascade of interior multiple independent component units, form the quick FIR filter of a multiple parallel multi-tap.
4) data output selecting unit, for different convolution patterns, corresponding result of calculation is selected to be exported as module.
CN201711101343.7A 2017-11-06 2017-11-06 A kind of FIR filter suitable for a variety of convolution patterns is realized Pending CN107862381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711101343.7A CN107862381A (en) 2017-11-06 2017-11-06 A kind of FIR filter suitable for a variety of convolution patterns is realized

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711101343.7A CN107862381A (en) 2017-11-06 2017-11-06 A kind of FIR filter suitable for a variety of convolution patterns is realized

Publications (1)

Publication Number Publication Date
CN107862381A true CN107862381A (en) 2018-03-30

Family

ID=61700161

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711101343.7A Pending CN107862381A (en) 2017-11-06 2017-11-06 A kind of FIR filter suitable for a variety of convolution patterns is realized

Country Status (1)

Country Link
CN (1) CN107862381A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109104197A (en) * 2018-11-12 2018-12-28 合肥工业大学 The coding and decoding circuit and its coding and decoding method of non-reduced sparse data applied to convolutional neural networks
WO2021046709A1 (en) * 2019-09-10 2021-03-18 深圳市南方硅谷半导体有限公司 Fir filter optimization method and device, and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040103133A1 (en) * 2002-11-27 2004-05-27 Spectrum Signal Processing Inc. Decimating filter
CN106845635A (en) * 2017-01-24 2017-06-13 东南大学 CNN convolution kernel hardware design methods based on cascade form
CN106936406A (en) * 2017-03-10 2017-07-07 南京大学 A kind of realization of 5 parallel rapid finite impact response filter

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040103133A1 (en) * 2002-11-27 2004-05-27 Spectrum Signal Processing Inc. Decimating filter
CN106845635A (en) * 2017-01-24 2017-06-13 东南大学 CNN convolution kernel hardware design methods based on cascade form
CN106936406A (en) * 2017-03-10 2017-07-07 南京大学 A kind of realization of 5 parallel rapid finite impact response filter

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JICHEN WANG等: "《Efficient Convolution Architectures for Convolutional Neural Network》", 《2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109104197A (en) * 2018-11-12 2018-12-28 合肥工业大学 The coding and decoding circuit and its coding and decoding method of non-reduced sparse data applied to convolutional neural networks
CN109104197B (en) * 2018-11-12 2022-02-11 合肥工业大学 Coding and decoding circuit and coding and decoding method for non-reduction sparse data applied to convolutional neural network
WO2021046709A1 (en) * 2019-09-10 2021-03-18 深圳市南方硅谷半导体有限公司 Fir filter optimization method and device, and apparatus

Similar Documents

Publication Publication Date Title
Cheng et al. High-speed VLSI implementation of 2-D discrete wavelet transform
Cooklev An efficient architecture for orthogonal wavelet transforms
WO2016201216A1 (en) Sparse cascaded-integrator-comb filters
CN105183425A (en) Fixed-bit-width multiplier with high accuracy and low complexity properties
CN107862381A (en) A kind of FIR filter suitable for a variety of convolution patterns is realized
CN111510110A (en) Interpolation matched filtering method and filter for parallel processing
WO2008034027A2 (en) Processor architecture for programmable digital filters in a multi-standard integrated circuit
US10050607B2 (en) Polyphase decimation FIR filters and methods
Kumar et al. Exploiting coefficient symmetry in conventional polyphase FIR filters
CN113556101B (en) IIR filter and data processing method thereof
Ahmed et al. High-accuracy stochastic computing-based fir filter design
Srivastava et al. An efficient fir filter based on hardware sharing architecture using csd coefficient grouping for wireless application
CN107864017B (en) A kind of method for correcting phase and device
CN106505971A (en) A kind of low complex degree FIR filter structure of the row that rearranged based on structured adder order
CN108429546A (en) A kind of mixed type FIR filter design method
WO2014127663A1 (en) Interpolation filtering method and interpolation filter
CN105429610B (en) A kind of FIR filter optimization method based on SPT coefficients
CN106505973A (en) A kind of FIR filter of N taps
US6449630B1 (en) Multiple function processing core for communication signals
CN108270416A (en) A kind of high-order interpolation wave filter and method
Reddy et al. Shift add approach based implementation of RNS-FIR filter using modified product encoder
CN105048997A (en) Matched filer multiplexing apparatus and method, and digital communication receiver
CN112422102B (en) Digital filter capable of saving multiplier and implementation method thereof
TW201616810A (en) Finite impulse response filter and filtering method
CN204733137U (en) Matched filter multiplexer and digital communication receiver

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180330

WD01 Invention patent application deemed withdrawn after publication