CN103699355A - Variable-order pipeline serial multiply-accumulator - Google Patents
Variable-order pipeline serial multiply-accumulator Download PDFInfo
- Publication number
- CN103699355A CN103699355A CN201310738598.XA CN201310738598A CN103699355A CN 103699355 A CN103699355 A CN 103699355A CN 201310738598 A CN201310738598 A CN 201310738598A CN 103699355 A CN103699355 A CN 103699355A
- Authority
- CN
- China
- Prior art keywords
- multiply accumulating
- group
- exponent number
- signal
- flowing water
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 238000000034 method Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims abstract description 8
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 19
- 230000008859 change Effects 0.000 claims description 15
- 230000001186 cumulative effect Effects 0.000 claims description 7
- 238000000151 deposition Methods 0.000 claims description 5
- 230000008901 benefit Effects 0.000 abstract description 2
- 238000013461 design Methods 0.000 description 9
- 238000012360 testing method Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 101100063435 Caenorhabditis elegans din-1 gene Proteins 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Landscapes
- Complex Calculations (AREA)
Abstract
The invention relates to a variable-order pipeline serial multiply-accumulator comprising a group of multipliers, three groups of accumulators and corresponding control circuits. The group of multipliers is used for multiplying of two channels of input data and outputting multiplying results. The first group of the three groups of accumulators is used for accumulating the multiplying results, the other two of the three groups of accumulators are used for accumulating pipeline results of the first group of accumulators after the accumulating process of the first group of accumulators, and thus, the first group of accumulators can continue to process data of the next stage. The corresponding control circuits are used for adding extra control signals and control logics and eliminating the head-tail zero padding process of algorithms. The variable-order pipeline serial multiply-accumulator has the advantage that, in application, the head-tail zero padding process of the algorithms and redundant multiplying and accumulating caused by the head-tail zero padding process can be omitted during computing, and accordingly performance index approaching the theoretical estimation can be acquired.
Description
Technical field
The present invention relates to multiply accumulating device, relate in particular to a kind of change rank flowing water serial multiply accumulating device.
Background technology
All the time, Digital Signal Processing, is widely used in all kinds of field of engineering technology as important technological means.In recent years, the development along with scientific and technical, becomes again one of theoretical foundation of the new branch of science such as artificial intelligence, and the popularity of its importance and application is huge.
Mostly the main algorithm that Digital Signal Processing is used is data to carry out filtering, convolution, relevant and analysis of spectrum computing etc.These algorithms have similar framework, i.e. multiply accumulating structure.The N rank FIR wave filter of take is example, and its function expression may be defined as: (filter coefficient is)
Conventionally there is several different methods to realize FIR wave filter, single multiplication serial wave filter for example, serial wave filter, parallelism wave filter and semi-parallelism wave filter based on symmetry coefficient FIR.
General serial filter construction is multiply accumulating structure, only comprises a multiplier and a totalizer can realize, and average flow time is N cycle, as Fig. 1.Parallelism wave filter needs N multiplier, and N-1 totalizer enters into first filtering output from first data, needs N+logN cycle, and flow time is only 1 clk afterwards, as shown in Figure 2.Half parallel wave filter is by parallel architecture and multiply accumulating architecture combined, makes compromise between Area and Speed, so its area and performance also must fall between.
Summary of the invention
The object of the invention is to overcome the deficiency of above prior art, and a kind of change rank flowing water serial multiply accumulating device is provided, and specifically has following technical scheme to realize:
Described change rank flowing water serial multiply accumulating device, comprises
One group of multiplier, for carrying out the multiplication operations of two-way input data, and exports multiplication result;
Three groups of totalizers, first group of totalizer carried out the cumulative of multiplication result, and second group and the 3rd group adds hair device and after cumulative end, the result on first group of totalizer pipelining-stage is added successively, thereby has guaranteed that first group of totalizer can continue to process the data of next stage;
Corresponding control circuit, for increasing extra control signal and steering logic, for saving the head and the tail zero padding operation of algorithm.
The further design of described change rank flowing water serial multiply accumulating device is, described each totalizer and a corresponding connection of data selector.
The further design of described change rank flowing water serial multiply accumulating device is, described corresponding control circuit comprises
Exponent number register, for recording multiply accumulating exponent number, and exports multiply accumulating exponent number;
Counter module, for receiving the enabling signal of multiply accumulating exponent number and multiply accumulating operation, adds up multiply accumulating exponent number, and enabling signal is transmitted;
Four controllers, three controllers are wherein for controlling the gating of described data selector, and a remaining controller is for controlling the output of multiply accumulating device;
Steering logic unit, receives the signal of depositing that comes from the multiply accumulating exponent number of counter module and exponent number register by controller, output multiply accumulating Output rusults and output useful signal.
The further design of described change rank flowing water serial multiply accumulating device is, described data selector is connected with counter module by corresponding controller, makes the gating of the complete paired data selector switch of counter module.
The further design of described change rank flowing water serial multiply accumulating device is, described output useful signal enables input signal as writing of multiply accumulating Output rusults storage.
Advantage of the present invention is as follows:
The realization of change factorial totalizer provided by the invention make can save when computing head and the tail zero padding in algorithm operation and by the unnecessary multiply accumulating of generation, thereby obtain the performance index that approach theoretical calculation.
Accompanying drawing explanation
Fig. 1 is basic multiply accumulating structure.
Fig. 2 is parallel FIR wave filter framework.
Fig. 3 is flowing water serial multiply accumulating device schematic diagram.
Fig. 4 becomes rank flowing water serial multiply accumulating device designed holder composition.
Fig. 5 is the interconnected figure of system emulation basic module.
Embodiment
Below in conjunction with accompanying drawing, the present invention program is elaborated.
The change rank flowing water serial multiply accumulating device that the present embodiment provides, comprises one group of multiplier, three groups of totalizers and corresponding control circuit.One group of multiplier, for carrying out the multiplication operations of two-way input data, and exports multiplication result.Three groups of totalizers, first group of totalizer carried out the cumulative of multiplication result, and second group and the 3rd group adds hair device and after cumulative end, the result on first group of totalizer pipelining-stage is added successively, thereby has guaranteed that first group of totalizer can continue to process the data of next stage.Corresponding control circuit, for increasing extra control signal and steering logic, for saving the head and the tail zero padding operation of algorithm.Each totalizer and a corresponding connection of data selector.
Corresponding control circuit comprises exponent number register, counter module, steering logic unit and four controllers.Exponent number register, for recording multiply accumulating exponent number, and exports multiply accumulating exponent number.Counter module, for receiving the enabling signal of multiply accumulating exponent number and multiply accumulating operation, adds up multiply accumulating exponent number, and enabling signal is transmitted.Four controllers, three controllers are wherein for controlling the gating of data selector, and a remaining controller is for controlling the output of multiply accumulating device.Steering logic unit, receives the signal of depositing that comes from the multiply accumulating exponent number of counter module and exponent number register by controller, output multiply accumulating Output rusults and output useful signal.Data selector is connected with counter module by corresponding controller, makes the gating of the complete paired data selector switch of counter module.
Become rank flowing water serial multiply accumulating device design architecture, referring to Fig. 4.Suppose that the inner flowing water progression of each computing ip is three grades, at output terminal, deposit one-level (consideration timing closure), be equivalent to level Four pipeline.As shown in the figure, start signal is the enabling signal of multiply accumulating operation, and order_mul_accu is for recording the signal of exponent number, din0, din1 is two paths of data input, dout is multiply accumulating Output rusults, wen, for output useful signal, enables input as writing of memory simultaneously.Inside modules mainly comprises a multiplier, three totalizers and each self-corresponding mux, four controll blocks, counter module and depositing signal.Its middle controller 1 ~ 3 is controlled respectively three totalizers mux gating separately, and controller 4 is responsible for output.The input signal of steering logic all comes from the signal of depositing of " count " (counter) and " order_mul_accu ".
In order to verify function and the performance of multiply accumulating module, need other submodule complete systems of collocation, as shown in Figure 5.System mainly comprises: multiply accumulating module, AGU(address generator), memory, top-level module and testbench.
Multiply accumulating module
A single-precision floating point multiplication ip and three addition ip (inside is three grades of flowing water) have been called, design is mainly carried out RTL coding to logics such as four controll blocks, counter, output controls, input signal order_mul_accu and counter is carried out to shift LD simultaneously.
Memory module
First define a submodule, with verilog, write register group, data width 64b, degree of depth 8K.Then at three bank of this submodule exampleization for memory top layer, be respectively used to deposit two-way source data and multiply accumulating result.
AGU module
For special algorithm, for generation of required vector address sequence.Mainly comprise read through model and writing module.Read through model is according to algorithm requirements, address sequence corresponding to design source data, and collocation chip selection signal and read enable signal and send to bank1 and bank2 to peek, the number reading is passed to multiply accumulating module and is carried out computing, i.e. rdata1 and rdata2; Writing module is exported by multiply accumulating device " wen " to control address cumulative, and simultaneously by this address signal, the wen signal of multiply accumulating module, wdata signal are exported to bank3 and are carried out data storage.
Top-level module
By multiply accumulating module, it is interconnected that memory module and AGU module are carried out top layer.
Testbench module
Define clock, reset signal and each parameter of algorithm, data file is imported to the bank1 of memory, bank2 carries out initialization, sends the computing of start signal enabling, after receiving the finish signal of module top layer, finishes emulation.
System testing
Operating system adopts linux system, by VCS simulation validation tool and Design Compiler synthesis tool, carries out functional simulation.
(1) basic module test: carry out the design of multiply accumulating device submodule with multiplicaton addition unit, mainly test each steering logic sequential; Test AGU module, determines whether the address sequence of its generation is expection situation.
(2) integration testing: to specifying memory to import test source operand, generate start signal enabling algorithm by testbench and carry out, transmit each configuration parameter simultaneously.After finishing, algorithm top layer is beamed back finish signal.Operation result obtains by checking corresponding memor, and contrasts with matlab operation result.
(3) DC is comprehensive: further carry out code detection; Whether adopt the 40nm technology library of TSMC to carry out logic synthesis, observing multiply accumulating device module has slack.
In sum, the realization of the change factorial totalizer that the present embodiment provides make can save when computing head and the tail zero padding in algorithm operation and by the unnecessary multiply accumulating of generation, thereby obtain the performance index that approach theoretical calculation.
Claims (5)
1. become a rank flowing water serial multiply accumulating device, it is characterized in that comprising
One group of multiplier, for carrying out the multiplication operations of two-way input data, and exports multiplication result;
Three groups of totalizers, first group of totalizer carried out the cumulative of multiplication result, and second group and the 3rd group adds hair device and after cumulative end, the result on first group of totalizer pipelining-stage is added successively, thereby has guaranteed that first group of totalizer can continue to process the data of next stage;
Corresponding control circuit, for increasing extra control signal and steering logic, for saving the head and the tail zero padding operation of algorithm.
2. change according to claim 1 rank flowing water serial multiply accumulating device, is characterized in that, described each totalizer and a corresponding connection of data selector.
3. change according to claim 2 rank flowing water serial multiply accumulating device, is characterized in that described corresponding control circuit comprises
Exponent number register, for recording multiply accumulating exponent number, and exports multiply accumulating exponent number;
Counter module, for receiving the enabling signal of multiply accumulating exponent number and multiply accumulating operation, adds up multiply accumulating exponent number, and enabling signal is transmitted;
Four controllers, three controllers are wherein for controlling the gating of described data selector, and a remaining controller is for controlling the output of multiply accumulating device;
Steering logic unit, receives the signal of depositing that comes from the multiply accumulating exponent number of counter module and exponent number register by controller, output multiply accumulating Output rusults and output useful signal.
4. change according to claim 3 rank flowing water serial multiply accumulating device, is characterized in that described data selector is connected with counter module by corresponding controller, makes the gating of the complete paired data selector switch of counter module.
5. change according to claim 4 rank flowing water serial multiply accumulating device, is characterized in that, described output useful signal enables input signal as writing of multiply accumulating Output rusults storage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310738598.XA CN103699355B (en) | 2013-12-30 | 2013-12-30 | Variable-order pipeline serial multiply-accumulator |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310738598.XA CN103699355B (en) | 2013-12-30 | 2013-12-30 | Variable-order pipeline serial multiply-accumulator |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103699355A true CN103699355A (en) | 2014-04-02 |
CN103699355B CN103699355B (en) | 2017-02-08 |
Family
ID=50360896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310738598.XA Active CN103699355B (en) | 2013-12-30 | 2013-12-30 | Variable-order pipeline serial multiply-accumulator |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103699355B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104504205A (en) * | 2014-12-29 | 2015-04-08 | 南京大学 | Parallelizing two-dimensional division method of symmetrical FIR (Finite Impulse Response) algorithm and hardware structure of parallelizing two-dimensional division method |
CN106325812A (en) * | 2015-06-15 | 2017-01-11 | 华为技术有限公司 | Processing method and device for multiplication and accumulation operation |
CN109976707A (en) * | 2019-03-21 | 2019-07-05 | 西南交通大学 | A kind of variable bit width multiplier automatic generating method |
CN117555515A (en) * | 2024-01-11 | 2024-02-13 | 成都市晶蓉微电子有限公司 | Digital ASIC serial-parallel combined multiplier for balancing performance and area |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4507725A (en) * | 1982-07-01 | 1985-03-26 | Rca Corporation | Digital filter overflow sensor |
US5923273A (en) * | 1996-11-18 | 1999-07-13 | Crystal Semiconductor Corporation | Reduced power FIR filter |
CN1963745A (en) * | 2006-12-01 | 2007-05-16 | 浙江大学 | High speed split multiply accumulator apparatus |
CN101834615A (en) * | 2009-03-12 | 2010-09-15 | 普然通讯技术(上海)有限公司 | Implementation method of Reed-Solomon encoder |
CN101916177A (en) * | 2010-07-26 | 2010-12-15 | 清华大学 | Configurable multi-precision fixed point multiplying and adding device |
CN102053186A (en) * | 2009-11-10 | 2011-05-11 | 北京普源精电科技有限公司 | Digital oscilloscope with variable-order digital filter |
CN102629189A (en) * | 2012-03-15 | 2012-08-08 | 湖南大学 | Water floating point multiply-accumulate method based on FPGA |
-
2013
- 2013-12-30 CN CN201310738598.XA patent/CN103699355B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4507725A (en) * | 1982-07-01 | 1985-03-26 | Rca Corporation | Digital filter overflow sensor |
US5923273A (en) * | 1996-11-18 | 1999-07-13 | Crystal Semiconductor Corporation | Reduced power FIR filter |
CN1963745A (en) * | 2006-12-01 | 2007-05-16 | 浙江大学 | High speed split multiply accumulator apparatus |
CN101834615A (en) * | 2009-03-12 | 2010-09-15 | 普然通讯技术(上海)有限公司 | Implementation method of Reed-Solomon encoder |
CN102053186A (en) * | 2009-11-10 | 2011-05-11 | 北京普源精电科技有限公司 | Digital oscilloscope with variable-order digital filter |
CN101916177A (en) * | 2010-07-26 | 2010-12-15 | 清华大学 | Configurable multi-precision fixed point multiplying and adding device |
CN102629189A (en) * | 2012-03-15 | 2012-08-08 | 湖南大学 | Water floating point multiply-accumulate method based on FPGA |
Non-Patent Citations (7)
Title |
---|
刘艳萍: "《EDA技术及应用教程》", 31 August 2012, 北京航空航天大学出版社 * |
徐远泽等: "FIR滤波器的FPGA实现方法", 《现代电子技术》 * |
王堃: "基于多核的并行程序设计及优化", 《中国优秀硕士学位论文全文数据库 信息科学辑》 * |
西瑞克斯(北京)通信设备有限公司: "《无线通信的MATLAB和FPGA实现》", 30 June 2009, 人民邮电出版社 * |
陆光华等: "《数字信号处理》", 31 October 2005, 西安电子科技大学出版社 * |
黄晓林: "NOC多核处理器FPGA开发板的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
黄炎等: "NCS算法的并行化设计实现", 《计算机工程与设计》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104504205A (en) * | 2014-12-29 | 2015-04-08 | 南京大学 | Parallelizing two-dimensional division method of symmetrical FIR (Finite Impulse Response) algorithm and hardware structure of parallelizing two-dimensional division method |
CN104504205B (en) * | 2014-12-29 | 2017-09-15 | 南京大学 | A kind of two-dimentional dividing method of the parallelization of symmetrical FIR algorithm and its hardware configuration |
CN106325812A (en) * | 2015-06-15 | 2017-01-11 | 华为技术有限公司 | Processing method and device for multiplication and accumulation operation |
CN106325812B (en) * | 2015-06-15 | 2019-03-08 | 华为技术有限公司 | It is a kind of for the processing method and processing device for multiplying accumulating operation |
CN109976707A (en) * | 2019-03-21 | 2019-07-05 | 西南交通大学 | A kind of variable bit width multiplier automatic generating method |
CN117555515A (en) * | 2024-01-11 | 2024-02-13 | 成都市晶蓉微电子有限公司 | Digital ASIC serial-parallel combined multiplier for balancing performance and area |
CN117555515B (en) * | 2024-01-11 | 2024-04-02 | 成都市晶蓉微电子有限公司 | Digital ASIC serial-parallel combined multiplier for balancing performance and area |
Also Published As
Publication number | Publication date |
---|---|
CN103699355B (en) | 2017-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101782893B (en) | Reconfigurable data processing platform | |
CN102541774B (en) | Multi-grain parallel storage system and storage | |
CN102541749B (en) | Multi-granularity parallel storage system | |
CN102945224A (en) | High-speed variable point FFT (Fast Fourier Transform) processor based on FPGA (Field-Programmable Gate Array) and processing method of high-speed variable point FFT processor | |
CN103699355A (en) | Variable-order pipeline serial multiply-accumulator | |
CN102931994B (en) | Be applied to high speed signal sampling and synchronous framework and the method for signal processing chip | |
CN103984560A (en) | Embedded reconfigurable system based on large-scale coarseness and processing method thereof | |
CN113157637B (en) | High-capacity reconfigurable FFT operation IP core based on FPGA | |
CN108710505A (en) | A kind of expansible Sparse Matrix-Vector based on FPGA multiplies processor | |
CN103870438A (en) | Circuit structure using number theoretic transform for calculating cyclic convolution | |
CN107545914B (en) | Method and apparatus for intelligent memory interface | |
Yang et al. | Molecular dynamics range-limited force evaluation optimized for FPGAs | |
CN104504205B (en) | A kind of two-dimentional dividing method of the parallelization of symmetrical FIR algorithm and its hardware configuration | |
US9460007B1 (en) | Programmable hardware blocks for time-sharing arithmetic units using memory mapping of periodic functions | |
CN104679670A (en) | Shared data caching structure and management method for FFT (fast Fourier transform) and FIR (finite impulse response) algorithms | |
CN102364456A (en) | 64-point fast Fourier transform (FFT) calculator | |
KR101334111B1 (en) | Quad-data rate controller and realization method thereof | |
CN102411557B (en) | Multi-granularity parallel FFT (Fast Fourier Transform) computing device | |
CN104317554B (en) | Device and method of reading and writing register file data for SIMD (Single Instruction Multiple Data) processor | |
CN106407535A (en) | Field-programmable gate array chip-based process mapping method | |
CN103377029B (en) | parameterized universal FIFO control method | |
CN105955705A (en) | Reconfigurable multi-channel detection algorithm accelerator | |
CN102117264A (en) | Fast Walsh transform realization method based on FPGA (Field Programmable Gate Array) | |
CN103293373A (en) | Electric energy metering device and electric energy metering chip thereof | |
CN102200961A (en) | Expansion method of sub-units in dynamically reconfigurable processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |