CN1564125A - Array type reconstructural DSP engine chip structure based on CORDIC unit - Google Patents

Array type reconstructural DSP engine chip structure based on CORDIC unit Download PDF

Info

Publication number
CN1564125A
CN1564125A CN 200410013670 CN200410013670A CN1564125A CN 1564125 A CN1564125 A CN 1564125A CN 200410013670 CN200410013670 CN 200410013670 CN 200410013670 A CN200410013670 A CN 200410013670A CN 1564125 A CN1564125 A CN 1564125A
Authority
CN
China
Prior art keywords
interconnect bus
totalizer
restructural
reconfigurable processing
reconfigurable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200410013670
Other languages
Chinese (zh)
Inventor
杨宇
毛志刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN 200410013670 priority Critical patent/CN1564125A/en
Publication of CN1564125A publication Critical patent/CN1564125A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Multi Processors (AREA)

Abstract

Interconnection bus including longitudinal interconnection bus and transverse interconnection bus interconnected through reconfigurable switch network are setup between reconfigurable processing units arranged in array. Reconfigurable processing units are connected to each other longitudinally through basic unit data lines, which are connected to transverse interconnection bus through reconfigurable switch network. In transverse adjacent reconfigurable processing units at same stage of pipeline, their adders, shifters, accumulators, and register are connected through interconnection lines containing control switch. Reconfigurable processing unit itself of using COROIC algorithm possesses high reconfigurability, providing features of realizing wide used DSP algorithms, simple structure and rules, easy of modularized. Thus, the reconfigurable processing unit is suitable for being as core unit in reconfigurable chip.

Description

A kind of based on CORDIC cellular array formula restructural DSP engine chip structure
Technical field:
The present invention relates to a kind of is restructural (hardware programmable) the array chip inner structure of the coarse particle degree elementary cell composition of core with the cordic algorithm, and this structure is mainly used in the DSP field.By configuration to the hardware restructural resource in the chip, can carry out the core link in most DSP algorithms efficiently, can be used as the accelerating engine in the dsp system.
Background technology:
CORDIC (COordinate Rotation DIgital Computing) is called the rotation of coordinate numerical calculation method again, is a kind of alternative manner that is used to calculate the generalized vector rotation.By Several Parameters few in number in the CORDIC unit is set, it can realize multiple basic function and computing with simple " displacement---add " iteration, as: trigonometric function, inverse trigonometric function, hyperbolic function, inverse hyperbolic function, logarithm operation, exponent arithmetic, extracting operation, multiplying, division arithmetic, these characteristics show that cordic algorithm itself has good reconfigurability (hardware programmable).Having much in these functions and the computing to be not easy to realize with additive method, also is often to run in some DSP algorithms.Reconfigurable device in the past is divided into fine granularity and coarse particle degree two big classes.We find that to the summary of existing coarse particle degree array chip their data word is wide all fixes, and can only adapt to the wide identical class of data word and use.We think can have the wide reconfigurable function of data word as fruit chip, will strengthen the adaptability of chip greatly, make the resources of chip less wastage, the performance that forms functional module is higher, power consumption is lower, and configuration data to lack by contrast many, help dynamic application.
Summary of the invention:
In order to solve the wide fixing problem of existing coarse particle degree DSP array chip data word, provide a kind of energy to pass through the fundamental operation parts reconstruct of adjacent cells to change the wide array restructural dsp chip of data word.Technical scheme of the present invention is as follows: a kind of based on CORDIC cellular array formula restructural DSP engine chip structure, line up between several reconfigurable processing units 1 of array and be provided with interconnect bus 2, interconnect bus 2-1 and horizontal interconnect bus 2-2 are connected to each other by restructural switching network 3 longitudinally, each adjacent reconfigurable processing unit 1 in same vertical orientation vertically is connected by elementary cell data line 4, elementary cell data line 4 is connected by restructural switching network 3 with horizontal interconnect bus 2-2, reconfigurable processing unit 1 is some grades of flowing structures of cordic algorithm, laterally the totalizer in the same one-level flowing water of adjacent reconfigurable processing unit 1 and the totalizer of opposite position, the shift unit of shift unit and opposite position, the totalizer of totalizer and opposite position, be connected by the interconnection line 7 that includes gauge tap 5 respectively between the register of register and opposite position.DSP engine chip of the present invention, for realizing the restructural between the adjacent cells, set up the restructural path between main two arithmetic units that are in same pipelining-stage in adjacent cells such as shift unit, totalizer, totalizer, the register, two shift units, totalizer, totalizer, registers can form that data word is wide to be original 2 times functional module when this path is communicated with.We have increased the restructural function for shift unit, totalizer, totalizer and the register in the unit, make between two laterally adjacent unit shift unit, totalizer and register with one-level flowing water by configuration, can be connected to become that word is wide to be original 2 times corresponding functional unit.This restructural function let us can be realized the CORDIC unit of the 16/24/32-bit of 4/9/16 8-bit CORDIC unit composition that the domain position is adjacent.Because wide can both the changing of data word of shift unit, totalizer, register and the totalizer of DSP engine chip of the present invention by reconstruct, therefore the versatility and the adaptability of chip strengthen greatly, make the resources of chip less wastage, the performance height of formation functional module, low in energy consumption helps dynamic application.Use the reconfigurable processing unit 1 of COROIC algorithm self to have very strong reconfigurability, can realize DSP class algorithm quite widely efficiently, and rule simple in structure, be easy to realize modularization, be suitable as very much the core cell of restructural chip.Modern design of the present invention, reliable operation have bigger promotional value.
Description of drawings:
Fig. 1 is a structural representation of the present invention, Fig. 2 is the structural representation of restructural switching network 3 in the embodiment of the present invention two, Fig. 3 is the structural representation of reconfigurable processing unit 1 in the embodiment one, Fig. 4 is the synoptic diagram of shift unit reconstruct in the embodiment one, Fig. 5 is the synoptic diagram of totalizer reconstruct, and Fig. 6 is the syndeton synoptic diagram of first order flowing water in the adjacent reconfigurable processing unit 1.
Embodiment:
Embodiment one: specify present embodiment below in conjunction with Fig. 1, Fig. 3 to Fig. 6.Line up between several reconfigurable processing units 1 of array and be provided with interconnect bus 2, interconnect bus 2-1 and horizontal interconnect bus 2-2 are connected to each other by restructural switching network 3 longitudinally, each adjacent reconfigurable processing unit 1 in same vertical orientation vertically is connected by elementary cell data line 4, elementary cell data line 4 is connected by restructural switching network 3 with horizontal interconnect bus 2-2, reconfigurable processing unit 1 is some grades of flowing structures of cordic algorithm, laterally the totalizer in the same one-level flowing water of adjacent reconfigurable processing unit 1 and the totalizer of opposite position, the shift unit of shift unit and opposite position, the totalizer of totalizer and opposite position, be connected by the interconnection line 7 that includes gauge tap 5 respectively between the register of register and opposite position.
As shown in Figure 4, adhere to two shift unit 1-1-1 and shift unit 1-1-2 in the adjacent reconfigurable processing unit 1 separately, it all is three the shift unit of moving to right, when expecting when wide than the long data word, allow gauge tap 5 connect by instruction, interconnection line 7 just becomes path, and making shift unit 1-1-1 and shift unit 1-1-2 just synthesize a data word wide is original 2 times shift unit.Totalizer also can be reconstructed by same procedure, and the lengthening data word is wide.As shown in Figure 5, adhere to the carry lookahead adder 1-2-1 and the carry lookahead adder 1-2-2 of same position in two adjacent reconfigurable processing units 1 separately, by the connection of gauge tap 5, just can obtain data word wide is original 2 a times carry lookahead adder, thereby realizes recombination function.Totalizer and register also can be reconstructed by same procedure.Gauge tap 5 selects for use field effect transistor to realize.The shift unit of same position in the first order flowing water in laterally adjacent two reconfigurable processing units 1 shown in Fig. 6, totalizer, the syndeton synoptic diagram of register and totalizer, between totalizer 1-2-1 and the totalizer 1-2-2, between totalizer 1-3-1 and the totalizer 1-3-2, between totalizer 1-4-1 and the totalizer 1-4-2, between register 1-5-1 and the register 1-5-2, between register 1-6-1 and the register 1-6-2, between register 1-7-1 and the register 1-7-2, between shift unit 1-8-1 and the shift unit 1-8-2, between shift unit 1-9-1 and the shift unit 1-9-2, all be connected between flag register 1-10-1 and the flag register 1-10-2 by the interconnection line 7 that includes switch 5.
As Fig. 3 and shown in Figure 6, the course of work of chip mainly is divided into two stages: reconstruction stage and working stage.In reconstruction stage, above have the part of restructural function in the structure, promptly the preset storage unit of circle representative among the figure is written into the configuration data that presets.This moment, the function of this chip was fixed, and was equivalent to the hardware circuit that can only finish a certain specific function, can load processed data at any time and start working.Enter after the working stage, each timeticks has one group of data to enter processing engine from the top of unit, and with the mode operation downwards step by step of flowing water, the position that arrives the algorithm end is connected to a certain chip port by interconnect bus later.
Be 8 CORDIC unit internal work processes of example explanation with sine and the cosine that calculates a certain angle [alpha] below.The unit inlet has three circuit-switched data input (X 0, Y 0, Z 0), initial value is got (1,0, α), and the shift sequence of preceding ten grades of flowing water is (0,0,1,2,3,4,5,6,7,8), and 3 grades of moulds are proofreaied and correct the shift sequence (2,5,8) of flowing water.X 0Be divided into two-way after entering first order flowing water, the one tunnel directly as the summand of first totalizer, and another road adds the addend of sending out device through " moving to right 0 " back as second; Y 0Be divided into three the tunnel, the one tunnel directly as the summand of second totalizer after entering first order flowing water, the second tunnel through adding the addend of sending out device as first after " moving to right 0 ", and Third Road is sent into the symbol decision module; Z 0Be divided into two-way after entering the first order, the one tunnel sends into to add and sends out device and ± arctan (2 -0) do and, another road then enters the symbol decision module.The symbol decision module is according to Z iSign bit of positive and negative generation send into three tunnel totalizers respectively, control they do and or do poor.The result of three totalizer generations deposits in the register of this grade flowing water when next clock arrives, and uses for second level water operation.By that analogy, up to the tenth grade, at different levels between the difference of the operation figure place (describing) that is to be shifted as shift sequence.From 13 grades of flowing water of the tenth one-level to the is the mould correct operation, X in this operation 0Be divided into two-way, the one tunnel directly as the summand of first totalizer, and another road is through the addend of back as this totalizer that move to right; Y 0With X 0The mould correct operation identical; Z 0Do not have operation, only do three grades and deposit.The shift sequence of three grades of mould correct operations is (2,5,8), and the reducing that adds of totalizers at different levels is preset bank bit control by one.Data in final the 13 grade of flowing water register are exactly this result calculated: X 13=Cos α, Y 13=Sin α, Z 13=0.These two sines and cosine value can or be passed to other functional modules and use by interconnect bus output.
Embodiment two: specify present embodiment below in conjunction with Fig. 1 and Fig. 2.The difference of present embodiment and embodiment one is: described interconnect bus 2 is the 64-bit interconnect bus, restructural switching network 3 is made up of several switching tubes 3-1, switching tube 3-1 is arranged on longitudinally the place, point of crossing of 64 interconnect bus 2-1 and horizontal 64 interconnect bus 2-2, two main work utmost points of switching tube 3-1 connect interconnect bus 2-1 and horizontal interconnect bus 2-2 longitudinally respectively, and the control utmost point of switching tube 3-1 connects the preset memory 3-2 of gauge tap pipe.Present embodiment when work, preset among the storage 3-2 configuration switch pipe 3-1 and be communicated with or turn-off by being programmed in, thereby determine the constituted mode of this chip.Switching tube 3-1 both can select for use the CMOS pipe also can select the NMOS pipe for use.

Claims (2)

1, a kind of based on CORDIC cellular array formula restructural DSP engine chip structure, line up between several reconfigurable processing units (1) of array and be provided with interconnect bus (2), interconnect bus (2-1) is connected to each other by restructural switching network (3) with horizontal interconnect bus (2-2) longitudinally, each adjacent reconfigurable processing unit (1) in same vertical orientation vertically is connected by elementary cell data line (4), elementary cell data line (4) is connected by restructural switching network (3) with horizontal interconnect bus (2-2), it is characterized in that reconfigurable processing unit (1) is some grades of flowing structures of cordic algorithm, laterally the totalizer in the same one-level flowing water of adjacent reconfigurable processing unit (1) and the totalizer of opposite position, the shift unit of shift unit and opposite position, the totalizer of totalizer and opposite position, be connected by the interconnection line (7) that includes gauge tap (5) respectively between the register of register and opposite position.
2, according to claim 1 a kind of based on CORDIC cellular array formula restructural DSP engine chip structure, it is characterized in that described interconnect bus (2) is the 64-bit interconnect bus, restructural switching network (3) is made up of several switching tubes (3-1), switching tube (3-1) is arranged on the place, point of crossing of 64 interconnect bus (2-1) longitudinally and horizontal 64 interconnect bus (2-2), two main work utmost points of switching tube (3-1) connect interconnect bus (2-1) and horizontal interconnect bus (2-2) longitudinally respectively, and the control utmost point of switching tube (3-1) connects the preset memory (3-2) of gauge tap pipe.
CN 200410013670 2004-04-09 2004-04-09 Array type reconstructural DSP engine chip structure based on CORDIC unit Pending CN1564125A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200410013670 CN1564125A (en) 2004-04-09 2004-04-09 Array type reconstructural DSP engine chip structure based on CORDIC unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200410013670 CN1564125A (en) 2004-04-09 2004-04-09 Array type reconstructural DSP engine chip structure based on CORDIC unit

Publications (1)

Publication Number Publication Date
CN1564125A true CN1564125A (en) 2005-01-12

Family

ID=34478237

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200410013670 Pending CN1564125A (en) 2004-04-09 2004-04-09 Array type reconstructural DSP engine chip structure based on CORDIC unit

Country Status (1)

Country Link
CN (1) CN1564125A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620587B (en) * 2008-07-03 2011-01-19 中国人民解放军信息工程大学 Flexible reconfigurable task processing unit structure
CN102163247A (en) * 2011-04-02 2011-08-24 北京大学深圳研究生院 Array structure of reconfigurable operators
CN102214158A (en) * 2011-06-08 2011-10-12 清华大学 Dynamic reconfigurable processor with full-interconnection routing structure
CN102339269A (en) * 2011-09-09 2012-02-01 北京大学深圳研究生院 Reconfigurable operator array structure suitable for WLP (Wafer Level Packaging) packaging mode
CN102624653A (en) * 2012-01-13 2012-08-01 清华大学 Extensible QR decomposition method based on pipeline working mode
CN102650860A (en) * 2011-02-25 2012-08-29 西安邮电学院 Controller structure of signal processing hardware in novel data stream DSP (digital signal processor)
CN103390071A (en) * 2012-05-07 2013-11-13 北京大学深圳研究生院 Hierarchical interconnection structure of reconfigurable operator array
CN105843774A (en) * 2016-03-23 2016-08-10 东南大学—无锡集成电路技术研究所 Dynamic multimode configurable reconstructed computation unit structure
CN106326628A (en) * 2015-12-03 2017-01-11 西安邮电大学 Reconstructing array structure for natural logarithm and natural exponential functions
CN109933372A (en) * 2019-02-26 2019-06-25 西安理工大学 A kind of changeable framework low power processor of multi-mode dynamic
CN110597755A (en) * 2019-08-02 2019-12-20 北京多思安全芯片科技有限公司 Recombination configuration method of safety processor

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620587B (en) * 2008-07-03 2011-01-19 中国人民解放军信息工程大学 Flexible reconfigurable task processing unit structure
CN102650860A (en) * 2011-02-25 2012-08-29 西安邮电学院 Controller structure of signal processing hardware in novel data stream DSP (digital signal processor)
CN102163247A (en) * 2011-04-02 2011-08-24 北京大学深圳研究生院 Array structure of reconfigurable operators
CN102214158A (en) * 2011-06-08 2011-10-12 清华大学 Dynamic reconfigurable processor with full-interconnection routing structure
CN102339269B (en) * 2011-09-09 2017-10-27 北京大学深圳研究生院 A kind of reconfigurable operator array structure suitable for WLP packing forms
CN102339269A (en) * 2011-09-09 2012-02-01 北京大学深圳研究生院 Reconfigurable operator array structure suitable for WLP (Wafer Level Packaging) packaging mode
CN102624653A (en) * 2012-01-13 2012-08-01 清华大学 Extensible QR decomposition method based on pipeline working mode
CN102624653B (en) * 2012-01-13 2014-08-20 清华大学 Extensible QR decomposition method based on pipeline working mode
CN103390071A (en) * 2012-05-07 2013-11-13 北京大学深圳研究生院 Hierarchical interconnection structure of reconfigurable operator array
CN106326628B (en) * 2015-12-03 2018-12-28 西安邮电大学 A kind of reconfigurable array structure for realizing natural logrithm and natural exponential function
CN106326628A (en) * 2015-12-03 2017-01-11 西安邮电大学 Reconstructing array structure for natural logarithm and natural exponential functions
CN105843774B (en) * 2016-03-23 2018-10-02 东南大学—无锡集成电路技术研究所 A kind of Reconfigurable Computation cellular construction that dynamic multi-mode can match
CN105843774A (en) * 2016-03-23 2016-08-10 东南大学—无锡集成电路技术研究所 Dynamic multimode configurable reconstructed computation unit structure
CN109933372A (en) * 2019-02-26 2019-06-25 西安理工大学 A kind of changeable framework low power processor of multi-mode dynamic
CN109933372B (en) * 2019-02-26 2022-12-09 西安理工大学 Multi-mode dynamic switchable architecture low-power-consumption processor
CN110597755A (en) * 2019-08-02 2019-12-20 北京多思安全芯片科技有限公司 Recombination configuration method of safety processor
CN110597755B (en) * 2019-08-02 2024-01-09 北京多思安全芯片科技有限公司 Recombination configuration method of safety processor

Similar Documents

Publication Publication Date Title
CN101782893B (en) Reconfigurable data processing platform
KR100948512B1 (en) Floating point unit-processing elementFPU-PE structure, reconfigurable array processorRAP comprising the same FPU-PE structure, and multi-media platform comprising the same RAP
CN111178519A (en) Convolutional neural network acceleration engine, convolutional neural network acceleration system and method
CN105912501B (en) A kind of SM4-128 Encryption Algorithm realization method and systems based on extensive coarseness reconfigurable processor
CN101729463A (en) Hardware device and method for implementing Fourier transform and Fourier inverse transform
CN105335331B (en) A kind of SHA256 realization method and systems based on extensive coarseness reconfigurable processor
CN1564125A (en) Array type reconstructural DSP engine chip structure based on CORDIC unit
CN102508643A (en) Multicore-parallel digital signal processor and method for operating parallel instruction sets
CN101685385A (en) Complex multiplier
CN102945224A (en) High-speed variable point FFT (Fast Fourier Transform) processor based on FPGA (Field-Programmable Gate Array) and processing method of high-speed variable point FFT processor
CN101847137B (en) FFT processor for realizing 2FFT-based calculation
CN110717583B (en) Convolution circuit, processor, chip, board card and electronic equipment
CN102306141B (en) Method for describing configuration information of dynamic reconfigurable array
CN101625634A (en) Reconfigurable multiplier
CN109753268B (en) Multi-granularity parallel operation multiplier
CN100465877C (en) High speed split multiply accumulator apparatus
CN109284824A (en) A kind of device for being used to accelerate the operation of convolution sum pond based on Reconfiguration Technologies
CN102214158B (en) Dynamic reconfigurable processor with full-interconnection routing structure
CN103984677A (en) Embedded reconfigurable system based on large-scale coarseness and processing method thereof
CN105975251A (en) DES algorithm round iteration system and method based on coarse-grained reconfigurable architecture
CN110851779A (en) Systolic array architecture for sparse matrix operations
CN101038582B (en) Systolic array processing method and circuit used for self-adaptive optical wave front restoration calculation
Maliţa et al. Not multi-, but many-core: designing integral parallel architectures for embedded computation
CN112559954B (en) FFT algorithm processing method and device based on software-defined reconfigurable processor
CN102799564A (en) Fast fourier transformation (FFT) parallel method based on multi-core digital signal processor (DSP) platform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication