CN101236576B - Interconnecting model suitable for heterogeneous reconfigurable processor - Google Patents
Interconnecting model suitable for heterogeneous reconfigurable processor Download PDFInfo
- Publication number
- CN101236576B CN101236576B CN2008100333220A CN200810033322A CN101236576B CN 101236576 B CN101236576 B CN 101236576B CN 2008100333220 A CN2008100333220 A CN 2008100333220A CN 200810033322 A CN200810033322 A CN 200810033322A CN 101236576 B CN101236576 B CN 101236576B
- Authority
- CN
- China
- Prior art keywords
- data
- input
- restructural
- storage unit
- process nuclear
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Logic Circuits (AREA)
- Design And Manufacture Of Integrated Circuits (AREA)
Abstract
The invention belongs to the integrated circuit design technical field, in particular relating to an interconnecting model suitable for a heterogeneous reconfigurable processor, which is used for data transmission and interchange of various heterogeneous reconfigurable processing nucleus in the processor. The model normalizes output of a plurality of heterogeneous reconfigurable nucleus and then provides interconnection with quickest speed and maximum flexibility, and data which is outputted from a reconfigurable processing nuclear can be inputted into any reconfigurable processing nuclear for processing within two clock cycles through certain control bits.
Description
Technical field
The invention belongs to the integrated circuit (IC) design technical field, be specifically related to a kind of internet model, relate in particular to a kind of interconnect model that is applicable to heterogeneous reconfigurable processor, be used for the data transmission and the exchange of each heterogeneous reconfigurable process nuclear of this processor.
Background technology
At present, reconfigurable processor is because its advantage at aspects such as versatility, dirigibility, high-performance has obtained to use more widely and development gradually.Wherein, the heterogeneous reconfigurable processor framework is owing to wherein comprised a plurality of different process nuclear, each process nuclear at also difference to some extent of concrete operation, therefore, be better than the reconfigurable processor of isomorphism on its specific aim in area, power consumption and specific area.But,, therefore the data transmission internet between these restructural process nuclear is become a difficult point because in the heterogeneous reconfigurable processor framework, the data width of each restructural process nuclear, inputoutput data number all are not quite similar.
At present, that the internet roughly has is interconnected entirely, bus mode, mesh structure, network-on-chip (NoC, Network onChip) mode etc., has some defectives at aspects such as area, dirigibility and time-delays respectively.
Summary of the invention
The object of the present invention is to provide a kind of interconnect model that is applicable to heterogeneous reconfigurable processor, be used for the data transmission and the exchange of each heterogeneous reconfigurable process nuclear of this processor.This model carries out normalization with the output of all multiple heterogeneous reconfigurables nuclears, provides a kind of the rapidest and have the interconnected of maximum degree of flexibility then.
Interconnect model provided by the present invention is divided into two-stage internet (as shown in Figure 1), is called the overall situation (global) interconnected 102 and local (local) interconnected 103,104.The data of the different in width of different restructural process nuclear 101 are carried out normalized, use that unified granularity is stored, data transmission and exchange 105.
Be divided into three parts in the global interconnect.First is several storage unit 201 with identical data grain size, is used for the data storage of the overall situation and keeping in of swap data, and these storage unit are all used register; Second portion is the multi path selector array 202 from local output data to each storage unit 201, and each storage unit can at random be selected one and store from local output data 204; Third part is to import the multi path selector array 203 of data 205 from the storage unit to this locality, and this locality input data 205 of each restructural process nuclear can at random be selected a storage unit, obtain the data of the inside storage.The data granularity of these selections all data granularity with storage unit 105 is identical.
Be divided into interconnected 104 two kinds of local input interconnected 103 and local output during this locality is interconnected.Multi path selector array 301 are used in input local interconnected 103, and the data that are complementary with its input data width are provided for each input 303 of restructural process nuclear.Data that are complementary with its input data width can be at random selected in each input 303 of restructural process nuclear from this locality input data 304, be input to then and carry out calculation process in the restructural process nuclear; Output local interconnected 104 in, a plurality of outputs 403 of restructural process nuclear are merged, reorganize, be built into global interconnect in the identical size of data of storage unit 105 granularities size, local output data 402 as this reconfigurable core enters global interconnect.
Data to overall storage organization 105, need a clock period from the output 403 of each heterogeneous reconfigurable process nuclear; Input 303 from overall storage organization 105 to each heterogeneous reconfigurable process nuclear also needs a clock period;
In each clock period, the needed control information of each MUX all independently comes from the control bit 106 of outside input, is used for the flow direction that is stored in of control data, provides a kind of the rapidest and have the interconnected of maximum degree of flexibility thus.
According to what of restructural process nuclear number in the heterogeneous reconfigurable processor, and what of input number and output number in each reconfigurable core, the scale of this interconnect model and time delay be variation to some extent also.
Description of drawings
The interconnect model that is applicable to heterogeneous reconfigurable processor that Fig. 1 proposes for the present invention.;
Fig. 2 is the overall situation (global) interconnect architecture of the interconnect model of the present invention's proposition.
Fig. 3 is input this locality (local) interconnect architecture of the interconnect model of the present invention's proposition.
Fig. 4 is output this locality (local) interconnect architecture of the interconnect model of the present invention's proposition.
Number in the figure:
101 is a plurality of restructural process nuclear unit in the heterogeneous reconfigurable processor framework; 102 is global interconnect provided by the present invention; 103 is that local input provided by the present invention is interconnected; 104 is that local output provided by the present invention is interconnected; 105 are overall storage unit with identical data grain size; 106 is interconnected control information.
201 are overall storage unit with identical data grain size; 202 is the MUX from local output data to each storage unit; 203 for importing the MUX of data from the storage unit to this locality; 204 is this locality input data of each heterogeneous reconfigurable process nuclear; 205 is the local output data of each heterogeneous reconfigurable process nuclear.
The 301st, the MUX during local input is interconnected; The 302nd, this locality input data of restructural process nuclear K; The 303rd, each input of restructural process nuclear K.
The 401st, the output data during local output is interconnected merges, assembled unit; The 402nd, the local output data of restructural process nuclear K; The 403rd, each output of restructural process nuclear K.
Embodiment
An example below by interconnect model further specifies:
Interconnect model proposed by the invention is carried out instantiation in a heterogeneous reconfigurable processor framework, in order to the scale of the interconnect model that the present invention proposes to be described.Contain four heterogeneous reconfigurable process nuclear 101 in this processor architecture, wherein, the input data number of restructural process nuclear A is 32 to the maximum, and 8 of output data numbers, data width are 16; The maximum input number of restructural process nuclear B is 16, and the output number is 8, and data width is 16; The maximum input number of restructural process nuclear C is 32,16 of data bit widths, 16 of output data numbers, 8 of data bit widths; The input number of restructural process nuclear D is 1,1 of data bit width, 6 of output data numbers, 64 of data bit widths.
According to the demand of framework and operational data amount, it is 64 that overall situation storage data 105 granularities are set, and always has 64 overall storage unit; This locality input data 205 width of restructural process nuclear A are the 4*64 position, and local output data 204 width are the 2*64 position; This locality input data 205 width of restructural process nuclear B are the 4*64 position, and local output data 204 width are the 2*64 position; This locality input data 205 width of restructural process nuclear C are the 4*64 position, local output data 204 width 2*64 positions; This locality input data 205 width of restructural process nuclear D are the 1*64 position, and local output data 204 width are the 6*64 position.
According to above data, in the global interconnect, to each storage unit 105, need 64 16 to select 1 MUX altogether from local output data 204; This locality input data 205 from 105 to four restructural process nuclear of storage unit need 13 64 to select 1 MUX altogether.Import in interconnected 103 in this locality of restructural process nuclear A, need 32 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear B, need 16 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear C, need 32 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear D, need 1 64 to select 1 MUX altogether.
In above instantiation, according to data from the output 403 of each heterogeneous reconfigurable process nuclear to overall storage organization 105, clock period of time spent, and input 303 from overall storage organization 105 to each heterogeneous reconfigurable process nuclear, also time spent such setting of clock period, interconnection network proposed by the invention can reach the frequency of operation of 150MHz.
Claims (4)
1. interacted system that is applicable to heterogeneous reconfigurable processor is characterized in that: be divided into the two-stage internet, be called global interconnect and local interconnected, use that unified granularity is stored, data transmission and exchange; Wherein:
Be divided into three parts in the global interconnect, first is several storage unit with identical data grain size (201), is used for the data storage of the overall situation and keeping in of swap data, and these storage unit are all used register; Second portion is the multi path selector array (202) from local output data to each storage unit (201), and each storage unit can at random be selected one and store from local output data (204); Third part is to import the multi path selector array of data (205) (203) from the storage unit to this locality, and this locality input data (205) of each restructural process nuclear can at random be selected a storage unit, obtain the data of the inside storage; The data granularity of these selections all data granularity with storage unit (201) is identical;
Be divided into local interconnected (104) two kinds of input local interconnected (103) and output during this locality is interconnected, multi path selector array (301) is used in input local interconnected (103), the data that are complementary with its input data width are provided for each input (303) of restructural process nuclear, each input (303) of restructural process nuclear can be at random one of selection and the data that its input data width is complementary from this locality input data (304), be input to then and carry out calculation process in the restructural process nuclear; In output local interconnected (104), a plurality of outputs (403) of restructural process nuclear are merged, reorganize, be built into the size of data identical with storage unit (201) granularity size in the global interconnect, local output data (402) as this restructural process nuclear enters global interconnect.
2. interacted system according to claim 1 is characterized in that data from the overall storage organization of outputing to of each heterogeneous reconfigurable process nuclear, have a clock period.
3. interacted system according to claim 1 is characterized in that the input of data from overall storage organization to each heterogeneous reconfigurable process nuclear, needs a clock period.
4. interacted system according to claim 1 is characterized in that the data selection that described use multi path selector array is carried out, and all independently is controlled by the control bit of outside input.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100333220A CN101236576B (en) | 2008-01-31 | 2008-01-31 | Interconnecting model suitable for heterogeneous reconfigurable processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100333220A CN101236576B (en) | 2008-01-31 | 2008-01-31 | Interconnecting model suitable for heterogeneous reconfigurable processor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101236576A CN101236576A (en) | 2008-08-06 |
CN101236576B true CN101236576B (en) | 2011-12-07 |
Family
ID=39920192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2008100333220A Expired - Fee Related CN101236576B (en) | 2008-01-31 | 2008-01-31 | Interconnecting model suitable for heterogeneous reconfigurable processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101236576B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102323916B (en) * | 2011-06-14 | 2013-05-22 | 清华大学 | Method and device for one-to-one data interaction among dynamic reconfigurable processors |
KR101912427B1 (en) * | 2011-12-12 | 2018-10-29 | 삼성전자주식회사 | Reconfigurable processor and mini-core of reconfigurable processor |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1641619A (en) * | 2004-01-17 | 2005-07-20 | 中国科学院计算技术研究所 | Multi-processor communication and its communication method |
WO2007067562A3 (en) * | 2005-12-06 | 2007-10-25 | Boston Circuits Inc | Methods and apparatus for multi-core processing with dedicated thread management |
-
2008
- 2008-01-31 CN CN2008100333220A patent/CN101236576B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1641619A (en) * | 2004-01-17 | 2005-07-20 | 中国科学院计算技术研究所 | Multi-processor communication and its communication method |
WO2007067562A3 (en) * | 2005-12-06 | 2007-10-25 | Boston Circuits Inc | Methods and apparatus for multi-core processing with dedicated thread management |
Non-Patent Citations (1)
Title |
---|
董培良等.一种可重构处理器的设计.《复旦学报(自然科学版)》.2004,第43卷(第1期),45-49. * |
Also Published As
Publication number | Publication date |
---|---|
CN101236576A (en) | 2008-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pellauer et al. | Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration | |
Cong et al. | A fully pipelined and dynamically composable architecture of CGRA | |
CN103744644B (en) | The four core processor systems built using four nuclear structures and method for interchanging data | |
CN105378651B (en) | Memory-network processing unit with programmable optimization | |
US7200735B2 (en) | High-performance hybrid processor with configurable execution units | |
CN101799750B (en) | Data processing method and device | |
CN108537331A (en) | A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic | |
CN103730149B (en) | A kind of read-write control circuit of dual-ported memory | |
CN103761075B (en) | Coarse granularity dynamic reconfigurable data integration and control unit structure | |
CN102782672A (en) | A tile-based processor architecture model for high efficiency embedded homogneous multicore platforms | |
JP2008537268A (en) | An array of data processing elements with variable precision interconnection | |
Ax et al. | CoreVA-MPSoC: A many-core architecture with tightly coupled shared and local data memories | |
CN107562549A (en) | Isomery many-core ASIP frameworks based on on-chip bus and shared drive | |
US20070136560A1 (en) | Method and apparatus for a shift register based interconnection for a massively parallel processor array | |
US7624209B1 (en) | Method of and circuit for enabling variable latency data transfers | |
CN101236576B (en) | Interconnecting model suitable for heterogeneous reconfigurable processor | |
Leibson et al. | Configurable processors: a new era in chip design | |
CN103365821B (en) | A kind of address generator of heterogeneous multi-nucleus processor | |
CN103455367B (en) | For realizing administrative unit and the method for multi-task scheduling in reconfigurable system | |
US7720636B1 (en) | Performance monitors (PMs) for measuring performance in a system and providing a record of transactions performed | |
Sievers et al. | Comparison of shared and private l1 data memories for an embedded mpsoc in 28nm fd-soi | |
Shang et al. | LACS: A high-computational-efficiency accelerator for CNNs | |
Prasad et al. | Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality With At-MRAM Neural Engine | |
CN102129495B (en) | Method for reducing power consumption of reconfigurable operator array structure | |
CN109902040A (en) | A kind of System on Chip/SoC of integrated FPGA and artificial intelligence module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20111207 Termination date: 20140131 |