CN101236576B - Interconnecting model suitable for heterogeneous reconfigurable processor - Google Patents

Interconnecting model suitable for heterogeneous reconfigurable processor Download PDF

Info

Publication number
CN101236576B
CN101236576B CN2008100333220A CN200810033322A CN101236576B CN 101236576 B CN101236576 B CN 101236576B CN 2008100333220 A CN2008100333220 A CN 2008100333220A CN 200810033322 A CN200810033322 A CN 200810033322A CN 101236576 B CN101236576 B CN 101236576B
Authority
CN
China
Prior art keywords
data
input
restructural
storage unit
process nuclear
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008100333220A
Other languages
Chinese (zh)
Other versions
CN101236576A (en
Inventor
陆雯青
赵爽
陆超
周晓方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN2008100333220A priority Critical patent/CN101236576B/en
Publication of CN101236576A publication Critical patent/CN101236576A/en
Application granted granted Critical
Publication of CN101236576B publication Critical patent/CN101236576B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Logic Circuits (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The invention belongs to the integrated circuit design technical field, in particular relating to an interconnecting model suitable for a heterogeneous reconfigurable processor, which is used for data transmission and interchange of various heterogeneous reconfigurable processing nucleus in the processor. The model normalizes output of a plurality of heterogeneous reconfigurable nucleus and then provides interconnection with quickest speed and maximum flexibility, and data which is outputted from a reconfigurable processing nuclear can be inputted into any reconfigurable processing nuclear for processing within two clock cycles through certain control bits.

Description

A kind of interconnect model that is applicable to heterogeneous reconfigurable processor
Technical field
The invention belongs to the integrated circuit (IC) design technical field, be specifically related to a kind of internet model, relate in particular to a kind of interconnect model that is applicable to heterogeneous reconfigurable processor, be used for the data transmission and the exchange of each heterogeneous reconfigurable process nuclear of this processor.
Background technology
At present, reconfigurable processor is because its advantage at aspects such as versatility, dirigibility, high-performance has obtained to use more widely and development gradually.Wherein, the heterogeneous reconfigurable processor framework is owing to wherein comprised a plurality of different process nuclear, each process nuclear at also difference to some extent of concrete operation, therefore, be better than the reconfigurable processor of isomorphism on its specific aim in area, power consumption and specific area.But,, therefore the data transmission internet between these restructural process nuclear is become a difficult point because in the heterogeneous reconfigurable processor framework, the data width of each restructural process nuclear, inputoutput data number all are not quite similar.
At present, that the internet roughly has is interconnected entirely, bus mode, mesh structure, network-on-chip (NoC, Network onChip) mode etc., has some defectives at aspects such as area, dirigibility and time-delays respectively.
Summary of the invention
The object of the present invention is to provide a kind of interconnect model that is applicable to heterogeneous reconfigurable processor, be used for the data transmission and the exchange of each heterogeneous reconfigurable process nuclear of this processor.This model carries out normalization with the output of all multiple heterogeneous reconfigurables nuclears, provides a kind of the rapidest and have the interconnected of maximum degree of flexibility then.
Interconnect model provided by the present invention is divided into two-stage internet (as shown in Figure 1), is called the overall situation (global) interconnected 102 and local (local) interconnected 103,104.The data of the different in width of different restructural process nuclear 101 are carried out normalized, use that unified granularity is stored, data transmission and exchange 105.
Be divided into three parts in the global interconnect.First is several storage unit 201 with identical data grain size, is used for the data storage of the overall situation and keeping in of swap data, and these storage unit are all used register; Second portion is the multi path selector array 202 from local output data to each storage unit 201, and each storage unit can at random be selected one and store from local output data 204; Third part is to import the multi path selector array 203 of data 205 from the storage unit to this locality, and this locality input data 205 of each restructural process nuclear can at random be selected a storage unit, obtain the data of the inside storage.The data granularity of these selections all data granularity with storage unit 105 is identical.
Be divided into interconnected 104 two kinds of local input interconnected 103 and local output during this locality is interconnected.Multi path selector array 301 are used in input local interconnected 103, and the data that are complementary with its input data width are provided for each input 303 of restructural process nuclear.Data that are complementary with its input data width can be at random selected in each input 303 of restructural process nuclear from this locality input data 304, be input to then and carry out calculation process in the restructural process nuclear; Output local interconnected 104 in, a plurality of outputs 403 of restructural process nuclear are merged, reorganize, be built into global interconnect in the identical size of data of storage unit 105 granularities size, local output data 402 as this reconfigurable core enters global interconnect.
Data to overall storage organization 105, need a clock period from the output 403 of each heterogeneous reconfigurable process nuclear; Input 303 from overall storage organization 105 to each heterogeneous reconfigurable process nuclear also needs a clock period;
In each clock period, the needed control information of each MUX all independently comes from the control bit 106 of outside input, is used for the flow direction that is stored in of control data, provides a kind of the rapidest and have the interconnected of maximum degree of flexibility thus.
According to what of restructural process nuclear number in the heterogeneous reconfigurable processor, and what of input number and output number in each reconfigurable core, the scale of this interconnect model and time delay be variation to some extent also.
Description of drawings
The interconnect model that is applicable to heterogeneous reconfigurable processor that Fig. 1 proposes for the present invention.;
Fig. 2 is the overall situation (global) interconnect architecture of the interconnect model of the present invention's proposition.
Fig. 3 is input this locality (local) interconnect architecture of the interconnect model of the present invention's proposition.
Fig. 4 is output this locality (local) interconnect architecture of the interconnect model of the present invention's proposition.
Number in the figure:
101 is a plurality of restructural process nuclear unit in the heterogeneous reconfigurable processor framework; 102 is global interconnect provided by the present invention; 103 is that local input provided by the present invention is interconnected; 104 is that local output provided by the present invention is interconnected; 105 are overall storage unit with identical data grain size; 106 is interconnected control information.
201 are overall storage unit with identical data grain size; 202 is the MUX from local output data to each storage unit; 203 for importing the MUX of data from the storage unit to this locality; 204 is this locality input data of each heterogeneous reconfigurable process nuclear; 205 is the local output data of each heterogeneous reconfigurable process nuclear.
The 301st, the MUX during local input is interconnected; The 302nd, this locality input data of restructural process nuclear K; The 303rd, each input of restructural process nuclear K.
The 401st, the output data during local output is interconnected merges, assembled unit; The 402nd, the local output data of restructural process nuclear K; The 403rd, each output of restructural process nuclear K.
Embodiment
An example below by interconnect model further specifies:
Interconnect model proposed by the invention is carried out instantiation in a heterogeneous reconfigurable processor framework, in order to the scale of the interconnect model that the present invention proposes to be described.Contain four heterogeneous reconfigurable process nuclear 101 in this processor architecture, wherein, the input data number of restructural process nuclear A is 32 to the maximum, and 8 of output data numbers, data width are 16; The maximum input number of restructural process nuclear B is 16, and the output number is 8, and data width is 16; The maximum input number of restructural process nuclear C is 32,16 of data bit widths, 16 of output data numbers, 8 of data bit widths; The input number of restructural process nuclear D is 1,1 of data bit width, 6 of output data numbers, 64 of data bit widths.
According to the demand of framework and operational data amount, it is 64 that overall situation storage data 105 granularities are set, and always has 64 overall storage unit; This locality input data 205 width of restructural process nuclear A are the 4*64 position, and local output data 204 width are the 2*64 position; This locality input data 205 width of restructural process nuclear B are the 4*64 position, and local output data 204 width are the 2*64 position; This locality input data 205 width of restructural process nuclear C are the 4*64 position, local output data 204 width 2*64 positions; This locality input data 205 width of restructural process nuclear D are the 1*64 position, and local output data 204 width are the 6*64 position.
According to above data, in the global interconnect, to each storage unit 105, need 64 16 to select 1 MUX altogether from local output data 204; This locality input data 205 from 105 to four restructural process nuclear of storage unit need 13 64 to select 1 MUX altogether.Import in interconnected 103 in this locality of restructural process nuclear A, need 32 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear B, need 16 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear C, need 32 16 to select 1 MUX altogether; Import in interconnected 103 in this locality of restructural process nuclear D, need 1 64 to select 1 MUX altogether.
In above instantiation, according to data from the output 403 of each heterogeneous reconfigurable process nuclear to overall storage organization 105, clock period of time spent, and input 303 from overall storage organization 105 to each heterogeneous reconfigurable process nuclear, also time spent such setting of clock period, interconnection network proposed by the invention can reach the frequency of operation of 150MHz.

Claims (4)

1. interacted system that is applicable to heterogeneous reconfigurable processor is characterized in that: be divided into the two-stage internet, be called global interconnect and local interconnected, use that unified granularity is stored, data transmission and exchange; Wherein:
Be divided into three parts in the global interconnect, first is several storage unit with identical data grain size (201), is used for the data storage of the overall situation and keeping in of swap data, and these storage unit are all used register; Second portion is the multi path selector array (202) from local output data to each storage unit (201), and each storage unit can at random be selected one and store from local output data (204); Third part is to import the multi path selector array of data (205) (203) from the storage unit to this locality, and this locality input data (205) of each restructural process nuclear can at random be selected a storage unit, obtain the data of the inside storage; The data granularity of these selections all data granularity with storage unit (201) is identical;
Be divided into local interconnected (104) two kinds of input local interconnected (103) and output during this locality is interconnected, multi path selector array (301) is used in input local interconnected (103), the data that are complementary with its input data width are provided for each input (303) of restructural process nuclear, each input (303) of restructural process nuclear can be at random one of selection and the data that its input data width is complementary from this locality input data (304), be input to then and carry out calculation process in the restructural process nuclear; In output local interconnected (104), a plurality of outputs (403) of restructural process nuclear are merged, reorganize, be built into the size of data identical with storage unit (201) granularity size in the global interconnect, local output data (402) as this restructural process nuclear enters global interconnect.
2. interacted system according to claim 1 is characterized in that data from the overall storage organization of outputing to of each heterogeneous reconfigurable process nuclear, have a clock period.
3. interacted system according to claim 1 is characterized in that the input of data from overall storage organization to each heterogeneous reconfigurable process nuclear, needs a clock period.
4. interacted system according to claim 1 is characterized in that the data selection that described use multi path selector array is carried out, and all independently is controlled by the control bit of outside input.
CN2008100333220A 2008-01-31 2008-01-31 Interconnecting model suitable for heterogeneous reconfigurable processor Expired - Fee Related CN101236576B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100333220A CN101236576B (en) 2008-01-31 2008-01-31 Interconnecting model suitable for heterogeneous reconfigurable processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100333220A CN101236576B (en) 2008-01-31 2008-01-31 Interconnecting model suitable for heterogeneous reconfigurable processor

Publications (2)

Publication Number Publication Date
CN101236576A CN101236576A (en) 2008-08-06
CN101236576B true CN101236576B (en) 2011-12-07

Family

ID=39920192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100333220A Expired - Fee Related CN101236576B (en) 2008-01-31 2008-01-31 Interconnecting model suitable for heterogeneous reconfigurable processor

Country Status (1)

Country Link
CN (1) CN101236576B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102323916B (en) * 2011-06-14 2013-05-22 清华大学 Method and device for one-to-one data interaction among dynamic reconfigurable processors
KR101912427B1 (en) * 2011-12-12 2018-10-29 삼성전자주식회사 Reconfigurable processor and mini-core of reconfigurable processor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641619A (en) * 2004-01-17 2005-07-20 中国科学院计算技术研究所 Multi-processor communication and its communication method
WO2007067562A3 (en) * 2005-12-06 2007-10-25 Boston Circuits Inc Methods and apparatus for multi-core processing with dedicated thread management

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641619A (en) * 2004-01-17 2005-07-20 中国科学院计算技术研究所 Multi-processor communication and its communication method
WO2007067562A3 (en) * 2005-12-06 2007-10-25 Boston Circuits Inc Methods and apparatus for multi-core processing with dedicated thread management

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董培良等.一种可重构处理器的设计.《复旦学报(自然科学版)》.2004,第43卷(第1期),45-49. *

Also Published As

Publication number Publication date
CN101236576A (en) 2008-08-06

Similar Documents

Publication Publication Date Title
Pellauer et al. Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration
Cong et al. A fully pipelined and dynamically composable architecture of CGRA
CN103744644B (en) The four core processor systems built using four nuclear structures and method for interchanging data
CN105378651B (en) Memory-network processing unit with programmable optimization
US7200735B2 (en) High-performance hybrid processor with configurable execution units
CN101799750B (en) Data processing method and device
CN108537331A (en) A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic
CN103730149B (en) A kind of read-write control circuit of dual-ported memory
CN103761075B (en) Coarse granularity dynamic reconfigurable data integration and control unit structure
CN102782672A (en) A tile-based processor architecture model for high efficiency embedded homogneous multicore platforms
JP2008537268A (en) An array of data processing elements with variable precision interconnection
Ax et al. CoreVA-MPSoC: A many-core architecture with tightly coupled shared and local data memories
CN107562549A (en) Isomery many-core ASIP frameworks based on on-chip bus and shared drive
US20070136560A1 (en) Method and apparatus for a shift register based interconnection for a massively parallel processor array
US7624209B1 (en) Method of and circuit for enabling variable latency data transfers
CN101236576B (en) Interconnecting model suitable for heterogeneous reconfigurable processor
Leibson et al. Configurable processors: a new era in chip design
CN103365821B (en) A kind of address generator of heterogeneous multi-nucleus processor
CN103455367B (en) For realizing administrative unit and the method for multi-task scheduling in reconfigurable system
US7720636B1 (en) Performance monitors (PMs) for measuring performance in a system and providing a record of transactions performed
Sievers et al. Comparison of shared and private l1 data memories for an embedded mpsoc in 28nm fd-soi
Shang et al. LACS: A high-computational-efficiency accelerator for CNNs
Prasad et al. Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality With At-MRAM Neural Engine
CN102129495B (en) Method for reducing power consumption of reconfigurable operator array structure
CN109902040A (en) A kind of System on Chip/SoC of integrated FPGA and artificial intelligence module

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111207

Termination date: 20140131