CN102013984B - Two-dimensional net network-on-chip system - Google Patents
Two-dimensional net network-on-chip system Download PDFInfo
- Publication number
- CN102013984B CN102013984B CN2010105072008A CN201010507200A CN102013984B CN 102013984 B CN102013984 B CN 102013984B CN 2010105072008 A CN2010105072008 A CN 2010105072008A CN 201010507200 A CN201010507200 A CN 201010507200A CN 102013984 B CN102013984 B CN 102013984B
- Authority
- CN
- China
- Prior art keywords
- processing unit
- mux
- cache device
- kernel
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention discloses a two-dimensional net network-on-chip system which is used for solving the problem that a multi-core on-chip system has delay transmission time and large power consumption when processing mass data. The technical scheme is that: a two-stage register L2 is arranged out of the core; a novel exchanging switch with an internal memory accessing port is used, so that the two-stage register L2 exchanges data with a processing unit PE through the internal memory accessing port in the exchanging switch; all processing units PE can share the two-stage register L2; and the writing/reading operations among the processing units PE in the traditional two-dimensional net network-on-chip system are divided into two steps of firstly sharing to the two-stage register L2 from the processing unit PE and then sharing to the processing unit PE from the shared two-stage register L2. The two-dimensional net network-on-chip system relieves the congestion among the processing units PE caused by the concentrated reading/writing requests and reduces the transmission time and the power consumption of the network-on-chip system; and the two-dimensional net network-on-chip system is used for processing the large-scale data.
Description
Technical field
The invention belongs to technical field of integrated circuits, relate to the structure of multi-core processor chip network-on-chip, can be used for handling the large-scale data that multimedia technology or wireless application etc. produce.
Background technology
Network-on-chip NoC is used for system-on-chip designs to interference networks, solves communication between components problem on the sheet.Compare with traditional structure such as bus structures, cross bar structure, have the reliability height, autgmentability is strong, advantage low in energy consumption.
The netted network-on-chip tactical rule of traditional two-dimensional, simply be easy to realize, and have good durability, therefore two-dimension netted network is a most frequently used network-on-chip structure in the research at present, and its structure is as shown in Figure 1.Its each routing node links to each other with a kernel with four adjacent routing nodes; Each routing node is an alteration switch S; In each kernel, L2 cache device L2 and processing unit PE, level cache device L1, network adapter NI integrate.
Alteration switch S, its structure is as shown in Figure 2, and this alteration switch S is by North, South, East, four I/O ports of West; Processing unit access interface PE port; Five MUX MUX, five selected cells, five fifo queue Queue and a cross bar switch array are formed.North, South, East, four I/O ports of West, processing unit access interface PE port forms by input port and output port two parts.Input port links to each other with the fifo queue Queue of this input port; Output port links to each other with the MUX MUX of this output port direction; MUX MUX simultaneously with this MUX MUX direction on selected cell link to each other; MUX MUX links to each other through MUX MUX, the fifo queue Queue of all MUX directions of cross bar switch array and other again.
This alteration switch S is transferred to one or more output ports to data from an input port, realizes the transfer of data of network-on-chip.Data transmission procedure is: data are from certain input port input, and fifo queue Queue carries out buffer memory to the input data; Confirm transmission path by the cross bar switch array then; Then MUX MUX selects the data that transmission comes under the control of selected cell; Last selecteed data are exported through output port.
According to Pande ' s performance model, set up the network-on-chip transmission delay model of write/read operation between the processing unit PE:
Write operation: shown in Fig. 3 (a), as i processing unit PE
iTo j processing unit PE
jDuring write data, PE
iAt first to PE
jRequest is write in transmission, then PE
jResponse should request, then PE
iBegin to PE
jWrite data.So PE
iThe transmission delay T noc write of the network-on-chip of write operation can use following formulate:
T
noc?write=T
h+T
S+T
C+T
W=Ht
r+L/b+T
C+T
W
In the formula, T
h, T
s, T
c, T
WBe respectively that head postpones, sequence delays, communication delay and response time, H is a jumping figure, t
rBe that route postpones, L is that bag is long, and b is a bandwidth.
Read operation: shown in Fig. 3 (b), as i processing unit PE
iFrom j processing unit PE
jDuring read data, PE
iAt first to PE
jSend read request, PE then
jResponse should request, then PE
jBegin to PE
iSend data.So PE
iThe transmission delay T noc read of the network-on-chip of read operation can use following formulate:
T
noc?read=2T
h+T
S+2T
C+T
W=2Ht
r+L/b+2T
C+T
W
In the formula, T
h, T
s, T
c, T
WBe respectively that head postpones, sequence delays, communication delay and response time, H is a jumping figure, t
rBe that route postpones, L is that bag is long, and b is a bandwidth.
In the netted network-on-a-chip of traditional two-dimensional, because processing unit PE request is too concentrated and caused congestedly, and system need wait for that processing unit PE responds the Writing/Reading request, communication delay T
cWith response time T
WGreatly, cause the transmission delay of network-on-chip and power consumption big, particularly when handling large-scale data, the problem that time-delay and power consumption are big is particularly evident, can't satisfy the requirement that system in time handles mass data at short notice.
Summary of the invention
The objective of the invention is to overcome the deficiency of above-mentioned prior art, a kind of novel two-dimension netted network-on-a-chip is provided,, satisfy the requirement that system in time handles mass data at short notice to reduce transmission delay and power consumption.
The technical thought that realizes the object of the invention is; L2 cache device L2 is arranged on the outer novel alteration switch with an internal memory access interface that also adopts of kernel; Realize sharing of L2 cache device L2; And to change into the data-transmission mode between the processing unit PE with L2 cache device L2 be the data-transmission mode of intermediary, and then realize low transmission time-delay, low-power consumption.Whole network-on-a-chip comprises: N kernel, a N routing node (N >=2) and a L2 cache device L2; Each routing node links to each other with a kernel with four adjacent routing nodes; Each kernel is by processing unit PE, and level cache device L1 and network adapter NI form; Each routing node is an alteration switch S, and this alteration switch is made up of North, South, East, four I/O ports of West, internal memory access interface L2port, processing unit access interface PE port, cross bar switch array, six MUX MUX, six selected cells and six fifo queue Queue; L2 cache device L2 is arranged on the outside of kernel; Realize sharing of L2 cache device L2; This L2 cache device L2 is connected with all routing nodes, through internal memory access interface among the alteration switch S and the processing unit PE swap data in the kernel, realizes the low transmission time-delay.
Processing unit PE in the said kernel, level cache device L1 link to each other with other routing node through four I/O ports among the alteration switch S; Be connected with the outer L2 cache device L2 of kernel through the internal memory access interface among the alteration switch S, realize earlier from i processing unit PE
iTo the L2 cache device L2 that shares, again from the L2 cache device L2 that shares to j processing unit PE
jTwo the step write/read operation.
Described North, South, East, four I/O ports of West, internal memory access interface L2port and processing unit access interface PE port form by input port and output port two parts; Input port links to each other with the fifo queue Queue of this input port direction; Output port links to each other with the MUX MUX of this output port direction; MUX MUX, the fifo queue Queue of MUX MUX through all MUX directions of cross bar switch array and other links to each other, and the while links to each other with the selected cell of self direction.
The present invention compared with prior art has the following advantages:
(1) the present invention realizes that processing unit PE, level cache device L1 in the kernel are connected with the outer L2 cache device L2's that shares of kernel owing to be provided with the internal memory access interface in the alteration switch; Be provided with four I/O ports and realize that kernel, L2 cache device L2 are connected with other routing node; Be divided into the write/read operation between the processing unit PE in the netted network-on-a-chip of conventional two-dimensional earlier from processing unit PE to L2 cache device L2; Go on foot to handling unit PE two from L2 cache device L2 again; Alleviated because processing unit PE read is too concentrated cause congested; Reduced the communication delay between the processing unit PE, thereby reduced the transmission delay of network-on-a-chip, power consumption also decreases;
(2) the present invention shares L2 cache device L2 owing to the outside that L2 cache device L2 is arranged on kernel, and there is not response time T in this L2 cache device L2 that shares through internal memory access interface and processing unit PE swap data in the alteration switch
WThereby, further reduced network-on-a-chip transmission delay and power consumption, satisfied the requirement that system in time handles mass data at short notice.
Description of drawings
Fig. 1 is the netted network-on-chip system configuration of a conventional two-dimensional sketch map;
Fig. 2 is an alteration switch structural representation in the netted network-on-a-chip of conventional two-dimensional;
Fig. 3 is the read/write operation delay model sketch map of processing unit PE in the netted network-on-a-chip of conventional two-dimensional;
Fig. 4 is the two-dimension netted network-on-a-chip structural representation of the present invention;
Fig. 5 is an alteration switch structural representation in the two-dimension netted network-on-a-chip of the present invention;
Fig. 6 is the read/write operation delay model sketch map of processing unit PE in the two-dimension netted network-on-a-chip of the present invention.
Embodiment
With reference to Fig. 4, two-dimension netted network-on-a-chip of the present invention is made up of N kernel, a N routing node (N >=2) and a L2 cache device L2.Each routing node links to each other with a kernel with four adjacent routing nodes; Each kernel is made up of processing unit PE, level cache device L1 and network adapter NI; And the L2 cache device L2 that is integrated in the traditional structure in the kernel is arranged on outside the kernel; This L2 cache device L2 is connected with all routing nodes, realizes sharing of L2 cache device L2.The L2 cache device L2 that shares links to each other with processing unit PE, level cache device L1 in the kernel through the internal memory access interface L2port among the alteration switch S, realizes first from i processing unit PE
iTo the L2 cache device L2 that shares, again from the L2 cache device L2 that shares to j processing unit PE
jTwo the step write/read operation.Each routing node is an alteration switch S, and its structure is as shown in Figure 5.
With reference to Fig. 5; Alteration switch S of the present invention comprises: North, South, East, four I/O ports of West; Internal memory access interface L2port, processing unit access interface PE port, six MUX MUX; Six selected cells, six fifo queue Queue and a cross bar switch array.Wherein, North, South, East, four I/O ports of West, internal memory access interface L2port and processing unit access interface PE port form by input port and output port two parts.Input port links to each other with the fifo queue Queue of this input port direction; Output port links to each other with the MUX MUX of this output port direction; MUX MUX links to each other with the selected cell of this MUX direction simultaneously; MUX MUX also links to each other with MUX MUX, the fifo queue Queue of other all MUX directions through the cross bar switch array.
This alteration switch S realizes the transmission of data from an input port to one or more output ports.Transmission course is: data are imported from input port, and the fifo queue Queue on this input port direction carries out buffer memory to the input data; By the transmission path of cross bar switch array specified data, then MUX MUX selects transmitting the data of coming under the control of selected cell then; At last selecteed data are exported through output port.When data were transmitted between processing unit access interface PE port and internal memory access interface L2port, network-on-a-chip had been realized the exchanges data between processing unit PE and the shared L2 cache device L2.
Effect of the present invention further specifies through following theory analysis and simulation result:
1. theory analysis
Write/read operation process among the present invention between the processing unit PE is divided into network-on-chip transmission course and the DRP data reception process from L2 cache device L2 to processing unit PE from processing unit PE to L2 cache device L2.Influence the response time T of the processing unit PE in network-on-chip transmission time in the traditional structure
WCan influence the Data Receiving time in the new construction and can not influence the network-on-chip transmission time.The present invention only considers the network-on-chip transmission time.
With reference to Fig. 6, set up i processing unit PE in the network-on-a-chip of the present invention
iTo j processing unit PE
jThe delay model of write/read operation.Wherein:
Write operation: shown in Fig. 6 (a), as i processing unit PE
iTo j processing unit PE
jDuring write data, PE
iAt first to distributing to PE
jL2 cache device L2
jRequest is write in transmission, then PE
iTo L2
jWrite data.PE
iThe network-on-chip transmission delay T of write operation
SMThe expression formula of noc write is:
T
SM?noc?write=T
h+T
S+T
C=Ht
r+L/b+T
C (1)
In the formula, T
h, T
s, T
cBe respectively that head postpones, sequence delays and communication delay, H are jumping figures, t
rBe that route postpones, L is that bag is long, and b is a bandwidth.
Read operation: shown in Fig. 6 (b), as i processing unit PE
iFrom j processing unit PE
jDuring read data, PE
iAt first to distributing to PE
jL2 cache device L2
jSend read request.PE then
iDirectly from L2
jMiddle reading of data.PE
iThe network-on-chip transmission delay T of read operation
SMThe expression of noc read is:
T
SM?noc?read=2T
h+T
S+2T
C=2Ht
r+L/b+2T
C (2)
In the formula, T
h, T
s, T
cBe respectively that head postpones, sequence delays and communication delay, H are jumping figures, t
rBe that route postpones, L is that bag is long, and b is a bandwidth.
According to background technology, the network-on-chip transmission delay T noc write and the T noc read of the write/read operation of traditional network-on-a-chip are expressed as respectively:
T
noc?write=T
h+T
S+T
C+T
W=Ht
r+L/b+T
C+T
W (3)
T
noc?read=2T
h+T
S+2T
C+T
W=2Ht
r+L/b+2T
C+T
W (4)
In the formula, T
h, T
s, T
cBe respectively that head postpones, sequence delays and communication delay, H are jumping figures, t
rBe that route postpones, L is that bag is long, and b is a bandwidth.
Contrast equation (1) and (3), (2) and (4) are because the transmission course of network-on-chip of the present invention is to realize i processing unit PE
iAnd the exchanges data between the L2 cache device L2 that shares, this process need not waited for j processing unit PE
jResponse Writing/Reading request is not so exist response time T
WThereby, reduced the transmission delay of network-on-chip.Network-on-chip of the present invention is realized be earlier from processing unit PE to the L2 cache device L2 that shares; Again from L2 cache device L2 to the data-transmission mode of handling unit PE; Compare with the data-transmission mode between the processing unit PE of traditional network-on-chip; Alleviated too concentrated cause congested of read between the processing unit PE, made the communication delay T of network-on-chip
cDiminish, thereby further reduced the transmission delay of network-on-chip.
2. emulation experiment
This emulation experiment adopts the supply voltage of SIMC 0.13um method and 1.1V, application based on the MPSOCS simulation system software of OPNET respectively on netted network-on-a-chip of traditional two-dimensional and two-dimension netted network-on-a-chip of the present invention to H.264, the transmission delay and the power consumption of M-JPEG, three kinds of decoding algorithms of MP3 carry out emulation.Simulation result is as shown in table 1.
The contrast of table 1 simulation result
Visible by table 1, two-dimension netted network-on-a-chip of the present invention is compared with the netted network-on-a-chip of traditional two-dimensional, on average makes transmission delay reduce by 37.6%, and power consumption reduces by 33.7%.
Claims (1)
1. two-dimension netted network-on-a-chip; Comprise N kernel; N routing node and a L2 cache device L2, N>=2 wherein, each routing node links to each other with a kernel with four adjacent routing nodes; It is characterized in that: each kernel is by processing unit PE, and level cache device L1 and network adapter NI form; Each routing node is an alteration switch S, and this alteration switch S is made up of North, South, East, four I/O ports of West, processing unit access interface PE port, internal memory access interface L2port, six MUX MUX, six selected cells, cross bar switch array and six fifo queue Queue; L2 cache device L2 is arranged on the outside of kernel, realizes sharing of L2 cache device L2, and this L2 cache device L2 is connected with all routing nodes, through the processing unit PE swap data in internal memory access interface L2port and the kernel, realizes the low transmission time-delay; Described internal memory access interface L2port and processing unit access interface PE port form by input port and output port two parts; Input port links to each other with the fifo queue Queue of this input port direction; Output port links to each other with the MUX MUX of this output port direction; MUX MUX, the fifo queue Queue of MUX MUX through all MUX directions of cross bar switch array and other links to each other, and the while links to each other with the selected cell of self direction; Processing unit PE in the kernel, level cache device L1 link to each other with other routing node through four the I/O ports of North, South, East, West among the alteration switch S; Be connected with the outer L2 cache device L2 that shares of kernel through the internal memory access interface L2port among the alteration switch S, realize earlier from i processing unit PE
iTo the L2 cache device L2 that shares, again from the L2 cache device L2 that shares to j processing unit PE
jTwo the step write/read operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010105072008A CN102013984B (en) | 2010-10-14 | 2010-10-14 | Two-dimensional net network-on-chip system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010105072008A CN102013984B (en) | 2010-10-14 | 2010-10-14 | Two-dimensional net network-on-chip system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102013984A CN102013984A (en) | 2011-04-13 |
CN102013984B true CN102013984B (en) | 2012-05-09 |
Family
ID=43844014
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010105072008A Expired - Fee Related CN102013984B (en) | 2010-10-14 | 2010-10-14 | Two-dimensional net network-on-chip system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102013984B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103188158B (en) * | 2011-12-28 | 2016-07-20 | 清华大学 | A kind of network-on-chip router and method for routing |
CN102868604B (en) * | 2012-09-28 | 2015-05-06 | 中国航空无线电电子研究所 | Two-dimension Mesh double buffering fault-tolerant route unit applied to network on chip |
CN105812063B (en) * | 2016-03-22 | 2018-08-03 | 西安电子科技大学 | Network on mating plate system based on statistic multiplexing and communication means |
CN108897701B (en) * | 2018-06-20 | 2020-07-14 | 珠海市杰理科技股份有限公司 | cache storage device |
CN113162906B (en) * | 2021-02-26 | 2023-04-07 | 西安微电子技术研究所 | NoC transmission method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101232456A (en) * | 2008-01-25 | 2008-07-30 | 浙江大学 | Distributed type testing on-chip network router |
CN101383712A (en) * | 2008-10-16 | 2009-03-11 | 电子科技大学 | Routing node microstructure for on-chip network |
CN101582854A (en) * | 2009-06-12 | 2009-11-18 | 华为技术有限公司 | Data exchange method, device and system thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070280224A1 (en) * | 2006-06-05 | 2007-12-06 | Via Technologies | System and method for an output independent crossbar |
US8102884B2 (en) * | 2008-10-15 | 2012-01-24 | International Business Machines Corporation | Direct inter-thread communication buffer that supports software controlled arbitrary vector operand selection in a densely threaded network on a chip |
-
2010
- 2010-10-14 CN CN2010105072008A patent/CN102013984B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101232456A (en) * | 2008-01-25 | 2008-07-30 | 浙江大学 | Distributed type testing on-chip network router |
CN101383712A (en) * | 2008-10-16 | 2009-03-11 | 电子科技大学 | Routing node microstructure for on-chip network |
CN101582854A (en) * | 2009-06-12 | 2009-11-18 | 华为技术有限公司 | Data exchange method, device and system thereof |
Also Published As
Publication number | Publication date |
---|---|
CN102013984A (en) | 2011-04-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108400880B (en) | Network on chip, data transmission method and first switching node | |
CN102013984B (en) | Two-dimensional net network-on-chip system | |
CN101841420B (en) | Network-on-chip oriented low delay router structure | |
US20060095621A1 (en) | Methods and apparatuses for generating a single request for block transactions over a communication fabric | |
CN104158738A (en) | Network-on-chip router with low buffer area and routing method | |
US8769459B2 (en) | High-end fault-tolerant computer system and method for same | |
CN102685017A (en) | On-chip network router based on field programmable gate array (FPGA) | |
CN101739241A (en) | On-chip multi-core DSP cluster and application extension method | |
US7277975B2 (en) | Methods and apparatuses for decoupling a request from one or more solicited responses | |
Ou et al. | A 65nm 39GOPS/W 24-core processor with 11Tb/s/W packet-controlled circuit-switched double-layer network-on-chip and heterogeneous execution array | |
CN105207957B (en) | A kind of system based on network-on-chip multicore architecture | |
CN109992543A (en) | A kind of PCI-E data efficient transmission method based on ZYZQ-7000 | |
CN103106173A (en) | Interconnection method among cores of multi-core processor | |
Ebrahimi et al. | A high-performance network interface architecture for NoCs using reorder buffer sharing | |
CN104320341A (en) | Adaptive and asynchronous routing network on 2D-Torus chip and design method thereof | |
Sikder et al. | Exploring wireless technology for off-chip memory access | |
CN104035896B (en) | Off-chip accelerator applicable to fusion memory of 2.5D (2.5 dimensional) multi-core system | |
Sinha et al. | Data-flow aware CNN accelerator with hybrid wireless interconnection | |
KR20150028520A (en) | Memory-centric system interconnect structure | |
CN110096456A (en) | A kind of High rate and large capacity caching method and device | |
Ahmed et al. | A one-to-many traffic aware wireless network-in-package for multi-chip computing platforms | |
CN103744817B (en) | For Avalon bus to the communication Bridge equipment of Crossbar bus and communication conversion method thereof | |
Lee et al. | Design of a feasible on-chip interconnection network for a chip multiprocessor (cmp) | |
CN102622319A (en) | Data exchange system of high-speed storage interface IP (Internet Protocol) core based on MPMC (Multi-Port Memory Controller) | |
Duan et al. | Research on Double-Layer Networks-on-Chip for Inter-Chiplet Data Switching on Active Interposers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120509 Termination date: 20151014 |
|
EXPY | Termination of patent right or utility model |