CN101625673A - Method for mapping task of network on two-dimensional grid chip - Google Patents

Method for mapping task of network on two-dimensional grid chip Download PDF

Info

Publication number
CN101625673A
CN101625673A CN200810116245A CN200810116245A CN101625673A CN 101625673 A CN101625673 A CN 101625673A CN 200810116245 A CN200810116245 A CN 200810116245A CN 200810116245 A CN200810116245 A CN 200810116245A CN 101625673 A CN101625673 A CN 101625673A
Authority
CN
China
Prior art keywords
thread
common thread
common
desired location
threads
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810116245A
Other languages
Chinese (zh)
Other versions
CN101625673B (en
Inventor
刘祥
陈曦
黄毅
张金龙
任菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN2008101162455A priority Critical patent/CN101625673B/en
Publication of CN101625673A publication Critical patent/CN101625673A/en
Application granted granted Critical
Publication of CN101625673B publication Critical patent/CN101625673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method for mapping a task of a network on a two-dimensional grid chip. The method comprises the following steps: 1) pre-distributing expected positions of all threads on a two-dimensional grid, wherein the threads comprise common threads which can be mapped to any position; 2) calculating variation Com_diff of a general communication power consumption factor after each common thread is exchanged with a close-by common thread on the expected position of the common thread or an idle position, wherein the common thread executes exchange with the common thread or the idle position which minimizes the Com_diff, until exchanges of all the common threads and the close-by threads on the expected positions of the common threads or the idle positions lead the Com_diff to be more than or equal to zero; and 3) outputting a mapping file according to the positions of all the threads. The method has high optimization degree, and solves part of the mapping problem because users can regulate parameters by oneself to control time complexity.

Description

A kind of duty mapping method of two-dimensional grid network-on-chip
Technical field
The present invention relates to a kind of using method of polycaryon processor, is a kind of two-dimensional grid (2-D Mesh) structure network-on-chip (Network-on-Chip, duty mapping method NoC).
Background technology
Along with the development of semiconductor and integrated circuit technique, (System-on-Chip, integrated level SoC) is more and more higher, can integrated hundreds of the IP kernels such as microprocessor, storer, I/O interface on the single chip for SOC (system on a chip).On the other hand, the function of embedded electronic product becomes increasingly complex, and the uniprocessor SOC (system on a chip) can't satisfy growing function of embedded system and performance requirement, and (Multi-Processor SoC, appearance MPSoC) becomes inevitable the multinuclear SOC (system on a chip).And the multinuclear SOC (system on a chip) is had higher requirement to chip-on communication, and network-on-chip proposes for the global communication that solves nanometer era multinuclear SOC (system on a chip).Network-on-chip is used for reference the design philosophy of parallel computation and computer network, on single silicon chip, make up a micronetwork that adopts packet switch, interconnect by switch between the IP kernel, and use Global Asynchronous local synchronization (Global Asynchronous Local Synchronous, GALS) mechanism realizes the efficient communication between computing modules such as a large amount of processing units, storage unit in the multinuclear SOC (system on a chip).
The topological structure of network-on-chip is varied, wherein two-dimensional grid have simple in structure, extensibility good, be convenient to realize and advantages such as analysis, thereby obtained using widely in the network-on-chip field.Along with number of transistors on the chip develops into 1,000,000,000 orders of magnitude, power consumption becomes the primary restraining factors of chip design gradually, thread and the mapping method between a plurality of processing units of network-on-chip based on power consumption are a lot, wherein Jingcao Hu and Marculescu, R is at Computer-Aided Design of IntegratedCircuits and Systems, IEEE Transactions on Volume 24, Issue 4, among the document Energy-and performance-aware mapping forregular NoC architectures that April 2005Pages:551-562 delivers, hereinafter to be referred as document 1, set forth the method that adopts branch and bound thought, promptly in obtaining the process of next feasible solution, those can not obtain the process of optimum solution to utilize upper bound function U BC (upper bound cost) and lower limit function LBC (lower bound cost) premature termination, thereby guide this method to advance towards " branch " of optimum solution.But in this method implementation, each step all needs to calculate UBC and LBC, and this will inevitably increase time complexity.A plurality of " optimum solutions " might appear in this method simultaneously, cause final majorization of solutions degree not high.
Summary of the invention
It is long that the object of the invention overcomes the mapping method execution time of the prior art, the final unwarrantable problem of majorization of solutions degree, thus a kind of duty mapping method of two-dimensional grid network-on-chip is provided.
According to an aspect of the present invention, provide a kind of duty mapping method of two-dimensional grid network-on-chip, comprised the following steps:
1) desired location of all threads of predistribution to the two-dimensional grid, described thread comprises the common thread that can map to any position;
2) calculate near the desired location of each common thread and this common thread the common thread or the variable quantity Com_diff of the total communication power consumption factor after the clear position exchange, described common thread is got minimum common thread or clear position execution exchange with making Com_diff, and near common thread described all common thread and its desired location or clear position exchange all make Com_diff more than or equal to 0;
3) export mapped file according to the position of described all threads.
Wherein, described step 1) comprises:
11) list described common thread in a formation according to the size order of the traffic of each common thread;
12) first common thread in the described formation is dispensed to the center of described two-dimensional grid;
13) calculate common thread desired location to be allocated according to the desired location of the thread that has distributed.
Wherein, described all threads also comprise the special thread that need map to ad-hoc location.
Wherein, described step 1) comprises:
11 ') list described special thread in a formation;
12 ') size order according to the traffic of each common thread adds described formation with described common thread;
13) calculate common thread desired location to be allocated according to the thread desired location of having distributed.
Wherein, described step 13) is calculated common thread desired location to be allocated according to following formula according to the thread desired location of having distributed:
Figure S2008101162455D00021
Figure S2008101162455D00031
Com wherein I, kData communication total amount between expression thread i, the k, x kAnd y kX axle and the y axial coordinate of representing thread k respectively, x iAnd y iX axle and the y axial coordinate of representing thread i respectively.
Wherein, described step 2) comprising:
21) all common thread in the described formation are constituted round-robin queue, appoint and get one of them common thread;
22) supposing that described common thread belongs to does not shine upon thread, calculate the variable quantity Com_diff of the total communication power consumption factor after near common thread of described common thread and its desired location or clear position exchange, described common thread and the common thread or the clear position execution that make Com_diff get minimum are exchanged;
23) repeating step 22) near described all common thread and its desired location common thread or clear position exchange all make Com_diff more than or equal to 0.
Wherein, near common thread the described desired location or clear position are and the distance of desired location common thread or the clear position less than predetermined threshold.
Wherein, described and distance desired location is a manhatton distance.
The invention provides the preferential network-on-chip mapping method of a kind of power consumption, the optimal location that it constantly adjusts each thread makes final majorization of solutions degree reach the highest.In the present invention simultaneously, the user can set up a threshold value on their own, so that weigh on method execution time and final majorization of solutions degree, when choosing less threshold value, the each circulation time of this method only need compare the numerical value of several Key Points, thereby has significantly reduced time complexity.And the present invention considers and has solved the situation of part mapping, and promptly some thread is the special case that must map on the specific PU in a NoC system.
Description of drawings
Fig. 1 is the NoC synoptic diagram of a 2-D Mesh structure;
Fig. 2 is the process flow diagram of the duty mapping method embodiment of two-dimensional grid network-on-chip of the present invention;
Fig. 3 is that data are utilized a mapping result of duty mapping method generation of the present invention at S=0 in H.264Decoder showing during d=1;
Fig. 4 is that data are utilized another mapping result of duty mapping method generation of the present invention at S=1 in H.264Decoder showing during d=2.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in further detail.
Fig. 1 shows the specific embodiment of NoC of one 4 * 3 2-D Mesh structure, and wherein S represents crosspoint, and LR represents local resource, and Adapt represents adapter, PU represents processing unit, (0,0), (0,1), (0,2) ..., the position coordinates of (3,2) expression thread mapping (x, y).
The mapping position of supposing i thread in 2-D Mesh is (x i, y i), the mapping position of j thread is (x j, y j).Adopt manhatton distance D in the present embodiment I, j=| x i-x j|+| y i-y j| the skip distance of data when expression thread i communicates by letter with thread j; It will be understood by those in the art that not break away from inventive concept, also can adopt other distance.The communication power consumption of then transmitting 1 Bit data from thread i to thread j is:
E bit i , j = ( D i , j + 1 ) E Sbit + D i , j E Lbit - - - ( 1 )
Wherein, E SbitRepresent that each crosspoint receives and transmit the power consumption of 1 Bit data, E LbitRepresent the circuit power consumption of transmission 1 Bit data between adjacent two processing units, make E Sbit/ E Lbit=θ then has
E bit i , j = [ ( θ + 1 ) D i , j + θ ] E Lbit - - - ( 2 )
Then the total communication power consumption of total system is
E com = Σ i = 0 T - 1 Σ j = i + 1 T - 1 ( C i , j + C j , i ) [ ( θ + 1 ) D i , j + θ ] E Lbit - - - ( 3 )
Wherein, T represents total number of thread, C I, jExpression thread i is to the data traffic of thread j.Note Com I, j=C I, j+ C J, iData communication total amount between expression thread i, the j then has
E com = Σ i = 0 T - 1 Σ j = i + 1 T - 1 [ ( θ + 1 ) D i , j + θ ] Com i , j · E Lbit - - - ( 4 )
Starting point of the present invention is the communication power consumption factor of optimization system, thereby determines to make the thread of system power dissipation minimum and the mapping relations between a plurality of processing units of network-on-chip.By formula (4) as can be known, the communication power consumption factor of system is:
W = Σ i = 0 T - 1 Σ j = i + 1 T - 1 [ ( θ + 1 ) D i , j + θ ] Com i , j - - - ( 5 )
All threads are divided into two classes, comprise the common thread that can map to any position and need map to the special thread of ad-hoc location.In system, special thread may exist also and may not exist, and needs only in mapping process it is mapped to its ad-hoc location.For arbitrary common thread i, wish that it is mapped to the position of the communication power consumption factor minimum that makes above-mentioned system, realize by following manner: calculate the variable quantity of the total communication power consumption factor after near thread (except the special thread) common thread i and its desired location or the clear position exchange, common thread i is exchanged with making that variable quantity is minimum and carry out less than 0 thread or clear position.The variable quantity of the total communication power consumption factor after thread i and other thread and the clear position exchange is calculated as follows:
The coefficient of the communication power consumption sum of thread i and other all threads is
w i = Σ k = 0 T - 1 [ ( θ + 1 ) D i , k + θ ] Com i , k ( k ≠ i ) - - - ( 6 )
The coefficient of the communication power consumption sum of thread j and other all threads is
w j = Σ k = 0 T - 1 [ ( θ + 1 ) D j , k + θ ] Com j , k ( k ≠ j ) - - - ( 7 )
After thread i, the j exchange mapping position, the coefficient of the communication power consumption sum of thread i and thread j and other all threads is respectively
Figure S2008101162455D00053
Figure S2008101162455D00054
Therefore, behind thread i, the j switch, the variable quantity of the total communication power consumption factor of system is
Figure S2008101162455D00055
If thread i and a certain clear position (s, t) exchange, that is, and thread i be mapped to clear position (s, t), and the original position of thread i becomes clear position, then exchange afterwards the coefficient of the communication power consumption sum of thread i and other all threads be:
w new = Σ k = 0 T - 1 [ ( θ + 1 ) ( | s - x k | + | t - y k | ) + θ ] Com i , k ( k ≠ i ) - - - ( 11 )
Therefore, thread i and clear position (s, t) after the exchange, the variable quantity of the total communication power consumption factor of system is:
Com _ diff = w new - w i = Σ k = 0 T - 1 ( θ + 1 ) ( | s - x k | + | t - y k | - D i , k ) Com i , k ( k ≠ i ) - - - ( 12 )
If set omega comprises all threads that shone upon, the desired location (x of thread i so to be mapped i, y i) calculate symbol in the formula (13) wherein in the present embodiment with formula (13)
Figure S2008101162455D00061
Expression is to symbol inner function round, and those skilled in that art are appreciated that and can also calculate by alternate manner.
Figure S2008101162455D00062
Calculate according to above-mentioned theory, the specific embodiment of the present invention is as follows:
Suppose in a certain task that special number of threads is S, the common thread number is T-S.List all special threads in a sequencing queue, all common thread are also added in this formation, preferably common thread can add this formation according to the size order of the traffic of other thread in this thread and the formation.Get the thread of a pair of traffic maximum, wherein comprise a common thread that does not add formation as yet at least.If two threads all not in formation, then all are attached to the formation end with these two threads, their sequencing depends on the peak volume of other threads in these two threads and the formation, is worth big person preceding; If one of them in formation, then is added in the sequencing queue end with another thread.The thread that takes off a pair of traffic maximum then continue to carry out this operation until all threads all in formation.Be 0~T-1 with this T thread by its serial number in formation at last.If special number of threads is 0 in the task, then the order of first pair of thread in formation is arbitrarily.
With demoder H.264 is example, and it relates to 12 modules (task), requires to be mapped in the processor array of a 2D Mesh of 4 * 4, and the communication flows between these modules is as shown in table 1.
Table 1 is intermodule communication flowmeter in the demoder H.264
??From/To ??IP0 ??IP1 ??IP2 ??IP3 ??IP4 ??IP5 ??IP6 ??IP7 ??IP8 ??IP9 ??IP10 ??IP11
??IP0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??7,098.7
??IP1 ??4,465.1 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??344.5
??IP2 ??0.0 ??0.0 ??0.0 ??0.0 ??62.7 ??4,791.9 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??13,1970
??IP3 ??0.0 ??5,936.1 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??641.0 ??0.0 ??0.0 ??0.0 ??0.0
??IP4 ??0.0 ??0.0 ??0.0 ??6,5771 ??0.0 ??0.0 ??406.6 ??0.0 ??494.7 ??0.0 ??0.0 ??0.0
??IP5 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0
??IP6 ??324.9 ??321.4 ??0.0 ??186.0 ??232.0 ??11.6 ??0.0 ??6.9 ??990.2 ??59.2 ??11.6 ??0.0
??IP7 ??320.5 ??13.5 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??145.0 ??0.0 ??26.7
??IP8 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??826.3 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0
??IP9 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??320.5 ??0.0 ??0.0 ??0.0 ??0.0
??IP10 ??0.0 ??0.0 ??62.7 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0 ??0.0
??IP11 ??2,644.3 ??10,628.0 ??7,470.4 ??0.0 ??0.0 ??0.0 ??0.0 ??39.6 ??0.0 ??0.0 ??0.0 ??0.0
Suppose S=1, IP5 is special thread, and so initial sequencing queue is { IP5}.Get the thread IP2 and the IP11 of a pair of traffic maximum, they all not in formation since in IP2 and the formation traffic between the thread IP5 greater than the traffic between IP11 and the IP5, so IP2 preferentially adds tail of the queue.Therefore new sequencing queue is { IP5, IP2, IP11}.The thread of a pair of traffic maximum is IP1 and IP11 down, because IP11 is in formation, so IP1 adding tail of the queue is got final product.Therefore new sequencing queue be IP5, IP2, IP11, IP1} in like manner also adds tail of the queue to IP0, then formation becomes { IP5, IP2, IP11, IP1, IP0}.Down the thread of a pair of letter amount maximum is IP3 and IP4, they all not in formation since in IP3 and the formation maximal value (with the traffic of IP1) of the thread traffic greater than the traffic of each thread in IP4 and the formation, so IP3 preferentially adds tail of the queue.Therefore new sequencing queue is { IP5, IP2, IP11, IP1, IP0, IP3, IP4}.The rest may be inferred, and after all threads all added formation, formation was { IP5, IP2, IP11, IP1, IP0, IP3, IP4, IP6, IP8, IP7, IP4, IP10}.Giving these 12 thread number by this queue sequence at last is 0~11.
Calculate according to above-mentioned formula and to make the mapping of communication power consumption factor minimum of system, concrete steps are as follows:
Step 1: initialization two-dimensional array A[M] [N], make all elements be changed to-1, represent this position free time, wherein M is the line number of network-on-chip two-dimensional grid, N is the columns of two-dimensional grid.If S=0 then distributes No. 0 thread to the center Otherwise distribute the correspondence position of all special threads to particular processor unit, utilize formula (13) to calculate the desired location of next thread, get with the shortest unallocated position of this desired location manhatton distance and distribute this thread, distribute next thread then, assign until all threads.For the i thread be dispensed to (a, b) position makes A[a] [b]=i, x[i]=a, y[i]=b.
Step 2: make unextimes=0, all T-S in the sequencing queue common thread are constituted round-robin queue, get first thread in this round-robin queue.In the present embodiment, with the specific implementation of chained list as formation.
Step 3: suppose that this thread does not belong to Ω, utilize formula (13) to calculate the desired location of this thread, utilize formula (10) and formula (12) to calculate among the A with this position Com_diff after to be the center manhatton distance smaller or equal to each position (except ad-hoc location and the infeasible position) of threshold value d exchange with this thread origin-location; Threshold value d wherein can be set up on their own by the user, the diamond-shaped area scope of decision " comparison position ", and the big more diamond-shaped area scope of d value is big more, but must not be greater than max (M, N), preferred value is 1 or 2, the complexity of this threshold affects entire method and final majorization of solutions degree.Each Com_diff relatively, the Com_diff of value minimum and write down its pairing position (x, y).If Step 4 is changeed in Com_diff>=0, otherwise change Step 6.
Step 4:unextimes++ if unextimes=T-S changes Step 7, otherwise changes Step 5.
Step 5: get next thread, change Step 3.
Step 6: make unextimes=0, if A[x, y]=-1, this thread is dispensed to [x, y] position, and the origin-location is changed to-1; Otherwise with the thread switch on this thread and [x, the y].Change Step 5.
Step 7: the position output mapped file according to each thread, finish.
Be example with above-mentioned H.264 demoder still, suppose not have particular thread need be fixed on certain particular processor unit, set d=1, the result that this method is carried out is: IP10 maps to (0,0), IP9 maps to (0,2), and IP7 maps to (0,3), and IP2 maps to (1,0), IP11 maps to (1,1), and IP1 maps to (1,2), IP3 maps to (1,3), and IP5 maps to (2,0), IP0 maps to (2,1), and IP6 maps to (2,2), IP4 maps to (2,3), and IP8 maps to (3,2).
Suppose that thread IP5 must map to (3,1) in advance, set d=2, the result that this moment, this method was carried out is: IP10 maps to (3,3), IP9 maps to (1,3), and IP7 maps to (0,3), and IP2 maps to (3,2), IP11 maps to (2,2), and IP1 maps to (1,2), IP3 maps to (0,2), and IP5 maps to (3,1), IP0 maps to (2,1), and IP6 maps to (1,1), IP4 maps to (0,1), and IP8 maps to (1,0).
Carrying out coding-decoding operation with two sections film-pieces that only contain box and hand is instantiation, and it maps to 20 threads among 5 * 5 the NoC, and document 1 described method and the inventive method power consumption and time ratio see Table 2 and table 3 more respectively:
Table 2 document 1 described method and the inventive method power consumption comparison sheet
Figure S2008101162455D00081
Table 3 document 1 described method and the inventive method time comparison sheet
Figure S2008101162455D00082
Data show in table 2 and the table 3, and the optimum solution that method of the present invention is calculated is littler than document 1 described optimum solution power consumption; And when d=1, the time of finding the solution is shorter.
Should be noted that and understand, under the situation that does not break away from the desired the spirit and scope of the present invention of accompanying Claim, can make various modifications and improvement the present invention of foregoing detailed description.Therefore, the scope of claimed technical scheme is not subjected to the restriction of given any specific exemplary teachings.

Claims (8)

1. the duty mapping method of a two-dimensional grid network-on-chip comprises the following steps:
1) desired location of all threads of predistribution to the two-dimensional grid, described thread comprises the common thread that can map to any position;
2) calculate near the desired location of each common thread and this common thread the common thread or the variable quantity Com_diff of the total communication power consumption factor after the clear position exchange, described common thread is got minimum common thread or clear position execution exchange with making Com_diff, and near common thread described all common thread and its desired location or clear position exchange all make Com_diff more than or equal to 0;
3) export mapped file according to the position of described all threads.
2. method according to claim 1 is characterized in that, described step 1) comprises:
11) list described common thread in a formation according to the size order of the traffic of each common thread;
12) first common thread in the described formation is dispensed to the center of described two-dimensional grid;
13) calculate common thread desired location to be allocated according to the desired location of the thread that has distributed.
3. method according to claim 1 is characterized in that, described all threads also comprise the special thread that need map to ad-hoc location.
4. method according to claim 3 is characterized in that, described step 1) comprises:
11 ') list described special thread in a formation;
12 ') size order according to the traffic of each common thread adds described formation with described common thread;
13) calculate common thread desired location to be allocated according to the thread desired location of having distributed.
5. according to claim 2 or 4 described methods, it is characterized in that described step 13) is calculated common thread desired location to be allocated according to following formula according to the thread desired location of having distributed:
Figure A2008101162450002C1
Figure A2008101162450002C2
Com wherein I, kData communication total amount between expression thread i, the k, x kAnd y kX axle and the y axial coordinate of representing thread k respectively, x iAnd y iX axle and the y axial coordinate of representing thread i respectively.
6. according to claim 2 or 4 described methods, it is characterized in that described step 2) comprising:
21) all common thread in the described formation are constituted round-robin queue, appoint and get one of them common thread;
22) supposing that described common thread belongs to does not shine upon thread, calculate the variable quantity Com_diff of the total communication power consumption factor after near common thread of described common thread and its desired location or clear position exchange, described common thread and the common thread or the clear position execution that make Com_diff get minimum are exchanged;
23) repeating step 22) near described all common thread and its desired location common thread or clear position exchange all make Com_diff more than or equal to 0.
7. method according to claim 1 is characterized in that, near the common thread the described desired location or the distance of clear position and desired location are less than predetermined threshold.
8. method according to claim 6 is characterized in that, described and distance desired location is calculated according to manhatton distance.
CN2008101162455A 2008-07-07 2008-07-07 Method for mapping task of network on two-dimensional grid chip Active CN101625673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101162455A CN101625673B (en) 2008-07-07 2008-07-07 Method for mapping task of network on two-dimensional grid chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101162455A CN101625673B (en) 2008-07-07 2008-07-07 Method for mapping task of network on two-dimensional grid chip

Publications (2)

Publication Number Publication Date
CN101625673A true CN101625673A (en) 2010-01-13
CN101625673B CN101625673B (en) 2012-06-27

Family

ID=41521524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101162455A Active CN101625673B (en) 2008-07-07 2008-07-07 Method for mapping task of network on two-dimensional grid chip

Country Status (1)

Country Link
CN (1) CN101625673B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103428804A (en) * 2013-07-31 2013-12-04 电子科技大学 Method for searching mapping scheme between tasks and nodes of network-on-chip (NoC) and network code position
CN103885842A (en) * 2014-03-19 2014-06-25 浙江大学 Task mapping method for optimizing whole of on-chip network with acceleration nodes
CN104079439A (en) * 2014-07-18 2014-10-01 合肥工业大学 NoC (network-on-chip) mapping method based on discrete firefly algorithm
CN104270308A (en) * 2014-10-15 2015-01-07 重庆大学 On-radio-frequency-piece network application mapping method facing unbalanced communication feature
CN106254254A (en) * 2016-09-19 2016-12-21 复旦大学 A kind of network-on-chip communication means based on Mesh topological structure
CN107391247A (en) * 2017-07-21 2017-11-24 同济大学 A kind of breadth First greed mapping method of network-on-chip application
WO2018014300A1 (en) * 2016-07-21 2018-01-25 张升泽 Power implementation method and system for multi-core chip

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112085A (en) * 1995-11-30 2000-08-29 Amsc Subsidiary Corporation Virtual network configuration and management system for satellite communication system
CN101075961B (en) * 2007-06-22 2011-05-11 清华大学 Self-adaptable package for designing on-chip network

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103428804A (en) * 2013-07-31 2013-12-04 电子科技大学 Method for searching mapping scheme between tasks and nodes of network-on-chip (NoC) and network code position
CN103428804B (en) * 2013-07-31 2016-03-30 电子科技大学 Find mapping scheme and network code location method between network-on-chip task and node
CN103885842A (en) * 2014-03-19 2014-06-25 浙江大学 Task mapping method for optimizing whole of on-chip network with acceleration nodes
CN103885842B (en) * 2014-03-19 2017-08-25 浙江大学 A kind of band accelerates the overall duty mapping method of the optimization of the network-on-chip of node
CN104079439A (en) * 2014-07-18 2014-10-01 合肥工业大学 NoC (network-on-chip) mapping method based on discrete firefly algorithm
CN104079439B (en) * 2014-07-18 2017-02-22 合肥工业大学 NoC (network-on-chip) mapping method based on discrete firefly algorithm
CN104270308A (en) * 2014-10-15 2015-01-07 重庆大学 On-radio-frequency-piece network application mapping method facing unbalanced communication feature
WO2018014300A1 (en) * 2016-07-21 2018-01-25 张升泽 Power implementation method and system for multi-core chip
CN106254254A (en) * 2016-09-19 2016-12-21 复旦大学 A kind of network-on-chip communication means based on Mesh topological structure
CN107391247A (en) * 2017-07-21 2017-11-24 同济大学 A kind of breadth First greed mapping method of network-on-chip application
CN107391247B (en) * 2017-07-21 2020-06-26 同济大学 Breadth-first greedy mapping method for network-on-chip application

Also Published As

Publication number Publication date
CN101625673B (en) 2012-06-27

Similar Documents

Publication Publication Date Title
CN101625673B (en) Method for mapping task of network on two-dimensional grid chip
CN103179052B (en) A kind of based on the central virtual resource allocation method and system of the degree of approach
CN100449522C (en) Matrix multiplication parallel computing system based on multi-FPGA
CN104899182A (en) Matrix multiplication acceleration method for supporting variable blocks
CN104156267A (en) Task allocation method, task allocation device and on-chip network
CN110618870A (en) Working method and device for deep learning training task
CN1493041A (en) Arithmetric functions in torus and tree networks
CN112119459A (en) Memory arrangement for tensor data
CN103984677A (en) Embedded reconfigurable system based on large-scale coarseness and processing method thereof
Ebrahimi et al. Exploring partitioning methods for 3D Networks-on-Chip utilizing adaptive routing model
CN112183015B (en) Chip layout planning method for deep neural network
CN102780628A (en) On-chip interconnection network routing method oriented to multi-core microprocessor
Albing et al. Scalable node allocation for improved performance in regular and anisotropic 3d torus supercomputers
CN102325089A (en) Fat tree type network-on-chip mapping method based on differential evolution and predatory search strategy
CN103902505A (en) Configurable FFT processor circuit structure based on switching network
CN111630487A (en) Centralized-distributed hybrid organization of shared memory for neural network processing
Karthikeyan et al. Randomly prioritized buffer-less routing architecture for 3D network on chip
CN104360982A (en) Implementation method and system for host system directory structure based on reconfigurable chip technology
CN109242161A (en) The generation method and terminal device of distribution route based on big data
CN102136012A (en) SystemC system level synthetic approach
Agyeman et al. Optimised application specific architecture generation and mapping approach for heterogeneous 3d networks-on-chip
CN108304261B (en) Job scheduling method and device based on 6D-Torus network
Yin et al. FPGA-based high-performance CNN accelerator architecture with high DSP utilization and efficient scheduling mode
CN111626410B (en) Sparse convolutional neural network accelerator and calculation method
Emmert et al. A methodology for fast FPGA floorplanning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant