CN101540786A

CN101540786A - Optimization method of on-chip network communication facing peripheral equipment requirement

Info

Publication number: CN101540786A
Application number: CN200910097646A
Authority: CN
Inventors: 陈天洲; 陈剑; 汪达舟; 王超; 蒋冠军
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2009-04-13
Filing date: 2009-04-13
Publication date: 2009-09-23

Abstract

The invention discloses an optimization method of on-chip network communication facing peripheral equipment requirement. The optimization method divides the NoC into virtual networks according to IO requirements, divides the NoC into a plurality of logically independent networks, and balances the communication flow of the externally arranged IO and internal thread (or task), thus not only supporting the computationally intensive computation requirement, but also supporting IO intensive computation requirement. By difference of geographical positions and proper design of node positions, the on-chip network communication is optimized in a task mapping mode.

Description

Optimization method towards the network-on-chip communication of peripheral hardware demand

Technical field

The present invention relates to the processing nuclear optimization method in the Computer Architecture, especially relate to a kind of optimization method of the network-on-chip communication towards the peripheral hardware demand

Background technology

The development result of Computer Science and Technology had been presented on individual's desktop computer and various digital product at hand in recent years.From the angle of architectural framework, their common trait be performance from strength to strength, power consumption is more and more littler, this also is to meet the demand of people to this most.

No matter in academia or industrial quarters, the research of Computer Architecture, the performance of handling core from one of simple raising is converted on the wafer on integrated a plurality of processing modules, examines more than the meaning.In the architecture development, multinuclear becomes trend and main flow, have benefited from the progress of silicon technology and the defective of silicon material own: the former makes the integrated number of transistors more and more (the pseudo-laws of mole) of energy on the unit are, the latter makes the frequency upgrading of handling nuclear be restricted (light velocity upper limit, wiring delay is more and more longer with respect to gate circuit).The consequence that so directly causes is that monokaryon frequency upgrading finite sum number of transistors can not make full use of, and adds the control of power consumption, and multinuclear just arises at the historic moment.

In order to satisfy the demand of growing highly dense type, high-throughput type application program, explore the problem that becomes people's thinking than communication modes in the better sheet of traditional shared bus.

Various functional parts (processor, internal memory and peripheral I/O controller etc.) are called network-on-chip by the mode that the form of giving out a contract for a project is carried out communication on network-on-chip.At aspects such as communication, multimedia and consumer electronics, SOC (system on a chip) (SoC) can satisfy harsh demand such as build and power consumption by integrating various functional parts to one chip.Yet the SoC system based on shared bus has the design complexity like this, and function is difficult to design correct and each module is difficult to multiplexing critical defect.So Network-on-Chip (being called for short NoC) technology at first is applied in the SOC systematic research.

NoC puts forward for the problem that overcomes traditional shared bus framework just, thereby than based on bus incomparable advantage being arranged.Being main below, is not whole.

1.NoC have good expandability, and have than the better concurrency of bus.All functional modules (node) are by on the unified network that is connected to based on route (router) of network interface (Network Interface).Route has the ability of transmitting bag, and by route, global wires becomes has several sections short-terms to be formed by connecting, and has guaranteed that signal is not twisted.Can extend to thousands of nodes so theoretically.Line between all routes can concurrently carry out the transmission and the reception of data, has improved concurrency.

2. module height reusableization.As mentioned above, each functional module is interconnected by NI, and unified interface has so just been arranged.Electronic product updates, and network remains unchanged basically, thereby module obtains multiplexingization.Reduce the workload of design iterations like this to the designer, reduced cost, also quickened time to market (TTM).

3. overall signal's control becomes simple (comprising clock).Each functional module is relatively independent, does not have the overall situation or does not have the overall coordination system substantially, makes overall signal's distortion drop to minimum.Each functional module can adopt the clock of oneself simultaneously, and system's change has just become distributed, makes the state that needn't monitor other modules immediately.

Because the research of NoC is in the starting stage, also do not have commercial system to occur at present, but prospect is considerable.Can meet in the near future, the model of NoC can be more and more clear.Following several problems comprise that each building block will be increased the durability of module by standardization, but performance but can be lossy, therefore need think over; The analogue system of SOC (system on a chip) does not determine that the benchmark of Performance evaluation criterion needs redesign yet; Power consumption is the important indicator of existing chip design, also no exception in the NoC design.

In traditional shared bus framework, each functional part directly is connected on the bus, therefore from physically seeing the position that is in symmetry fully.Difference in logic also only limits to ask the priority difference of bus service.So, there is not topological problem between each parts in the traditional architecture design.

And in the system of NoC, each functional part (also claiming node) is connected on the network, therefore also introduced one with legacy network in a similar network interconnection topological problem.In general, because the distributivity of route and the finiteness of ability, the position of each node in network is not symmetrical fully, can only be the part symmetry.In sum, a feasible NoC topological structure inevitably has node geographical position otherness.

Summary of the invention

The object of the present invention is to provide a kind of optimization method of the network-on-chip communication towards the peripheral hardware demand, by the appropriate nodes Position Design, the duty mapping mode is optimized the network-on-chip communication by difference in geographical location.

The technical scheme that the present invention solves its technical problem employing is as follows:

1) periphery towards peripheral hardware IO connects design:

The topological structure of handling nuclear based on network-on-chip is included in scope of design, and because of the demand that peripheral hardware IO handles up, a feasible NoC topological structure inevitably has node geographical position otherness; And the node location design of this difference in geographical location by appointment, the duty mapping mode, the influence that this variability issues is brought minimizes;

2) divide virtual subnetwork:

Network-on-chip exists n node, link to each other by unified router between each node, because the clustering round property of task, peripheral channel and one group of processing node have constituted the hardware resource requirements of a task, have promptly constituted a virtual subnetwork interim, that have remarkable communication; In this network, compare with not at same group of other nodes, more communication close relationship is arranged;

3) balance communication traffic:

After being divided into virtual subnetwork, sub-network inside has a large amount of relatively communication needs, also exists a spot of communication need in addition between each virtual subnetwork.The balance of utilizing communication by the balance communication traffic, and makes whole system be in a kind of more excellent communication operating state to optimize router design, and " Ohm's law " followed in obtaining of this state.

Compare with background technology, the useful effect that the present invention has is:

The present invention is a kind of a kind of necessary optimal design mode in the network-on-chip type of process nuclear design of peripheral hardware communication need, is following based on one of Consideration in the design of network-on-chip communication nuclear.The present invention with NoC at the IO demand, be divided into virtual network, NoC is divided into independently network of a plurality of logics, the communication traffic of balance peripheral hardware IO and internal thread (or task), both to have supported the computation requirement of computation-intensive, support the computation requirement of IO intensity again.

(1) independence, reliability and high efficiency.This method is by in the satisfying of the communication need of network-on-chip, and is relatively independent with other factors in the design, and asynchronous working between each height piece has greatly increased the reliability of network-on-chip.Each functional module is interconnected by NI, and unified interface has so just been arranged.Overall signal's control becomes simple (comprising clock).Each functional module is relatively independent, does not have the overall situation or does not have the overall coordination system substantially, makes overall signal's distortion drop to minimum.Each functional module can adopt the clock of oneself simultaneously, and system's change has just become distributed, makes the state that needn't monitor other modules immediately.The communication relative equilibrium of each node, not crowded especially central point is in distribution, thereby is efficiently.

(2) good extensibility, better concurrency.All functional modules (node) are by on the unified network that is connected to based on route (router) of network interface (Network Interface).Route has the ability of transmitting bag, and by route, global wires becomes has several sections short-terms to be formed by connecting, and has guaranteed that signal is not twisted.Can extend to thousands of nodes so theoretically.Line between all routes can concurrently carry out the transmission and the reception of data, has improved concurrency.The communication need of each peripheral channel logically has been divided into several virtual subnetwork, and it is minimum that " interference " between each sub-network reaches.

Description of drawings

Fig. 1 is a peripheral channel connection diagram of the present invention.

Fig. 2 is virtual subnetwork division methods figure of the present invention.

Fig. 3 is the state scattergram of network optimum in the group.

Fig. 4 is that two paths can arrive the destination schematic diagram.

Embodiment

This present invention is a kind of necessary optimal design mode in the design of the network-on-chip type of process of peripheral hardware communication need nuclear in a kind of Computer Architecture.Present NoC development expectation has the lifting of matter to intensive calculations mostly, and has ignored the demand of fast peripheral equipment I O.The present invention with NoC at the IO demand, be divided into virtual network, NoC is divided into independently network of a plurality of logics, the communication traffic of balance peripheral hardware IO and internal thread (or task), both to have supported the computation requirement of computation-intensive, support the computation requirement of IO intensity again.

Below in conjunction with Fig. 1, Fig. 2 its specific implementation process is described.

1) periphery towards peripheral hardware IO connects design

This Optimization Design is included limit of consideration in to the topological structure of handling nuclear based on network-on-chip, and because of the demand that peripheral hardware IO handles up, a feasible NoC topological structure inevitably has node geographical position otherness.And this difference in geographical location is passed through the appropriate nodes Position Design, appropriate duty mapping mode etc., and the influence that this problem is brought minimizes.

Among Fig. 1, LVDS, PCI-E and MEM controller are that the high flow capacity peripheral hardware leads to, can see, suppose that need send out a bag respectively from node 0 gives node 1 and node 15, delay to node 1 in the optimum routing algorithm is obviously short than the delay to node 15, and higher to the communication cost of node 15.And from the logic visual angle of node 0, node 1 and node 15 are equal to fully.This phenomenon can become the geographical position difference of topological network framework.The delay of this species diversity type on bringing intuitively, will consume more resources, as route and line etc.This species diversity can not be eliminated fully.Among the present invention, at first fixed the position of the peripheral channel of big throughput, as the quick video card passage among the figure, Memory Controller Hub passage etc.Next utilizes thread (task) unfixed fact in position in handling nuclear (node), arranges nearest position when the employing traffic is big, traffic hour arrangement position far away, and the design peripheral channel is handled the design of examining that is connected with network-on-chip.

2) divide virtual subnetwork

From Fig. 2, core represents the processing unit of sheet online, and promptly network-on-chip exists n node, links to each other by unified substantially route between each node.Because the clustering round property of task, peripheral channel and one group of processing node have constituted the hardware resource requirements of a task.Just be based on such consideration, constituting a virtual subnetwork interim, that have remarkable communication.In this network, compare and other nodes, the nearly feelings relation of more communication is arranged, also be efficiently therefore based on this.

Since the restriction of chip area, and application program needs the more nuclear that calculates.Thereby can't be integrated into some in the sheet except that other device controllers of handling examining.Can simply be transplanted to existing shared bus framework, and handle the mode that NoC is adopted in the nuclear communication in the sheet, thereby reduce the pressure of bus, form a kind of computer system of communication modes of mixing.Such design is the compatibility to the conventional bus structure, to reach the purpose that reduces cost.

Such processing nuclear design generally is applied to high density and calculates, and is applicable to that the I/O throughput is less, the perhaps very rare situation of I/O passage, and service condition very easily is restricted.Work as peripheral port, as the VGA among Fig. 2, when needing the mass data exchange, the route (router) that he connects can produce overload.Reason is that all need carry out the data that the node of VGA exchanges data produces and all will transmit data by this route, when being in the downstream just as a ditch.

The NoC system of this structure is generally in mapping during application program, according to the demand of application program to IO, and combines communication need between self nuclear, and logic has been divided the framework of similar virtual network on being, shown in the curvilinear frame among Fig. 2.The big node of self internuclear communication cost generally is arranged in from required IO port place far away, on the contrary, has the node of direct mass data exchange just to need very close this IO port to the IO port.Come the load of route in each node of balance to reach roughly balance like this.Can support some High-speed I passages preferably like this, vague generalization ground reduces the cost of internuclear communication.

3) balance communication traffic

After being divided into virtual subnetwork, sub-network inside has a large amount of relatively communication needs, also exists a spot of communication need in addition between each sub-network.The balance of utilizing communication to be optimizing routing Design, and makes whole system be in a kind of more excellent communication operating state;

(1), when virtual subnetwork arrives, always with the position of each node of distributions of this network optimum.Referring to Fig. 3.

(2), the passage between a pair of node is important communication hardware resource, has self " resistance value " x when traffic is sky.When there is load in a passage, promptly increase " resistance value " x ' of this passage.As seen, when the communication of a pair of necessity can be passed through with nearest passage, but may not be optimum.Supposing now need be from node 0 to node 1 communication, yet has had traffic 10 (resistance value) in the shortest path, selects this moment to transmit from the mode of node 0-3-4-1, because the traffic on this passage is 2+2+2=6＜10.The mode of this balance traffic meets the trend of electric current, i.e. " Ohm's law " promptly regards traffic as the resistance of increase, and the trend of electric current is the selection of communication path of the present invention.

(3),, neither select to load as all communications between 2 even do not have traffic between two nodes, but, referring to Fig. 4, exist two paths can arrive the destination.Suppose that traffic is 10, then exists traffic 8, and still has traffic 2 between the 0-3-4-1 between 0-1.This meets " Ohm's law " than row.Size as for concrete traffic depends on " setting of resistance value ", and this setting derives from concrete experimental data and experience, to reach the purpose of system optimization.

Claims

1, a kind of optimization method of the network-on-chip communication towards the peripheral hardware demand is characterized in that:

1) periphery towards peripheral hardware IO connects design:

2) divide virtual subnetwork:

3) balance communication traffic: