CN103152281A - Two-level switch-based load balanced scheduling method - Google Patents

Two-level switch-based load balanced scheduling method Download PDF

Info

Publication number
CN103152281A
CN103152281A CN2013100693918A CN201310069391A CN103152281A CN 103152281 A CN103152281 A CN 103152281A CN 2013100693918 A CN2013100693918 A CN 2013100693918A CN 201310069391 A CN201310069391 A CN 201310069391A CN 103152281 A CN103152281 A CN 103152281A
Authority
CN
China
Prior art keywords
input port
voq
unit frame
zone
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013100693918A
Other languages
Chinese (zh)
Other versions
CN103152281B (en
Inventor
戴艺
肖立权
伍楠
曹继军
高蕾
张鹤颖
童元满
董德尊
王绍刚
沈胜宇
刘路
肖灿文
张磊
王永庆
齐星云
陆平静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201310069391.8A priority Critical patent/CN103152281B/en
Publication of CN103152281A publication Critical patent/CN103152281A/en
Application granted granted Critical
Publication of CN103152281B publication Critical patent/CN103152281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a two-level switch-based load balanced scheduling method, which comprises the following steps that first-level input ports buffer arriving cells in a virtual output queue (VOQ) according to destination ports; a scheduler switches messages to second-level input ports through a first-level switching network, wherein k cells from the same stream in the VOQ are called unit frames; each first-level input port executes minimum length scheduling according to a traffic distribution matrix, and transmits the unit frames of the same stream to a fixed mapping area of the stream through the first-level switching network in k continuous external timeslots; and N second-level input ports are sequentially divided into N/k groups of which each comprises k continuous second-level input ports forming an area, and each area buffers the cells in an output queue (OQ) according to the destination ports, and switches the cells to the destination output ports through a second-level switching network. The method has the advantages that a scheduling process is simple, computation or communication is not required, the method is easily implemented by hardware, the throughput of 100 percent is ensured, the sequence of the messages can be ensured, and the like.

Description

Load equilibration scheduling method based on the two-stage exchange
Technical field
The present invention is mainly concerned with the dispatching message field in parallel switching fabric, refers in particular to a kind of method for dispatching message of realizing load balancing and packet order preserving in parallel switching fabric.
Background technology
The researcher adopt initiatively measure and the passive measurement mode to packet out-ordering behavior done large quantity research.J.Bennet has measured the packet out-ordering situation in the switching center of MAE-East ISP, finds that the packet out-ordering situation is very serious under the high measurement environment of heavy duty, network equipment degree of parallelism, and the connection generation 90% or more is out of order.J.Bennett analyzes the out of order rate of this height mainly from the local parallel treatment mechanism of network internal, comprises parallel switching equipment and parallel transmission link, and points out that packet out-ordering is not the illness behavior of network.There is the packet out-ordering phenomenon in most high performance parallel switching fabrics, and such as load balancing switching fabric (load-balanced switch), parallel Packet switch architecture (parallel packet switch), multilevel interchange frame (multi-level switch) etc. all is subject to the puzzlement of packet out-ordering problem.Degree of depth Parallel Design has brought load balancing and packet out-ordering two large problems to switching fabric.Load balancing is the key that realizes that delay and throughput guarantee, but load balancing may cause packet out-ordering, out of order message can damage Internet net condition, because the widely used TCP host-host protocol of Internet can be wrong regard out of order message the sign of the congested generation of message dropping as, thereby cause unnecessary re-transmission and TCP overtime.These retransmit and the overtime TCP throughput that will reduce improves message delay, and the order that guarantees message when therefore realizing the input flow rate load balancing is extremely necessary.
The method that prevents packet out-ordering can be divided into two classes: 1) limit the quantity of out of order message, reset the order buffering area what output arranged finite capacity, be used for resetting the out of order message of order; 2) guarantee that message sequentially leaves output port according to arrival, thereby avoided packet out-ordering.Due to the finite capacity of buffering area, first method can only be processed the out of order message in certain limit, increases to O (N if will reset the order buffer size 2), can be correspondingly with the time scale increase message delay of quadratic power although can solve the problem of packet out-ordering fully, wherein N is port number.Therefore, the quantity that limits out of order message can not effectively solve the packet out-ordering problem, and is difficult to adapt to the demand of router high port density.Preferential (the Full Ordered Frame First of the full frame that Stanford University proposes, abbreviation FOFF) algorithm is the algorithm that represents of first method, it allows to exist in router the out of order message of some, and the buffer queue that is provided with N * N at output is used for resetting the out of order message of order.Can prove, leave according to the order of sequence in order to guarantee cell, the FOFF algorithm resets the order buffer pool size and is at most N 2Individual cell and load balancing can be provided obtains 100% throughput.In recent years, more researcher trends towards adopting second method to guarantee the order of message, and the order that resets of having eliminated output operates and the buffering area expense, is conducive to improve delay performance.The common trait of this class dispatching method is to obtain by certain message passing mechanism the state information that all arrive message, based on overall message status information and executing centralized scheduling.For example, it is that the notification packet time departure is created feedback network that mailbox exchange (mailbox switch) method adopts the symmetric form connection mode, and scheduler is according to the time departure scheduling message of message.This strategy can guarantee that the message of every stream sequentially leaves switching system according to its arrival but can not provide load balancing to realize 100% throughput.Alternately mate dispatching method and adopt the centralized scheduling method, suppose traffic characteristic be precognition and immobilize, adopt the method off-line solution centralized scheduling problem of matrix decomposition, the distributed implementation on-line scheduling also provides service guarantees.Yet, when flow becomes unpredictable and dynamically changes, be difficult to satisfy the centralized dispatching requirement under large exchange size.The PARALLEL MATCHING dispatching method is realized packet order preserving by transmit request-license token between first order input ply-yarn drill and second level input ply-yarn drill, but the frequent transmission of token between ply-yarn drill will be multiplied dispatching cycle in particular hardware realizes.
Summary of the invention
The technical problem to be solved in the present invention just is: for the technical problem that prior art exists, the invention provides a kind of scheduling process simple, need not any calculating or communication, be easy to the load equilibration scheduling method based on the two-stage exchange that hardware had been realized, realized 100% throughput and can guarantee the order of message.
For solving the problems of the technologies described above, the present invention by the following technical solutions:
A kind of load equilibration scheduling method based on the two-stage exchange, first order input port will arrive cell and be buffered in the VOQ formation according to destination interface, scheduler by first order switching network with message switching to second level input port, k cell from same stream in the VOQ formation is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out the minimum length assignment according to the flow distribution matrix, and at k continuous external time groove, the unit frame that same is flowed by first order switching network is sent to the fixing mapping area of this stream; N second level input port is divided into the N/k group successively, every group contains k continuous second level input port and consists of a zone, each zone according to destination interface with cell-buffering in the OQ formation, because the mapping relations that flow to the zone are fixed, to arrive successively OQ queue heads position from k cell of same unit frame, exchange to the purpose output port by second level switching network.
As a further improvement on the present invention, adopt two endless form to build the mapping that flows to the zone:
(1.1) build in a looping fashion N/k flow branching of first order input port to the mapping in second level input port N/k zone, be able to contain all stream to guarantee each zone;
(1.2) further adjust in a looping fashion the related of different input ports and N/k kind mapping mode, contain all stream to guarantee each zone according to the equiblibrium mass distribution of input port.
As a further improvement on the present invention, assign cell according to the mapping relations that flow to the zone and serve N/k flow branching with polling mode when input port is dispatched, described first order input port as follows in the operating process of each groove external time:
(2.1) if flow branching
Figure BDA00002885718400021
There is full frame in the VOQ formation, and the priority scheduling full frame, give this full frame N/k the highest dispatching priority of unit frame, searches the flow distribution matrix L, and first unit frame of intercepting full frame sends to present L G, jMinimum regional R g, i.e. execution in step (2.1.1)~(2.1.k) is not if exist full frame to turn step (2.2);
If (2.1.1) input port i is to second level input port S G, 1Inner link idle, with VOQ I, jThe queue heads cell sends to regional R gSecond level input port S G, 1, otherwise f=(f+1) modN/k turns step (2.1);
(2.1.2) send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 2
(2.1.3) send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 3
The rest may be inferred sends VOQ to (2.1.k) I, jThe queue heads cell is to regional R gSecond level input port S G, k, g=(g+1) modN/k turns step (2.1);
(2.2) if flow branching
Figure BDA00002885718400031
VOQ formation, VOQ I, j(kf≤j<kf+k) have the highest dispatching priority unit frame searches the flow distribution matrix L, and this unit frame is sent to present L G, jMinimum regional R g, i.e. execution in step (2.1.1)~(2.1.k), otherwise turn step (2.3);
(2.3) if flow branching
Figure BDA00002885718400032
There are one or more unit frame in the VOQ formation, searches the flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ formation according to lookup result, sends it to flow branching
Figure BDA00002885718400033
Fixedly mapping area R g, namely execution in step (2.1.1)~(2.1.k), can skip the search operation of flow distribution matrix if only contain a unit frame, and directly the unit frame that flow branching is unique sends to its fixedly mapping area R g, otherwise flow branching
Figure BDA00002885718400034
Do not contain any unit frame, g=(g+1) modN/k turns step (2.1).
Compared with prior art, the invention has the advantages that:
1, dispatching method of the present invention can be distributed in each input port and independently carries out, assign cell according to local VOQ queuing message, without any need for communication overhead, has realized 100% throughput and can guarantee the order of message with O (1) time complexity.
2, the present invention in the situation that between each input port scheduler without any communication overhead, realized packet order preserving and load balancing.Flow to the fixedly mapping in zone by structure, avoided packet out-ordering, eliminated message and reset the order expense; For avoiding the flow region concentration phenomenon, adopt two circulation (dual-rotation) modes to build the mapping relations that flow to the zone of different input ports, each input port is safeguarded the flow distribution matrix of overall unified view, according to flow distribution matrix thread frame.Can prove, to any output port j, the same area OQ jIdentical and the zones of different OQ of queue length jQueue length differs from 1 at the most, thereby has realized 100% load balancing degrees.
3, the present invention only need suitably choose polymerization granularity k, can obtain lowest latency in theory.By the delay performance of simplation verification dispatching method of the present invention under different polymerization granularity k, and compare with the load balance scheduling algorithm of present main flow.Analog result shows, when polymerization granularity k=2, the present invention be all can guarantee to have the optimal delay performance in the dispatching algorithm of message sequence at present, and under the burst flow model, shows the performance suitable with the algorithm that does not possess the packet order preserving characteristic.
4, the present invention is according to the fixedly mapping relations scheduling message that flows to the zone, and scheduling process is simple, need not any calculating or communication, is easy to hardware and realizes.
Description of drawings
Fig. 1 is an example of the secondary switching architecture that is suitable for of dispatching method of the present invention.
Fig. 2 is that the present invention's structure in concrete application example flows to regional mapping method at port number N=32, during polymerization granularity k=8, and the mapping result schematic diagram that flows to the zone that adopts two cyclic mapping modes to obtain.
Fig. 3 is that the present invention carries out the load equilibration scheduling method schematic flow sheet in concrete application example.
Fig. 4 is the present invention in concrete application example after adopting minimum length of the present invention to assign in the bursts of traffic situation, and cell is at the distribution situation schematic diagram of second level input port OQ buffering area.
Embodiment
Below with reference to Figure of description and specific embodiment, the present invention is described in further details.
The present invention is based on the load equilibration scheduling method of two-stage exchange, at first first order input port will arrive cell and be buffered in the VOQ formation according to destination interface, scheduler by first order switching network (Mesh network as shown in the figure) with message switching to second level input port, k the cell that flows from same in the VOQ formation, be referred to as a unit frame, unit frame is the minimum scheduling unit of the present invention.Each input port of the first order is independently carried out dispatching method of the present invention, carrying out minimum length according to the flow distribution matrix assigns, at k continuous external time groove, the unit frame that same is flowed by first order switching network (Mesh network) is sent to the fixing mapping area of this stream.N second level input port is divided into the N/k group successively, and every group contains k continuous second level input port and consist of a zone; Each zone according to destination interface with cell-buffering in the OQ formation, because the mapping relations that flow to the zone are fixed, to arrive successively OQ queue heads position from k cell of same unit frame, exchange to the purpose output port by second level switching network (Mesh network as shown in the figure).In said process, each input port is independently carried out the cell dispatching algorithm based on the stream mapping: the unit frame of same stream is sent the fixing mapping area of this stream by first order Mesh network; Each zone according to destination interface with cell-buffering in the OQ formation, because the mapping relations that flow to the zone are fixed, will arrive successively OQ queue heads position from k cell of same unit frame, wait for that second level Mesh network arrives output port successively during the free time.If first order buffering area is implemented in first order ply-yarn drill, second level buffering area is implemented in second level ply-yarn drill, and so above-mentioned secondary switching fabric becomes typical load balancing switching fabric, and the present invention is particularly useful for load balancing router message dispatching method.For ease of statement, the present invention sets forth summary of the invention with the Mesh network, and the Mesh network can be regarded as realizes message switching to the technological approaches of second level input port and purpose output port, can be that the Mesh network can be also other switching technologies.
As from the foregoing, core of the present invention is divided into a zone with regard to being with k continuous input port, and input adopts the load sharing algorithm based on the stream mapping, in fine-grained mode, k the cell that same flows is assigned to fixing mapping area.By theoretical proof, this scheduling strategy can obtain 100% throughput and can guarantee the order of message.Wherein k is the polymerization granularity, and it has determined the cell number of each scheduling same stream.For avoiding the flow region concentration phenomenon, the present invention further adopts two circulation (dual-rotation) modes to build the mapping relations that flow to the zone of different input ports.For realizing loading on the equiblibrium mass distribution of second level input port, the present invention further safeguards the flow distribution matrix of overall unified view at each input port, according to flow distribution matrix thread frame, can realize 100% load balancing degrees.
Fig. 1 is an example of the secondary switching architecture that is suitable for of dispatching method of the present invention.In figure, VOQ I, jThe VOQ j of expression first order input port i;
Figure BDA00002885718400051
The output queue j of expression second level input port l; The stream of f (i, j) expression from first order input port i to output port j; K is the polymerization granularity, and expression is the number (k is the factor of port number N) of scheduling same stream cell continuously; VOQ I, jEvery k cell of formation consists of a unit frame (unit frame), and the unit frame of the first order input port i N/k of place different VOQ formations (amounting to N cell) consists of an aggregate frame (aggregate frame), VOQ I, jThe N of a formation cell consists of a full frame (full frame); Second level input port 1,2 ...., N is divided into the N/k group successively, and each group contains k continuous second level input port, and k second level input port of g group consists of a zone, is denoted as R g, S R, zZ the second level input port that represents regional r; It is corresponding with N/k zone that the N bar stream of each input port is divided into the N/k group, and every group contains k bar stream, and the k bar stream of input port i f group consists of a flow branching, is denoted as
For reducing the message buffering memory bandwidth requirements, the Mesh network is normally operated in speed R/N (the inner link speed-up ratio is 1), obtains thus giving a definition:
The link that definition 1. is R in speed sends or receives a spent time of cell is external time groove (external time slot).
The link that definition 2. is R/N in speed sends or receives a spent time of unit frame is time slot (time slot), time slot be external time groove N doubly.
Generally, each time slot be every N external time groove, the UFFS-k algorithm can be from an input N/k flow branching polymerization N/k unit frame consist of an aggregate frame and be assigned to second level input port.VOQ formation equalizing coefficient has reflected the harmony of flow in second level input port OQ queue distribution, and the present invention's OQ queue length regional according to each thread frame has been realized interregional load balancing.Next, will elaborate equalizing coefficient operation principle of the present invention.By adopting the induction to time slot to prove, can prove theoretically that above-mentioned dispatching method based on the stream mapping can guarantee arbitrary region R g, 0≤j<N, k second level input port is corresponding k Queue length identical (transmission delay of ignoring unit frame), wherein l ∈ R gThus, can obtain giving a definition:
Since definition 3. is to arbitrary region R g,
Figure BDA00002885718400054
Identical (the l ∈ R of queue length g), formation VOQ so I, jEqualizing coefficient equals its mapping area R gThe length of output queue j is denoted as L G, j
If definition 4. formation VOQ I, jExist unit frame and equalizing coefficient to satisfy
Figure BDA00002885718400055
Continuous k external time groove with VOQ I, jThe formation unit frame sends to regional R g, minimum length that Here it is is assigned.
Can prove theoretically and adopt the minimum length assignment strategy can guarantee after time slot T finishes, to any two regional R g1, R g2, its OQ queue length Lg 1, jWith Lg 2, jDiffer from most 1, thereby can realize 100% throughput and 100% load balancing degrees.For realizing that minimum length assigns, first order input port scheduler need to be safeguarded the flow distribution matrix L of overall unified view=[L G, j], in order to guarantee the flow distribution matrix in the consistency of each input port view, must realize that each port is to the alternative of flow distribution matrix write operation.The present invention adopts lock mechanism to realize mutex L G, jMutual exclusion write: if g, j satisfies
Figure BDA00002885718400061
And L G, jBe in release (unlock) state, first order input port i is to L so G, jAfter locking with VOQ I, jThe formation unit frame is sent to its mapping area R g, L G, jBe unlocked after adding 1.Equalizing coefficient L in flow branching G, jBe in the VOQ of locking state I, jFormation directly is skipped.Be not difficult to infer to only have those identical input ports of mapping relations that flow to the zone to dispatch simultaneously the identical VOQ of destination interface I, jJust may cause same L during formation G, jWrite conflict, to equalizing coefficient L G, jMutual exclusion write and can avoid these input ports simultaneously a plurality of unit frame to be sent to the output queue j of the same area, cause flow distribution unbalanced.
Dispatching message algorithm based on the stream mapping is dispatched cell according to local VOQ queuing message, can be distributed in each input port of the first order and independently carry out, and what step 2 was described is the process that each input port of the first order is carried out load equilibration scheduling method of the present invention.
In the present invention, the first step adopts two circulation (dual-rotation) modes to build the mapping that flows to the zone.The mapping method that flows to the zone is related to the utilance of second level storage resources and two-stage Mesh network, for the mapping algorithm of avoiding loss of throughput to flow to the zone should realize that input load is at each regional equiblibrium mass distribution.The present invention proposes a kind of two circulation (dual-rotation) mapping algorithms of taking into account load balancing and packet order preserving, its design philosophy comes from the Essential Analysis to the packet out-ordering reason: will not to cause cell simultaneously out of order when the second level, the cell place buffering area OQ queue length of same stream, and out of order cell number increases along with the increase of OQ queue length difference.If k the cell that same is flowed in fine-grained mode is assigned to predefined mapping area (k continuous second level input port), because the mapping relations that flow to the zone are fixed, to any stream, the OQ queue length of its cell place mapping area is identical, thereby has realized the transmission according to the order of sequence of cell.
Since 1.1 for any given zone, can only receive the fixing k bar stream of same input port, so for realizing loading on each regional equiblibrium mass distribution, build in a looping fashion N/k flow branching of first order input port to the mapping in second level input port N/k zone, the ground floor for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.1 are described, step 1.1 have guaranteed that each zone be able to contain all stream;
1.2 further adjust in a looping fashion the related of different input ports and N/k kind mapping mode, the second layer for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.2 are described, step 1.2 have guaranteed that each zone contains all stream according to the equiblibrium mass distribution of input port.
Two cyclic mapping algorithms are set up by simple modulo operation the mapping relations that flow to the zone and are easy to the hardware realization, and its pseudo-code is described below:
Figure BDA00002885718400071
In the present invention, second step is that the input port scheduler is assigned cell according to the mapping relations that flow to the zone, serve N/k flow branching in poll (round-robin) mode, take unit frame as minimum scheduling unit, continuous k external time groove, send in flow branching the fixedly unit frame of VOQ formation.First order input port i scheduler carry out dispatching method of the present invention each external time groove operating process as follows:
If 2.1 flow branching
Figure BDA00002885718400072
(0≤f<N/k, f is initialized as 0), VOQ formation, VOQ I, j(kf≤j<kf+k) have full frame, the priority scheduling full frame, give this full frame N/k the highest dispatching priority of unit frame, searches the flow distribution matrix L, and first unit frame of intercepting full frame sends to present L G, jMinimum regional R g, i.e. execution in step 2.1.1~2.1.k, if do not exist full frame to turn step 2.2,
If 2.1.1 input port i is to second level input port S G, 1Inner link idle, with VOQ I, jThe queue heads cell sends to regional R gSecond level input port S G, 1, otherwise f=(f+1) modN/k turns step 2.1;
2.1.2 send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 2
2.1.3 send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 3
...·..
The like send VOQ to 2.1.k I, jThe queue heads cell is to regional R gSecond level input port S G, k, g=(g+1) modN/k turns step 2.1.
If 2.2 flow branching
Figure BDA00002885718400081
VOQ formation, VOQ I, j(kf≤j<kf+k) have the highest dispatching priority unit frame searches the flow distribution matrix L, and this unit frame is sent to present L G, jMinimum regional R g, i.e. execution in step 2.1.1~2.1.k, otherwise turn step 2.3.
If 2.3 flow branching
Figure BDA00002885718400082
VOQ formation, VOQ I, j(kf≤j<kf+k) have one or more unit frame searches the flow distribution matrix L, selects minimum equalizing coefficient VOQ formation, VOQ according to lookup result I, j(unit frame of kf≤j<kf+k) sends it to flow branching
Figure BDA00002885718400083
Fixedly mapping area R g, namely execution in step 2.1.1~2.1.k, can skip the search operation of flow distribution matrix if only contain a unit frame, and directly the unit frame that flow branching is unique sends to its fixedly mapping area R g, otherwise flow branching
Figure BDA00002885718400084
Do not contain any unit frame, g=(g+1) modN/k turns step 2.1.
The present invention adopts load balancing degrees to weigh and loads on the balanced intensity that second level OQ buffering area distributes, and load balancing degrees can be defined as follows:
Define 5. load balancing degrees: suppose at time period [t r, t v] in l Switching Module forwarded S l[t r, t v] individual cell.Load balancing degrees is within this time period, the minimum cell number that different Switching Modules forward and the ratio of maximum cell number, that is:
E [ t r , t v ] = min l = 0 , . . . K - 1 S l [ t r , t v ] max l = 0 , . . . K - 1 S l [ t r , t v ] (K is the number of second level Switching Module)
Obvious load balancing degrees E[t r, t v]≤1, E[t r, t v] level off to 1, represent that the cell number of each Switching Module processing is basic identical, the distribution that loads on Switching Module is more balanced.E[t r, t v] less, the Switching Module load equilibrium is poorer, can prove that thus dispatching method of the present invention can obtain 100% load balancing degrees.
As shown in Figure 2, in concrete application example, the mapping method that flows to the zone of first step design of the present invention, as port number N=32, during polymerization granularity k=8, the mapping result (VOQ that flows to the zone that adopts two cyclic mapping methods to obtain ijThe stream of representative from input port i to output port j, → expression mapping relations), check for convenience the mapping result that flows to the zone, chosen the larger polymerization granularity stream k=8 of numerical value.The mapping method that flows to the zone is used for structure and flows to the fixing mapping relations in zone, it is related to the utilance of second level input storage resources and two-stage switching network (as the Mesh network), for the mapping algorithm of avoiding loss of throughput (loss of throughput) to flow to the zone should realize that input load is at each regional equiblibrium mass distribution.
The present invention adopts two circulation (dual-rotation) mapping modes to adjust flow branching to the mapping in zone.As shown in Figure 2, input port i, (0≤i≤31) contain N/k=4 flow branching
Figure BDA00002885718400086
Second level input port is divided into N/k=4 zone { R 0, R 1, R 2, R 3.For guaranteeing the utilance of first order exchange resource, the flow branching of same input port should be mapped to different zones, thereby obtains 4 kinds of mapping modes.In order to guarantee that each zone be able to contain all stream, second layer circulation builds the flow branching of different input ports to the mapping in zone with polling mode, guaranteed theoretically the feasibility of load balancing.According to the operation of second layer cyclic mapping, input port i={0,4, ..., 28} adopts the first mapping mode, input port i={1,5 ..., 29} adopts the second mapping mode, input port i={2,6 ..., 30} adopts the third mapping mode, input port i={3,7, ..., 31} adopts the 4th kind of mapping mode.Above-mentioned pair of cyclic mapping method can be taken into account load balancing and packet order preserving, and the present invention has further guaranteed the harmony of every stream between the zone according to second level input port flow distribution matrix thread frame.
As shown in Figure 3, for the present invention carries out the load equilibration scheduling method schematic flow sheet in concrete application example, corresponding above-mentioned second step of the present invention.
As shown in Figure 4, in the bursts of traffic situation, after adopting minimum length of the present invention to assign, cell is in the distribution situation of second level input port OQ buffering area for the present invention, and the present invention has realized the equiblibrium mass distribution of load when guaranteeing message sequence.Because flow branching is fixed to the mapping relations in zone, exist some zone that the relatively idle situation in other zones of buffer overflow occurs because load is overweight.For example, the VOQ formation of certain flow branching of input contains N cell, and queue length is still in continuous growth, and other flow branchings do not have unit frame to dispatch.The heavy duty flow branching will become performance bottleneck to the inner link of its mapping area like this, and the heavy duty flow branching is because using his idle link of input line khaki to cause the internal bandwidth waste on the other hand.In order to overcome the above problems, in the situation that bursts of traffic the present invention allows the heavy duty flow branching to seize link circuit resource, be uniformly distributed in each zone by giving the full frame limit priority message flow that will happen suddenly.In order to solve the out of order problem of cell of dispatching the full frame initiation and to follow the load balancing principle, the present invention will be from VOQ I, jThe N/k that formation an is read a unit frame i.e. full frame is assigned to present L successively G, jMinimum regional R gThe assignment of adopting this strategy to obtain is sequentially: from VOQ I, jFirst that formation is read and second unit frame are assigned successively and are assigned successively in zone 1 or zone 4 in zone 2 and 3, the three, zone unit frame and the 4th unit frame, and these four unit frame will arrive output port j successively by reading order.Because the length of zones of different output queue j differs from 1 at the most, therefore the unit frame of N/k in full frame is sent to present L successively G, jIt is out of order that cell can not be caused in minimum zone.In essence, the thread frame has all adopted the minimum length assignment strategy with the scheduling full frame, and both differences are that unit frame can only fixed assignment arrive its mapping area, and full frame will be split as N/k unit frame equiblibrium mass distribution in N/k zone.
Be only below the preferred embodiment of the present invention, protection scope of the present invention also not only is confined to above-described embodiment, and all technical schemes that belongs under thinking of the present invention all belong to protection scope of the present invention.Should be pointed out that for those skilled in the art, the some improvements and modifications not breaking away under principle of the invention prerequisite should be considered as protection scope of the present invention.

Claims (3)

1. load equilibration scheduling method based on two-stage exchange is characterized in that:
First order input port will arrive cell and be buffered in the VOQ formation according to destination interface, scheduler by first order switching network with message switching to second level input port, k cell from same stream in the VOQ formation is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out the minimum length assignment according to the flow distribution matrix, and at k continuous external time groove, the unit frame that same is flowed by first order switching network is sent to the fixing mapping area of this stream;
N second level input port is divided into the N/k group successively, every group contains k continuous second level input port and consists of a zone, each zone according to destination interface with cell-buffering in the OQ formation, because the mapping relations that flow to the zone are fixed, to arrive successively OQ queue heads position from k cell of same unit frame, exchange to the purpose output port by second level switching network.
2. the load equilibration scheduling method based on the two-stage exchange according to claim 1, is characterized in that, adopts two endless form to build the mapping that flows to the zone:
(1.1) build in a looping fashion N/k flow branching of first order input port to the mapping in second level input port N/k zone, be able to contain all stream to guarantee each zone;
(1.2) further adjust in a looping fashion the related of different input ports and N/k kind mapping mode, contain all stream to guarantee each zone according to the equiblibrium mass distribution of input port.
3. the load equilibration scheduling method based on two-stage exchange according to claim 1, it is characterized in that, in input port when scheduling, assign cell according to the mapping relations that flow to the zone and serve N/k flow branching with polling mode, and described first order input port as follows in the operating process of each groove external time:
(2.1) if flow branching There is full frame in the VOQ formation, and the priority scheduling full frame, give this full frame N/k the highest dispatching priority of unit frame, searches the flow distribution matrix L, and first unit frame of intercepting full frame sends to present L G, jMinimum regional R g, i.e. execution in step (2.1.1)~(2.1.k) is not if exist full frame to turn step (2.2);
If (2.1.1) input port i is to second level input port S G, 1Inner link idle, with VOQ I, jThe queue heads cell sends to regional R gSecond level input port S G, 1, otherwise f=(f+1) modN/k turns step (2.1);
(2.1.2) send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 2
(2.1.3) send VOQ I, jThe queue heads cell is to regional R gSecond level input port S G, 3
The rest may be inferred sends VOQ to (2.1.k) I, jThe queue heads cell is to regional R gSecond level input port S G, k, g=(g+1) modN/k turns step (2.1);
(2.2) if flow branching
Figure FDA00002885718300012
VOQ formation, VOQ I, j, wherein there is the highest dispatching priority unit frame in kf≤j<kf+k, searches the flow distribution matrix L, and this unit frame is sent to present L G, jMinimum regional R g, i.e. execution in step (2.1.1)~(2.1.k), otherwise turn step (2.3);
(2.3) if flow branching There are one or more unit frame in the VOQ formation, searches the flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ formation according to lookup result, sends it to flow branching
Figure FDA00002885718300022
Fixedly mapping area R g, namely execution in step (2.1.1)~(2.1.k), can skip the search operation of flow distribution matrix if only contain a unit frame, and directly the unit frame that flow branching is unique sends to its fixedly mapping area R g, otherwise flow branching
Figure FDA00002885718300023
Do not contain any unit frame, g=(g+1) modN/k turns step (2.1).
CN201310069391.8A 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method Active CN103152281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310069391.8A CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310069391.8A CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Publications (2)

Publication Number Publication Date
CN103152281A true CN103152281A (en) 2013-06-12
CN103152281B CN103152281B (en) 2014-09-17

Family

ID=48550152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310069391.8A Active CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Country Status (1)

Country Link
CN (1) CN103152281B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103825845A (en) * 2014-03-17 2014-05-28 北京航空航天大学 Matrix decomposition-based packet scheduling algorithm of reconfigurable VOQ (virtual output queuing) structure switch
CN107770093A (en) * 2017-09-29 2018-03-06 内蒙古农业大学 A kind of method of work of preposition continuous feedback type two-stage switching fabric
CN108243113A (en) * 2016-12-26 2018-07-03 深圳市中兴微电子技术有限公司 The method and device of Random Load equilibrium
CN108259382A (en) * 2017-12-06 2018-07-06 中国航空工业集团公司西安航空计算技术研究所 3 × 256 priority scheduling circuits
CN108540398A (en) * 2018-03-29 2018-09-14 江汉大学 Feedback-type load balancing alternate buffer dispatching algorithm
CN108632143A (en) * 2017-03-16 2018-10-09 华为数字技术(苏州)有限公司 A kind of method and apparatus of transmission data
CN109391556A (en) * 2017-08-10 2019-02-26 深圳市中兴微电子技术有限公司 A kind of method for dispatching message, device and storage medium
CN112653623A (en) * 2020-12-21 2021-04-13 国家电网有限公司信息通信分公司 Relay protection service-oriented route distribution method and device
CN113179226A (en) * 2021-03-31 2021-07-27 新华三信息安全技术有限公司 Queue scheduling method and device
CN113722113A (en) * 2021-08-30 2021-11-30 北京天空卫士网络安全技术有限公司 Traffic statistic method and device
CN114415969A (en) * 2022-02-09 2022-04-29 杭州云合智网技术有限公司 Dynamic storage method for message of switching chip
CN114448899A (en) * 2022-01-20 2022-05-06 天津大学 Method for balancing network load of data center
CN114500581A (en) * 2022-01-24 2022-05-13 芯河半导体科技(无锡)有限公司 Equal-delay distributed cache Ethernet MAC (media access control) architecture
CN114697275A (en) * 2020-12-30 2022-07-01 深圳云天励飞技术股份有限公司 Data processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7362733B2 (en) * 2001-10-31 2008-04-22 Samsung Electronics Co., Ltd. Transmitting/receiving apparatus and method for packet retransmission in a mobile communication system
CN101404616A (en) * 2008-11-04 2009-04-08 北京大学深圳研究生院 Load balance grouping and switching structure and its construction method
WO2011050541A1 (en) * 2009-10-31 2011-05-05 北京大学深圳研究生院 Load balancing packet switching structure with the minimum buffer complexity and construction method thereof
CN102123087A (en) * 2011-02-18 2011-07-13 天津博宇铭基信息科技有限公司 Method for quickly calibrating multi-level forwarding load balance and multi-level forwarding network system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7362733B2 (en) * 2001-10-31 2008-04-22 Samsung Electronics Co., Ltd. Transmitting/receiving apparatus and method for packet retransmission in a mobile communication system
CN101404616A (en) * 2008-11-04 2009-04-08 北京大学深圳研究生院 Load balance grouping and switching structure and its construction method
WO2011050541A1 (en) * 2009-10-31 2011-05-05 北京大学深圳研究生院 Load balancing packet switching structure with the minimum buffer complexity and construction method thereof
CN102123087A (en) * 2011-02-18 2011-07-13 天津博宇铭基信息科技有限公司 Method for quickly calibrating multi-level forwarding load balance and multi-level forwarding network system

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103825845A (en) * 2014-03-17 2014-05-28 北京航空航天大学 Matrix decomposition-based packet scheduling algorithm of reconfigurable VOQ (virtual output queuing) structure switch
CN108243113A (en) * 2016-12-26 2018-07-03 深圳市中兴微电子技术有限公司 The method and device of Random Load equilibrium
CN108243113B (en) * 2016-12-26 2020-06-16 深圳市中兴微电子技术有限公司 Random load balancing method and device
CN108632143A (en) * 2017-03-16 2018-10-09 华为数字技术(苏州)有限公司 A kind of method and apparatus of transmission data
CN109391556B (en) * 2017-08-10 2022-02-18 深圳市中兴微电子技术有限公司 Message scheduling method, device and storage medium
CN109391556A (en) * 2017-08-10 2019-02-26 深圳市中兴微电子技术有限公司 A kind of method for dispatching message, device and storage medium
CN107770093B (en) * 2017-09-29 2020-10-23 内蒙古农业大学 Working method of preposed continuous feedback type two-stage exchange structure
CN107770093A (en) * 2017-09-29 2018-03-06 内蒙古农业大学 A kind of method of work of preposition continuous feedback type two-stage switching fabric
CN108259382A (en) * 2017-12-06 2018-07-06 中国航空工业集团公司西安航空计算技术研究所 3 × 256 priority scheduling circuits
CN108259382B (en) * 2017-12-06 2021-10-15 中国航空工业集团公司西安航空计算技术研究所 3x256 priority scheduling circuit
CN108540398A (en) * 2018-03-29 2018-09-14 江汉大学 Feedback-type load balancing alternate buffer dispatching algorithm
CN112653623A (en) * 2020-12-21 2021-04-13 国家电网有限公司信息通信分公司 Relay protection service-oriented route distribution method and device
CN114697275B (en) * 2020-12-30 2023-05-12 深圳云天励飞技术股份有限公司 Data processing method and device
CN114697275A (en) * 2020-12-30 2022-07-01 深圳云天励飞技术股份有限公司 Data processing method and device
WO2022142917A1 (en) * 2020-12-30 2022-07-07 深圳云天励飞技术股份有限公司 Data processing method and apparatus
CN113179226A (en) * 2021-03-31 2021-07-27 新华三信息安全技术有限公司 Queue scheduling method and device
CN113179226B (en) * 2021-03-31 2022-03-29 新华三信息安全技术有限公司 Queue scheduling method and device
CN113722113A (en) * 2021-08-30 2021-11-30 北京天空卫士网络安全技术有限公司 Traffic statistic method and device
CN114448899A (en) * 2022-01-20 2022-05-06 天津大学 Method for balancing network load of data center
CN114500581A (en) * 2022-01-24 2022-05-13 芯河半导体科技(无锡)有限公司 Equal-delay distributed cache Ethernet MAC (media access control) architecture
CN114500581B (en) * 2022-01-24 2024-01-19 芯河半导体科技(无锡)有限公司 Method for realizing equal-delay distributed cache Ethernet MAC architecture
CN114415969A (en) * 2022-02-09 2022-04-29 杭州云合智网技术有限公司 Dynamic storage method for message of switching chip
CN114415969B (en) * 2022-02-09 2023-09-29 杭州云合智网技术有限公司 Method for dynamically storing messages of exchange chip

Also Published As

Publication number Publication date
CN103152281B (en) 2014-09-17

Similar Documents

Publication Publication Date Title
CN103152281B (en) Two-level switch-based load balanced scheduling method
Kim et al. Adaptive routing in high-radix clos network
CN107579922A (en) Network Load Balance apparatus and method
Ouyang et al. LOFT: A high performance network-on-chip providing quality-of-service support
Feng et al. Dynamic network service optimization in distributed cloud networks
CN104683242A (en) Two-dimensional network-on-chip topological structure and routing method
Navaridas et al. Reducing complexity in tree-like computer interconnection networks
Zahid et al. A weighted fat-tree routing algorithm for efficient load-balancing in infini band enterprise clusters
Zhang et al. Reco: Efficient regularization-based coflow scheduling in optical circuit switches
Sahoo et al. Deterministic dynamic network-based just-in-time delivery for distributed edge computing
US7460544B2 (en) Flexible mesh structure for hierarchical scheduling
Wang et al. Randomized load-balanced routing for fat-tree networks
Dhakad et al. Performance analysis of round robin scheduling using adaptive approach based on smart time slice and comparison with SRR
Eugster et al. Essential traffic parameters for shared memory switch performance
Zheng et al. Design and analysis of a parallel hybrid memory architecture for per-flow buffering in high-speed switches and routers
Yoshigoe The CICQ switch with virtual crosspoint queues for large RTT
Lin et al. Distributed packet buffers for high-bandwidth switches and routers
Wang et al. Router with centralized buffer for network-on-chip
Dehghani et al. Deadline-Aware and Energy-Efficient Dynamic Task Mapping and Scheduling for Multicore Systems Based on Wireless Network-on-Chip
Rasmussen et al. Efficient round‐robin multicast scheduling for input‐queued switches
Shu et al. SRNoC: A novel high performance Shared-Resource routing scheme for Network-on-Chip
Bao et al. A priority-based polling scheduling algorithm for arbitration policy in Network on Chip
Yoshigoe Threshold-based exhaustive round-robin for the CICQ switch with virtual crosspoint queues
Fan et al. Shuffle scheduling for MapReduce jobs based on periodic network status
Concatto et al. Two-levels of adaptive buffer for virtual channel router in nocs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant