WO2012116655A1 - 交换单元芯片、路由器及信元信息的发送方法 - Google Patents

交换单元芯片、路由器及信元信息的发送方法 Download PDF

Info

Publication number
WO2012116655A1
WO2012116655A1 PCT/CN2012/071845 CN2012071845W WO2012116655A1 WO 2012116655 A1 WO2012116655 A1 WO 2012116655A1 CN 2012071845 W CN2012071845 W CN 2012071845W WO 2012116655 A1 WO2012116655 A1 WO 2012116655A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
data
module
queue
switching unit
Prior art date
Application number
PCT/CN2012/071845
Other languages
English (en)
French (fr)
Inventor
拉米⋅茨卡里埃
艾利克斯⋅乌曼斯基
熊礼霞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2012116655A1 publication Critical patent/WO2012116655A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/109Integrated on microchip, e.g. switch-on-chip

Definitions

  • TECHNICAL FIELD Embodiments of the present invention relate to data exchange technologies, and in particular, to a switching unit chip, a router, and a method for transmitting cell information.
  • a switching unit of a large-capacity router mainly adopts a shared cache structure, and all input and output ports access the same block cache. At each clock cycle (Clock Cycle), all input and output ports can simultaneously read and write, which greatly improves the exchange. The processing power of the unit.
  • a N-to-N-out switching unit can process the cell size (cel l length) and the speed of its link.
  • a switching unit of 128 Serdes determines a link number limit according to a constraint relationship between a cache read/write period, a link processing rate, a cell size, and a number of links, and then divides the switching unit into a plurality of small units. Data exchange processing is performed independently.
  • Embodiments of the present invention provide a method for transmitting a switching unit chip, a router, and cell information to meet the switching requirement of higher traffic of the Internet.
  • An embodiment of the present invention provides an exchange unit chip, including:
  • a cell input module having a plurality of input ports for buffering cell information received through each input port, and allocating data cells in the cached cell information according to a corresponding allocation rule, and The data cell read in is sent to the queue bow engine module;
  • the queue engine module is connected to the cell input module, and includes a plurality of independent data queue engine sub-modules, where each data queue engine sub-module is configured to receive the information sent by the cell input module according to the corresponding allocation rule. Data cells, and storing the data cells in corresponding queues in the data queue engine sub-module;
  • a cell output module having a plurality of output ports, configured to schedule data cells stored in the plurality of data queue engine sub-modules according to a scheduling rule, and send the data cells through the corresponding output ports.
  • the embodiment of the invention provides a router, which includes the switching unit chip provided by the embodiments of the present invention.
  • the embodiment of the invention further provides a method for sending cell information, including:
  • a cell input module having a plurality of input ports in the switching unit chip buffers cell information received through each input port, and allocates data cells in the cached cell information according to a corresponding allocation rule, and according to the allocation result Transmitting the data cell read from the cache to the queue engine module in the switching unit chip;
  • the queue engine module is connected to the queue engine module, and includes a plurality of independent data queue engine sub-modules;
  • Each data queue engine sub-module in the switching unit chip receives a data cell sent by the cell input module according to the corresponding allocation rule, and stores the data cell in the data queue engine sub-module Corresponding queue;
  • a cell output module having a plurality of output ports in the switching unit chip schedules data cells stored in the plurality of data queue engine sub-modules according to a scheduling rule, and sends the data cells through the corresponding output ports. .
  • the chip is divided into a plurality of independent data processing units, and the data cell distribution algorithm with good performance is adopted to ensure different data processing.
  • the unit's queue state consistency can meet the higher traffic exchange requirements of the Internet.
  • FIG. 1 is a schematic structural diagram of a switching unit chip according to an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a switching unit chip according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a cell input module in an implementation of the present invention.
  • FIG. 4 is a schematic diagram of reading IQ cells in a switching unit chip according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a data table in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of an information matrix in a process of implementing a distribution algorithm according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of QE selection in a process of implementing a distribution algorithm according to an embodiment of the present invention.
  • FIG. 8 is another schematic diagram of QE selection during implementation of a distribution algorithm according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of an updated data table in a process of implementing a distribution algorithm according to an embodiment of the present invention.
  • FIG. 10 is another schematic diagram of an update data table during implementation of a distribution algorithm according to an embodiment of the present invention.
  • FIG. 11 is still another schematic diagram of updating a data table during implementation of a distribution algorithm according to an embodiment of the present invention.
  • 12 is a schematic diagram of queue scheduling of a QE according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of cooperation polling between 0Q Group and QE according to an embodiment of the present invention.
  • FIG. 14 is a schematic diagram of a three-level switching network system using a switching unit chip according to an embodiment of the present invention.
  • the purpose of the present invention is to clearly and completely describe the technical solutions in the embodiments of the present invention, and it is obvious that the technical solutions in the embodiments of the present invention are clearly and completely described.
  • the described embodiments are a part of the embodiments of the invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
  • the embodiments of the present invention are directed to the use of independent small units to piece together switching units in the prior art, which is disadvantageous to system expansion from resource utilization and unit splicing processing, and because data cells are required to be distributed to different small units, and the distribution is increased.
  • the cost of current sharing and rearrangement even leads to defects such as performance degradation, and a solution is provided in which the chip is divided into several independent data processing units in the design of the switching unit of the switching fabric of the embodiment of the present invention, and performance is adopted.
  • a good data cell distribution algorithm guarantees the state consistency of different data processing units to meet the higher traffic exchange requirements of the future Internet.
  • the 8QE data distribution algorithm used by the switching unit chip in the embodiments of the present invention can also be applied to the field of stream classification.
  • the switching unit chip includes a cell input module 1, a queue engine module 2, and a cell output module 3; wherein, the cell input module 1 has multiple The input port, the switching unit chip receives the cell information sent by the uplink chip through multiple input ports, and the cell input module 1 is configured to buffer the cell information received through each input port, and the cached letter according to the preset corresponding allocation rule.
  • the data cells in the meta information are allocated, and the data cells read from the buffer are sent to the queue engine module 2 according to the allocation result.
  • the queue engine module 2 is connected to the cell input module 1, and the queue engine module 2 includes a plurality of data queue engine sub-modules 21, each of which is independent of each other, and is used for receiving the cell input module 1 according to the corresponding distribution rule.
  • the transmitted data cells store the received data cells in a corresponding queue in the data queue engine sub-module 21.
  • the cell output module 3 has a plurality of output ports for scheduling data cells stored in the plurality of data queue engine sub-modules 21 according to the set scheduling rules, and transmitting the data cells through the corresponding output ports.
  • the cell input module 1 in the switching unit chip receives the data cell through the input port thereon, and is buffered in the input queue (Input Queue; referred to as IQ);
  • the data cells to be distributed are allocated according to a preset distribution algorithm to determine which data queue engine sub-module 21 is assigned to the queue engine module 2, and the data cells are sent according to the allocation result.
  • Corresponding data queue engine sub-module 21 The distribution algorithm used in this embodiment needs to ensure that the queue states of different data queue engine sub-modules 21 are consistent.
  • the data queue engine sub-module 21 stores the data cells in the corresponding queues included in the data queue engine sub-module 21 according to the allocation result.
  • the internal queue design of each data queue engine sub-module 21 in this embodiment The same, including multiple unicast queues and multiple multicast queues.
  • the cell output module 3 dispatches the data cells stored in each data queue engine sub-module 21 out of the queue according to the scheduling rule in each clock cycle, and sends the data cells to the lower-level chip to exchange the cells.
  • the queue engine module 2 may further include a control queue engine sub-module 22 for receiving control cells in the cell information received by the cell input module 1 through each input port, and sending Give the cell output module 3.
  • the queue engine module 2 stores the data cells received by the cell input module 1 through the data queue engine sub-module 21 therein, and the control queue engine sub-module 22 stores the control cells received by the cell input module 1 through the control queue engine sub-module 22 therein. To achieve control of cell exchange within the switching unit chip.
  • FIG. 2 is a schematic structural diagram of a switching unit chip according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a cell input module according to an embodiment of the present invention. The number of input ports and output ports is 128, and the data queue engine is respectively The number of modules is 8 as an example for detailed introduction. As shown in Figure 2 and Figure 3, where RX portO ⁇ RX port l27 represents 128 input ports, TX portiTTX port l27 represents 128 output ports, and QE (T QE7 represents 8 data queue engine sub-modules.
  • the number of Serdes of the chip is 128, and the chip is divided into 8 independent processing units, called data QE (Q UeUe Engine), based on the shared memory processing formula.
  • data QEs that is, the data queue engine submodule
  • the cell input module includes a cell receiving unit (RX Ports Arbiter) for receiving cell information through each input port, and buffering cell information; a cell corresponding unit (Cells-2QE Arbiter) is used according to Corresponding to the allocation rule, establishing a corresponding allocation relationship between the data cell and the data QE in the cell information buffered by the cell receiving unit.
  • the Cell Data MUXs array is configured to send the data cells read from the cache to the corresponding data QE in the queue engine module according to the corresponding allocation relationship established by the cell corresponding unit. Specifically, the RX Ports Arbiter unit sends the data cell to the Cells-2QE Arbiter unit, which determines the data QE to which the data cell is sent by the preset distribution algorithm, and then the data cell is used by the Cells Data MUXs array unit. Send to the corresponding QE.
  • a control queue engine sub-module that specifically processes control cells (eg, Request/Grant, BP) may be included, which is called a control QE, and is responsible for forwarding the traffic of the uplink and downlink line card.
  • the cell such as: Request/Grant
  • the flow control cell such as BP
  • each data QE there are 512 unicast queues and 256 multicast queues in each data QE, and data cells entering the data QE are entered.
  • the incoming queue will be selected according to the following rules (16 FICs per frame in the system and 4 priority levels in the switching unit chip):
  • Unicast cell 1 is enqueued according to the purpose of the cell FIC number and priority;
  • the multicast cell 1 is enqueued according to the source FIC number and priority of the cell;
  • Unicast cell 1 is enqueued according to the destination frame number and priority of the cell; (In SE13 mode, the cell of the destination FIC is still enqueued according to the FIC number and priority in the frame. ; )
  • the multicast cell 1 is enqueued according to the source frame number and priority of the cell.
  • the control queue engine sub-module that is, the control QE, establishes three kinds of FIFO queues, including a first first in first out queue, a second first in first out queue, and a third first in first out queue, respectively It is used to store the scheduling control cell (Request/Grant), global BP cell and queue BP cell. These three FIFO queues are also implemented by shared cache.
  • each link corresponds to one IQ and 0Q queue, and each queue is composed of three FIFOs, which respectively store data cells, Request/Grant cells and BP cells.
  • each data queue engine sub-module is a data processing unit inside the same chip, which is completely different from the prior art by physically splicing a small number of chips into one switching unit.
  • the data cells can be evenly distributed among the data processing units in the entire chip based on the distribution algorithm, without the cost of distributing current sharing and rearrangement, and does not affect system performance. .
  • FIG. 4 is a schematic diagram of reading IQ cells in a switching unit chip according to an embodiment of the present invention.
  • a switching unit chip sequentially passes from 128 IQs through a cell input module therein.
  • Read up to 8 data cells in sequence for example, in Clock CycleO, read cells from IQs with sequence numbers 0 to 7.
  • Clock Cyclel read cells from IQs with serial numbers 8 to 15, in turn Analogy and looping.
  • the control cell is buffered, the control cell in the cell information received through each input port may also be read in the Clock Cycle, for example, the read control cell of one uplink and downlink line card is read (Request/ Grant cell), and a system of flow control cells (BP cells).
  • the data structure of the QE is identical, consisting of a unicast V0Q and a multicast MVIQ queue.
  • the cells entering the data QE enter the corresponding queue according to ⁇ unicast/multicast, priority, destination ⁇ . Since the structure and scheduling processing of the eight data QEs are identical, it is expected that the queue occupancy states in them are similar, so that they can be in a similar working state. Otherwise, if some queues are occupied in 8 QEs.
  • the cell input module in the switching unit chip of the embodiment of the present invention can maintain two data tables, that is, an information table including the length occupied by each queue in each QE (hereinafter referred to as: a queue occupation information table) and a An information table including the total cache occupation length of each QE.
  • FIG. 5 is a schematic diagram of a data table according to an embodiment of the present invention. As shown in FIG. 5, it includes an occupation length information table of each queue in eight QEs and a total cache occupancy information table of eight QEs.
  • the value recorded in the information table including the length of each queue in each data queue engine sub-module is the relative difference of the length occupied by each queue in each QE.
  • Table 1 shows the difference in the number of data cells stored in the respective V0Q0 queues in the queues QE0 to QE7, assuming that the number of data cells actually stored in QE0, QEK QE2, ..., QE7 is 97, 98, respectively. , 99 and 100 (you can also use the Byte value to indicate the occupied depth of the cell), then the difference between the records is ⁇ 0, 1, 2 - 3 ⁇ , so the number of bits required for the operation is It is much smaller than the actual length of the record.
  • the corresponding allocation rule in the embodiment of the present invention includes: selecting, according to the foregoing two information tables, a QE corresponding to the smallest queue occupation length and the smallest total cache occupation length for each data cell in each QE.
  • a QE corresponding to the smallest queue occupation length and the smallest total cache occupation length for each data cell in each QE.
  • V0Q 0 has a queue length of 30 in QE0 and QE1
  • QE0 has a total queue length of 100
  • QE1 has a total queue length of 90
  • Beck 1 is selected. In this way, it is possible to avoid preferentially selecting a QE with a small serial number when the same queue occupation length is used.
  • the cell input module will receive up to 8 data cells, so the corresponding entry content is obtained from the above-mentioned queue occupancy information table. At the end of this step, you can get a maximum of 64 (8QE X 8V0Qs) information matrix, as shown in Figure 6.
  • the distribution algorithm selects the corresponding QEs one by one from the first to-be-distributed cells, and the "first cell to be distributed" adopts simple polling.
  • the method determines, for example, that the first to-be-distributed cell is set to "the first cell to be distributed" when clock cycle N, then the second cell to be distributed is sequentially sequenced at clock cycle N+1 Set to "the first cell to be distributed”.
  • the distribution algorithm selects the smallest QE based on the occupancy information of the cell mapping V0Q. If the corresponding QE has been selected in this Clock Cycle, the V0Q takes the next smallest QE, as shown in Figure 7.
  • the cell input module is further configured to update the two information tables after the data cells are distributed for each QE and the data cells are scheduled from the QEs.
  • the switching unit chip provided by the embodiment of the invention can solve the line-speed scheduling problem of the packet data, and can perform the line rate processing in the switch structure for the 64-byte cell; the V0Q queue structure is designed to eliminate the principle.
  • the head-to-head blocking phenomenon of unicast traffic; the state consistency of different data QEs is ensured by using a well-functioning data cell distribution algorithm.
  • 12 is a schematic diagram of queue scheduling of a QE according to an embodiment of the present invention. As shown in FIG.
  • a queue in each QE is dequeued in a three-level scheduling manner, including: according to the principle of absolute priority or the principle of weight polling (WRR) Mode) Select the priority of the scheduled pair of columns; in the queue with the selected priority, select scheduling unicast or schedule multicast according to the unicast/multicast weight polling principle; select by simple polling
  • WRR weight polling
  • the priority of the scheduled queue is determined. If the principle of absolute priority is strict finite scheduling, the highest priority that can be scheduled is selected as the priority of the current scheduling. If it is the WRR mode, the current scheduling weight is selected according to the priority.
  • the single multicast type of the current scheduling is determined, and the scheduling unicast or the scheduling multicast is selected according to the scheduling weight values of the unicast and the multicast.
  • the cells in the queues to be scheduled correspond to multiple output queue (0Q) exits, and a 0Q is selected by simple polling, and the to-be-scheduled corresponding to the 0Q is determined. queue.
  • the scheduling parameter here refers to the relevant parameters of the scheduler (for example, the information recording of the simple polling for 0Q, waiting)
  • the status of the scheduling queue (whether there are cells or not), etc., and the content of the data table distributed by the previous 8QE is modified after the cell enters and schedules its queue.
  • the data table in the 8 QEs is related to the scheduler. The parameter is different from the content), complete the queue scheduling.
  • the OQ Group design in the switching unit chip provided by the embodiment of the present invention is described below.
  • the 128 ODS (Output Queue) corresponding to Serdes is divided into 8 groups, namely OQ Group, 16 Serdes chains in each group; next to different chips, each OQ Group is configured by the following mapping method:
  • each OQ Group connects all the lower-level chips
  • the plane mapping is performed by parity or other means. For example, when the number of SE2 is 32, Group0/2 of the SE1 chip is connected to SE2 with an even ID number, and Group1/3 of SE1. Just connect the SE2 whose ID number is odd.
  • each data QE corresponds to an OQ Group, that is, the data of this QE can only be dispatched to the 0Q in the corresponding OQ Group, for example, a map can be used. Polling is performed in the manner shown in 13.
  • FIG. 14 is a schematic diagram of a three-level switching network system using a switching unit chip according to an embodiment of the present invention.
  • the three-level switching network system shown in the figure adopts the switching unit architecture design mentioned in the foregoing embodiments, that is, SE13 (in the picture Logically divided into SE1 and SE3) and SE2 are the switching unit chips proposed by the present invention.
  • the data cells are extracted from the IQ in each switching unit chip and distributed to different data QEs.
  • the corresponding V0Q/MVIQ is entered in the data QE, and finally the data cells are scheduled by the queue scheduling into the corresponding 0Q.
  • the control cell is taken out from the IQ in each switching unit chip and sent to the control QE, and then scheduled according to the priority of the flow control cell, and sent to the corresponding 0Q.
  • the embodiment of the present invention further provides a router, wherein the switching unit chip used may use the switching unit chip provided by the foregoing embodiments, and the structure and function thereof are not described herein again.
  • the switching unit chip provided by the embodiment of the present invention is also applicable to a single-stage and back-to-back switching network structure.
  • the switching unit chip and the router provided by the embodiments of the present invention ensure the line rate processing of the 64 Byte packet in the switching unit chip from the design; the scheduling process of the queue is optimized from the design, and the schedulability of the queue is also determined. Its 0Q number, the queue schedulability check and scheduling are integrated, only need to perform simple polling and scheduling for 0Q; and the unicast head-of-line blocking (HOL) is eliminated from the design. .
  • the embodiment of the invention further provides a method for sending cell information, including the following steps:
  • a cell input module having a plurality of input ports in the switching unit chip buffers cell information received through each input port, and allocates data cells in the cached cell information according to a corresponding allocation rule, and according to the allocation result Transmitting the data cell read from the cache to the queue engine module in the switching unit chip;
  • the queue engine module is connected to the queue engine module, and includes a plurality of independent data queue engine sub-modules;
  • Each data queue engine sub-module in the switching unit chip receives a data cell sent by the cell input module according to the corresponding allocation rule, and stores the data cell in the data queue engine sub-module Corresponding queue;
  • a cell output module having a plurality of output ports in the switching unit chip schedules data cells stored in the plurality of data queue engine sub-modules according to a scheduling rule, and sends the data cells through the corresponding output ports. .
  • the method for sending the cell information may further include: the control queue engine submodule included in the queue engine module receives the cell information received by the cell input module through each input port. The control cell in the process and sent to the cell output module.
  • the switching unit chip in the method for transmitting the cell information provided in this embodiment may use the switching unit chip provided by the foregoing embodiments of the switching unit chip.
  • the structure and function refer to the foregoing embodiment, and details are not described herein again.
  • the operation steps included in the method for transmitting the cell information provided in this embodiment reference may also be made to the processing steps mentioned in the foregoing embodiments, and details are not described herein again.
  • a person skilled in the art can understand that all or part of the steps of implementing the foregoing method embodiments may be completed by using hardware related to the program instructions, and the foregoing program may be stored.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes: various media that can store program codes, such as wake up, should, disk or optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明实施例提供一种交换单元芯片、路由器及信元信息的发送方法。该交换单元芯片包括具有多个输入端口的信元输入模块,缓存通过各输入端口接收到的信元信息,根据对应分配规则对信元信息中的数据信元进行分配,并根据分配结果将数据信元发送给队列引擎模块;队列引擎模块包括多个独立的数据队列引擎子模块,各数据队列引擎子模块接收信元输入模块根据对应分配规则所发送的数据信元,并将数据信元存储在数据队列引擎子模块中对应的队列中;具有多个输出端口的信元输出模块根据调度规则对数据信元进行调度,并通过对应的输出端口向外发送。本发明实施例能够满足互联网的更高流量交换需求。

Description

交换单元芯片、 路由器及信元信息的发送方法 本申请要求于 2011年 3月 2日提交中国专利局、 申请号为 201110050100. 1、 发明 名称为 "交换单元芯片、 路由器及信元信息的发送方法"的中国专利申请的优先权, 其 全部内容通过引用结合在本申请中。 技术领域 本发明实施例涉及数据交换技术, 尤其涉及一种交换单元芯片、 路由器及信元信息 的发送方法。 背景技术 目前大容量路由器的交换单元主要采用共享缓存结构,所有的输入和输出端口访问 同一块缓存, 在每个时钟周期 (Clock Cycle ), 所有的输入输出端口可以同时读写, 大 大提高了交换单元的处理能力。 但是, 因为缓存读写周期 (Memory access cycle ) 的 限制, 一个 N进 N出的交换单元可以线速处理的信元大小 (cel l length) 和其链路的速
cell lensf h
Meoiorv access cycle < - ~― ~~ -——
率 (l ink speed ) 存在如下关系: ' ' — ' ' N ' mk eed , 在主频为 400MHz双端口读写处理, l ink speed为 lOGbps , 满足纯 64Byte信元线速的条件 下, N〈=20, 即交换芯片的芯片之间的物理连接 (Serdes ) 个数不超过 20根, 这显然大 大限制了交换单元芯片的物理连接数目。
将交换单元完全地从物理上分割成几个部分是现有技术解决缓存读写周期限制的 一个处理方法。 例如, 一个 128个 Serdes的交换单元, 根据缓存读写周期、 链路处理速 率、 信元大小和链路个数的制约关系, 确定链路数目限制, 然后就将交换单元分成多个 小单元, 分别独立地进行数据交换处理。
在实现本发明过程中, 发明人发现现有技术中至少存在如下问题: 现有技术并不是 从本质上解决问题, 而是采用独立的小单元来拼凑而成交换单元, 所有的小单元都完全 一样, 从资源利用和单元拼接处理上都不利于系统的扩展; 而且, 因为数据信元被需要 分发到不同的小单元, 还增加了分发均流和重排的代价, 还可能因分发不均匀而导致性 能下降。 发明内容 本发明实施例提供一种交换单元芯片、 路由器及信元信息的发送方法, 以满足互联 网更高流量的交换需求。
本发明实施例提供一种交换单元芯片, 包括:
具有多个输入端口的信元输入模块, 用于缓存通过各输入端口接收到的信元信息, 根据对应分配规则对缓存的信元信息中的数据信元进行分配, 并根据分配结果将从缓存 中读取的数据信元发送给队列弓 I擎模块;
所述队列引擎模块与所述信元输入模块连接,其包括多个独立的数据队列引擎子模 块,各数据队列引擎子模块用于接收所述信元输入模块根据所述对应分配规则所发送的 数据信元, 并将所述数据信元存储在所述数据队列引擎子模块中对应的队列中;
具有多个输出端口的信元输出模块,用于根据调度规则对多个数据队列引擎子模块 中所存储的数据信元进行调度, 并通过对应的所述输出端口向外发送。
本发明实施例提供一种路由器, 包括本发明各实施例所提供的交换单元芯片。 本发明实施例还提供一种信元信息的发送方法, 包括:
交换单元芯片中的、具有多个输入端口的信元输入模块缓存通过各输入端口接收到 的信元信息, 根据对应分配规则对缓存的信元信息中的数据信元进行分配, 并根据分配 结果将从缓存中读取的数据信元发送给交换单元芯片中的队列引擎模块; 所述队列引擎 模块与所述队列引擎模块连接, 并包括多个独立的数据队列引擎子模块;
所述交换单元芯片中的各数据队列引擎子模块接收所述信元输入模块根据所述对 应分配规则所发送的数据信元, 并将所述数据信元存储在所述数据队列引擎子模块中对 应的队列中;
所述交换单元芯片中的、具有多个输出端口的信元输出模块根据调度规则对多个数 据队列引擎子模块中所存储的数据信元进行调度, 并通过对应的所述输出端口向外发 送。
本发明实施例提供的交换单元芯片、 路由器及信元信息的发送方法中, 通过将芯片 分为若干个独立的数据处理单元, 并通过采用性能良好的数据信元分发算法, 保证不同 的数据处理单元的队列状态一致性, 能够满足互联网的更高流量交换需求。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实施例或现有 技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下面描述中的附图是本发 明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可 以根据这些附图获得其他的附图。
图 1为本发明实施例交换单元芯片结构示意图;
图 2为本发明实施例交换单元芯片的结构模块示意图;
图 3为本发明实施中信元输入模块的结构示意图;
图 4为本发明实施例中交换单元芯片中 IQ信元的读取示意图;
图 5为本发明实施例中数据表示意图;
图 6为本发明实施例分发算法实施过程中的信息矩阵表示意图;
图 7为本发明实施例分发算法实施过程中的 QE选择一示意图;
图 8为本发明实施例分发算法实施过程中的 QE选择另一示意图;
图 9为本发明实施例分发算法实施过程中更新数据表一示意图;
图 10为本发明实施例分发算法实施过程中更新数据表另一示意图;
图 11为本发明实施例分发算法实施过程中更新数据表再一示意图;
图 12为本发明实施例 QE的队列调度示意图;
图 13为本发明实施例 0Q Group与 QE的配合轮询示意图;
图 14为本发明实施例采用交换单元芯片的三级交换网系统示意图。 具体实肺式 为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发明实施例中 的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例 是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通技 术人员在没有作出创造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范 围。
本发明实施例针对现有技术中采用独立的小单元来拼凑交换单元, 从资源利用 和单元拼接处理上不利于系统的扩展, 以及因为数据信元被需要分发到不同的小单元, 增加了分发均流和重排的代价甚至导致性能下降等缺陷,提供一种解决方案即在本发明 实施例交换结构的交换单元的设计中, 将芯片分为若干个独立的数据处理单元, 并通过 采用性能良好的数据信元分发算法, 保证不同的数据处理单元的状态一致性, 以满足未 来互联网的更高流量交换需求。 本发明各实施例中交换单元芯片所采用 8QE的数据分发 算法还可以应用于流分类的领域。 图 1为本发明实施例交换单元芯片结构示意图, 如图 1所示, 该交换单元芯片包括信 元输入模块 1、 队列引擎模块 2和信元输出模块 3 ; 其中, 信元输入模块 1具有多个输入端 口, 交换单元芯片通过多个输入端口接收上行芯片发送的信元信息, 信元输入模块 1用 于缓存通过各输入端口接收到的信元信息,根据预先设置的对应分配规则对缓存的信元 信息中的数据信元进行分配, 并根据分配结果将从缓存中读取的数据信元发送给队列引 擎模块 2。 队列引擎模块 2与信元输入模块 1连接, 队列引擎模块 2中包括多个数据队列引 擎子模块 21, 各数据队列引擎子模块 21相互独立, 且用于接收信元输入模块 1根据对应 分配规则所发送的数据信元, 并将接收到的数据信元存储在数据队列引擎子模块 21中对 应的队列中。 信元输出模块 3具有多个输出端口, 用于根据设置的调度规则对多个数据 队列引擎子模块 21中所存储的数据信元进行调度, 并通过对应的输出端口向外发送。
具体地, 本发明实施例中交换单元芯片中的信元输入模块 1通过其上的输入端口接 收数据信元, 缓存在其内的输入队列 (Input Queue ; 简称: IQ) 中; 在每个时钟周期 ( Clock Cycle ) 内, 根据预设的分发算法对待分发的数据信元进行分配以确定分配给 队列引擎模块 2中的哪一个数据队列引擎子模块 21, 并根据分配结果将数据信元发送给 对应的数据队列引擎子模块 21。本实施例中所采用的分发算法需要保证不同的数据队列 引擎子模块 21的队列状态一致。 数据队列引擎子模块 21接收数据信元后, 按照分配结果 将数据信元存储在数据队列引擎子模块 21内所包括的对应的队列中; 本实施例中各个数 据队列引擎子模块 21内部队列设计相同, 均包括多个单播队列和多个多播队列。 信元输 出模块 3在每个 Clock Cycle内根据调度规则将存储在各数据队列引擎子模块 21内的数据 信元调度出队, 向外发送给下级芯片实现信元的交换。
本发明实施例提供的交换单元芯片中, 队列引擎模块 2还可以包括控制队列引擎子 模块 22用于接收信元输入模块 1通过各输入端口接收到的信元信息中的控制信元, 并发 送给信元输出模块 3。队列弓 I擎模块 2通过其中的数据队列引擎子模块 21存储信元输入模 块 1接收到的数据信元,通过其中的控制队列引擎子模块 22存储信元输入模块 1接收到的 控制信元, 以实现交换单元芯片内部信元交换的控制。
图 2为本发明实施例交换单元芯片的结构模块示意图, 图 3为本发明实施中信元输入 模块的结构示意图, 该交换单元芯片以输入端口和输出端口的数量分别为 128个, 数据 队列引擎子模块的数量为 8个为例进行详细介绍。 如图 2和图 3所示, 其中 RX portO^RX port l27表示 128个输入端口, TX portiTTX port l27表示 128个输出端口, QE(T QE7表示 8个数据队列引擎子模块。 具体地, 在本发明实施例交换结构的交换单元芯片设计中, 芯片的 Serdes个数为 128根, 基 于共享内存的处理公式制约,芯片分为 8个独立的处理单元,称为数据 QE(QUeUe Engine), 这 8个数据 QE (即为数据队列引擎子模块)用于处理数据信元。 如图 3所示, 信元输入模 块包括信元接收单元 (RX Ports Arbiter) 用于通过各输入端口接收信元信息, 并缓存 信元信息; 信元对应单元 (Cells— 2QE Arbiter) 用于根据对应分配规则, 建立信元接 收单元缓存的信元信息中的数据信元与数据 QE的对应分配关系。 信元发送单元 (Cells Data MUXs array) 用于根据信元对应单元建立的对应分配关系, 将从缓存中读取到的 数据信元发送给队列引擎模块中对应的数据 QE。 具体地, RX Ports Arbiter单元将数据 信元送给 Cells— 2QE Arbiter单元, 此单元通过预设的分发算法来决定数据信元被送往 哪个数据 QE中, 然后数据信元被 Cells Data MUXs array单元送到对应的 QE中。
在本发明实施例的交换单元芯片中, 可以包括一专门处理控制信元 (如: Request/Grant, BP) 的控制队列引擎子模块, 称为控制 QE, 其负责转发上下行线卡流 量的调度信元 (如: Request/Grant ), 以及下行线卡到上行线卡的流控信元 (如: BP)。
在本发明实施例的交换单元芯片中, 每个数据 QE内采用完全相同的队列设计, 例如 每个数据 QE内都有 512个单播队列和 256个多播队列,进入数据 QE的数据信元将根据如下 规则选择进入的队列 (系统中每框有 16个 FIC, 交换单元芯片中识别 4种优先级):
4框及以下系统: 单播信元一根据信元的目的 FIC号和优先级入队;
多播信元一根据信元的源 FIC号和优先级入队;
4框以上系统: 单播信元一根据信元的目的框号和优先级入队; (在 SE13模式下, 目 的地为本框 FIC的信元仍旧按着框内 FIC号和优先级入队; )
多播信元一根据信元的源框号和优先级入队。 在本发明实施例的交换单元芯片中, 控制队列引擎子模块即控制 QE内建立了三种 FIFO队列, 包括第一先入先出队列、 第二先入先出队列和第三先入先出队列, 分别用于 存放调度控制信元(Request/Grant ), 全局 BP信元和队列 BP信元, 这三种 FIFO队列之间 也采用共享缓存的方式实现。
在本发明实施例的交换单元芯片中, 每根链路都对应一个 IQ和 0Q队列, 每个队列由 三个 FIFO组成, 分别存放数据信元, Request/Grant信元和 BP信元。
本发明实施例所提供的交换单元芯片中,各个数据队列引擎子模块是同一芯片内部 的数据处理单元, 与现有技术通过数量较少的芯片物理拼凑成一个交换单元是完全不同 的, 本发明实施例的交换单元芯片中, 能够基于分发算法对数据信元在整个芯片内部各 数据处理单元进行均匀分配, 不会带来分发均流和重排的代价, 并且不影响系统性能。
以下分别介绍本发明实施例交换单元芯片实现数据信元交换的处理过程。
图 4为本发明实施例中交换单元芯片中 IQ信元的读取示意图, 如图 4所示, 在每个 Clock Cycle, 交换单元芯片通过其中的信元输入模块顺序地从 128个 IQ中按照顺序依次 最多读取 8个数据信元, 例如在 Clock CycleO, 从序号为 0到 7的 IQ上读取信元, 在 Clock Cyclel , 就从序号为 8到 15的 IQ上读取信元, 依次类推和循环。 若缓存有控制信元, 则 还可以在该 Clock Cycle内读取通过各输入端口接收到的信元信息中的控制信元, 例如 读取 1个上下行线卡的调度控制信元(Request/Grant信元), 和 1个系统的流控信元(BP 信元)。
下面介绍发明实施例交换单元芯片的数据 QE的信元分发算法。
8个数据 QE的数据结构是完全相同, 由单播 V0Q和多播 MVIQ队列组成。 进入数据 QE的 信元根据{单播 /多播, 优先级, 目的地 }进入相应的队列。 由于这 8个数据 QE的结构和调 度处理完全相同, 所以期望它们里面的队列占用状态也是相似的, 这样它们就可以处于 相似的工作状态, 否则, 如果某些队列的占用状态在 8个 QE中严重不均衡, 就会出现某 个 QE没有数据可调度, 而某个 QE中囤积了大量的数据来不及调度, 造成信元在交换单元 中的延迟和抖动很大, 严重影响交换单元 (Switch Element, SE) 的调度性能。
其中, 决定 8个数据 QE中信元分布状态的就是数据 QE的分发算法。 为了实现该分发 算法, 本发明实施例交换单元芯片中信元输入模块可以维护两张数据表, 即一包括各 QE 中每个队列占用长度的信息表(以下称为: 队列占用信息表)和一包括各 QE的总缓存占 用长度的信息表。 图 5为本发明实施例中数据表示意图, 如图 5所示, 其中包括 8个 QE中 每个队列的占用长度信息表和 8个 QE的总缓存占用信息表。 其中, 包括各数据队列引擎 子模块中每个队列占用长度的信息表中记录的数值是各 QE中各队列占用长度的相对差 值。 表 1所表示的是队列 QE0至 QE7中各自的 V0Q0队列中所存储的数据信元数量的差值, 假设 QE0、 QEK QE2、 ……、 QE7中实际存储有数据信元数量分别为 97、 98、 99和 100个 (也可以用 Byte数值来表示信元的占用深度), 则表中记录的是其之间的差值 {0、 1、 2— 3}, 这样操作所需要的比特数要比记录实际长度要小很多。
表 1
QE0 QE1 QE2 QE7
V0Q0 0 1 2 3 本发明实施例中所述对应分配规则包括: 根据上述的两个信息表, 在各 QE中为每个 数据信元选择对应队列占用长度最小、 且总缓存占用长度最小的 QE。 例如: V0Q 0在 QE0 和 QE1中的队列长度都是 30, QE0总的队列长度是 100, QE1总的队列长度是 90, 贝 Ε1被 选中。 这样就可以避免在相同的队列占用长度时, 总是优先选择序号小的 QE。
此分发算法的具体实施过程如下:
1、 在每个 Clock Cycle, 信元输入模块会收到最多 8个数据信元, 因此从上述的队 列占用信息表中获取相应的表项内容。 这个步骤结束时, 可以得到一个最大 64 (8QE X 8V0Qs ) 的信息矩阵表, 如图 6所示。
2、 根据信元的映射 V0Q在 8个 QE的占用信息, 分发算法从第一个待分发信元开始逐 一为其选择相应的 QE, "第一个待分发的信元"采用简单轮询的方法来确定,例如在 clock cycle N时顺序第一个待分发信元被设定为 "第一个待分发的信元", 那么在 clock cycle N+l时顺序第二个待分发信元被设定为 "第一个待分发的信元"。 分发算法根据信元映射 V0Q的占用信息选择占用最小的 QE。 如果对应的 QE已经在本 Clock Cycle被选择了, 就顺 序选择 V0Q占用次小的 QE, 如图 7所示。
3、 如果多个 QE对于同一个 V0Q占用具有相同的占用长度, 那么就顺序选择总缓存占 用最小的 QE。 如图 8所示。
本实施例中, 信元输入模块还用于在为各 QE分发完数据信元, 以及从各 QE调度出数 据信元后, 更新上述两个信息表。
4、 在每个 Clock Cycle, 完成所有数据信元的分发后, 就需要更新队列占用信息表 和 QE的总占用信息表, 如图 9所示, 如果当前 (:10 〔 ( 16在053/4/6分别送入了1个相同
V0Q的数据信元, 表格信息就做相应的更新。
5、 在每个 Clock Cycle, 数据信元从 QE中调度出队后, 也要对相应的信息表项内容 进行修改。 如图 10所示, 当 QE4/7都调度了同一个 V0Q的信元出队后的表项更新。
6、 如果 QE中调度出队的信元所在的队列正好是当前占用最小的, 那么信元出队之 后, 此 QE的队列占用仍保持为 0, 而其他的 QE的对应信息相应加 1。 如图 11所示, 若 QE3 调度一个信元出队, 表项内容做如图修改。
本发明实施例所提供的交换单元芯片, 能够解决小包数据的线速调度问题, 对于 64Byte的信元可以做到在交换结构中的线速处理; 在设计上利用 V0Q队列结构, 从原理 上消除了单播流量的队头阻塞现象; 通过采用性能良好的数据信元分发算法, 保证了不 同的数据 QE的状态一致性。 图 12为本发明实施例 QE的队列调度示意图, 如图 12所示, 本实施例中每个 QE内的队 列采用三级调度方式出队, 包括: 根据绝对优先原则或权重轮询原则 (WRR方式) 选择 被调度对列的优先级; 在具有所选择的优先级的队列中, 根据单播 /多播的权重轮询原 则选择调度单播或是调度多播; 采用简单轮询的方式选择本次可调度的输出队列, 并将 待调度的数据信元发送到该输出队列中。
具体地, 首先, 确定被调度队列的优先。 如果是绝对优先原则即严格有限调度, 就 选择当前可以调度的最高的优先级作为本次调度的优先级; 如果是 WRR方式, 则根据优 先级的当前调度权重值选择。
其次, 在确定的调度优先级队列中, 确定本次调度的单多播类型, 根据单播和多播 的调度权重值, 选择调度单播或者是调度多播。
然后, 可能存在多个待调度的队列, 这些待调度的队列中的信元对应了多个输出队 列(0Q)出口, 采用简单轮询的方式选择一个 0Q, 并确定此 0Q所对应的待调度队列。
最后, 将待调度队列中的信元送到选择的 0Q中, 并修改相应的调度参数(此处的调 度参数指的是调度器的相关参数(比如对 0Q进行简单轮询的信息记录, 待调度队列的状 态(是否还有信元)等等), 而之前 8QE分发的数据表内容是在信元进入和调度其所在队 列后做相应修改。 8个 QE中的数据表和调度器的相关参数本别是不同的内容), 完成本次 队列调度。
以下介绍本发明实施例所提供的交换单元芯片中的 OQ Group设计。 将 128根 Serdes 对应的 OQ (Output Queue) 分成 8组, 即 OQ Group, 每组中的 16个 Serdes链; 接下一级 的不同芯片, 每个 OQ Group通过下面的映射方法进行配置:
如果下级芯片个数不大于 16个, 每个 OQ Group连接所有的下级芯片;
如果下级芯片个数超过 16个, 就通过奇偶或其他方式分平面映射, 例如, SE2的个 数为 32个时, SE1芯片的 GroupO/2就连接 ID号为偶数的 SE2, SE1的 Groupl/3就连接 ID号 为奇数的 SE2。
以下介绍本发明实施例所提供的交换单元芯片中 OQ Group与 QE的配合。 为了避免调 度多个信元到同一个 0Q中, 在任意 clock cycle, 每个数据 QE都对应一个 OQ Group, 即 此 QE的数据只能调度到对应的 OQ Group内的 0Q中, 例如可以采用图 13所示的方式进行轮 询。
图 14为本发明实施例采用交换单元芯片的三级交换网系统示意图, 如图 14所示, 图 中所示的三级交换网系统采用了上述各实施例所提的交换单元架构设计, 即 SE13 (图中 从逻辑上分为了 SE1和 SE3) 和 SE2为本发明所提交换单元芯片。 数据信元在每个交换单 元芯片中从 IQ取出后被分发到不同的数据 QE中, 在数据 QE中进入对应 V0Q/MVIQ, 最后由 队列调度将数据信元调度到对应的 0Q中。控制信元在每个交换单元芯片中从 IQ取出后送 入控制 QE中, 然后根据流控信元的优先级进行调度, 被送入对应的 0Q中。
本发明实施例还提供一种路由器,其中所用的交换单元芯片可以采用上述各实施例 所提供的交换单元芯片, 其结构和功能此处不再赘述。 本发明实施例提供的交换单元芯 片同样适用于单级和背靠背交换网结构。
本发明实施例提供的交换单元芯片和路由器, 从设计上保证了 64Byte小包在交换单 元芯片中的线速处理; 从设计上优化了队列的调度过程, 在检查队列的可调度性的同时 也确定了其 0Q号, 将队列的可调度性检查和调度融为一体, 只需要对 0Q进行简单轮询调 度即可; 而且从设计上消除了单播的队列头阻塞 (Head of Line blocking; HOL)。
本发明实施例还提供一种信元信息的发送方法, 包括如下步骤:
交换单元芯片中的、具有多个输入端口的信元输入模块缓存通过各输入端口接收到 的信元信息, 根据对应分配规则对缓存的信元信息中的数据信元进行分配, 并根据分配 结果将从缓存中读取的数据信元发送给交换单元芯片中的队列引擎模块; 所述队列引擎 模块与所述队列引擎模块连接, 并包括多个独立的数据队列引擎子模块;
所述交换单元芯片中的各数据队列引擎子模块接收所述信元输入模块根据所述对 应分配规则所发送的数据信元, 并将所述数据信元存储在所述数据队列引擎子模块中对 应的队列中;
所述交换单元芯片中的、具有多个输出端口的信元输出模块根据调度规则对多个数 据队列引擎子模块中所存储的数据信元进行调度, 并通过对应的所述输出端口向外发 送。
进一步地, 本实施例提供的信元信息的发送方法中还可以进一步包括: 所述队列引 擎模块中包括的控制队列引擎子模块接收所述信元输入模块通过各输入端口接收到的 信元信息中的控制信元, 并发送给所述信元输出模块的步骤。
本实施例提供的信元信息的发送方法中所涉及的交换单元芯片可以采用上述各交 换单元芯片实施例所提供的交换单元芯片, 其结构和功能可以参见上述实施例, 此处不 再赘述。本实施例提供的信元信息的发送方法中所包括各操作步骤也可以参见上述各实 施例中提及的处理步骤, 此处也不再赘述。 本领域普通技术人员可以理解: 实现上述方 法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储 于一计算机可读取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而 前述的存储介质包括: 醒、 應、 磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对其限制; 尽管 参照前述实施例对本发明进行了详细的说明, 本领域的普通技术人员应当理解: 其依然 可以对前述各实施例所记载的技术方案进行修改, 或者对其中部分技术特征进行等同替 换; 而这些修改或者替换, 并不使相应技术方案的本质脱离本发明各实施例技术方案的 精神和范围。

Claims

权利要求
1、 一种交换单元芯片, 其特征在于, 包括:
具有多个输入端口的信元输入模块, 用于缓存通过各输入端口接收到的信元信息, 根据对应分配规则对缓存的信元信息中的数据信元进行分配, 并根据分配结果将从缓存 中读取的数据信元发送给队列弓 I擎模块;
所述队列引擎模块与所述信元输入模块连接,其包括多个独立的数据队列引擎子模 块,各数据队列引擎子模块用于接收所述信元输入模块根据所述对应分配规则所发送的 数据信元, 并将所述数据信元存储在所述数据队列引擎子模块中对应的队列中;
具有多个输出端口的信元输出模块,用于根据调度规则对多个数据队列引擎子模块 中所存储的数据信元进行调度, 并通过对应的所述输出端口向外发送。
2、 根据权利要求 1所述的交换单元芯片, 其特征在于, 所述队列引擎模块还包括: 控制队列引擎子模块,用于接收所述信元输入模块通过各输入端口接收到的信元信 息中的控制信元, 并发送给所述信元输出模块。
3、 根据权利要求 1或 2所述的交换单元芯片, 其特征在于, 所述信元输入模块包括: 信元接收单元, 用于通过各输入端口接收所述信元信息, 并缓存所述信元信息; 信元对应单元, 用于根据所述对应分配规则, 建立所述信元接收单元缓存的信元信 息中的数据信元与所述队列引擎模块中数据队列引擎子模块的对应分配关系;
信元发送单元, 用于根据所述信元对应单元建立的所述对应分配关系, 将从缓存中 读取到的所述数据信元发送给所述队列引擎模块中对应的数据队列引擎子模块。
4、 根据权利要求 2所述的交换单元芯片, 其特征在于, 所述控制队列引擎子模块包 括:
第一先入先出队列, 用于存储所述控制信元中的调度控制信元;
第二先入先出队列, 用于存储所述控制信元中的全局流控信元;
第三先入先出队列, 用于存储所述控制信元中的队列流控信元。
5、 根据权利要求 1或 2或 4所述的交换单元芯片, 其特征在于, 各所述数据队列引擎 子模块均包括多个单播队列和多个多播队列。
6、 根据权利要求 5所述的交换单元芯片, 其特征在于, 所述输入端口和输出端口的 数量分别为 128个, 所述数据队列引擎子模块的数量为 8个; 所述数据队列引擎子模块中 包括 512个单播队列和 256个多播队列。
7、 根据权利要求 1或 2或 4或 6所述的交换单元芯片, 其特征在于, 所述信元输入模 块还用于在每个时钟周期内, 从缓存的多个数据信元中按照顺序依次读取其中的 8个数 据信元。
8、 根据权利要求 7所述的交换单元芯片, 其特征在于, 所述信元输入模块还用于在 每个时钟周期内, 读取通过各输入端口接收到的信元信息中的控制信元。
9、 根据权利要求 1或 2或 4或 6或 8所述的交换单元芯片, 其特征在于, 所述信元输入 模块还维护有一包括各数据队列引擎子模块中每个队列占用长度的信息表和一包括各 数据队列引擎子模块的总缓存占用长度的信息表。
10、 根据权利要求 9所述的交换单元芯片, 其特征在于, 所述包括各数据队列引擎 子模块中每个队列占用长度的信息表中记录的数值是各数据队列引擎子模块中各队列 占用长度的相对差值。
11、 根据权利要求 9所述的交换单元芯片, 其特征在于, 所述对应分配规则包括: 根据两个所述信息表, 在各数据队列引擎子模块中, 为每个数据信元选择对应队列 占用长度最小、 且总缓存占用长度最小的数据队列引擎子模块。
12、根据权利要求 11所述的交换单元芯片,其特征在于,所述对应分配规则还包括: 若在同一时钟周期内,根据所述对应分配规则所确定的数据队列引擎子模块已经被 选择过, 则顺序选择对应队列占用长度次小的数据队列引擎子模块。
13、 根据权利要求 9所述的交换单元芯片, 其特征在于, 所述信元输入模块还用于 在为各数据队列引擎子模块分发完数据信元, 以及从各数据队列引擎子模块调度出数据 信元后, 更新包括各数据队列引擎子模块中每个队列占用长度的信息表和包括各数据队 列引擎子模块的总缓存占用长度的信息表。
14、 根据权利要求 1或 2或 4或 6或 8所述的交换单元芯片, 其特征在于, 所述调度规 则为采用三级调度方式出队, 包括:
根据绝对优先原则或权重轮询原则选择被调度对列的优先级;
在具有所选择的优先级的队列中, 根据单播 /多播的权重轮询原则选择调度单播或 是调度多播;
采用简单轮询的方式选择本次可调度的输出队列, 并将待调度的数据信元发送到该 输出队列中。
15、 根据权利要求 14所述的交换单元芯片, 其特征在于, 包括 8组所述输出队列, 每组输出队列通过如下映射方法进行配置: 若下级芯片的个数不大于 16, 则所述输出队列连接所有的下级芯片;
若下级芯片的个数大于 16, 则 8组所述输出队列通过奇偶方式分平面映射。
16、 根据权利要求 14所述的交换单元芯片, 其特征在于, 在任一时钟周期内, 每个 数据队列引擎子模块仅对应一组输出队列。
17、 一种路由器, 其特征在于, 包括如权利要求 1至 16任一所述的交换单元芯片。
18、 一种信元信息的发送方法, 其特征在于, 包括:
交换单元芯片中的、具有多个输入端口的信元输入模块缓存通过各输入端口接收到 的信元信息, 根据对应分配规则对缓存的信元信息中的数据信元进行分配, 并根据分配 结果将从缓存中读取的数据信元发送给交换单元芯片中的队列引擎模块; 所述队列引擎 模块与所述队列引擎模块连接, 并包括多个独立的数据队列引擎子模块;
所述交换单元芯片中的各数据队列引擎子模块接收所述信元输入模块根据所述对 应分配规则所发送的数据信元, 并将所述数据信元存储在所述数据队列引擎子模块中对 应的队列中;
所述交换单元芯片中的、具有多个输出端口的信元输出模块根据调度规则对多个数 据队列引擎子模块中所存储的数据信元进行调度, 并通过对应的所述输出端口向外发 送。
19、 根据权利要求 18所述的信元信息的发送方法, 其特征在于, 还包括: 所述队列 引擎模块中包括的控制队列引擎子模块接收所述信元输入模块通过各输入端口接收到 的信元信息中的控制信元, 并发送给所述信元输出模块。
PCT/CN2012/071845 2011-03-02 2012-03-02 交换单元芯片、路由器及信元信息的发送方法 WO2012116655A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110050100.1A CN102088412B (zh) 2011-03-02 2011-03-02 交换单元芯片、路由器及信元信息的发送方法
CN201110050100.1 2011-03-02

Publications (1)

Publication Number Publication Date
WO2012116655A1 true WO2012116655A1 (zh) 2012-09-07

Family

ID=44100031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/071845 WO2012116655A1 (zh) 2011-03-02 2012-03-02 交换单元芯片、路由器及信元信息的发送方法

Country Status (2)

Country Link
CN (1) CN102088412B (zh)
WO (1) WO2012116655A1 (zh)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10750023B2 (en) 2008-01-28 2020-08-18 Afiniti Europe Technologies Limited Techniques for hybrid behavioral pairing in a contact center system
CN102088412B (zh) * 2011-03-02 2014-09-03 华为技术有限公司 交换单元芯片、路由器及信元信息的发送方法
CA3004240C (en) * 2016-04-18 2019-12-31 Afiniti Europe Technologies Limited Techniques for benchmarking pairing strategies in a contact center system
CN109391559B (zh) * 2017-08-10 2022-10-18 华为技术有限公司 网络设备
CN109802896B (zh) * 2017-11-16 2022-04-22 华为技术有限公司 调度数据的方法和交换设备
US10686714B2 (en) * 2018-04-27 2020-06-16 Avago Technologies International Sales Pte. Limited Traffic management for high-bandwidth switching
CN110809033B (zh) * 2019-10-23 2022-07-12 新华三信息安全技术有限公司 报文转发方法、装置及交换服务器
WO2021146964A1 (zh) * 2020-01-21 2021-07-29 华为技术有限公司 一种交换网芯片及交换设备
CN111522643A (zh) * 2020-04-22 2020-08-11 杭州迪普科技股份有限公司 基于fpga的多队列调度方法、装置、计算机设备及存储介质
CN113179226B (zh) * 2021-03-31 2022-03-29 新华三信息安全技术有限公司 一种队列调度方法及装置
CN114257557B (zh) * 2021-11-26 2023-04-11 中国科学院计算技术研究所 一种数据分组交换系统和方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1694435A (zh) * 2004-05-04 2005-11-09 阿尔卡特公司 帧到信元的通信量调度
CN101035067A (zh) * 2007-01-25 2007-09-12 华为技术有限公司 一种基于输出队列的流控实现方法及装置
CN101567855A (zh) * 2009-06-11 2009-10-28 杭州华三通信技术有限公司 分布式包交换系统和分布式包交换方法
CN102088412A (zh) * 2011-03-02 2011-06-08 华为技术有限公司 交换单元芯片、路由器及信元信息的发送方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1184777C (zh) * 2002-04-17 2005-01-12 华为技术有限公司 以太网交换芯片传输数据过程中缓存的管理和分配方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1694435A (zh) * 2004-05-04 2005-11-09 阿尔卡特公司 帧到信元的通信量调度
CN101035067A (zh) * 2007-01-25 2007-09-12 华为技术有限公司 一种基于输出队列的流控实现方法及装置
CN101567855A (zh) * 2009-06-11 2009-10-28 杭州华三通信技术有限公司 分布式包交换系统和分布式包交换方法
CN102088412A (zh) * 2011-03-02 2011-06-08 华为技术有限公司 交换单元芯片、路由器及信元信息的发送方法

Also Published As

Publication number Publication date
CN102088412B (zh) 2014-09-03
CN102088412A (zh) 2011-06-08

Similar Documents

Publication Publication Date Title
WO2012116655A1 (zh) 交换单元芯片、路由器及信元信息的发送方法
EP1779607B1 (en) Network interconnect crosspoint switching architecture and method
TWI477109B (zh) 訊務管理器及用於訊務管理器之方法
US8917740B2 (en) Channel service manager
US7391786B1 (en) Centralized memory based packet switching system and method
US6754222B1 (en) Packet switching apparatus and method in data network
US20030026205A1 (en) Packet input thresholding for resource distribution in a network switch
US7859999B1 (en) Memory load balancing for single stream multicast
CN109861931B (zh) 一种高速以太网交换芯片的存储冗余系统
US11677676B1 (en) Shared traffic manager
CN1866927A (zh) 实现信息交换的系统及方法和调度算法
CN101695051A (zh) 一种用于缓冲Crossbar的队列长度均衡调度方法
US20080273546A1 (en) Data switch and a method of switching
CN114531488B (zh) 一种面向以太网交换器的高效缓存管理系统
US8040907B2 (en) Switching method
US10846225B1 (en) Buffer read optimizations in a network device
CN113110943B (zh) 软件定义交换结构及基于该结构的数据交换方法
US10742558B1 (en) Traffic manager resource sharing
Lin et al. Two-stage fair queuing using budget round-robin
US9225672B1 (en) Systems and methods for packet grouping in networks
WO2023130835A1 (zh) 一种数据交换方法及装置
CN110430146A (zh) 基于CrossBar交换的信元重组方法及交换结构
WO2022160307A1 (zh) 一种路由器及片上系统
WO2023202294A1 (zh) 一种数据流保序方法、数据交换装置及网络
CN1728682A (zh) 基于变长包的交换系统及交换方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12752010

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12752010

Country of ref document: EP

Kind code of ref document: A1