WO2019214801A1

WO2019214801A1 - Memory device for a high bandwidth high capacity switch

Info

Publication number: WO2019214801A1
Application number: PCT/EP2018/061671
Authority: WO
Inventors: Rami Zecharia; Itzhak Barak
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2018-05-07
Filing date: 2018-05-07
Publication date: 2019-11-14
Also published as: CN112088521A

Abstract

The present invention provides a shared memory device for a high bandwidth, high capacity switch. The memory device comprises a plurality of memory blocks. Further, the memory device comprises a plurality of ingress pipes, wherein each ingress pipe is configured to request a write of a packet to the memory blocks. The memory device also comprises a plurality of egress pipes, wherein each egress pipe is configured to request a read of a packet from the memory blocks. Thereby, each egress pipe is associated with (soft allocated to) a set of the memory blocks.

Description

MEMORY DEVICE FOR A HIGH BANDWIDTH HIGH CAPACITY SWITCH

TECHNICAL FIELD

The present invention relates to the usage of high capacity and high bandwidth switches in networking system. In particular, the present invention relates to a memory device for such a switch, and to a switch including the memory device. The memory device provides an implementation for a new memory block allocation scheme for ingress pipes and egress pipes, respectively. The new allocation scheme is particularly suited for providing a shared memory architecture with multiple memory blocks. The present invention relates further to a corresponding control method for a memory device.

BACKGROUND

A conventional switch usually contains multiple bi-directional ports. Arriving traffic from input ports, which is typically Ethernet packets, is directed to output ports based on a decision made within the switch.

A port is typically known by its rate, which is usually the same for input and for output. For instance, a 100 Gbps port (100 Giga bits per second) is capable of receiving traffic at a rate of lOOGbps and sending traffic at a rate of 100 Gbps.

The conventional switch further contains a memory to temporarily hold incoming traffic, before the traffic is sent out to the output ports. There are many reasons why traffic is to be held within the switch, such as:

1. Multiple input ports may receive traffic that is directed to a single output port (many-to- one). If the output port does not have capacity to deliver all received traffic, then some of the received traffic to that output port must be stored temporarily. 2. Back-pressure given from the outside of the switch to an output port may prevent further outgoing traffic to that output port. Therefore, all received traffic that is directed to this port must be stored temporarily. 3. A scheduling rate of an output port is a parameter within the switch, which may be configured to limit the output rate of a certain port. Therefore, received traffic directed to this port must be stored temporarily.

The memory in a high capacity and high bandwidth switch is typically built as a shared memory. That is, it is shared among all output ports. Share-ability has the main benefit of having less memory compared to a dedicated memory per output port.

A simplified architectural view of a conventional switch is shown in FIG. 7. Arriving traffic from different input ports goes through a classification engine configured to select an output port and a corresponding queue, in order to store the received traffic. The classification engine also decides on any editing that may be required to the arriving traffic. After classification, the traffic is stored in a memory to temporarily buffer the received traffic. The memory is virtually arranged in and managed as queues. Queues, buffer management and scheduling to the output ports are managed by the control logic. Queuing can be any methods such as input queues, output queues, virtual output queues (VOQ) etc.

In a conventional high capacity, high bandwidth switch, the memory architecture is typically shared among all output ports. That means, received traffic from any input port directed to any output port can be written to this shared memory. Algorithms exist for managing the memory per output port. The switch is typically built of a single silicon, i.e. it is a single device such that all high speed accesses to the shared memory is confined to the inside of the device without external interfaces. External interfaces could cause the switch to be impractical to be built as a single device.

Disadvantageously, the shared memory architecture of the conventional switch has limitations when used specifically for high capacity, high bandwidth switches.

Firstly, a single switch with N ports, where each port supports B bandwidth (i.e. 10 Gbps, 100 Gbps etc.), must support a write bandwidth of N*B and a read bandwidth of N*B from the shared memory at worst case, i.e. when the switch is fully utilized, that is, at 100% load. For example, a 64-ports switch, where each port supports 100 Gbps, must support 6.4 Tbps read bandwidth and 6.4 Tbps write bandwidth from the shared memory at worst case traffic.

Secondly, since the network traffic is built of packets with variable size (e.g. the size of an Ethernet packet can be 64 Byte and up to 9 KB), and since the shared memory in the switch is built of a fixed width (each memory location contains C bytes), each arriving packet must be segmented to chunks of data of size of C bytes to be written to the shared memory. The last chunk of data can have a size of 1 byte and up to C bytes. In the worst case, if arriving traffic contains a stream of packets of size (C+l) bytes, each packet must be written to the shared memory in 2 locations and read from the memory from 2 locations when it is to be sent to an output port. This scenario doubles the bandwidth requirements from the memory to be at most 2*N*B for read and 2*N*B for write.

Thirdly, typically in such a high bandwidth, high capacity switch, the buffering shared memory is implemented inside the switch. The shared memory is usually built of single-port memory blocks, in order to conserve silicon area, and is not built from dual port memory blocks, which are double in size relative to single port memory blocks. A single port memory block can perform either 1 read per clock or 1 write per clock but not both. This means, a shared memory built of single-port memory blocks must again double its bandwidth capacity to support N*B bandwidth for read and for write at the same time.

Fourthly, due to physical limitation on the operating frequency, it is not possible to support such high bandwidth requirements from a single block of a single-port memory. For example, a 64-ports switch and 100 Gbps ports must support 6.4 Tbps read bandwidth and 6.4 Tbps write bandwidth to the shared memory without taking into account the speedup required for segmentation of packets to fixed size C. For a 64 bytes segment size, which is also the memory width, a 12.8 Tbps single port memory should operate at 25 GHz to be able to sustain the required bandwidth. When doubling the required bandwidth, in order to support stream of packets with size C+l, the frequency is even higher. Increasing C to get a wider memory has implications on the memory speed and the silicon area (of the memory and of the logic to support such a big C).

To resolve the above-described bandwidth issues of the shared memory architecture, multiple blocks of single-port memories are typically used instead of one memory block. Specifically, there are two steps in using the multiple memory blocks:

1. A group of ports is bundled (to a pipe), such that at the operating frequency and at the worst case traffic scenario (100% load and segmentation) the selected segment size C is sufficient to provide C every clock cycle without creating any bottlenecks. 2. M memory blocks are set connected in parallel, such that each pipe can access a memory block for read or for write.

FIG. 8 shows such an architecture. On every clock cycle, each input pipe (also named ingress pipe or pipeline) can request a write to the shared memory and each output pipe (also named egress pipe or pipeline) can request a read from the shared memory. For proper operation, for a given number of pipes P, the number of memory blocks M should be at least 2P memory blocks, so that P reads and P writes can occur at the same clock cycle (at the same time).

The‘queueing engine and control’ block accepts write requests from all ingress pipes every clock and accepts read requests from all egress pipes every clock. It then decides which egress pipe performs read and from which memory block, and which ingress pipe performs write and to which memory block.

The sequence of reads from the egress pipes depends on scheduling algorithms, which are independent of the placement of the packets in the memory blocks and independent of scheduling decisions of other egress pipes. If two or more egress pipes request to read from the same memory block, only one egress pipe is granted the read, while the rest will wait and will not perform read at this clock cycle. This scenario is called a collision. As reads have to be in sequence, since the segments of the packets have to be transmitted in order of the original packet, in typical implementations reads cannot be out of order. A combined read and write process for such an architecture shown in FIG. 8 is now described. Every clock cycle, the control logic accepts W requests for write from the ingress pipes and R requests for read from the egress pipes (not all pipes request a read or write every clock cycle). The control logic then performs a selection of memory blocks for read and for write based on the requests as follows: 1. Select memory blocks for read. a. Perform maximal matching between egress pipes requesting reads (at most P) and corresponding memory blocks (M). b. Note that due to possible collisions, not all read requests can be supported at the same clock cycle. c. Set matching pairs {egress pipe, memory block} .

2. Select memory blocks for write. a. Set list of memory blocks that are available for write (M’). i. Any memory block, which is not selected for read and is not full. b. Out of M’ list, select W memory blocks and attached each to ingress pipe, which has a valid write request. i. Memory blocks are be selected using one of the following mechanisms:

1. Round-robin order between all available memory blocks.

2. Make list of memory blocks based on occupancy level from least occupied to most occupied and select W memory blocks which are list occupied. c. Set matching pairs {ingress pipe, memory block} .

In the conventional shared memory architecture using multiple memory blocks, the read bandwidth is disadvantageously reduced due to a possible collisions of read requests from different egress pipes to the same memory block.

Note that the read requests from the egress pipes is calculated at each egress pipe with no regard to requests from other egress pipes. Therefore, it is possible that two or more egress pipes will request a read from the same memory block at the same time. In fact the probability of any collision increases with an increase of the number of pipes for a given amount of memory blocks, and decreases with the increase of the number of memory blocks for a given amount of pipes.

The following equation calculates the probability of no collision at all:

where M is the number of memory blocks and P is the number of egress pipes requesting reads. Therefore the probability for any collision is:

For example,

• 32 memory blocks and 4 egress pipes P(collision) = 0.177

• 32 memory blocks and 8 egress pipes P(collision) = 0.614 · 32 memory blocks and 12 egress pipes P(collision) = 0.857

• 32 memory blocks and 16 egress pipes P(collision) = 0.990

Accordingly, there is a high probability of collusion in the conventional shared memory architecture. Thus, at the maximum possible capacity, at worst case traffic scenario, and at worst case collusion pattern (for example, all reads at the same clock cycle are requested from a specific memory block, therefore, the last pipe reads after waiting P-l clock cycles), the shared memory bandwidth is too low to support the outgoing traffic.

Multiple mechanisms have been suggested to reduce the increase in latency and the reduction of bandwidth caused by the read collisions phenomena, for instance:

1. Input queues - hard allocation of memory blocks to input pipes: This mechanism shows a good memory utilization and simpler logic. However, it produces high collision rate in the read side and can cause head-of-line blocking.

2. Output queues - hard allocation of memory blocks to output pipes: This mechanism has a simplified logic that eliminates collisions on read, but with poor memory utilization and no share-ability of memory. 3. Increase of the number of memory blocks relative to the number of pipes: This mechanism causes an increase in silicon area of the memory blocks and the logic (multiplexers and de-multiplexers).

4. Possible out of order reads of segments, and reordering them at each egress pipe: With this mechanism, the latency problem is not resolved, as there is a need to wait for the in order segment to be read before starting transmission. 5. Use dual ports memory blocks instead of single ports: In this mechanism, the silicon area of the memory blocks for the same total shared memory size is doubled.

SUMMARY

In view of the above-mentioned disadvantages, the present invention aims to improve the conventional shared memory architecture and the suggested mechanisms. The present invention has particularly the objective to provide a shared memory architecture that improves latency and throughput. Thereby, a goal of the invention is to reduce significantly a probability of collisions, in order to avoid an increase of the shared memory bandwidth. Optimally, the probability for read collusions is even eliminated completely.

The objective of the present invention is achieved by the solution provided in the enclosed independent claims. Advantageous implementations of the present invention are further defined in the dependent claims.

A first aspect of the present invention provides a memory device for a switch, the memory device comprising a plurality of memory blocks, a plurality of ingress pipes, each ingress pipe being configured to request a write of a packet to the memory blocks, and a plurality of egress pipes, each egress pipe being configured to request a read of a packet from the memory blocks, wherein each egress pipe is associated with a set of the memory blocks.

A“set” is a group of memory blocks that are associated with a single egress pipe.“Associated” means that these memory blocks are the preferred ones to be used for a write destined to the associated egress pipe, and are the preferred ones to be used for a read by the associated egress pipe. However, if e.g. more writes are needed at the same time, the rest of the writes may also be allocated to other memory blocks, and thus may also be read from these other memory blocks by the egress pipe.

By associating the egress pipes with sets of memory blocks, the right of the arriving packets is controlled by the memory device, and thus the probability for read collusions is at least significantly reduced. As a consequence, also the probability for a high latency and low throughput is reduced, since these drawbacks result often from read collusions to the same memory block by different egress pipes. The memory device of the first aspect thus enables an improved switch for higher bandwidth and higher capacity. In an implementation form of the first aspect, the sets are disjoint sets.

This leads to the lowest possible probability of collusions, potentially even to no collusion at all, thus optimally reducing the latency and increasing throughput.

In a further implementation form of first aspect, each set includes a same number of memory blocks.

This allows an efficient implementation of the memory device as shared memory architecture of the switch.

In a further implementation form of the first aspect, the memory device comprises a controller configured to select, for an egress pipe requesting a read of a packet, a memory block from the set associated with said egress pipe for the read of the packet.

In a further implementation form of the first aspect, the controller is further configured to select, for any ingress pipe requesting a write of a packet destined to a determined egress pipe, a memory block from the set associated with the determined egress pipe for the write of the packet.

The controller may be processor. The controller is particularly able to implement the“soft allocation” (i.e. the association) of memory blocks and egress pipes, in order to reduce the read collusion probability.

In a further implementation form of the first aspect, the controller is further configured to exclude one or more full memory blocks when selecting the memory block from the set associated with the determined egress pipe for the write of the packet. This further improves the memory block allocation efficiency.

In a further implementation form of the first aspect, if a total number of write requests for packets destined to the determined egress pipe is smaller than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the controller is further configured to select a memory block from the write-permitted memory blocks in the set associated with the determined egress pipe based on a minimum occupancy or randomly.

This reduces the number of selections to be made by the controller, and thus increases the efficiency of the allocation, and reduces the probability of collusions. In a further implementation form of the first aspect, if a total number of write requests for packets destined to the determined egress pipe is larger than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the controller is further configured to create a list of all ingress pipes having an unfulfilled write request. This allows the controller to monitor all unfulfilled write requests for a more efficient processing of all requests.

In a further implementation form of the first aspect, if a total number of write requests for packets destined to the determined egress pipe is larger than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the controller is further configured to select the memory block from another set not associated with the determined egress pipe for the write of the packet, in particular allocate a remaining available memory block to each unfulfilled write request.

This ensures that all requests are in the end fulfilled, but that at the same time the probability of collusions is kept as low as possible. In a further implementation form of the first aspect, the controller is configured to select, for the determined egress pipe requesting a read of a packet, a memory block from another set not associated with said determined egress for the read of the packet.

This ensures that each packet arrives at the correct destination.

In a further implementation form of the first aspect, the controller is further configured to select the remaining available memory blocks for the unfulfilled write requests based on a minimum occupancy or randomly.

This implementation form can further increase the efficiency of the memory block allocation.

In a further implementation form of the first aspect, the controller if further configured to associate the egress pipes with the sets of the memory blocks. The controller has thus complete control over the memory device. The controller may also be able to change the association of egress pipes with sets of memory blocks at need.

A second aspect of the present invention provides a switch for packet switching, the switch comprising a memory device according to the first aspect or any of its implementation forms. In an implementation form of the second aspect, the switch comprises a plurality of input ports and a plurality of output ports, wherein each ingress pipe is associated with a group of the input ports and each egress pipe is associated with a group of the output ports.

A third aspect of the present invention provides a method for controlling a memory device including a plurality of memory blocks, ingress pipes and egress pipes, the method comprising selecting, for an egress pipe requesting a read of a packet, a memory block from a set of memory blocks associated with said egress pipe for the read of the packet, and/or selecting, for any ingress pipe requesting a write of a packet destined to a determined egress pipe, a memory block from a set of memory blocks associated with the determined egress pipe for the write of the packet.

In an implementation form of the third aspect, the sets are disjoint sets.

In a further implementation form of third aspect, each set includes a same number of memory blocks.

In a further implementation form of the third aspect, the method comprises selecting, for an egress pipe requesting a read of a packet, a memory block from the set associated with said egress pipe for the read of the packet.

In a further implementation form of the third aspect, the method further comprises selecting, for any ingress pipe requesting a write of a packet destined to a determined egress pipe, a memory block from the set associated with the determined egress pipe for the write of the packet.

In a further implementation form of the third aspect, the method further comprises excluding one or more full memory blocks when selecting the memory block from the set associated with the determined egress pipe for the write of the packet.

In a further implementation form of the third aspect, if a total number of write requests for packets destined to the determined egress pipe is smaller than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the method comprises selecting a memory block from the write-permitted memory blocks in the set associated with the determined egress pipe based on a minimum occupancy or randomly.

In a further implementation form of the third aspect, if a total number of write requests for packets destined to the determined egress pipe is larger than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the method comprises creating a list of all ingress pipes having an unfulfilled write request.

In a further implementation form of the third aspect, if a total number of write requests for packets destined to the determined egress pipe is larger than a number of write-permitted memory blocks in the set associated with the determined egress pipe, the method comprises selecting the memory block from another set not associated with the determined egress pipe for the write of the packet, in particular allocating a remaining available memory block to each unfulfilled write request.

In a further implementation form of the third aspect, the method comprises selecting, for the determined egress pipe requesting a read of a packet, a memory block from another set not associated with said determined egress for the read of the packet.

In a further implementation form of the third aspect, the method comprises selecting the remaining available memory blocks for the unfulfilled write requests based on a minimum occupancy or randomly. In a further implementation form of the third aspect, the method comprises associating the egress pipes with the sets of the memory blocks.

The method of the third aspect achieves all advantages and effects described above for the memory device of the first aspect.

A fourth aspect of the present invention provides a computer program product storing a program code for controlling a memory device according to the first aspect or any of its implementation forms and/or a switch according to the second aspect or any of its implementation forms, or for performing, when implemented on a computer, a method according to the third aspect or any of its implementation forms.

Accordingly, with the computer program product of the fourth aspect the advantages of the first, second and third aspect can be achieved, respectively.

It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof.

BRIEF DESCRIPTION OF DRAWINGS

The above described aspects and implementation forms of the present invention will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which

FIG. 1 shows a memory device according to an embodiment of the present invention. FIG. 2 shows a switch according to an embodiment of the present invention. FIG. 3 shows a method according to an embodiment of the present invention. FIG. 4 shows simulations of random traffic with uniform distribution through a memory device according to an embodiment of the present invention.

FIG. 5 shows simulations of random traffic with uniform distribution through a memory device according to an embodiment of the present invention.

FIG. 6 shows simulations of random burst traffic with uniform distribution through a memory device according to an embodiment of the present invention.

FIG. 7 shows a conventional switch shared memory architecture.

FIG. 8 shows a conventional switch shared memory architecture with multiple memory blocks and multiple ingress and egress pipes.

DETAIFED DESCRIPTION OF EMBODIMENTS

FIG. 1 shows a memory device 100 according to an embodiment of the present invention. The memory device 100 is in particular suited to be implemented into a switch 200 (see in FIG. 2). In particular, the memory device 100 can provide as a shared memory architecture for the switch 200.

The memory device 100 comprises a plurality of memory blocks 101, which may each be a conventional memory block, a plurality of ingress pipes 102, and a plurality of egress pipes 103. Also the ingress pipes 102 and egress pipes 103 may per se be implemented like conventional pipes. Each ingress pipe 102 is configured to request a write of a packet to the memory blocks 101. The packet may, for instance, be an Ethernet packet. Further, each egress pipe 103 is configured to request a read of a packet from the memory blocks 101. According to the invention, each egress pipe 103 is associated with a set 104 of the memory blocks 101 (this is indicated by the dotted lines between the memory blocks 101 and egress lines 103, but does not mean that the egress lines 103 are only able to read from these memory blocks 101). The sets 104 of the memory blocks 101 may be disjoint sets, i.e. two sets 100 do not share any memory blocks 101. Further, each set 104 may include the same number of memory blocks 101. However, it is also possible that two sets 104 include a different number of memory blocks 101.

As read requests and read sequences from different egress pipes 103 in to the memory blocks 101 are mandatory, and cannot be controlled, the memory device 100 of FIG. 1 controls the write of arriving packets to memory blocks 101, such that the probability for read collision is significantly reduced. Hence, also the probability for high latency and low throughput, which is due to read collisions to the same memory blocks 101 by different egress pipes 103, is reduced.

In the following an example of the memory device 100 of FIG. 1 is explained. In the memory device 100, a plurality of M memory blocks 101 is divided to a plurality of P egress pipes 103, such that each egress pipe 103 is associated with M/P memory blocks 103. For instance, for M=64 and P=l6, each egress pipe 103 may be associated with 4 memory blocks 101 as follows:

Egress pipe 0: memory blocks 0-3

Egress pipe 1 : memory blocks 4-7 Egress pipe 16: memory blocks 60-63

In general, the memory blocks 101 associated with an egress pipe 103 are selected to be written from any ingress pipe 102, if the destination of the packet to be written is to the egress pipe 103. Notably, these memory blocks 101 may not be selected for write, if they are full. As there are M/P memory blocks 103 associated with a certain egress pipe 103, and assumingly one memory block is used for read, there remain M/P-l memory blocks 101 dedicated to the certain egress pipe 101 and are available for write. Assuming that none of the M/P-l memory blocks 101 is full.

Given this basic rule of selecting memory blocks 101 for a write, the probability of collision at a read is reduced to zero, if there are M/P-l write requests or less to each egress pipe 103 at the same clock cycle. The selection algorithm of memory blocks for write requests, as shown above for the conventional shared memory architecture, is modified accordingly to produce significantly less read collisions.

In particular, the write process of the invention may be divided into two steps, a first step (step 1) to allocate memory blocks 101 to requesting ingress pipes 102 for write per egress pipe 103, but only from the associated range of memory blocks 101, i.e. M/P. And a second step (step 2) to allocate memory blocks 101 to the remaining write requests (that is writes, which are more than M/P-l for an egress pipe 103, if a read was allocated already or more than M/P for an egress pipe 103, if a read was not allocated). These two steps are described in more details as follows:

Step 1 :

1. For each ingress pipe 102, independently select memory blocks 101 for write within M/P memory blocks 101. a. For example, for M=64 and P=l6, memory blocks 0-3 can be selected for any write to egress pipe 0 (unless one of them was already selected for a read). b. If less than 4 writes are needed for the specific egress pipe 103, or if less than 3 writes where a memory block 101 for a read is already selected, then fewer memory blocks 101 have to be selected between M/P (or M/P-l) memory blocks 101. Notably, this invention is not limited to the memory block selection process, which may be: i. Select memory blocks 101 with minimum occupancy ii. Select memory blocks 101 randomly iii. Etc.

2. Make list of all ingress pipes 102 with a request that was not fulfilled a. This can occur if: i. There are more than M/P-l write requests to a single egress pipe 103 and a read was allocated to that egress pipe 103, or ii. There are more than M/P write requests to a single egress pipe 103 and a read was not allocated to that egress pipe 103.

Step 2:

1. Remaining write requests from all ingress pipes 102 are allocated to remaining available memory blocks 101. a. An available memory block 101 is one that is not already allocated for read or for write, and it is not full (i.e. can be written into). b. This invention is not limited to the memory block selection process, which may be: i. Select memory blocks 101 with minimum occupancy. ii. Select memory blocks 101 randomly. iii. Etc.

As defined, the memory blocks 101 are“softly allocated” to (associated with) egress pipes 103. This means that the memory blocks M/P per egress pipes 103 are the preferred ones to be used for write, but if more writes are needed at the same time, the rest of the writes are allocated to other memory blocks 101, thus“contaminating” them with different egress pipes 103 as destinations. FIG. 2 shows a switch 200 according to an embodiment of the present invention. The switch 200 is particularly for packet switching in a networking system, and may be a high bandwidth and high capacity switch. The switch 200 includes at least one memory device 100 according to an embodiment of the present invention, particularly as shown in FIG. 1. As shown in FIG. 2, the switch 200 may further comprise a plurality of input ports 201 and a plurality of output ports 202. Each ingress pipe 102 of the memory device 100 may be associated with a group 203 of the input ports 201, and each egress pipe 103 of the memory device 100 may be associated with a group 204 of the output ports 202.

FIG. 3 shows a method 300 according to embodiment of the present invention. The method 300 is particularly for controlling a memory device 100 according to an embodiment of the present invention, particularly one as shown in FIG. 1. The method 300 may be carried out by a controller of the memory device 100, or by a controller of a switch 200 (as e.g. shown in FIG. 2), which comprises the memory device 100.

The method 300 comprises a step 301 of selecting, for an egress pipe 103 of the memory device 100 requesting a read of a packet, a memory block 101 of the memory device 100 from a set of memory blocks 101 associated with said egress pipe 103 for the read of the packet. Additionally or alternatively, the method 300 comprises a step 302 of selecting, for any ingress pipe 102 of the memory device 100 requesting a write of a packet destined to a determined egress pipe 103, a memory block 101 of the memory device 100 from a set 104 of memory blocks 101 associated with the determined egress pipe 103 for the write of the packet.

In the following, the performance improvement of the memory device 100, switch 200, and method 300 according to the embodiments of the present invention is analyzed.

As an example, a memory device 100 containing P=l6 egress pipes 103 and M=32-256 memory blocks 101, was simulated for two memory block allocation schemes: 1. Random allocation of memory blocks 101 to write requests (conventional).

2. “Soft allocation” of memory blocks 101 to write requests (egress pipes 103) according to the solution of the present invention. FIG. 4 shows a first test, which simulates random traffic with uniform distribution. A worst case traffic pattern (small packets) from all input ports was applied, where random destination with uniform distribution is selected for each packet.

FIG. 5 shows a second test, which simulates random traffic with uniform distribution. A worst case traffic pattern (small packets) from all input ports was applied, where random destination with uniform distribution is selected for each packet for 9 clock cycles, and where in the l0^th clock cycle all arriving packets are destined to a single destination selected at random with uniform distribution.

FIG. 6 shows a third test, which simulates random burst traffic with uniform distribution. A worst case traffic pattern (small packets) from all input ports is applied, where a random destination with uniform distribution is selected for all arriving packets every clock cycle. All arriving packets are destined to a single destination selected at random with uniform distribution.

In summary of FIG. 4, 5 and 6, respectively, all simulation results show a higher transmitted bandwidth with a lower number of memory blocks 101 when using the soft allocation scheme of the present invention (as compared to conventional random memory block selection). Only at a very high number of memory blocks 101, the performance of both schemes matches.

Notably, a big number of memory blocks 101 increases the overall silicon area and the power consumption. An additional benefit of the allocation scheme of the invention is the reduction in latency. Since the overall shared memory structure of the memory device 100 provides more bandwidth, the latency is reduced. Latency is the time, for which a packet remains in the memory device 100 or switch 200, before it is transmitted. This time is measured from the arrival of the first byte to the departure of the first byte of the packet.

The present invention has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article“a” or“an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

1. Memory device (100) for a switch (200), the memory device (100) comprising

a plurality of memory blocks (101),

a plurality of ingress pipes (102), each ingress pipe (102) being configured to request a write of a packet to the memory blocks (101), and

a plurality of egress pipes (103), each egress pipe (103) being configured to request a read of a packet from the memory blocks (101),

wherein each egress pipe (103) is associated with a set (104) of the memory blocks (101).

2. Memory device (100) according to claim 1, wherein

the sets (104) are disjoint sets.

3. Memory device (100) according to claim 1 or 2, wherein

each set (104) includes a same number of memory blocks (101).

4. Memory device (100) according to one of the claims 1 to 3, comprising

a controller configured to select, for an egress pipe (103) requesting a read of a packet, a memory block (101) from the set (104) associated with said egress pipe (103) for the read of the packet.

5. Memory device (100) according to claim 4, wherein

the controller is further configured to select, for any ingress pipe (102) requesting a write of a packet destined to a determined egress pipe ( 103), a memory block (101) from the set ( 104) associated with the determined egress pipe (103) for the write of the packet.

6. Memory device (100) according to claim 5, wherein

the controller is further configured to exclude one or more full memory blocks (101) when selecting the memory block (101) from the set (104) associated with the determined egress pipe (103) for the write of the packet.

7. Memory device (100) according to claim 5 or 6, wherein

if a total number of write requests for packets destined to the determined egress pipe (103) is smaller than a number of write-permitted memory blocks (101) in the set (104) associated with the determined egress pipe (103), the controller is further configured to select a memory block (101) from the write-permitted memory blocks (101) in the set (104) associated with the determined egress pipe (103) based on a minimum occupancy or randomly.

8. Memory device (100) according to one of the claims 5 to 7, wherein

if a total number of write requests for packets destined to the determined egress pipe (103) is larger than a number of write-permitted memory blocks (101) in the set (104) associated with the determined egress pipe (103), the controller is further configured to

create a list of all ingress pipes (102) having an unfulfilled write request.

9. Memory device (100) according to one of the claims 5 to 8, wherein

select the memory block (101) from another set (104) not associated with the determined egress pipe (103) for the write of the packet, in particular allocate a remaining available memory block ( 101 ) to each unfulfilled write request.

10. Memory device (100) according to claim 9, wherein

the controller is configured to select, for the determined egress pipe (103) requesting a read of a packet, a memory block (101) from another set (104) not associated with said determined egress pipe (103) for the read of the packet.

11. Memory device (100) according to claim 9 wherein

the controller is further configured to select the remaining available memory blocks (101) for the unfulfilled write requests based on a minimum occupancy or randomly.

12. Memory device (100) according to one of the claims 4 to 11, wherein

the controller if further configured to associate the egress pipes (103) with the sets (104) of the memory blocks (101).

13. Switch (200) for packet switching, the switch (200) comprising

a memory device (100) according to one of the claims 1 to 12.

14. Switch (200) according to claim 13, comprising

a plurality of input ports (201) and a plurality of output ports (202), wherein each ingress pipe (102) is associated with a group (203) of the input ports (201) and each egress pipe (103) is associated with a group (204) of the output ports (202).

15. Method (300) for controlling a memory device (100) including a plurality of memory blocks (101), ingress pipes (102) and egress pipes (103), the method comprising

selecting (301), for an egress pipe (103) requesting a read of a packet, a memory block (101) from a set of memory blocks (101) associated with said egress pipe (103) for the read of the packet, and/or

selecting (302), for any ingress pipe (102) requesting a write of a packet destined to a determined egress pipe (103), a memory block (101) from a set (104) of memory blocks (101) associated with the determined egress pipe (103) for the write of the packet.

16. Computer program product storing a program code for controlling a memory device (100) according to one of the claims 1 to 12 and/or a switch according to one of the claims 13 or 14, or for performing, when implemented on a computer, a method (300) according to claim 15.